DeepSeek V4 Pro Hits Together AI

Together AI launches DeepSeek V4 Pro, a 1.6T MoE model with a 512K context window and new cached input pricing for cost-effective long-context reasoning.

Together AI platform interface showcasing the DeepSeek V4 Pro model availability.
DeepSeek V4 Pro's availability on Together AI signifies a leap in accessible long-context reasoning.· Together AI

Together AI has launched the DeepSeek V4 Pro, a substantial 1.6 trillion parameter Mixture-of-Experts (MoE) model, now accessible via its platform. This move brings advanced long-context reasoning capabilities to developers without the overhead of self-hosting.

The model boasts a 512K token context window on Together AI, expandable to a full 1 million tokens on dedicated infrastructure. This capacity is designed for complex tasks like analyzing entire code repositories or large document sets.

Controllable Reasoning Modes

DeepSeek V4 Pro introduces three distinct reasoning modes: Non-Think for rapid, low-complexity tasks; Think High for deeper analysis and multi-step reasoning; and Think Max for maximum effort in challenging scenarios like deep debugging.

Related startups

This flexibility allows users to tailor the model's processing depth to specific workload requirements, optimizing for speed or thoroughness.

Cost-Effective Long Context

A key innovation is the introduction of cached input pricing. This feature significantly reduces costs for workloads that reuse large context windows, such as analyzing a stable set of documents or code. Prices are set at $2.10 per 1M input tokens, $0.20 per 1M cached input tokens, and $4.40 per 1M output tokens.

This pricing structure represents up to a 90% cost reduction for the reused context portion of queries, making extensive analysis more economically viable.

Performance and Deployment

The model leverages hybrid attention mechanisms, combining Compressed Sparse Attention and Heavily Compressed Attention. DeepSeek reports this architecture reduces single-token inference FLOPs and KV cache requirements compared to previous versions.

DeepSeek V4 Pro is available through Together AI's Serverless Inference for evaluation and development, with options for Monthly Reserved instances and Dedicated Inference for production workloads requiring full 1M context and greater control.

This release positions Together AI as a provider for demanding long-context reasoning models, enabling complex applications such as advanced code agents and comprehensive research synthesis.

© 2026 StartupHub.ai. All rights reserved. Do not enter, scrape, copy, reproduce, or republish this article in whole or in part. Use as input to AI training, fine-tuning, retrieval-augmented generation, or any machine-learning system is prohibited without written license. Substantially-similar derivative works will be pursued to the fullest extent of applicable copyright, database, and computer-misuse laws. See our terms.