DeepSeek V4 Pro Hits Together AI

Together AI has launched the DeepSeek V4 Pro, a substantial 1.6 trillion parameter Mixture-of-Experts (MoE) model, now accessible via its platform. This move brings advanced long-context reasoning capabilities to developers without the overhead of self-hosting.

The model boasts a 512K token context window on Together AI, expandable to a full 1 million tokens on dedicated infrastructure. This capacity is designed for complex tasks like analyzing entire code repositories or large document sets.

Controllable Reasoning Modes

DeepSeek V4 Pro introduces three distinct reasoning modes: Non-Think for rapid, low-complexity tasks; Think High for deeper analysis and multi-step reasoning; and Think Max for maximum effort in challenging scenarios like deep debugging.

This flexibility allows users to tailor the model's processing depth to specific workload requirements, optimizing for speed or thoroughness.

Cost-Effective Long Context

A key innovation is the introduction of cached input pricing. This feature significantly reduces costs for workloads that reuse large context windows, such as analyzing a stable set of documents or code. Prices are set at $2.10 per 1M input tokens, $0.20 per 1M cached input tokens, and $4.40 per 1M output tokens.

This pricing structure represents up to a 90% cost reduction for the reused context portion of queries, making extensive analysis more economically viable.

Performance and Deployment

The model leverages hybrid attention mechanisms, combining Compressed Sparse Attention and Heavily Compressed Attention. DeepSeek reports this architecture reduces single-token inference FLOPs and KV cache requirements compared to previous versions.

DeepSeek V4 Pro is available through Together AI's Serverless Inference for evaluation and development, with options for Monthly Reserved instances and Dedicated Inference for production workloads requiring full 1M context and greater control.

This release positions Together AI as a provider for demanding long-context reasoning models, enabling complex applications such as advanced code agents and comprehensive research synthesis.

DeepSeek V4 Pro Hits Together AI

Controllable Reasoning Modes

Related startups

Cost-Effective Long Context

Performance and Deployment

AI Daily Digest