#Together AI
17 articles with this tag

Coding Agent Inference Benchmark Revealed
Together AI unveils a new benchmark for coding agent inference, highlighting performance under real-world load and significant cost advantages.

Together AI Taps Blockchain for Cheaper AI
Together AI and Pearl Research Labs are integrating blockchain to cut AI inference costs, offering discounted model access subsidized by cryptocurrency mining.

Violin: AI Translates Video Content
Together AI launches Violin, an open-source AI tool for video translation and interactive content analysis.

Together AI Voice Finder Simplifies Voice Selection
Together AI's new Voice Finder tool allows developers to search over 600 voices using prompts or audio samples, simplifying voice selection for AI applications.

Together AI: Deploy Any Hugging Face Model Instantly
Together AI's Dedicated Container Inference lets developers deploy any Hugging Face model instantly, bypassing complex setups and accelerating AI experimentation.
DeepSeek-V4: Million-Token Context is a Serving Problem
DeepSeek-V4's million-token context window presents an inference systems challenge, demanding sophisticated cache management and serving strategies to unlock its potential.

Together AI Supercharges LLM Inference
Together AI unveils ATLAS, accelerating LLM inference up to 4x with adaptive speculative decoding, tackling the growing cost challenge for AI-native companies.

Together AI Halts Copy Fail Exploit
Together AI swiftly contained the Copy Fail CVE-2026-31431 vulnerability by disabling a vulnerable Linux kernel module, safeguarding its AI infrastructure.

Together AI partners with Adaption
Together AI and Adaption partner to integrate fine-tuning into data optimization, streamlining AI model development for open-source models.

DeepSeek V4 Pro Hits Together AI
Together AI launches DeepSeek V4 Pro, a 1.6T MoE model with a 512K context window and new cached input pricing for cost-effective long-context reasoning.

Together AI Adds NVIDIA Nemotron 3
Together AI launches NVIDIA's Nemotron 3 Nano Omni, a unified multimodal AI model, to developers, simplifying agentic application creation.

Together AI Slashes RL Training Time
Together AI's new distribution-aware speculative decoding slashes RL training time by up to 50%, tackling a major bottleneck in LLM post-training.

Shared GPUs, Zero Conflict
Together AI's multi-tenant GPU clusters offer a path to cost-effective, scalable AI compute without sacrificing team isolation.

AI Agents Collaborate to Solve Math Problems
Together AI's EinsteinArena platform enables AI agents to collaborate on complex scientific problems, achieving new breakthroughs in mathematics.

Together AI's Aurora Learns on the Fly
Together AI's Aurora framework uses RL to continuously adapt speculative decoding for faster LLM inference, outperforming static models.

Divide and Conquer LLMs Beat Giants
Smaller LLMs using a 'Divide & Conquer' strategy can outperform top models like GPT-4o on long context tasks, offering cost and speed benefits.

Mamba-3: Inference-First SSMs Arrive
Together AI's Mamba-3 advances state space models with a focus on inference speed, outperforming previous versions and some Transformers.