#LLM Inference

8 articles with this tag

Together AI Supercharges LLM Inference

Together AI unveils ATLAS, accelerating LLM inference up to 4x with adaptive speculative decoding, tackling the growing cost challenge for AI-native companies.

about 6 hours ago

Technology

Together AI's Aurora Learns on the Fly

Together AI's Aurora framework uses RL to continuously adapt speculative decoding for faster LLM inference, outperforming static models.

about 1 month ago

Artificial Intelligence

Mamba-3: Inference-First SSMs Arrive

Together AI's Mamba-3 advances state space models with a focus on inference speed, outperforming previous versions and some Transformers.

about 1 month ago

Artificial Intelligence

NVIDIA Nemotron 3 Nano launches on FriendliAI

\n The race to serve the next generation of efficient, open AI agents is heating up, and FriendliAI is aggressively positioning itself as the crucial infrastruc...

5 months ago

Technology

NVIDIA Nemotron 3 Nano launches on FriendliAI

\n The race to serve the next generation of efficient, open AI agents is heating up, and FriendliAI is aggressively positioning itself as the crucial infrastruc...

5 months ago