#LLM Inference

9 articles with this tag

Together AI Masters MiniMax M3 Inference
Technology

Together AI Masters MiniMax M3 Inference

Together AI details engineering feats enabling efficient MiniMax M3 inference, unlocking 1M-token context and multimodality.

16 days ago
Together AI Supercharges LLM Inference
Technology

Together AI Supercharges LLM Inference

Together AI unveils ATLAS, accelerating LLM inference up to 4x with adaptive speculative decoding, tackling the growing cost challenge for AI-native companies.

about 2 months ago
Together AI's Aurora Learns on the Fly
Technology

Together AI's Aurora Learns on the Fly

Together AI's Aurora framework uses RL to continuously adapt speculative decoding for faster LLM inference, outperforming static models.

3 months ago
Mamba-3: Inference-First SSMs Arrive
Artificial Intelligence

Mamba-3: Inference-First SSMs Arrive

Together AI's Mamba-3 advances state space models with a focus on inference speed, outperforming previous versions and some Transformers.

3 months ago
Technology

NVIDIA Nemotron 3 Nano launches on FriendliAI

The race to serve the next generation of efficient, open AI agents is heating up, and FriendliAI is aggressively positioning itself as the crucial infrastruc...

6 months ago
NVIDIA Nemotron 3 Nano launches on FriendliAI
Artificial Intelligence

NVIDIA Nemotron 3 Nano launches on FriendliAI

The race to serve the next generation of efficient, open AI agents is heating up, and FriendliAI is aggressively positioning itself as the crucial infrastruc...

6 months ago
Startup News

Clarifai Hits Fastest GPT-OSS-120B Inference and Narrows the GPU, ASIC Gap

Clarifai’s latest benchmark on OpenAI’s GPT-OSS-120B model points to a quiet but important shift in AI infrastructure.

7 months ago
Clarifai Hits Fastest GPT-OSS-120B Inference and Narrows the GPU, ASIC Gap
Startup News

Clarifai Hits Fastest GPT-OSS-120B Inference and Narrows the GPU, ASIC Gap

Clarifai’s latest benchmark on OpenAI’s GPT-OSS-120B model points to a quiet but important shift in AI infrastructure.

7 months ago
Pliops Unveils Breakthrough AI Performance Enhancements
Press Release

Pliops Unveils Breakthrough AI Performance Enhancements

about 1 year ago