#LLM Optimization

4 articles with this tag

Faster LLMs by Reshaping Sparsity

Sakana AI and NVIDIA unveil a new method that reshapes sparsity in LLMs to boost GPU efficiency, achieving over 20% speedups.

about 3 hours ago

AI Research

LLM Reasoning Fix: LPSR

Latent Phase-Shift Rollback (LPSR) corrects LLM reasoning errors at inference with no fine-tuning, boosting accuracy and efficiency.

17 days ago

AI Research

Prism: Symbolic Superoptimization for Tensors

Prism, a novel symbolic superoptimizer, uses sGraphs to represent tensor program families, achieving significant speedups and reduced optimization time for LLM workloads.

21 days ago

AI Research

Beyond Token Count: Semantic Compression for LLMs

Researchers recast LLM reasoning as lossy compression using the Conditional Information Bottleneck (CIB), employing semantic surprisal for efficient token pruning.

about 2 months ago