#LLM Optimization
4 articles with this tag

Technology
Faster LLMs by Reshaping Sparsity
Sakana AI and NVIDIA unveil a new method that reshapes sparsity in LLMs to boost GPU efficiency, achieving over 20% speedups.
about 3 hours ago
AI Research
LLM Reasoning Fix: LPSR
Latent Phase-Shift Rollback (LPSR) corrects LLM reasoning errors at inference with no fine-tuning, boosting accuracy and efficiency.
17 days ago
AI Research
Prism: Symbolic Superoptimization for Tensors
Prism, a novel symbolic superoptimizer, uses sGraphs to represent tensor program families, achieving significant speedups and reduced optimization time for LLM workloads.
21 days ago
AI Research
Beyond Token Count: Semantic Compression for LLMs
Researchers recast LLM reasoning as lossy compression using the Conditional Information Bottleneck (CIB), employing semantic surprisal for efficient token pruning.
about 2 months ago