#Transformer Models
7 articles with this tag

Technology
Faster LLMs by Reshaping Sparsity
Sakana AI and NVIDIA unveil a new method that reshapes sparsity in LLMs to boost GPU efficiency, achieving over 20% speedups.
18 days ago

AI Research
AI Brains vs. Human Minds
Exploring the fundamental differences between transformer AI models and the human brain's continuous learning and sensory grounding.
2 months ago

Artificial Intelligence
AI's Consciousness Debate
Vishal Misra and Martin Casado discuss LLM functionality, the path to AGI, and the role of data in AI development.
2 months ago
AI Research
Predicting Transformer Training Instability
Researchers introduce RKSP, a method to predict transformer training divergence from a single forward pass, and KSS, a technique to actively prevent it, saving compute and enabling higher learning rates.
3 months ago

AI Research
TabICLv2: Spreadsheets Meet AI's Future
TabICLv2 emerges as a breakthrough tabular foundation model, challenging traditional methods with zero-shot, in-context learning on massive datasets.
3 months ago

AI Research
AlphaProof system proves its worth at the Math Olympiad
6 months ago

AI Research
Nested Learning AI Tackles Catastrophic Forgetting
7 months ago