In a recent presentation, Rachel Nabors, known for her work with AI and UI standards, discussed the practical advantages of leveraging smaller, localized AI models over large, frontier models. Nabors, who has previously contributed to standards at Mozilla and the W3C, and worked with the React team, highlighted how companies can achieve significant cost savings and performance gains by opting for smaller, more specialized AI solutions, particularly when running models on device.
Related startups
The Cost of One-Size-Fits-All Inference
Nabors began by addressing the inherent costs associated with using large frontier models, especially for tasks that don't require their full capabilities. She emphasized that every API call made to a large language model (LLM) incurs costs for both the user and the business. Furthermore, relying on cloud-based models introduces risks related to data exposure and potential outages, as demonstrated by a hypothetical scenario where a model fails to connect to the web.
She presented a compelling argument for considering smaller language models (SLMs) or task-specific models. These models, containing millions to a few billion parameters compared to the hundreds of billions or trillions in LLMs, are significantly more efficient in terms of computational resources, energy consumption, and memory footprint. This efficiency makes them ideal for on-device deployment, offering benefits like lower latency and enhanced privacy as data processing occurs locally.
