Cormac Brick, a Principal Engineer at Google AI Edge, recently presented insights into the advancements and practical applications of fine-tuning Tiny LLMs (TLMs) for on-device agents. The presentation, titled "From 46% to 90%: Fine-Tuning Tiny LLMs for On-Device Agents," detailed how these smaller, more efficient language models are being integrated into devices to provide powerful, responsive AI experiences. Brick highlighted the growing importance of running AI directly on devices, citing benefits such as reduced latency, enhanced privacy, offline usability, and cost savings compared to cloud-based solutions.
The Google AI Edge Stack and TLMs
Brick outlined the Google AI Edge stack, which includes components like MediaPipe and LiteRT-LM, a cross-platform runtime designed to run powerful LLMs on Android, Chrome, and iOS. He explained that TLMs, defined as models with fewer than a billion parameters, are small enough to be integrated directly into applications, offering greater customization and reach. The presentation also touched upon the wider applicability of these models beyond Android, extending to iOS, macOS, Windows, web, and IoT devices.
