In a conversation with Latent Space, Barak Lenz, CTO of AI21, revealed their latest innovation: the Jamba 3B model. This tiny yet powerful hybrid transformer-state space model aims to bring long-context AI capabilities to edge devices. "Images quickly become 'long context' problems," Lenz stated, explaining how even a few images can require thousands of tokens, making hybrid architectures vital for on-device AI.
Lenz spoke with Swyx at Latent Space about the journey of AI21, becoming pioneers in hybrid architectures by combining attention mechanisms with Mamba's state space models. This approach achieves unprecedented efficiency without sacrificing performance. Lenz highlighted, "Deep learning is super cool and super useful, but it's not enough. We wanted to bridge classical AI with new AI."
