"LLMs and AI now enable Gen 3 of language learning, which is something that is very AI native, very focused on functional fluency." This declaration by Andrew Ng, CEO of Speak, succinctly encapsulates the core thesis behind his company's distinct approach to mastering new languages. Speaking recently with Latent Space, Ng outlined how artificial intelligence is not merely an augmentation but the foundational technology for a new paradigm in language acquisition, moving beyond the limitations of previous generations.
Speak’s journey, detailed in the interview and its strategic pivots, involved ditching free users and adopting a premium-only model, coupled with a focused expansion into the South Korean market. This decisive shift allowed the company to channel resources into developing a truly AI-native product, differentiating itself from the pervasive, gamified mobile applications that define the second generation of language learning.
Ng categorizes language learning evolution into distinct eras. The first generation, epitomized by Rosetta Stone, relied on CD-ROMs and structured, often repetitive, lessons. The subsequent "Gen 2" ushered in the mobile era, dominated by applications like Duolingo and Babel. These platforms, while massively popular and highly engaging, often lean heavily into gamification, functioning "closer to a mobile game, something that feels productive, something that's very engaging, very gamified." While acknowledging Duolingo's mastery of this approach, Ng posits that this model, despite its broad appeal, often falls short of delivering true functional fluency.
Speak's "Gen 3" is a direct response to this perceived gap. Their methodology eschews traditional vocabulary and grammar drills in favor of an immersive, practice-intensive model. "We don't teach vocabulary and grammar. We teach sentence patterns and we try to get you to just repeat and drill and drill and drill, almost like you're in a gym, until it's automatic, because that's what speaking is, right? Like it has to be spontaneous and automatic," Ng explained. This emphasis on spontaneous, automatic speech through AI-powered role-plays, such as practicing Spanish with an "Uber driver" scenario, represents a significant departure.
This commitment to deep, conversational fluency rather than superficial engagement likely underpins Speak’s premium-only business model. Developing and maintaining sophisticated AI models capable of nuanced, interactive role-playing demands substantial investment, a cost structure more aligned with a subscription-based revenue stream than an ad-supported freemium approach. The strategic focus on South Korea further suggests a dedication to market penetration and product refinement within a specific cultural and linguistic context, allowing for highly tailored AI experiences that resonate deeply with local users.
Such a concentrated effort allows Speak to optimize its AI for specific linguistic challenges and user behaviors, fostering a more effective learning environment. It is a bold bet on quality and depth over widespread, casual adoption.
Ultimately, Speak’s strategy represents a calculated move to capture value at the high end of the language learning market by leveraging advanced AI for truly functional outcomes. By defining and delivering "Gen 3" language learning, the company aims to move users from passive engagement to active, spontaneous linguistic competence, demonstrating a clear vision for AI's transformative potential in education.

