The prevailing narrative in AI often highlights grand visions of general intelligence, yet a fundamental challenge persists: getting AI agents to reliably navigate and interact with standard computer interfaces. Danielle Perszyk, a cognitive scientist at the new Amazon AGI SF Lab, presented at the AI Engineer World's Fair in San Francisco, detailing the core limitations of current agentic models and introducing Amazon's novel approach, Useful General Intelligence (UGI), powered by the Nova Act model and SDK. Her presentation underscored the paradox that "it turns out that getting AI to click, type, and scroll is more challenging than getting it to generate code," emphasizing the significant hurdle of robust, general-purpose computer use.
Perszyk articulated that despite advancements in language models, existing AI agents remain "incredibly brittle" when faced with the dynamic and often unstructured environment of a graphical user interface. Unlike clean APIs, UIs are messy, inconsistent, and demand a human-like adaptability that current AI often lacks. This brittleness manifests in frequent failures when attempting multi-step tasks, limiting their real-world applicability for automating complex workflows. Amazon AGI's UGI paradigm directly addresses this by prioritizing utility and reliability over abstract, ungrounded intelligence.
This practical approach bypasses the complexities of pure AGI, focusing instead on immediate utility. It prioritizes robust interaction over abstract reasoning.
The solution, Nova Act, is presented as an agentic model and SDK designed to bridge this gap. Nova Act aims to mimic human interaction with computers, allowing agents to "see" the screen, understand visual context, and perform actions with a high degree of reliability. A critical feature is its capacity for self-correction and recovery, enabling the agent to re-plan or adapt when an unexpected event occurs, a stark contrast to the common failure modes of less resilient systems. Furthermore, Nova Act incorporates long-term memory, allowing it to retain context and learn from past interactions, significantly improving performance over extended workflows.
Amazon's strategic decision to release Nova Act as an SDK signals a clear intent to foster a developer ecosystem around UGI. By providing developers with the tools to build workflows, Amazon AGI aims to accelerate the creation of practical, agentic applications that can truly automate complex computer tasks, from setting up virtual meetings to managing project boards. Perszyk affirmed, "We want to build systems that are useful, reliable, and robust, that can perform tasks that are meaningful to people," underscoring the product's foundational purpose. This focus on developer empowerment and real-world utility positions Nova Act as a pragmatic step towards more capable and trustworthy AI automation in enterprise and personal computing.

