Remember that thing you saw in that video, that one time? Good luck finding it. While large language models (LLMs) have made searching and analyzing text almost magically simple, video remains a dense, unstructured mess, largely stuck on a timeline. Most tools only skim the surface, indexing audio or thumbnails, missing the crucial actions, objects, and context that truly define a moment. Without a sophisticated visual memory layer, AI struggles to pinpoint exact moments or answer broader questions within a sea of frames.
That's about to change. Memories.ai today unveiled its Large Visual Memory Model 2.0 (LVMM), a significant step towards giving AI systems true visual memory, and crucially, bringing it on-device for the first time. The company also announced a strategic collaboration with Qualcomm Technologies, Inc., which will see LVMM 2.0 running natively on Qualcomm processors starting in 2026. This partnership promises to transform how consumers and businesses interact with their visual data, making raw video searchable and structured, securely and rapidly, right on their phones, cameras, and wearables.
The Future of Visual Search Lives On-Device
The implications of this shift are profound. Currently, video analysis often relies on cloud processing, introducing latency, incurring costs, and raising privacy concerns as data leaves the device. Memories.ai’s LVMM 2.0 tackles this head-on by encoding frames, compressing them, and building an index that supports sub-second search, all locally. Users will be able to ask complex questions in plain language or use an image cue to jump to the exact moment they’re looking for.
Qualcomm's role is pivotal here. As a leader in edge computing and on-device AI, their processors provide the necessary horsepower to run such a sophisticated model locally. Vinesh Sukumar, VP of Product Management and Head of Gen AI/ML at Qualcomm Technologies, emphasized this synergy, stating, "By combining Qualcomm’s expertise in edge computing, connectivity, and on-device AI with Memories.ai’s innovative Large Visual Memory Model (LVMM), we are transforming how machines perceive, learn, and remember." This collaboration isn't just about convenience; it's about enabling AI platforms that are context-aware, able to retain visual information over long periods, and perform reliably at the edge.
The benefits of this on-device approach are manifold: lower latency for instant results, significantly reduced cloud costs, and enhanced data security by keeping sensitive visual information local.
The LVMM 2.0 also fuses video, audio, and images, ensuring search results carry rich context, and a unified memory format promises consistent experiences across diverse devices.
The real-world applications span a wide range, from personal to enterprise. Imagine "AI Albums" on your phone that automatically organize and surface personal video memories with unprecedented detail. Smart glasses and other wearables could gain significantly enhanced AI recall, allowing them to understand and react to their environment with a persistent memory. For security agents, cameras could gain real-time understanding and response capabilities, moving beyond simple motion detection. Even robotics stands to benefit, giving machines in the real world better context and understanding of their surroundings.
Memories.ai, founded in 2024 by former Meta researchers and backed by a strong roster of investors, is positioning itself at the forefront of this visual AI revolution. As Shawn Shen, Co-founder and CEO of Memories.ai, noted, "We’re thrilled to partner with [Qualcomm] to bring the incredible power of LVMM’s to hundreds of millions of phones, computers, and wearables in the coming years."



