AI Research

50 articles in this category

Diffusion Research: Drug Discovery Outshines Image Generation

Diffusion Research: Drug Discovery Outshines Image Generation

AI's diffusion model research is making bigger waves in drug discovery than image generation, tackling complex molecular interactions with physics-informed AI.

2 days ago
Google Boosts AI: Faster Images, Video Tools

Google Boosts AI: Faster Images, Video Tools

Google launches Nano Banana 2 Lite for fast, cheap image generation and makes Gemini Omni Flash available for video editing.

2 days ago
Memora: Microsoft's AI Memory Upgrade

Memora: Microsoft's AI Memory Upgrade

Microsoft's Memora AI memory system revolutionizes long-term AI interactions by balancing detailed recall with efficient retrieval, outperforming existing solutions.

3 days ago
Meta's Nishant Gupta on Deterministic AI Infrastructure

Meta's Nishant Gupta on Deterministic AI Infrastructure

Nishant Gupta from Meta discusses the critical need for deterministic infrastructure to reliably run non-deterministic AI agents, highlighting the shift from model-centric to systems-centric development.

4 days ago
RL Agent Automates ETL Pipeline Failure Remediation

RL Agent Automates ETL Pipeline Failure Remediation

Anna Marie Benzon presents an RL agent designed to automate ETL pipeline failure detection and remediation, significantly reducing recovery time and enhancing system reliability.

4 days ago
OpenAI Weighs 2027 IPO Amid Market Volatility

OpenAI Weighs 2027 IPO Amid Market Volatility

OpenAI is reportedly considering a 2027 IPO, navigating market volatility and aiming to solidify its position in the rapidly evolving AI landscape.

6 days ago
Benchmarks Fail Modern AI, Says OpenAI Scientist

Benchmarks Fail Modern AI, Says OpenAI Scientist

OpenAI's Noam Brown discusses why traditional benchmarks fail modern AI, emphasizing the need for new evaluation methods that account for computational budgets and model capabilities.

6 days ago
OpenAI's Mark Chen on AGI, Scaling Laws, and Evals

OpenAI's Mark Chen on AGI, Scaling Laws, and Evals

OpenAI's Chief of Research, Mark Chen, shares insights on the path to AGI, the impact of scaling laws, and the importance of robust evaluations for AI safety.

7 days ago
Raymond Weitekamp on Recursive Coding Agents

Raymond Weitekamp on Recursive Coding Agents

Raymond Weitekamp of OpenProse discusses recursive coding agents, exploring how AI systems can autonomously generate and refine their own code.

7 days ago
Meta's Nishant Gupta on Evaluating Agentic AI Systems

Meta's Nishant Gupta on Evaluating Agentic AI Systems

Nishant Gupta from Meta's Superintelligence Labs discusses the shift from accuracy-based evaluation to reliability-focused methods for agentic AI systems.

7 days ago
OpenData Pipeline Elevates Agentic AI

OpenData Pipeline Elevates Agentic AI

The OpenThoughts-Agent project introduces an open data pipeline that significantly enhances generalization for agentic language models, outperforming existing benchmarks.

8 days ago
AI Models Storms with Unprecedented Accuracy

AI Models Storms with Unprecedented Accuracy

AI models are achieving surprising accuracy in predicting mega storms, outperforming traditional methods and offering crucial insights into future weather patterns.

8 days ago
Engram's AI: Memory and Continual Learning

Engram's AI: Memory and Continual Learning

Engram's Dan Biderman and Jessy Lin discuss the critical role of memory and continual learning in AI, aiming to overcome catastrophic forgetting.

8 days ago
AI Security Post-Codex & Claude: Kolter & Fredrikson

AI Security Post-Codex & Claude: Kolter & Fredrikson

AI security experts Zico Kolter & Matt Fredrikson discuss the challenges posed by models like Codex & Claude, and Gray Swan's approach to securing AI.

10 days ago
FlashRT: Execution State for Latency-First AI

FlashRT: Execution State for Latency-First AI

FlashRT revolutionizes on-device AI serving with execution-state capsules, enabling sub-millisecond state restoration and significant TTFT speedups for latency-critical applications.

13 days ago
PDE Solutions Get Analytical

PDE Solutions Get Analytical

Agentic Symbolic Search (ASYS) automates the discovery of analytical forms for PDE solutions, bridging computation and mathematical insight.

13 days ago
Anthropic's Co-founder on AI Research at the Frontier

Anthropic's Co-founder on AI Research at the Frontier

Anthropic's Co-founder and Top Economist discuss the frontier of AI research, covering economics, safety, and future implications.

13 days ago
Hybrid AI Models Get Orthogonal

Hybrid AI Models Get Orthogonal

OrthoReg, a novel regularization method, ensures clear separation between symbolic and neural components in hybrid dynamical systems, boosting interpretability and generalization.

14 days ago
Autonomous Agents Streamline Data Integration

Autonomous Agents Streamline Data Integration

Data Intelligence Agents (DIA) system revolutionizes data integration by using autonomous coding agents to generate, execute, and validate concrete artifacts, achieving state-of-the-art results.

14 days ago
OneCanvas: Unified 3D Scene Representation

OneCanvas: Unified 3D Scene Representation

OneCanvas revolutionizes 3D scene understanding in VLMs by projecting multi-view features onto a unified equirectangular canvas, enabling efficient situated reasoning and SOTA performance.

14 days ago
LoopWM: A New Scaling Axis for World Models

LoopWM: A New Scaling Axis for World Models

Looped World Models (LoopWM) redefine world simulation with iterative refinement, achieving 100x parameter efficiency and establishing latent depth as a new scaling axis.

15 days ago
WEQA: Bridging LLMs and Wearable Health Data

WEQA: Bridging LLMs and Wearable Health Data

WEQA, a novel agent framework, unifies LLM reasoning with specialized tools for wearable health data, achieving 24% higher accuracy and expert-validated clinical soundness.

15 days ago
UK trials AI for faster house planning

UK trials AI for faster house planning

UK government partners with Google DeepMind on an AI prototype to drastically cut house planning application times.

16 days ago
Phase Dominance in AI Image Recognition

Phase Dominance in AI Image Recognition

AI image classifiers exhibit a striking phase dominance for identity encoding, mirroring human vision principles, with architectural differences shaping its expression.

16 days ago
TokenPilot: Reining in LLM Context Costs

TokenPilot: Reining in LLM Context Costs

TokenPilot offers a dual-granularity context management framework, slashing LLM inference costs by up to 87% while preserving performance.

16 days ago
ActiveSAM: Efficient Open-Vocabulary Segmentation

ActiveSAM: Efficient Open-Vocabulary Segmentation

ActiveSAM revolutionizes open-vocabulary semantic segmentation with a training-free framework that dynamically identifies relevant classes, boosting speed and accuracy while enhancing robustness for real-world AI.

16 days ago
Nvidia's Ziv Ilan on Faster Diffusion Models

Nvidia's Ziv Ilan on Faster Diffusion Models

Nvidia's Ziv Ilan explains how to reduce diffusion model latency using quantization, caching, and distillation, plus the new FastGen library.

16 days ago
Compute Once: Unlocking AI Agent Efficiency

Compute Once: Unlocking AI Agent Efficiency

A radical proposal to precompute LLM KV caches, slashing inference costs by up to 50x and enabling a new compute-efficient AI agent paradigm.

19 days ago
HYDRA-X: Unifying Image & Video Tokenization

HYDRA-X: Unifying Image & Video Tokenization

HYDRA-X, a novel Vision Transformer-based UMM, unifies image and video tokenization, enhancing editing consistency and performance through causal attention and latent-level manipulation.

19 days ago
Humanoids Learn Self-Other Distinction

Humanoids Learn Self-Other Distinction

Humanoid robots now learn self-other distinction and build predictive self-models from sensory data, enabling better collaboration and task performance in human-robot environments.

19 days ago
AI spots new LOTUSLITE variant

AI spots new LOTUSLITE variant

Microsoft's AI agent 'Ire' has identified a new LOTUSLITE malware variant missed by traditional security tools, showcasing AI's prowess in behavioral analysis.

20 days ago
Unlocking Ultra-Long Context for LLMs

Unlocking Ultra-Long Context for LLMs

MiniMax Sparse Attention breaks the context window barrier for LLMs, enabling millions of tokens with significant compute reduction and practical speedups.

20 days ago
Mana Reimagines Dexterous Robotics

Mana Reimagines Dexterous Robotics

Mana framework reinterprets dexterous robotics as animation, achieving zero-shot sim-to-real transfer for articulated tool manipulation.

20 days ago
From LLM Agents to Scientific Knowledge Graphs

From LLM Agents to Scientific Knowledge Graphs

Agents-K1 revolutionizes LLM research agents by creating agent-native scientific knowledge graphs from full papers, enabling deeper scientific reasoning.

20 days ago
5 AI Research Papers Shaping AI's Future

5 AI Research Papers Shaping AI's Future

Discover five key AI research papers that reveal the current trajectory and future directions of artificial intelligence development.

20 days ago
Rethinking VLM Token Reduction

Rethinking VLM Token Reduction

Reroute transforms VLM token reduction from irreversible pruning to recoverable routing, improving grounding performance without sacrificing efficiency.

21 days ago
Automating Scientific Discovery

Automating Scientific Discovery

ATLAS, an active learning framework, automates the discovery of interpretable mechanistic models, achieving 5-10x sample efficiency gains.

21 days ago
VLA Models Unlock Decentralized Multi-Robot Teams

VLA Models Unlock Decentralized Multi-Robot Teams

CHORUS leverages pretrained VLA models for decentralized multi-robot collaboration, achieving significant performance gains without inference-time communication.

21 days ago
Codex Aids Black Hole Simulation Breakthrough

Codex Aids Black Hole Simulation Breakthrough

This video explores how the AI model Codex is revolutionizing the creation of black hole simulations, making previously intractable problems computationally feasible and accelerating astrophysical research.

21 days ago
DeepMind's Kilpatrick on AI Models Eating Harnesses

DeepMind's Kilpatrick on AI Models Eating Harnesses

Google DeepMind's Logan Kilpatrick delves into the AI concept of models "eating the harness," explaining how over-specialization hinders generalization and what can be done to prevent it.

21 days ago
Causal Inference's Counterfactual Blind Spot

Causal Inference's Counterfactual Blind Spot

Predictive AI models fail on counterfactual couplings. A new world model using semidefinite kernels offers a solution for robust causal inference.

22 days ago
Steering LRMs Beyond Output Degradation

Steering LRMs Beyond Output Degradation

A new probe-based method, FPCG, distinguishes prediction from detection features to enable precise large reasoning models steering with minimal output quality degradation.

22 days ago
LLMs Accelerate FPGA Design

LLMs Accelerate FPGA Design

LLMs are now automating complex FPGA accelerator design, reducing time and expertise needed for efficient AI hardware deployment.

22 days ago
DiffusionGemma: Google's AI is 4x Faster

DiffusionGemma: Google's AI is 4x Faster

Google DeepMind's DiffusionGemma model offers up to 4x faster text generation, enabling new real-time AI applications.

22 days ago
Google DeepMind Discusses Open Models & AI Ownership

Google DeepMind Discusses Open Models & AI Ownership

Google DeepMind's Gus Martins and Ian Ballantyne discuss the benefits of open AI models like Gemma for ownership, control, and custom applications.

22 days ago
Topology-Aware Operator Learning

Topology-Aware Operator Learning

Topological Neural Operators (TNOs) provide a unified framework for operator learning on cell complexes, improving PDE benchmark accuracy by integrating topological structures.

23 days ago
Personalized AI Agents Now Have a Benchmark

Personalized AI Agents Now Have a Benchmark

A new iOSWorld benchmark reveals AI agents' struggles with personalized, multi-app tasks, highlighting the need for richer context and advanced reasoning capabilities.

23 days ago
Images as the New Reasoning Medium

Images as the New Reasoning Medium

This paper introduces optical reasoning, enabling images to serve as the primary medium for LLM and MLLM reasoning, achieving higher token efficiency and competitive performance.

23 days ago
Gemini's Audio Stack: From Transcription to Music Generation

Gemini's Audio Stack: From Transcription to Music Generation

Google DeepMind's Thor Schaeff explores Gemini's audio stack, from advanced transcription to music generation with Lyria 3.

23 days ago
Google Rolls Out Gemini 3.5 Live Translate

Google Rolls Out Gemini 3.5 Live Translate

Google's new Gemini 3.5 Live Translate offers real-time speech-to-speech translation across 70+ languages, enhancing Google Translate and Meet.

23 days ago