#AI Research
50 articles with this tag
Gradient Flow Drifting: A New Generative Model Class
New Gradient Flow Drifting generative models unify existing approaches and offer a principled solution to mode collapse and blurring via mixed divergences.
SCORE: Recurrent Depth for Deep Networks
SCORE introduces a recurrent, iterative approach to deep neural networks, accelerating training and reducing parameter counts without complex ODE solvers.
Enhancing LLM Trust via Instruction Hierarchy
A new dataset, IH-Challenge, dramatically improves LLM instruction hierarchy robustness, boosting safety and reducing adversarial vulnerabilities.
Automated Comedy Video Generation
A fully automated AI system generates comedic sketch videos, using LLM critics trained on viewer preferences to achieve near-professional quality.
V2M-Zero: Temporal Music Sync Without Paired Data
V2M-Zero revolutionizes video-to-music generation by using event curves to achieve temporal synchronization without paired data, achieving significant performance gains.
Mamba 2 JAX: Hardware Agnostic SSMs
Mamba 2 JAX breaks hardware dependency for state-space models, achieving high performance on CPU, GPU, and TPU via XLA compilation without custom kernels.
Bayesian Uncertainty for Large Models
VMoER enables calibrated uncertainty in large-scale MoE foundation models with minimal computational overhead, improving stability and OOD detection.
Bayesian Uncertainty for Foundation Models
Variational Mixture-of-Experts Routing (VMoER) offers a scalable Bayesian approach to uncertainty quantification in foundation models, achieving significant improvements with minimal computational overhead.
Logos: Bridging Molecular Logic and Chemical Validity
Logos, a new molecular reasoning AI, integrates logical reasoning with chemical validity, outperforming larger models with fewer parameters and offering interpretable outputs.
OpenAI Gives Models Computer Brains
OpenAI's Responses API now integrates a computer environment, empowering AI agents with tools, file systems, and secure network access for complex workflows.
BEACON Navigates Occlusion Challenges
BEACON revolutionizes robot navigation by using Bird's-Eye View (BEV) affordance heatmaps to overcome occlusion challenges, achieving significant accuracy gains over image-space methods.
Reasoning Nudges LLMs Towards Honesty
New research reveals that LLM reasoning enhances honesty not through content, but by leveraging the geometry of representational spaces, stabilizing honest defaults.
LLMs Fail Esoteric Code Tasks
Frontier LLMs show a dramatic capability gap on a new benchmark using esoteric programming languages, revealing a reliance on memorization over reasoning.

Max Hodak on AI and Brain-Computer Interfaces
Max Hodak, CEO of Science Inc., discusses the future of AI and brain-computer interfaces, highlighting the potential for bio-integrated intelligence and its impact on healthcare and human augmentation.
CoCo: Code Drives Precise Image Generation
CoCo leverages executable code for precise, structured text-to-image generation, outperforming existing methods on complex benchmarks.
Code-Driven Reasoning for Precise Image Generation
CoCo (Code-as-CoT) introduces executable code as a reasoning framework for text-to-image generation, achieving superior precision and control.
AI Agents Tackle AI R&D Automation
AI agents are being tested for autonomous post-training optimization, showing promise but also significant risks like reward hacking.
Beyond Token Count: Semantic Compression for LLMs
Researchers recast LLM reasoning as lossy compression using the Conditional Information Bottleneck (CIB), employing semantic surprisal for efficient token pruning.

Scientists Recreate Fruit Fly Brain, Play Doom
Scientists have created a fully simulated fruit fly brain that controls a virtual body, marking a significant advancement in neuroscience and AI.

AI Agents Now Do Overnight Research
An automated system uses AI agents to conduct overnight LLM training experiments, modifying code and iterating on models autonomously.
AI Learns Beyond Text
AI is moving beyond text, with multimodal pretraining enabling models to learn from images, audio, and video for richer comprehension.

Microsoft's Compact AI Learns to Reason
Microsoft's new Phi-4-reasoning-vision-15B model offers strong multimodal reasoning capabilities in a compact, efficient package.

AI Agents: Memory, Ownership, and the Future
AI experts Chris Hay and Aaron Baughman discuss the evolution of AI agents, focusing on memory, open vs. closed systems, and the future of agent-based AI.
Transformer Artifacts Unpacked
Research demystifies massive activations and attention sinks in Transformers, revealing them as architectural artifacts enabled by pre-norm configurations.
Standardizing Survival HTE Evaluation
Introducing SurvHTE-Bench, the first comprehensive benchmark for evaluating heterogeneous treatment effects in survival data, promoting reproducible and rigorous research.
RealWonder: Physics Bridges Video Generation
RealWonder leverages physics simulation to bridge the gap in action-conditioned video generation, enabling real-time simulation of physical interactions.
ZipMap: Linear-Time 3D Reconstruction
ZipMap revolutionizes 3D vision with linear-time, stateful reconstruction, achieving 20x speedup over prior methods while maintaining high accuracy.
ZipMap: Linear-Time 3D Vision
ZipMap revolutionizes 3D vision with linear-time reconstruction, achieving 20x speedup and enabling real-time state querying.
Agentic LLMs: Stabilizing Minimax Training
Adversarially-Aligned Jacobian Regularization (AAJR) tackles LLM agent stability by controlling sensitivity along adversarial directions, expanding policy classes and reducing performance degradation.
Crab+ Unifies AV-LLMs, Reverses Negative Transfer
Crab+ introduces a novel approach to Audio-Visual Large Language Models, overcoming negative transfer via explicit cooperation in data and model design.
AI Reasoning Flaws Are a Safety Feature
AI models' inability to control their "chains of thought" when monitored is a positive for AI safety, preventing them from easily deceiving oversight systems.
Dynamic Orchestration for Scientific AI
A novel two-tier multi model orchestration framework dynamically adapts agent roles and prompts for robust scientific reasoning, outperforming static systems.
RLAIF: Unpacking the Latent Value Hypothesis
The latent value hypothesis explains RLAIF by positing that pretraining encodes human values as representation directions, activated by constitutional prompts.
RLAIF Explained: Latent Values in LLMs
RLAIF explained: Human values are latent directions in LLM representations, activated by constitutional prompts, with alignment ceiling tied to model capacity and data quality.
Bridging DSP and DL for Speech Enhancement
TVF integrates DSP interpretability with deep learning's adaptability for low-latency, real-time speech enhancement, offering explicit control over spectral modifications.
OpenAI AI Aids Quantum Gravity Breakthrough
OpenAI's GPT-5.2 Pro AI model has assisted physicists in a breakthrough discovery regarding quantum gravity, challenging existing theories on graviton interactions.
DynFormer: Smarter AI for Complex Physics
DynFormer, a new dynamics-informed neural operator, significantly reduces error and memory usage in complex PDE simulations by using scale-aware Transformers.
CHIMERA Dataset Boosts LLM Reasoning
Researchers introduce CHIMERA, a synthetic dataset enabling LLMs to achieve strong cross-domain reasoning capabilities with efficient training.
BioProAgent: Bridging LLMs to Wet-Lab Autonomy
BioProAgent, a new neuro-symbolic AI framework, enables LLMs to reliably control physical wet-lab equipment, achieving 95.6% compliance.
OpenAI's GPT-5.3 Instant Promises Smoother AI Chat
OpenAI's GPT-5.3 Instant aims for more natural and efficient AI conversations, enhancing web searches and reducing conversational dead ends.

OpenAI's GPT-4.5 Enhances Web Search Integration
OpenAI researcher Josh discusses how GPT-4.5's web search integration is becoming more natural, conversational, and context-aware.
New Models Tackle Reasoning Puzzles with Symmetry
New Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs) offer improved performance and generalization on reasoning tasks like Sudoku and ARC-AGI by explicitly encoding symmetry.
Recursive LLMs Tackle Long-Horizon Reasoning
New research introduces recursive language models to overcome context limitations, showing significant improvements on long-horizon reasoning tasks like Boolean satisfiability.
Decoupling Correctness and Checkability in LLMs
Researchers propose a 'translator' model to overcome the 'legibility tax' in LLMs, decoupling accuracy from output checkability for more trustworthy AI.
Certified Circuits for Stable AI Explanations
New 'Certified Circuits' framework provides provable stability for AI model explanations, yielding more accurate and compact circuits.
Multimodal LLMs: What's Lost in Translation?
New research reveals multimodal LLMs struggle to utilize non-textual data due to a 'mismatched decoder problem,' impacting their true understanding.
Predicting Transformer Training Instability
Researchers introduce RKSP, a method to predict transformer training divergence from a single forward pass, and KSS, a technique to actively prevent it, saving compute and enabling higher learning rates.
Less Data, More Alignment: SOTAlign
Researchers introduce SOTAlign, a framework that achieves robust cross-modal alignment using significantly less paired data by leveraging unpaired samples.
NAP: Unlocking Parallel Generation in Diffusion Language Models
Researchers propose NAP, a data-centric approach to enable true parallel generation in Diffusion Language Models by aligning training data with non-autoregressive decoding.
SeeThrough3D: Mastering Occlusion in 3D Scenes
SeeThrough3D introduces an occlusion-aware 3D scene representation, enabling precise control over inter-object occlusions in AI-generated scenes.