#Machine Learning

50 articles with this tag

AI Governance: Control, Not Code, Drives Success
Technology

AI Governance: Control, Not Code, Drives Success

Enterprise AI success hinges on robust governance, focusing on control and trust rather than just code, as Databricks leaders explain.

about 17 hours ago
Microsoft Debugs AI Agents with AgentRx
AI Research

Microsoft Debugs AI Agents with AgentRx

Microsoft Research launches AgentRx, an open-source framework and benchmark for systematically debugging AI agent failures, improving accuracy by over 23%.

about 21 hours ago
Databricks Serverless Simplifies Data Ops
Technology

Databricks Serverless Simplifies Data Ops

Databricks serverless compute automates infrastructure management, boosting performance and cutting costs for data engineering workflows.

about 23 hours ago
V2M-Zero: Temporal Music Sync Without Paired Data
AI Research

V2M-Zero: Temporal Music Sync Without Paired Data

V2M-Zero revolutionizes video-to-music generation by using event curves to achieve temporal synchronization without paired data, achieving significant performance gains.

1 day ago
Bayesian Uncertainty for Foundation Models
AI Research

Bayesian Uncertainty for Foundation Models

Variational Mixture-of-Experts Routing (VMoER) offers a scalable Bayesian approach to uncertainty quantification in foundation models, achieving significant improvements with minimal computational overhead.

2 days ago
Logos: Bridging Molecular Logic and Chemical Validity
AI Research

Logos: Bridging Molecular Logic and Chemical Validity

Logos, a new molecular reasoning AI, integrates logical reasoning with chemical validity, outperforming larger models with fewer parameters and offering interpretable outputs.

2 days ago
Databricks CEO on AI Agents and Market Trends
Artificial Intelligence

Databricks CEO on AI Agents and Market Trends

Databricks CEO Ali Ghodsi discusses the launch of 'Genie Code,' an AI agent for non-technical users, and the acquisition of Quotient AI to enhance AI monitoring.

2 days ago
OpenAI Gives Models Computer Brains
Artificial Intelligence

OpenAI Gives Models Computer Brains

OpenAI's Responses API now integrates a computer environment, empowering AI agents with tools, file systems, and secure network access for complex workflows.

2 days ago
Wayfair Taps OpenAI for Catalog and Support Overhaul
Artificial Intelligence

Wayfair Taps OpenAI for Catalog and Support Overhaul

Wayfair integrates OpenAI's AI models into its core operations, boosting product catalog accuracy and supplier support efficiency.

2 days ago
AI for Climate: Priya Dhawale on Data & Solutions
Artificial Intelligence

AI for Climate: Priya Dhawale on Data & Solutions

MIT's Priya Dhawale discusses AI's role in climate solutions, the energy cost of AI, and the need for democratization in the field.

2 days ago
Databricks Buys Quotient AI
Technology

Databricks Buys Quotient AI

Databricks acquires Quotient AI to enhance AI agent reliability and performance in production environments, integrating its evaluation technology into key products.

2 days ago
Databricks' Genie Code: AI for Data Work
Technology

Databricks' Genie Code: AI for Data Work

Databricks launches Genie Code, an AI agent designed to automate and optimize complex data workflows, promising to double success rates over traditional coding agents.

2 days ago
Databricks Unleashes Genie Code AI
Technology

Databricks Unleashes Genie Code AI

Databricks launches Genie Code, an AI agent designed to automate data tasks and significantly improve success rates in data science.

2 days ago
Reasoning Nudges LLMs Towards Honesty
AI Research

Reasoning Nudges LLMs Towards Honesty

New research reveals that LLM reasoning enhances honesty not through content, but by leveraging the geometry of representational spaces, stabilizing honest defaults.

2 days ago
GitHub Copilot SDK: Execution is the New Interface
Technology

GitHub Copilot SDK: Execution is the New Interface

GitHub's new SDK allows developers to embed AI execution and agentic workflows directly into their applications, moving beyond simple text generation.

3 days ago
Beyond Token Count: Semantic Compression for LLMs
AI Research

Beyond Token Count: Semantic Compression for LLMs

Researchers recast LLM reasoning as lossy compression using the Conditional Information Bottleneck (CIB), employing semantic surprisal for efficient token pruning.

3 days ago
OpenAI Tames AI Chaos with Instruction Hierarchy
Artificial Intelligence

OpenAI Tames AI Chaos with Instruction Hierarchy

OpenAI's new IH-Challenge dataset trains AI models to prioritize instructions, enhancing safety and mitigating risks like prompt injection.

3 days ago
Snowflake Targets Manufacturing with AI
Technology

Snowflake Targets Manufacturing with AI

Snowflake is integrating AI into its data cloud to offer manufacturers actionable insights for optimizing operations and improving quality control.

3 days ago
AI Memory Gets a Brain Upgrade
AI Research

AI Memory Gets a Brain Upgrade

Microsoft Research's PlugMem system transforms AI interaction logs into structured knowledge, boosting agent efficiency and performance.

3 days ago
AI Agents Need Humans: The HITL Advantage
Artificial Intelligence

AI Agents Need Humans: The HITL Advantage

IBM AI Engineer Anna Gutowska explains why human intervention in AI agents is critical for preventing subtle errors and ensuring safe, effective deployment.

3 days ago
LeCun Starts $1B AI Firm
Funding Round

LeCun Starts $1B AI Firm

Yann LeCun launches Advanced Machine Intelligence (AMI Labs) with $1.03B seed funding to build AI systems grounded in 'world models'.

3 days ago
AI Agents Now Do Overnight Research
Artificial Intelligence

AI Agents Now Do Overnight Research

An automated system uses AI agents to conduct overnight LLM training experiments, modifying code and iterating on models autonomously.

6 days ago
Databricks Automates PII Discovery with LLMs
Technology

Databricks Automates PII Discovery with LLMs

Databricks deploys LogSentinel, an LLM-powered system, to automate PII detection and data governance across its platform, slashing review times and enhancing security.

7 days ago
Microsoft's Compact AI Learns to Reason
AI Research

Microsoft's Compact AI Learns to Reason

Microsoft's new Phi-4-reasoning-vision-15B model offers strong multimodal reasoning capabilities in a compact, efficient package.

7 days ago
Balyasny's AI Engine
Artificial Intelligence

Balyasny's AI Engine

Balyasny Asset Management built a powerful AI research engine using OpenAI models, slashing analysis times and boosting investment team confidence.

7 days ago
AI Agents: Memory, Ownership, and the Future
Artificial Intelligence

AI Agents: Memory, Ownership, and the Future

AI experts Chris Hay and Aaron Baughman discuss the evolution of AI agents, focusing on memory, open vs. closed systems, and the future of agent-based AI.

7 days ago
Standardizing Survival HTE Evaluation
AI Research

Standardizing Survival HTE Evaluation

Introducing SurvHTE-Bench, the first comprehensive benchmark for evaluating heterogeneous treatment effects in survival data, promoting reproducible and rigorous research.

7 days ago
Copilot Code Review Hits 60 Million
Technology

Copilot Code Review Hits 60 Million

GitHub's AI code review tool has processed over 60 million reviews, evolving to provide high-signal feedback that accelerates development.

8 days ago
OpenAI Unveils GPT-5.4 for Pro Work
Artificial Intelligence

OpenAI Unveils GPT-5.4 for Pro Work

OpenAI releases GPT-5.4, its most advanced model for professional tasks, integrating enhanced reasoning, coding, and computer-use capabilities.

8 days ago
AI Reasoning Flaws Are a Safety Feature
Artificial Intelligence

AI Reasoning Flaws Are a Safety Feature

AI models' inability to control their "chains of thought" when monitored is a positive for AI safety, preventing them from easily deceiving oversight systems.

8 days ago
Databricks' KARL Cuts Agent Costs
Technology

Databricks' KARL Cuts Agent Costs

Databricks' new KARL AI agent drastically cuts costs and latency for enterprise knowledge tasks using custom reinforcement learning.

8 days ago
Microsoft's Phi-4-reasoning-vision-15B compact AI model
AI Research

Microsoft's Phi-4-reasoning-vision-15B compact AI model

Microsoft Research's Phi-4-reasoning-vision-15B offers efficient multimodal AI, excelling in reasoning and vision tasks with less data and compute.

9 days ago
DynFormer: Smarter AI for Complex Physics
AI Research

DynFormer: Smarter AI for Complex Physics

DynFormer, a new dynamics-informed neural operator, significantly reduces error and memory usage in complex PDE simulations by using scale-aware Transformers.

9 days ago
Robots Learn to Peel Like Humans
AI Research

Robots Learn to Peel Like Humans

Researchers developed a two-stage robot learning framework that uses imitation and human feedback to master complex, subjective manipulation tasks like peeling produce.

9 days ago
LM Agents Still Prone to Goal Drift
AI Research

LM Agents Still Prone to Goal Drift

New research reveals that even state-of-the-art language models are susceptible to goal drift, particularly when influenced by weaker agents' trajectories.

9 days ago
AI Steals AI's Own Secrets: Distillation Attacks
Artificial Intelligence

AI Steals AI's Own Secrets: Distillation Attacks

New research reveals how 'distillation attacks' can steal proprietary AI models, creating significant intellectual property and security risks for businesses.

9 days ago
Google's Interactions API Evolves Gemini
Artificial Intelligence

Google's Interactions API Evolves Gemini

Google's new Interactions API for Gemini models offers a unified interface for complex AI tasks, supporting multimodal inputs, agents, and tool integration.

10 days ago
Google's Gemini 3.1 Flash-Lite Targets Scale, Cuts Costs
AI Research

Google's Gemini 3.1 Flash-Lite Targets Scale, Cuts Costs

Google DeepMind's Gemini 3.1 Flash-Lite arrives as its most cost-effective AI model, designed for scale and speed.

10 days ago
CHIMERA Dataset Boosts LLM Reasoning
AI Research

CHIMERA Dataset Boosts LLM Reasoning

Researchers introduce CHIMERA, a synthetic dataset enabling LLMs to achieve strong cross-domain reasoning capabilities with efficient training.

10 days ago
New Models Tackle Reasoning Puzzles with Symmetry
AI Research

New Models Tackle Reasoning Puzzles with Symmetry

New Symbol-Equivariant Recurrent Reasoning Models (SE-RRMs) offer improved performance and generalization on reasoning tasks like Sudoku and ARC-AGI by explicitly encoding symmetry.

10 days ago
Recursive LLMs Tackle Long-Horizon Reasoning
AI Research

Recursive LLMs Tackle Long-Horizon Reasoning

New research introduces recursive language models to overcome context limitations, showing significant improvements on long-horizon reasoning tasks like Boolean satisfiability.

10 days ago
DCDP: Dynamic Diffusion Policies for Robotics
AI Research

DCDP: Dynamic Diffusion Policies for Robotics

The DCDP framework enhances robotic adaptability in dynamic environments by integrating real-time environmental dynamics for improved action correction, achieving significant performance gains with minimal computational overhead.

10 days ago
Spark Ditches Dual Engines for Real-Time Mode
Technology

Spark Ditches Dual Engines for Real-Time Mode

Databricks' new Real-Time Mode for Spark aims to deliver sub-second streaming speeds, eliminating the need for separate processing engines.

11 days ago
AI Research

Decoupling Correctness and Checkability in LLMs

Researchers propose a 'translator' model to overcome the 'legibility tax' in LLMs, decoupling accuracy from output checkability for more trustworthy AI.

13 days ago
AI Research

LLMs Revolutionize Vehicle Routing Optimization

A new LLM-powered approach, AILS-AHD, significantly advances vehicle routing optimization by dynamically designing heuristics, setting new performance records.

13 days ago
AI Research

Certified Circuits for Stable AI Explanations

New 'Certified Circuits' framework provides provable stability for AI model explanations, yielding more accurate and compact circuits.

13 days ago
AI Research

Edge AI Acceleration Gets Flexible

Researchers developed a novel FPGA-based accelerator that dynamically adjusts neural network precision at runtime, boosting inference speed for edge AI.

13 days ago
AI Research

AI Drives Safely Without Expert Data

Researchers introduce Risk-aware World Model Predictive Control (RaWMPC), enabling autonomous driving without expert data by predicting and avoiding risks.

13 days ago
AI Research

AI Governance: Optimization's Normative Limits

A new paper on arXiv argues that optimization-based AI, including RLHF LLMs, are formally incapable of normative governance due to inherent structural limitations.

13 days ago
AI Research

Predicting Transformer Training Instability

Researchers introduce RKSP, a method to predict transformer training divergence from a single forward pass, and KSS, a technique to actively prevent it, saving compute and enabling higher learning rates.

13 days ago