#Coding Agents

13 articles with this tag

Evaluating Coding Agents: Lessons from SWE-rebench
AI Research

Evaluating Coding Agents: Lessons from SWE-rebench

Ibragim Badertdinov from Nebius shares key lessons from evaluating coding agents using the SWE-rebench benchmark, highlighting the importance of real-world tasks, reliable verification, and cost-effectiveness.

27 days ago
Devin's 80% Moment: AI Coding Agents Evolve
Artificial Intelligence

Devin's 80% Moment: AI Coding Agents Evolve

Walden Yan and Cole Murray discuss Devin's '80% moment' in AI coding, highlighting background agents, multiple PRs, and the end of hand-held coding.

about 1 month ago
Hugging Face's Ben Burtenshaw on AI System Engineering
Artificial Intelligence

Hugging Face's Ben Burtenshaw on AI System Engineering

Ben Burtenshaw from Hugging Face discusses how AI coding agents can be used for AI system engineering, kernel optimization, and building multi-agent autoresearch labs.

about 1 month ago
Coding Agent Inference Benchmark Revealed
Technology

Coding Agent Inference Benchmark Revealed

Together AI unveils a new benchmark for coding agent inference, highlighting performance under real-world load and significant cost advantages.

about 1 month ago
Marlene Mhangami: Playwright for Functionality Testing
Technology

Marlene Mhangami: Playwright for Functionality Testing

Marlene Mhangami from Microsoft and GitHub discusses leveraging Playwright and AI agents for effective functionality testing, emphasizing clean code and behavior-driven development.

about 2 months ago
OpenAI's "Parameter Golf" Reveals AI's Role
Artificial Intelligence

OpenAI's "Parameter Golf" Reveals AI's Role

OpenAI's "Parameter Golf" competition revealed how AI coding agents are transforming machine learning research, pushing innovation under tight constraints.

about 2 months ago
VIBE✓ adds friction to AI coding agents
Technology

VIBE✓ adds friction to AI coding agents

Mozilla.ai's VIBE✓ framework introduces deliberate friction to coding agent workflows, mitigating automation bias and ensuring human oversight.

about 2 months ago
Embedding OpenClaw Coding Agent in Your Product
Artificial Intelligence

Embedding OpenClaw Coding Agent in Your Product

Matthias Luebken from Tavon.ai discusses embedding the OpenClaw coding agent, Pi, into products, highlighting its utility for developers and the future of AI in software systems.

about 2 months ago
OpenAI's Safety Playbook for Codex
Artificial Intelligence

OpenAI's Safety Playbook for Codex

OpenAI details its robust safety measures for its Codex AI coding agent, emphasizing sandboxing, network controls, and detailed telemetry for secure deployment.

about 2 months ago
Databricks Tames Coding AI Chaos
Technology

Databricks Tames Coding AI Chaos

Databricks introduces Unity AI Gateway to manage AI coding agents, offering centralized governance, cost controls, and observability for enterprises.

3 months ago
Databricks Centralizes Coding AI
Technology

Databricks Centralizes Coding AI

Databricks launches AI Gateway to centralize governance, security, and cost controls for the growing number of AI coding agents used by enterprises.

3 months ago
Exa Unveils New Code Search Benchmarks
Artificial Intelligence

Exa Unveils New Code Search Benchmarks

Exa.ai releases 'WebCode', a new benchmark suite for evaluating search performance in coding agents, addressing limitations in existing tools.

3 months ago
AI Agents Leveled Up by Harness Engineering
Artificial Intelligence

AI Agents Leveled Up by Harness Engineering

LangChain's harness engineering approach dramatically improved an AI coding agent's performance by refining its surrounding system, not the core model.

4 months ago