#Large Language Models

50 articles with this tag

Memora: Microsoft's AI Memory Upgrade
AI Research

Memora: Microsoft's AI Memory Upgrade

Microsoft's Memora AI memory system revolutionizes long-term AI interactions by balancing detailed recall with efficient retrieval, outperforming existing solutions.

2 days ago
AI Coding Token Reduction: Rajkumar Sakthivel on Local Code Index
Artificial Intelligence

AI Coding Token Reduction: Rajkumar Sakthivel on Local Code Index

Rajkumar Sakthivel from Tesco discusses how a local code index reduced AI coding tokens by 94%, optimizing costs and performance by focusing on context over model improvements.

3 days ago
Erik Hanchett: Cut AI Agent Token Costs
Artificial Intelligence

Erik Hanchett: Cut AI Agent Token Costs

AWS Developer Advocate Erik Hanchett shares five essential strategies to cut AI agent token costs, including caching prompts, routing by difficulty, and managing conversation history.

3 days ago
Isadora Martin-Dye: Layering AI Tone Instructions
Artificial Intelligence

Isadora Martin-Dye: Layering AI Tone Instructions

Isadora Martin-Dye explains why simple tone instructions for AI are insufficient, advocating for a four-layered approach to prompt engineering.

5 days ago
AI Agents Need More Than Just Brains
Technology

AI Agents Need More Than Just Brains

AI agents require more than just powerful LLMs; they need a robust harness infrastructure for reliable real-world task execution.

13 days ago
Uber Eats Tries Agentic Shopping
tech

Uber Eats Tries Agentic Shopping

Uber Eats' Cart Assistant uses AI to translate natural language requests into draft grocery carts, simplifying the shopping process.

15 days ago
Simulating Humans at Scale: Simile's Joon Sung Park
Artificial Intelligence

Simulating Humans at Scale: Simile's Joon Sung Park

Simile's Joon Sung Park discusses simulating human behavior at scale using LLMs to understand societal dynamics and emergent phenomena.

16 days ago
TCS Taps Anthropic's Claude for Regulated Industries
Artificial Intelligence

TCS Taps Anthropic's Claude for Regulated Industries

TCS partners with Anthropic to bring Claude AI to regulated industries like finance and healthcare, integrating it into its own operations and client solutions.

19 days ago
DeepMind's Kilpatrick on AI Models Eating Harnesses
AI Research

DeepMind's Kilpatrick on AI Models Eating Harnesses

Google DeepMind's Logan Kilpatrick delves into the AI concept of models "eating the harness," explaining how over-specialization hinders generalization and what can be done to prevent it.

21 days ago
Steering LRMs Beyond Output Degradation
AI Research

Steering LRMs Beyond Output Degradation

A new probe-based method, FPCG, distinguishes prediction from detection features to enable precise large reasoning models steering with minimal output quality degradation.

21 days ago
Google DeepMind Discusses Open Models & AI Ownership
AI Research

Google DeepMind Discusses Open Models & AI Ownership

Google DeepMind's Gus Martins and Ian Ballantyne discuss the benefits of open AI models like Gemma for ownership, control, and custom applications.

21 days ago
Claude Fable 5 Arrives on Databricks
Technology

Claude Fable 5 Arrives on Databricks

Anthropic's Claude Fable 5 is now available on Databricks, offering advanced AI capabilities with enterprise-grade governance and cost controls.

22 days ago
Alex Bowcut on RAG: Accuracy Over Obsolescence
Artificial Intelligence

Alex Bowcut on RAG: Accuracy Over Obsolescence

Alex Bowcut of Sphere discusses why Retrieval Augmented Generation (RAG) remains vital for AI applications demanding accuracy, especially in specialized fields like tax compliance.

22 days ago
Images as the New Reasoning Medium
AI Research

Images as the New Reasoning Medium

This paper introduces optical reasoning, enabling images to serve as the primary medium for LLM and MLLM reasoning, achieving higher token efficiency and competitive performance.

22 days ago
Google's Gemma 4 12B: AI on Your Laptop
AI Research

Google's Gemma 4 12B: AI on Your Laptop

Google's Gemma 4 12B model brings efficient, multimodal AI directly to laptops with a novel unified architecture.

23 days ago
AI Agents Get Dumber With More Context, Expert Warns
Artificial Intelligence

AI Agents Get Dumber With More Context, Expert Warns

Nupur Sharma of Qodo explains how too much context can hinder AI agents, leading to the 'lost in the middle' problem, and discusses solutions like context engines and hybrid orchestration.

23 days ago
NF-CoT: High-Bandwidth Latent Reasoning
AI Research

NF-CoT: High-Bandwidth Latent Reasoning

NF-CoT framework enables high-bandwidth latent reasoning using normalizing flows, boosting LLM performance and efficiency while preserving autoregressive strengths.

26 days ago
Brendon Dillon on Text Diffusion at Google DeepMind
AI Research

Brendon Dillon on Text Diffusion at Google DeepMind

Brendon Dillon from Google DeepMind discusses the advancements and potential of text diffusion models in language generation, highlighting advantages over autoregressive models.

27 days ago
Databricks Search Gets 3x Faster
Technology

Databricks Search Gets 3x Faster

Databricks' Instructed-Retriever-1 model uses parallel test-time scaling to boost Knowledge Assistant search speed by over 3x.

27 days ago
Together AI Masters MiniMax M3 Inference
Technology

Together AI Masters MiniMax M3 Inference

Together AI details engineering feats enabling efficient MiniMax M3 inference, unlocking 1M-token context and multimodality.

29 days ago
Claude Code Embraces Opus 4.8
Technology

Claude Code Embraces Opus 4.8

Claude Code updates its default model to Opus 4.8, introduces dynamic workflows, security plugins, and faster, cheaper Opus options.

about 1 month ago
Snowflake Adds Claude Opus 4.8
Technology

Snowflake Adds Claude Opus 4.8

Snowflake integrates Anthropic's Claude Opus 4.8 into its Cortex AI platform, boosting agentic workflows, data analysis, and code generation for enterprises.

about 1 month ago
Anthropic Debuts Claude Opus 4.8
Artificial Intelligence

Anthropic Debuts Claude Opus 4.8

Anthropic unveils Claude Opus 4.8, boosting AI performance with new features like 'effort control' and 'dynamic workflows' for complex coding.

about 1 month ago
CAG vs. Long Context: AI's Memory Explained
Artificial Intelligence

CAG vs. Long Context: AI's Memory Explained

IBM's Martin Keen explains how AI models use Long Context and Cache Augmented Generation (CAG) to process information, highlighting the trade-offs and efficiency gains of each approach.

about 1 month ago
Angus McLean on Bounded Autonomy in AI
Artificial Intelligence

Angus McLean on Bounded Autonomy in AI

Angus J. McLean of Oliver discusses 'Bounded Autonomy' in AI, exploring the shift to agentic processes in advertising and offering practical advice for building AI agents.

about 1 month ago
Unlocking LLM Recall: Data Composition is Key
AI Research

Unlocking LLM Recall: Data Composition is Key

New research reveals a sigmoid scaling law for LLM factual recall, driven by model size and training data composition, explaining up to 94% of performance variance.

about 1 month ago
Google Launches Gemini 3.5 Flash
Artificial Intelligence

Google Launches Gemini 3.5 Flash

Google unveils Gemini 3.5 Flash, a fast and intelligent AI model optimized for agentic tasks, now powering consumer and developer tools.

about 1 month ago
Spotify's Shivam Verma on LLMs and Personalization
Artificial Intelligence

Spotify's Shivam Verma on LLMs and Personalization

Shivam Verma from Spotify discusses how LLMs are transforming personalization in recommendation systems, moving towards steerable and context-aware content discovery.

about 1 month ago
AI Delegation: Reliability Concerns Emerge
AI Research

AI Delegation: Reliability Concerns Emerge

New Microsoft Research highlights how AI can degrade document fidelity in long, delegated tasks, stressing the need for better verification and orchestration.

about 2 months ago
ChatGPT Gets Smarter on Sensitive Chats
Artificial Intelligence

ChatGPT Gets Smarter on Sensitive Chats

OpenAI's latest ChatGPT safety updates help the AI better understand context in sensitive conversations, improving its response to potential harm.

about 2 months ago
AI Agents Flunk Social Reasoning Test
AI Research

AI Agents Flunk Social Reasoning Test

Microsoft's SocialReasoning-Bench reveals AI agents struggle to negotiate effectively in users' best interests, prioritizing task completion over optimal outcomes.

about 2 months ago
Sally-Ann Delucia on AI Agent Context Management
AI Research

Sally-Ann Delucia on AI Agent Context Management

Sally-Ann Delucia of Arize discusses the challenges and strategies for context management in AI agents, highlighting the importance of memory and sub-agents.

about 2 months ago
Databricks' Genie Data Agent
Technology

Databricks' Genie Data Agent

Databricks unveils Genie, a sophisticated data agent designed to navigate complex enterprise data, leveraging specialized search, parallel thinking, and multi-LLM designs for enhanced accuracy.

about 2 months ago
JACTUS AI Unifies Compression and Adaptation
AI Research

JACTUS AI Unifies Compression and Adaptation

JACTUS AI unifies parameter compression and task adaptation, outperforming sequential methods with fewer retained parameters across vision and language tasks.

about 2 months ago
OpenAI boosts ChatGPT with GPT-5.5 Instant
Artificial Intelligence

OpenAI boosts ChatGPT with GPT-5.5 Instant

OpenAI upgrades ChatGPT with GPT-5.5 Instant, boosting accuracy, personalization, and user control over AI memory.

about 2 months ago
Training LLMs Locally: ElevenLabs Expert Shares How-To
AI Research

Training LLMs Locally: ElevenLabs Expert Shares How-To

Angelos Perivolaropoulos of ElevenLabs shares a practical guide to training Large Language Models (LLMs) from scratch on local hardware.

about 2 months ago
Andrej Karpathy: AI Models Need Human-Like Reasoning
Artificial Intelligence

Andrej Karpathy: AI Models Need Human-Like Reasoning

Andrej Karpathy discusses the evolution of AI from programming to prompting, emphasizing the current need for models to develop human-like reasoning.

2 months ago
Perplexity CTO on GPT-5.5 Efficiency
Artificial Intelligence

Perplexity CTO on GPT-5.5 Efficiency

Perplexity CTO Denis Yarats reveals GPT-5.5's impressive efficiency, using 56% fewer tokens for complex tasks and enabling faster user feedback.

2 months ago
Anthropic Delays 'Myths' AI Model Amid Security Concerns
Artificial Intelligence

Anthropic Delays 'Myths' AI Model Amid Security Concerns

Anthropic delays release of its 'Myths' AI model after a security researcher found it could be prompted to simulate a bank robbery, raising safety concerns.

2 months ago
OpenAI Unveils GPT-5.5
Artificial Intelligence

OpenAI Unveils GPT-5.5

OpenAI launches GPT-5.5, boasting enhanced intelligence, autonomy, and speed for complex tasks, alongside advanced safety features.

2 months ago
AI's Memory Problem
Investors News

AI's Memory Problem

AI models currently struggle to learn and adapt post-deployment, relying on external memory. Continual learning research aims to change that.

2 months ago
Sunil Pai on AI Agents & the Future of Software
Artificial Intelligence

Sunil Pai on AI Agents & the Future of Software

Cloudflare's Sunil Pai discusses the future of AI agents, moving from tool-calling to code generation for more efficient and powerful interactions.

2 months ago
Anthropic Unveils Opus 4.7: A Leap in AI Coding and Vision
Artificial Intelligence

Anthropic Unveils Opus 4.7: A Leap in AI Coding and Vision

Anthropic unveils its updated Opus 4.7 AI model, boasting enhanced coding and computer vision capabilities, with a key focus on cybersecurity.

3 months ago
Anthropic's Claude Opus 4.7 Arrives, Sharper Than Ever
Artificial Intelligence

Anthropic's Claude Opus 4.7 Arrives, Sharper Than Ever

Anthropic unveils Claude Opus 4.7, boosting AI's coding prowess, multimodal input, and safety features for enterprise use.

3 months ago
OpenAI Demystifies AI Basics
Artificial Intelligence

OpenAI Demystifies AI Basics

OpenAI's new 'AI Fundamentals' course simplifies AI, explaining LLMs and model evolution for everyone.

3 months ago
AI Agents Need Better Memories
Technology

AI Agents Need Better Memories

Databricks research explores how AI agents can improve by accessing vast stores of past interactions and organizational knowledge, moving beyond just larger models.

3 months ago
LLM Adaptation Without Retraining
AI Research

LLM Adaptation Without Retraining

In-Place Test-Time Training enables LLMs to adapt to new data at inference without retraining, enhancing performance and paving the way for continual learning.

3 months ago
LLMs Learn to Play Tic-Tac-Toe with Reinforcement Learning
Artificial Intelligence

LLMs Learn to Play Tic-Tac-Toe with Reinforcement Learning

Stefano Fiorucci discusses the power of reinforcement learning for training LLMs, showcasing Tic-Tac-Toe as a case study for building interactive environments and improving model capabilities.

3 months ago
AI Hacker "Pliny the Liberator" Tests GPT-4 Security
AI Research

AI Hacker "Pliny the Liberator" Tests GPT-4 Security

AI security researcher "Pliny the Liberator" demonstrates a novel jailbreaking technique using "tokenades" to manipulate AI models, showcasing the ongoing challenges in AI security.

3 months ago
China's AI Surge: Open Source Models Face Scrutiny
Artificial Intelligence

China's AI Surge: Open Source Models Face Scrutiny

Bloomberg Opinion columnist Catherine Thorbecke discusses China's booming AI sector, the rise of open-source models, and the critical need for security and data privacy.

3 months ago