#Large Language Models

50 articles with this tag

AI Delegation: Reliability Concerns Emerge

New Microsoft Research highlights how AI can degrade document fidelity in long, delegated tasks, stressing the need for better verification and orchestration.

1 day ago

Artificial Intelligence

ChatGPT Gets Smarter on Sensitive Chats

OpenAI's latest ChatGPT safety updates help the AI better understand context in sensitive conversations, improving its response to potential harm.

3 days ago

AI Research

AI Agents Flunk Social Reasoning Test

Microsoft's SocialReasoning-Bench reveals AI agents struggle to negotiate effectively in users' best interests, prioritizing task completion over optimal outcomes.

6 days ago

AI Research

Sally-Ann Delucia on AI Agent Context Management

Sally-Ann Delucia of Arize discusses the challenges and strategies for context management in AI agents, highlighting the importance of memory and sub-agents.

7 days ago

Technology

Databricks' Genie Data Agent

Databricks unveils Genie, a sophisticated data agent designed to navigate complex enterprise data, leveraging specialized search, parallel thinking, and multi-LLM designs for enhanced accuracy.

9 days ago

AI Research

JACTUS AI Unifies Compression and Adaptation

JACTUS AI unifies parameter compression and task adaptation, outperforming sequential methods with fewer retained parameters across vision and language tasks.

12 days ago

Artificial Intelligence

OpenAI boosts ChatGPT with GPT-5.5 Instant

OpenAI upgrades ChatGPT with GPT-5.5 Instant, boosting accuracy, personalization, and user control over AI memory.

12 days ago

AI Research

Training LLMs Locally: ElevenLabs Expert Shares How-To

Angelos Perivolaropoulos of ElevenLabs shares a practical guide to training Large Language Models (LLMs) from scratch on local hardware.

13 days ago

Artificial Intelligence

Andrej Karpathy: AI Models Need Human-Like Reasoning

Andrej Karpathy discusses the evolution of AI from programming to prompting, emphasizing the current need for models to develop human-like reasoning.

15 days ago

Artificial Intelligence

Perplexity CTO on GPT-5.5 Efficiency

Perplexity CTO Denis Yarats reveals GPT-5.5's impressive efficiency, using 56% fewer tokens for complex tasks and enabling faster user feedback.

23 days ago

Artificial Intelligence

Anthropic Delays 'Myths' AI Model Amid Security Concerns

Anthropic delays release of its 'Myths' AI model after a security researcher found it could be prompted to simulate a bank robbery, raising safety concerns.

24 days ago

Artificial Intelligence

OpenAI Unveils GPT-5.5

OpenAI launches GPT-5.5, boasting enhanced intelligence, autonomy, and speed for complex tasks, alongside advanced safety features.

24 days ago

Investors News

AI's Memory Problem

AI models currently struggle to learn and adapt post-deployment, relying on external memory. Continual learning research aims to change that.

25 days ago

Artificial Intelligence

Sunil Pai on AI Agents & the Future of Software

Cloudflare's Sunil Pai discusses the future of AI agents, moving from tool-calling to code generation for more efficient and powerful interactions.

28 days ago

Artificial Intelligence

Anthropic Unveils Opus 4.7: A Leap in AI Coding and Vision

Anthropic unveils its updated Opus 4.7 AI model, boasting enhanced coding and computer vision capabilities, with a key focus on cybersecurity.

about 1 month ago

Artificial Intelligence

Anthropic's Claude Opus 4.7 Arrives, Sharper Than Ever

Anthropic unveils Claude Opus 4.7, boosting AI's coding prowess, multimodal input, and safety features for enterprise use.

about 1 month ago

Artificial Intelligence

OpenAI Demystifies AI Basics

OpenAI's new 'AI Fundamentals' course simplifies AI, explaining LLMs and model evolution for everyone.

about 1 month ago

Technology

AI Agents Need Better Memories

Databricks research explores how AI agents can improve by accessing vast stores of past interactions and organizational knowledge, moving beyond just larger models.

about 1 month ago

AI Research

LLM Adaptation Without Retraining

In-Place Test-Time Training enables LLMs to adapt to new data at inference without retraining, enhancing performance and paving the way for continual learning.

about 1 month ago

Artificial Intelligence

LLMs Learn to Play Tic-Tac-Toe with Reinforcement Learning

Stefano Fiorucci discusses the power of reinforcement learning for training LLMs, showcasing Tic-Tac-Toe as a case study for building interactive environments and improving model capabilities.

about 1 month ago

AI Research

AI Hacker "Pliny the Liberator" Tests GPT-4 Security

AI security researcher "Pliny the Liberator" demonstrates a novel jailbreaking technique using "tokenades" to manipulate AI models, showcasing the ongoing challenges in AI security.

about 1 month ago

Artificial Intelligence

China's AI Surge: Open Source Models Face Scrutiny

Bloomberg Opinion columnist Catherine Thorbecke discusses China's booming AI sector, the rise of open-source models, and the critical need for security and data privacy.

about 2 months ago

Technology

Divide and Conquer LLMs Beat Giants

Smaller LLMs using a 'Divide & Conquer' strategy can outperform top models like GPT-4o on long context tasks, offering cost and speed benefits.

about 2 months ago

AI Research

Google Researchers Explore AI Storage Efficiency

Google researchers are developing AI compression techniques to reduce model storage needs by sixfold, aiming to lower costs and boost efficiency in AI development.

about 2 months ago

Artificial Intelligence

AI Storage Efficiency & Corebridge Deal Highlighted

Google researchers have developed an AI storage efficiency technique, while Corebridge Financial faces acquisition and Pony.ai plans global driverless vehicle expansion.

about 2 months ago

Technology

Namazu AI Adapts Global Models for Japan

Sakana AI launches Namazu AI, adapting global LLMs for Japan with improved neutrality and integrated web search via Sakana Chat.

about 2 months ago

AI Research

Perceptio: Spatial Grounding for LVLMs

Perceptio LVLM integrates explicit spatial tokens (segmentation, depth) to overcome LVLM limitations in fine-grained visual grounding, achieving SOTA across benchmarks.

about 2 months ago

Technology

Cloudflare Bets Big on Open-Source LLMs

Cloudflare's Workers AI now supports large language models, integrating Kimi K2.5 to offer cost-effective AI agent development.

about 2 months ago

Artificial Intelligence

Mistral Small 4 Unifies AI Capabilities

Mistral AI unveils Mistral Small 4, a unified model combining text, image, reasoning, and coding capabilities under an open-source license.

2 months ago

Artificial Intelligence

Run LLMs Locally with Llama.cpp

Cedric Clyburn explains how Llama.cpp makes running large language models locally on consumer hardware possible, highlighting GGUF format and optimized kernels for efficiency and accessibility.

2 months ago

Startup News

Tiiny AI Pocket Lab Hits $1M on Kickstarter

Tiiny AI's Pocket Lab, a personal AI supercomputer, raised over $1 million in five hours on Kickstarter, signaling demand for local AI processing.

2 months ago

Artificial Intelligence

IBM's Martin Keen on LLM Context Windows

IBM's Martin Keen explains how larger context windows in LLMs simplify deployments and improve reasoning by reducing reliance on complex RAG systems.

2 months ago

AI Research

Agentic LLMs: Stabilizing Minimax Training

Adversarially-Aligned Jacobian Regularization (AAJR) tackles LLM agent stability by controlling sensitivity along adversarial directions, expanding policy classes and reducing performance degradation.

2 months ago

AI Research

RLAIF Explained: Latent Values in LLMs

RLAIF explained: Human values are latent directions in LLM representations, activated by constitutional prompts, with alignment ceiling tied to model capacity and data quality.

2 months ago

Artificial Intelligence

OpenAI GPT-5.4 Launch Amid AI Race Intensifies

OpenAI is reportedly fast-tracking the launch of GPT-5.4, a new AI model, in response to rapid advancements from competitors like Anthropic.

2 months ago

Artificial Intelligence

OpenAI's GPT-5.3 Instant Promises Smoother AI Chat

OpenAI's GPT-5.3 Instant aims for more natural and efficient AI conversations, enhancing web searches and reducing conversational dead ends.

2 months ago

Artificial Intelligence

OpenAI's GPT-4.5 Enhances Web Search Integration

OpenAI researcher Josh discusses how GPT-4.5's web search integration is becoming more natural, conversational, and context-aware.

2 months ago

AI Research

Recursive LLMs Tackle Long-Horizon Reasoning

New research introduces recursive language models to overcome context limitations, showing significant improvements on long-horizon reasoning tasks like Boolean satisfiability.

2 months ago

AI Research

Decoupling Correctness and Checkability in LLMs

Researchers propose a 'translator' model to overcome the 'legibility tax' in LLMs, decoupling accuracy from output checkability for more trustworthy AI.

3 months ago

AI Research

LLMs Revolutionize Vehicle Routing Optimization

A new LLM-powered approach, AILS-AHD, significantly advances vehicle routing optimization by dynamically designing heuristics, setting new performance records.

3 months ago

AI Research

Multimodal LLMs: What's Lost in Translation?

New research reveals multimodal LLMs struggle to utilize non-textual data due to a 'mismatched decoder problem,' impacting their true understanding.

3 months ago

Technology

OpenClaw Agents: The Future of AI Autonomy?

OpenClaw Agents, powered by advanced reasoning LLMs, are poised to redefine AI autonomy and potentially disrupt current application paradigms.

3 months ago

Artificial Intelligence

OpenAI Lands $110B, Valued at $730B

OpenAI has announced a massive $110 billion funding round at a $730 billion pre-money valuation, backed by Amazon, NVIDIA, and SoftBank.

3 months ago

Technology

Etched Secures $500M for AI Chip Battle

Google alum Reiner Pope's startup, Etched, raises $500M to develop specialized AI chips designed to compete with Nvidia.

3 months ago

Technology

Intuit Taps Anthropic for AI Partnership

Intuit's stock saw a modest gain following its multi-year partnership with Anthropic, aimed at integrating custom AI agents for businesses and consumers.

3 months ago

AI Research

Arcee Trinity Large Breaks Cover

Arcee.ai unveils Trinity Large, a 400B-parameter Mixture-of-Experts model engineered for inference efficiency and enterprise long-context use, alongside smaller variants.

3 months ago

Technology

Governing Agentic AI by 2026

As agentic AI trends accelerate towards 2026, robust governance frameworks encompassing identity, policy, and enforcement are crucial for safe and ethical autonomous AI deployment.

3 months ago

AI Research

GPT-OSS-Puzzle-88B: Faster AI, Same Brains

GPT-OSS-Puzzle-88B offers substantial inference speedups for large language models without sacrificing accuracy, utilizing techniques like MoE pruning and window attention.

3 months ago

AI Research

AI Societies' Safety Problem

Self-evolving AI societies face an impossible trilemma: achieving continuous learning, isolation, and safety alignment simultaneously.

3 months ago

Technology

Testing AI Guardrails Across Languages

Researchers tested context-aware AI guardrails across English and Farsi in humanitarian scenarios, finding nuanced performance differences and highlighting the need for language-specific safety evaluations.

3 months ago