Quantifying LLM Impact on Labor Skills

As Large Language Models (LLMs) rapidly integrate into the economy, understanding their precise impact on the labor market is paramount. New research introduces the Skill Automation Feasibility Index (SAFI), a critical framework for benchmarking frontier LLMs against granular occupational skills. This work, detailed on arXiv, provides empirical data for policymakers and investors navigating the evolving landscape of AI-driven automation.

Benchmarking LLMs for Skill Automation Feasibility

The study rigorously evaluates four leading LLMs, LLaMA 3.3 70B, Mistral Large, Qwen 2.5 72B, and Gemini 2.5 Flash, across 263 text-based tasks representative of all 35 skills defined by the U.S. Department of Labor's O*NET taxonomy. The resulting Skill Automation Feasibility Index (SAFI) reveals that skills like Mathematics (SAFI: 73.2) and Programming (71.8) are most susceptible to automation. Conversely, Active Listening (42.2) and Reading Comprehension (45.5) exhibit the lowest feasibility scores, indicating areas where human skills remain robust against current LLM capabilities. The convergence of performance across the evaluated models, with a narrow 3.6-point spread, suggests that the potential for text-based automation is more intrinsically tied to the nature of the skill itself rather than specific model architectures.

The Capability-Demand Inversion and AI Augmentation

A striking finding is the "capability-demand inversion": skills most critical for jobs exposed to AI are precisely those where LLMs currently underperform according to the benchmark. This highlights a strategic gap and an opportunity for human expertise. Furthermore, by cross-referencing with real-world AI adoption data, the research proposes an AI Impact Matrix. This framework categorizes skills into High Displacement Risk, Upskilling Required, AI-Augmented, and Lower Displacement Risk. Crucially, the analysis indicates that 78.7% of observed AI interactions in the workplace are currently augmentation, enhancing human capabilities rather than replacing them outright. The SAFI, measuring LLM performance on text-based skill representations, provides a vital tool for understanding this nuanced interaction.

Quantifying LLM Impact on Labor Skills

Benchmarking LLMs for Skill Automation Feasibility

Related startups

The Capability-Demand Inversion and AI Augmentation

AI Daily Digest