Microsoft Research is shedding light on a critical challenge in AI-powered workflows: the reliability of AI delegation over extended tasks. Their latest findings, detailed in a recent paper, reveal that AI systems can degrade the fidelity of important documents, spreadsheets, and code through repeated edits.
Related startups
The research specifically examines a scenario termed "delegated work," where users entrust AI with multi-step modifications with minimal human oversight. Using chained transformation and inversion tasks, the study evaluates how well semantic content is preserved. The focus is on meaningful changes, not just superficial formatting.
Degradation Over Time
Across various settings, state-of-the-art AI models demonstrated a significant drop in artifact fidelity. Over 20 delegated iterations, this degradation ranged from 19% to 34%.
