The vast sums locked in cryptocurrency smart contracts are increasingly a target for sophisticated AI. To gauge this evolving threat landscape, researchers have launched EVMbench, a new benchmark suite designed to rigorously evaluate AI agents' prowess in identifying, fixing, and exploiting vulnerabilities within blockchain environments.
Smart contracts, the backbone of decentralized finance, manage over $100 billion in open-source crypto assets. As AI models become more adept at coding, their ability to navigate these complex financial systems—both for malicious and defensive purposes—requires careful measurement. EVMbench aims to provide this critical assessment, fostering the development of AI systems for auditing and fortifying deployed contracts.
EVMbench: A Tri-Modal AI Security Challenge
Developed in collaboration with Paradigm, EVMbench presents AI agents with three distinct challenges:

