1 articles with this tag
New Hedge-Bench 1.0 benchmark reveals frontier AI models score under 16% on real-world financial reasoning tasks, exposing a critical gap in expert-level judgment.