1 articles with this tag
The ClinEnv benchmark reveals LLMs struggle with sequential medical decision-making, showing a gap between diagnostic and management capabilities.