Tools · Hugging Face Blog ·

EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

EVA-Bench Data 2.0: 3 Domains, 121 Tools, 213 Scenarios

Hugging Face announced EVA-Bench Data 2.0, expanding the benchmark to 3 domains, 121 tools, and 213 scenarios for evaluating tool use and agent performance. The update provides a broader test set for measuring capabilities across more tasks and environments.

Read the full story at Hugging Face Blog →