AI Model Evaluation

FlagEval

FlagEval, jointly developed by Zhiyuan Research Institute and multiple university teams

Tags:

FlagEval, jointly developed by Zhiyuan Research Institute and multiple university teams, is a large-scale model evaluation platform that adopts a three-dimensional evaluation framework of “ability task indicator”, aiming to provide comprehensive and detailed evaluation results. The platform has provided comprehensive evaluations of over 600 dimensions, including over 30 abilities, 5 tasks, and 4 categories of indicators. The task dimensions include 22 subjective and objective evaluation datasets and 84433 questions.

data statistics

Relevant Navigation

No comments

No comments...