AI Model Evaluation


Large scale multitasking language comprehension benchmark


MMLU, also known as Massive Multitask Language Understanding, is an assessment of language comprehension ability for large models. It is currently one of the most famous semantic comprehension assessments for large models, launched by researchers from UC Berkeley University in September 2020. This test covers 57 tasks, including elementary mathematics, American history, computer science, law, and more. The task covers a wide range of knowledge, in English, to evaluate the basic knowledge coverage and understanding ability of the large model.

data statistics

Relevant Navigation

No comments

No comments...