MMLU
MMLU (Massive Multitask Language Understanding) is a broad-coverage knowledge and reasoning benchmark spanning 57 subjects across STEM, the humanities, the social sciences, law, medicine, and other professional domains. It contains 14,079 four-choice multiple-choice test items plus a 1,540-item dev/validation split used for few-shot example selection, with difficulty calibrated from elementary up to advanced-professional level. Released in 2020, it became the de facto industry standard for measuring an LLM's breadth of world knowledge and remains one of the most widely cited LLM benchmarks despite its age.
28 models · 2026-05-26T00:39:08.035851Z