1.google/gemma-4-31B-it
acc51.6%±2.1%
GPQA Extended includes the Main set plus additional questions that were authored but did not pass the rigorous filtering for the Main set.
Source paperLatest run: 2026-05-25
Switch between the canonical ranking, release-date performance view, and score-size tradeoff.
Questions may be flawed, ambiguous, or trivial compared to the Main set.
Citation
FrozeBench. "GPQA Extended." https://frozebench.com/benchmarks/gpqa-extended. Retrieved 2026-06-04.
BibTeX
@misc{frozebench_gpqa_extended,
title = {GPQA Extended},
howpublished = {\url{https://frozebench.com/benchmarks/gpqa-extended}},
year = {2026},
note = {FrozeBench. Retrieved 2026-06-04.}
}URL
https://frozebench.com/benchmarks/gpqa-extended