GSM8K#6 of 28
exact_match
View run →88.5%±0.9%
Evaluated across 13 benchmarks. Ranks in the top 3 on 2 of 13. Strongest showing on GSM8K (88.5% exact_match, #6 of 28). Weakest on GPQA Extended (36.6% acc, #5 of 28).
| Benchmark | Metric | Score | Rank | Actions |
|---|---|---|---|---|
| GSM8K | exact_match | 88.5%±0.9% | #6 of 28 | View run → |
| IFEval | prompt_level_strict_acc | 82.3%±1.6% | #9 of 28 | View run → |
| EQ-Bench | eqbench | 80.7±1.4 | #7 of 28 | View run → |
| MBPP(Instruct) | pass@1 | 76.0%±1.9% | #1 of 24 | View run → |
| MMLU | acc | 74.1%±0.4% | #11 of 28 | View run → |
| MBPP | pass@1 | 71.4%±2.0% | #6 of 28 | View run → |
| MMLU-Pro | exact_match | 66.7%±0.4% | #10 of 28 | View run → |
| BBH | exact_match | 62.5%±0.4% | #5 of 28 | View run → |
| MGSM | exact_match | 60.8%±0.8% | #18 of 28 | View run → |
| GPQA Diamond | acc | 39.4%±3.5% | #4 of 28 | View run → |
| GPQA Main | acc | 38.4%±2.3% | #5 of 28 | View run → |
| LongBench | aggregate | 37.0%±0.5% | #3 of 17 | View run → |
| GPQA Extended | acc | 36.6%±2.1% | #5 of 28 | View run → |
Citation
FrozeBench. "google/gemma-3-27b-it." https://frozebench.com/models/google-gemma-3-27b-it. Retrieved 2026-06-04.
BibTeX
@misc{frozebench_google_gemma_3_27b_it,
title = {google/gemma-3-27b-it},
howpublished = {\url{https://frozebench.com/models/google-gemma-3-27b-it}},
year = {2026},
note = {FrozeBench. Retrieved 2026-06-04.}
}URL
https://frozebench.com/models/google-gemma-3-27b-it