google/gemma-3-12b-it

Weights: 24.4 GB · safetensors
Source: huggingface page
Commit: 96b6f1e

Evaluated across 13 benchmarks. Ranks in the top 3 on 2 of 13. Strongest showing on GSM8K (86.3% exact_match, #8 of 28). Weakest on GPQA Extended (33.3% acc, #7 of 28).

Benchmark results

Benchmark	Metric	Score	Rank	Actions
GSM8K	exact_match	86.3%±0.9%	#8 of 28	View run →
IFEval	prompt_level_strict_acc	81.0%±1.7%	#12 of 28	View run →
EQ-Bench	eqbench	72.7±1.9	#21 of 28	View run →
MBPP(Instruct)	pass@1	70.8%±2.0%	#2 of 24	View run →
MMLU	acc	70.7%±0.4%	#13 of 28	View run →
MGSM	exact_match	68.4%±0.8%	#10 of 28	View run →
MBPP	pass@1	65.4%±2.1%	#9 of 28	View run →
BBH	exact_match	62.3%±0.4%	#6 of 28	View run →
MMLU-Pro	exact_match	59.7%±0.4%	#15 of 28	View run →
LongBench	aggregate	46.2%±0.5%	#1 of 17	View run →
GPQA Diamond	acc	35.9%±3.4%	#7 of 28	View run →
GPQA Main	acc	35.7%±2.3%	#6 of 28	View run →
GPQA Extended	acc	33.3%±2.0%	#7 of 28	View run →

GSM8K#8 of 28
exact_match
86.3%±0.9%
View run →
IFEval#12 of 28
prompt_level_strict_acc
81.0%±1.7%
View run →
EQ-Bench#21 of 28
eqbench
72.7±1.9
View run →
MBPP(Instruct)#2 of 24
pass@1
70.8%±2.0%
View run →
MMLU#13 of 28
acc
70.7%±0.4%
View run →
MGSM#10 of 28
exact_match
68.4%±0.8%
View run →
MBPP#9 of 28
pass@1
65.4%±2.1%
View run →
BBH#6 of 28
exact_match
62.3%±0.4%
View run →
MMLU-Pro#15 of 28
exact_match
59.7%±0.4%
View run →
LongBench#1 of 17
aggregate
46.2%±0.5%
View run →
GPQA Diamond#7 of 28
acc
35.9%±3.4%
View run →
GPQA Main#6 of 28
acc
35.7%±2.3%
View run →
GPQA Extended#7 of 28
acc
33.3%±2.0%
View run →

last evaluated:2 weeks ago

How to cite

Citation

FrozeBench. "google/gemma-3-12b-it." https://frozebench.com/models/google-gemma-3-12b-it. Retrieved 2026-06-04.

BibTeX

@misc{frozebench_google_gemma_3_12b_it,
  title = {google/gemma-3-12b-it},
  howpublished = {\url{https://frozebench.com/models/google-gemma-3-12b-it}},
  year = {2026},
  note = {FrozeBench. Retrieved 2026-06-04.}
}

URL

https://frozebench.com/models/google-gemma-3-12b-it