microsoft/phi-4-reasoning-plus

Weights: 29.3 GB · safetensors
Source: huggingface page
Commit: master

Evaluated across 13 benchmarks. Ranks in the top 3 on 0 of 13. Strongest showing on MMLU (77.6% acc, #8 of 28). Weakest on MBPP(Instruct) (0.0% pass@1, #21 of 24).

Benchmark results

Benchmark	Metric	Score	Rank	Actions
MMLU	acc	77.6%±0.3%	#8 of 28	View run →
MMLU-Pro	exact_match	29.4%±0.4%	#24 of 28	View run →
GPQA Diamond	acc	27.8%±3.2%	#21 of 28	View run →
GPQA Extended	acc	26.9%±1.9%	#17 of 28	View run →
GPQA Main	acc	26.1%±2.1%	#21 of 28	View run →
MGSM	exact_match	23.5%±0.6%	#28 of 28	View run →
IFEval	prompt_level_strict_acc	23.3%±1.8%	#28 of 28	View run →
LongBench	aggregate	10.2%±0.2%	#14 of 17	View run →
GSM8K	exact_match	9.2%±0.8%	#25 of 28	View run →
EQ-Bench	eqbench	3.6±1.4	#26 of 28	View run →
BBH	exact_match	3.3%±0.2%	#18 of 28	View run →
MBPP	pass@1	0.4%±0.3%	#22 of 28	View run →
MBPP(Instruct)	pass@1	0.0%±0.0%	#21 of 24	View run →

MMLU#8 of 28
acc
77.6%±0.3%
View run →
MMLU-Pro#24 of 28
exact_match
29.4%±0.4%
View run →
GPQA Diamond#21 of 28
acc
27.8%±3.2%
View run →
GPQA Extended#17 of 28
acc
26.9%±1.9%
View run →
GPQA Main#21 of 28
acc
26.1%±2.1%
View run →
MGSM#28 of 28
exact_match
23.5%±0.6%
View run →
IFEval#28 of 28
prompt_level_strict_acc
23.3%±1.8%
View run →
LongBench#14 of 17
aggregate
10.2%±0.2%
View run →
GSM8K#25 of 28
exact_match
9.2%±0.8%
View run →
EQ-Bench#26 of 28
eqbench
3.6±1.4
View run →
BBH#18 of 28
exact_match
3.3%±0.2%
View run →
MBPP#22 of 28
pass@1
0.4%±0.3%
View run →
MBPP(Instruct)#21 of 24
pass@1
0.0%±0.0%
View run →

last evaluated:2 weeks ago

How to cite

Citation

FrozeBench. "microsoft/phi-4-reasoning-plus." https://frozebench.com/models/microsoft-phi-4-reasoning-plus. Retrieved 2026-06-04.

BibTeX

@misc{frozebench_microsoft_phi_4_reasoning_plus,
  title = {microsoft/phi-4-reasoning-plus},
  howpublished = {\url{https://frozebench.com/models/microsoft-phi-4-reasoning-plus}},
  year = {2026},
  note = {FrozeBench. Retrieved 2026-06-04.}
}

URL

https://frozebench.com/models/microsoft-phi-4-reasoning-plus