IFEval#8 of 28
prompt_level_strict_acc
View run →85.4%±1.5%
Evaluated across 13 benchmarks. Ranks in the top 3 on 0 of 13. Strongest showing on IFEval (85.4% prompt_level_strict_acc, #8 of 28). Weakest on MBPP(Instruct) (0.0% pass@1, #17 of 24).
| Benchmark | Metric | Score | Rank | Actions |
|---|---|---|---|---|
| IFEval | prompt_level_strict_acc | 85.4%±1.5% | #8 of 28 | View run → |
| GSM8K | exact_match | 81.5%±1.1% | #11 of 28 | View run → |
| EQ-Bench | eqbench | 80.8±1.8 | #6 of 28 | View run → |
| MGSM | exact_match | 74.9%±0.7% | #5 of 28 | View run → |
| MMLU-Pro | exact_match | 60.4%±0.4% | #14 of 28 | View run → |
| BBH | exact_match | 49.7%±0.5% | #7 of 28 | View run → |
| MBPP | pass@1 | 48.6%±2.2% | #14 of 28 | View run → |
| LongBench | score | 35.5%±0.4% | #6 of 10 | View run → |
| GPQA Diamond | acc | 28.3%±3.2% | #18 of 28 | View run → |
| MMLU | acc | 27.1%±0.4% | #26 of 28 | View run → |
| GPQA Main | acc | 26.1%±2.1% | #20 of 28 | View run → |
| GPQA Extended | acc | 25.8%±1.9% | #21 of 28 | View run → |
| MBPP(Instruct) | pass@1 | 0.0%±0.0% | #17 of 24 | View run → |
Citation
FrozeBench. "Qwen/Qwen3.5-122B-A10B-NVFP4." https://frozebench.com/models/qwen-qwen3-5-122b-a10b-nvfp4. Retrieved 2026-06-04.
BibTeX
@misc{frozebench_Qwen_Qwen3_5_122B_A10B_NVFP4,
title = {Qwen/Qwen3.5-122B-A10B-NVFP4},
howpublished = {\url{https://frozebench.com/models/qwen-qwen3-5-122b-a10b-nvfp4}},
year = {2026},
note = {FrozeBench. Retrieved 2026-06-04.}
}URL
https://frozebench.com/models/qwen-qwen3-5-122b-a10b-nvfp4