IFEval#7 of 28
prompt_level_strict_acc
View run →87.1%±1.4%
Evaluated across 13 benchmarks. Ranks in the top 3 on 4 of 13. Strongest showing on IFEval (87.1% prompt_level_strict_acc, #7 of 28). Weakest on MBPP(Instruct) (0.0% pass@1, #9 of 24).
| Benchmark | Metric | Score | Rank | Actions |
|---|---|---|---|---|
| IFEval | prompt_level_strict_acc | 87.1%±1.4% | #7 of 28 | View run → |
| MGSM | exact_match | 83.2%±0.7% | #1 of 28 | View run → |
| MMLU-Pro | exact_match | 80.8%±0.3% | #4 of 28 | View run → |
| MBPP | pass@1 | 79.6%±1.8% | #2 of 28 | View run → |
| EQ-Bench | eqbench | 76.8±1.7 | #15 of 28 | View run → |
| GSM8K | exact_match | 73.1%±1.2% | #16 of 28 | View run → |
| MMLU | acc | 64.0%±0.4% | #18 of 28 | View run → |
| LongBench | score | 49.2%±0.5% | #1 of 10 | View run → |
| GPQA Main | acc | 42.4%±2.3% | #3 of 28 | View run → |
| GPQA Extended | acc | 40.8%±2.1% | #4 of 28 | View run → |
| GPQA Diamond | acc | 38.9%±3.5% | #5 of 28 | View run → |
| BBH | exact_match | 2.3%±0.2% | #21 of 28 | View run → |
| MBPP(Instruct) | pass@1 | 0.0%±0.0% | #9 of 24 | View run → |
Citation
FrozeBench. "Qwen/Qwen3-Next-80B-A3B-Instruct." https://frozebench.com/models/qwen-qwen3-next-80b-a3b-instruct. Retrieved 2026-06-04.
BibTeX
@misc{frozebench_Qwen_Qwen3_Next_80B_A3B_Instruct,
title = {Qwen/Qwen3-Next-80B-A3B-Instruct},
howpublished = {\url{https://frozebench.com/models/qwen-qwen3-next-80b-a3b-instruct}},
year = {2026},
note = {FrozeBench. Retrieved 2026-06-04.}
}URL
https://frozebench.com/models/qwen-qwen3-next-80b-a3b-instruct