Skip to main content
FrozeBench

EQ-Bench

eq_bench
Emotional Intelligence

EQ-Bench (Emotional Intelligence Benchmark) is a 60-question English benchmark that asks models to predict, on a 0-10 intensity scale, the emotional states experienced by characters in short dialogue scenarios. The model output is compared against author-defined reference intensities and scored by a normalized error metric. The original paper reports a Pearson correlation of approximately 0.97 between EQ-Bench and MMLU scores, and the benchmark has been used as a lightweight, fast-to-run signal of emotional reasoning ability in conversational language models.

Source paperLatest run: 2026-05-25

Benchmark results

Switch between the canonical ranking, release-date performance view, and score-size tradeoff.

28 models

Caveats

The most fundamental concern is sample size. With only 60 questions, run-to-run variance is high, scores are sensitive to sampling temperature and minor prompt-template changes, and small deltas between models on a single run should be considered noise rather than signal. There is also no human-cohort baseline: reference answers are author-defined with no inter-rater reliability check, so "correct" emotional intensity has no external anchor and the ground-truth itself reflects one annotator's intuitions about emotional plausibility. Construct validity is the deeper open question. The reported r≈0.97 correlation with MMLU suggests EQ-Bench may be measuring general language-model capability rather than emotional intelligence specifically — if a benchmark moves in lockstep with broad-knowledge MCQ scores, its claim to test a distinct capability is weak. The dialogues are also entirely synthetic and were generated by GPT-4, which can impose stylistic homogeneity and GPT-4-era biases on what counts as emotionally plausible behavior. The benchmark is English- only, and emotional norms are culturally specific, so EQ-Bench scores do not generalize to evaluating emotional reasoning across cultures or in non-English deployment contexts.

How to cite

Citation

FrozeBench. "EQ-Bench." https://frozebench.com/benchmarks/eq-bench. Retrieved 2026-06-04.

BibTeX

@misc{frozebench_eq_bench,
  title = {EQ-Bench},
  howpublished = {\url{https://frozebench.com/benchmarks/eq-bench}},
  year = {2026},
  note = {FrozeBench. Retrieved 2026-06-04.}
}

URL

https://frozebench.com/benchmarks/eq-bench