Nature · 2026
Humanity's Last Exam: A Benchmark of Expert-Level Academic Questions to Assess AI Capabilities
Long Phan,..., Wei Hao,..., Dan Hendrycks
Long Phan, Alice Gatti, Nathaniel Li, et al. (including Wei Hao). "Humanity's Last Exam: A Benchmark of Expert-Level Academic Questions to Assess AI Capabilities." Nature, vol. 649 (2026). https://doi.org/10.1038/s41586-025-09962-4