NAACL · 2025

Diversity Helps Jailbreak Large Language Models

Weiliang Zhao, Daniel Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao

Weiliang Zhao, Daniel Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao. "Diversity Helps Jailbreak Large Language Models", Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025.