NAACL · 2025
Diversity Helps Jailbreak Large Language Models
Weiliang Zhao, Daniel Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao
Weiliang Zhao, Daniel Ben-Levi, Wei Hao, Junfeng Yang, Chengzhi Mao. "Diversity Helps Jailbreak Large Language Models", Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025.