基于深度语义挖掘的大语言模型越狱检测方法研究
刘会, 朱正道, 王淞鹤, 武永成, 黄林荃
Jailbreak Detection for Large Language Model Based on Deep Semantic Mining
LIU Hui, ZHU Zhengdao, WANG Songhe, WU Yongcheng, HUANG Linquan
信息网络安全 . 2025, (9): 1377 -1384 .  DOI: 10.3969/j.issn.1671-1122.2025.09.006