信息网络安全 ›› 2026, Vol. 26 ›› Issue (1): 24-37.doi: 10.3969/j.issn.1671-1122.2026.01.002

• 综述 • 上一篇    下一篇

公共安全领域大语言模型的可信性研究综述:风险、对策与挑战

仝鑫1, 焦强2, 王靖亚1, 袁得嵛1, 金波3()   

  1. 1.中国人民公安大学信息网络安全学院,北京 100038
    2.公安部科技信息化局,北京 100741
    3.公安部第三研究所,上海 200031
  • 收稿日期:2025-10-10 出版日期:2026-01-10 发布日期:2026-02-13
  • 通讯作者: 金波 jinbo@gass.cn
  • 作者简介:仝鑫(1995—),男,河南,博士研究生,CCF会员,主要研究方向为大语言模型安全|焦强(1981—),男,河北,硕士,主要研究方向为大数据和人工智能|王靖亚(1966—),女,北京,教授,硕士,主要研究方向为自然语言处理|袁得嵛(1986—),男,河北,副教授,博士,主要研究方向为信息内容安全与人工智能安全|金波(1970—),男,上海,研究员,博士,CCF会员,主要研究方向为行业大模型
  • 基金资助:
    国家重点研发计划(2023YFB3107105);北京市教育委员会科研计划(KM202414019003);中国人民公安大学研究生科研创新项目(2025yjsky006)

A Survey on the Trustworthiness of Large Language Models in the Public Security Domain: Risks, Countermeasures, and Challenges

TONG Xin1, JIAO Qiang2, WANG Jingya1, YUAN Deyu1, JIN Bo3()   

  1. 1. School of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
    2. Bureau of Science and Technology Information, Ministry of Public Security of the People’s Republic of China, Beijing 100741, China
    3. The Third Research Institute of the Ministry of Public Security of China, Shanghai 200031, China
  • Received:2025-10-10 Online:2026-01-10 Published:2026-02-13

摘要:

随着大语言模型(LLMs)的快速发展,其在公共安全领域的应用潜力不断凸显。然而,能力透明度不足、过度对齐导致可用性弱、幻觉生成及安全威胁等问题使其难以满足公共安全场景的高敏感性、高风险性和高精度需求。文章系统综述了公共安全领域LLMs的可信性问题:梳理其在风险预警、安全事件响应、内部管理与公共服务等任务中的应用现状,明确可信性定义并归纳出内部脆弱、外部威胁和伴生问题这3类风险;结合通用基础、专网域与互联网域的特点,提出任务适用、事实准确、安全完成、对抗鲁棒和责任追溯这5个可信维度,并以此为顺序综述了相应的增强策略与挑战,旨在推动LLMs在公共安全领域的可靠、安全与可控应用。

关键词: 大语言模型, 可信性, 公共安全

Abstract:

With the rapid development of Large Language Models (LLMs), their application potential in the public security domain has become increasingly prominent. However, issues such as insufficient capability transparency, over-alignment leading to unavailability, hallucination generation, and security threats hinder their ability to meet the high sensitivity, high risk, and high precision requirements of public security scenarios. This paper systematically reviews trustworthiness issues of LLMs in the public security context: it examines their current applications in tasks such as risk warning, security incident response internal management, and public services; defines trustworthiness and categorizes risks into internal vulnerabilities, external threats, and concomitant issues; and, based on the characteristics of the general domain, private network domain, and internet domain, proposes five trustworthiness dimensions—task suitability, factual accuracy, safe completion, adversarial robustness, and accountability. Following this structure, the paper surveys corresponding enhancement strategies and challenges, with the aim of promoting reliable, secure, and controllable applications of LLMs in the public security sector.

Key words: large language models, trustworthiness, public security

中图分类号: