信息网络安全 ›› 2024, Vol. 24 ›› Issue (9): 1396-1408.doi: 10.3969/j.issn.1671-1122.2024.09.008

• 理论研究 • 上一篇    下一篇

一种聚焦于提示的大语言模型隐私评估和混淆方法

焦诗琴, 张贵杨, 李国旗()   

  1. 北京航空航天大学可靠性与系统工程学院,北京 100191
  • 收稿日期:2024-04-11 出版日期:2024-09-10 发布日期:2024-09-27
  • 通讯作者: 李国旗 gqli@buaa.edu.cn
  • 作者简介:焦诗琴(2000—),女,山西,硕士研究生,主要研究方向为多模态计算、大模型隐私保护|张贵杨(2000—),男,安徽,硕士研究生,主要研究方向为控制系统安全、大模型隐私保护|李国旗(1977—),男,山东,讲师,博士,主要研究方向为航空安全、信息安全、无人机系统
  • 基金资助:
    可靠性与环境工程技术重点实验室基金(10100002019114012)

A Prompt-Focused Privacy Evaluation and Obfuscation Method for Large Language Model

JIAO Shiqin, ZHANG Guiyang, LI Guoqi()   

  1. School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China
  • Received:2024-04-11 Online:2024-09-10 Published:2024-09-27

摘要:

虽然大语言模型在语义理解方面表现优异,但频繁的用户交互带来了诸多隐私风险。文章通过部分回忆攻击和模拟推理游戏对现有的大语言模型进行隐私评估,证明了常见的大语言模型仍存在两类棘手的隐私风险,即数据脱敏处理可能影响模型响应质量以及通过推理仍能获取潜在的隐私信息。为了应对这些挑战,文章提出了一种聚焦于提示的大语言模型隐私评估和混淆方法。该方法以结构化进程展开,包括初始描述分解、伪造描述生成以及描述混淆。实验结果表明,文章所提方法的隐私保护效果较好,与现有方法相比,处理前后的模型响应之间的归一化Levenshtein距离、Jaccard相似度和余弦相似度均有一定程度下降。该方法也有效限制了大语言模型的隐私推理能力,准确率从未处理时的97.14%下降至34.29%。这项研究不仅加深了人们对大语言模型交互中隐私风险的理解,还提出了一种用于增强用户隐私安全的综合方法,可有效解决上述两类棘手的隐私风险场景下的安全问题。

关键词: 隐私风险, 大语言模型, 提示工程, 描述混淆

Abstract:

Although the impressive performance of large language model (LLM) in semantic understanding, frequent user interactions introduce many privacy risks. This paper evaluated the privacy evaluation of existing LLM through partial recall attacks and simulated inference games. The findings indicate that common LLM still face two challenging privacy risks: data anonymization can degrade the quality of model responses, and potential privacy information can still be inferred through reasoning. To address these challenges, this paper proposed a prompt-focused privacy evaluation and obfuscation method for large language model. The method unfolded through a structured process, including initial description decomposition, generation of fabricated descriptions, and description obfuscation. The experimental results show that the proposed method effectively enhances privacy protection, as evidenced by the reduction in normalized Levenshtein distance, Jaccard similarity, and cosine similarity between pre-processed and post-processed model responses compared to existing methods. Additionally, this approach significantly limits the inference capabilities of LLM, with accuracy dropping from 97.14% in unprocessed models to 34.29%. This study not only deepens the understanding of privacy risks in LLM interactions but also introduces a comprehensive approach to enhance user privacy security, effectively addressing the aforementioned challenging privacy risk scenarios.

Key words: privacy risk, LLM, prompt project, description obfuscation

中图分类号: