信息网络安全 ›› 2025, Vol. 25 ›› Issue (7): 1074-1091.doi: 10.3969/j.issn.1671-1122.2025.07.007

• 理论研究 • 上一篇    下一篇

面向数据投毒后门攻击的随机性增强双层优化防御方法

闫宇坤1,2, 唐朋3, 陈睿1,2(), 都若尘1,2, 韩启龙1,2   

  1. 1.哈尔滨工程大学计算机科学与技术学院,哈尔滨150001
    2.电子政务建模与仿真国家工程实验室,哈尔滨 150001
    3.山东大学网络空间安全学院,青岛 266237
  • 收稿日期:2025-05-07 出版日期:2025-07-10 发布日期:2025-08-07
  • 通讯作者: 陈睿 E-mail:ruichen@hrbeu.edu.cn
  • 作者简介:闫宇坤(1997—),男,河北,博士研究生,CCF会员,主要研究方向为数据安全和隐私保护|唐朋(1987—),男,山东,副教授,博士,主要研究方向为数据安全和隐私保护|陈睿(1983—),男,四川,教授,博士,主要研究方向为数据安全和隐私保护|都若尘(1996—),男,黑龙江,博士研究生,CCF会员,主要研究方向为数据投毒后门防御|韩启龙(1974—),男,黑龙江,教授,博士,CCF会员,主要研究方向为数据安全和隐私保护
  • 基金资助:
    国家自然科学基金(62002203);黑龙江省重点研发计划(GA23A915)

A Randomness Enhanced Bi-Level Optimization Defense Method against Data Poisoning Backdoor Attacks

YAN Yukun1,2, TANG Peng3, CHEN Rui1,2(), DU Ruochen1,2, HAN Qilong1,2   

  1. 1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    2. National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin 150001, China
    3. School of Cyber Science and Technology, Shandong University, Qingdao 266237, China
  • Received:2025-05-07 Online:2025-07-10 Published:2025-08-07
  • Contact: CHEN Rui E-mail:ruichen@hrbeu.edu.cn

摘要:

数据投毒后门攻击揭示了深度神经网络在安全性方面的脆弱性,严重威胁其在实际应用中的可靠性。尽管已有多种防御策略被提出,但在实际部署中仍面临两大挑战:1)过度依赖于有关攻击者行为或训练数据的先验知识,导致泛化性受限;2)难以在模型性能与防御能力之间取得有效平衡。因此,文章提出一种面向数据投毒后门攻击的随机性增强双层优化防御框架(RADAR)。该框架以数据识别为核心,将鲁棒性增强训练与样本筛选机制有机融合,无需任何先验信息,即可在模型训练过程中动态识别数据集内干净样本与可疑中毒样本,并在筛选所得可信数据上进行模型快速调整,构建具备稳健防御能力的深度神经网络。具体而言,RADAR结合噪声增强的自监督预训练与满足差分隐私约束的参数自适应微调机制,即使在中毒样本主导目标类别的极端情况下,也能识别其为全局异常并抑制拟和,保障干净样本的准确筛选。此外,RADAR设计了面向干净特征的随机平滑拟和解耦策略,在干净样本受限条件下,有效去除后门模型对干净特征的拟和能力,从而降低可疑中毒样本识别的假阳率。通过在多种类型数据投毒后门攻击下开展防御实验,结果表明RADAR不仅在干净样本上分类性能优良,还展现出优异的防御能力,将各类攻击成功率抑制在7%以下,体现出良好的安全性与实用性。

关键词: 后门防御, 数据投毒, 差分隐私

Abstract:

Data poisoning backdoor attacks have revealed vulnerabilities in the security of deep neural networks, posing serious threats to their reliability in real-world applications. Although numerous defense strategies have been proposed, their practical deployment still faces two key challenges: 1) heavy reliance on prior knowledge of attacker behavior or training data characteristics, which limits the generalizability of these methods; and 2) difficulty in balancing model performance and defense effectiveness. To address these challenges, this paper proposed RADAR, a randomness enhanced bi-level optimization defense framework tailored for data poisoning backdoor attacks. Centered on data identification, RADAR organically integrated robust training and sample selection mechanisms. It enabled dynamic identification between clean and suspicious poisoned samples during training without requiring any prior knowledge, and subsequently fine-tuned the model on a trusted subset to obtain a backdoor-resilient model. Specifically, RADAR combined noise-augmented self-supervised pretraining with differentially private, parameter-adaptive fine-tuning. This allowed the model to identify poisoned samples as global outliers even in extreme scenarios where they dominated the target class, thereby ensuring accurate clean sample selection. In addition, RADAR introduced a random smoothing-based disentangled training strategy for clean features under limited clean data conditions, effectively reducing the false positive rate in suspicious poisoned sample identification. Extensive experiments across diversed data poisoning backdoor attacks demonstrate that RADAR not only maintains strong classification performance on clean data but also exhibits outstanding defensive capabilities, consistently suppressing attack success rates to below 7%. These results highlight the security and practical applicability of RADAR.

Key words: backdoor defenses, data poisoning, differential privacy

中图分类号: