Netinfo Security ›› 2025, Vol. 25 ›› Issue (7): 1074-1091.doi: 10.3969/j.issn.1671-1122.2025.07.007

Previous Articles     Next Articles

A Randomness Enhanced Bi-Level Optimization Defense Method against Data Poisoning Backdoor Attacks

YAN Yukun1,2, TANG Peng3, CHEN Rui1,2(), DU Ruochen1,2, HAN Qilong1,2   

  1. 1. College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
    2. National Engineering Laboratory for Modeling and Emulation in E-Government, Harbin 150001, China
    3. School of Cyber Science and Technology, Shandong University, Qingdao 266237, China
  • Received:2025-05-07 Online:2025-07-10 Published:2025-08-07
  • Contact: CHEN Rui E-mail:ruichen@hrbeu.edu.cn

Abstract:

Data poisoning backdoor attacks have revealed vulnerabilities in the security of deep neural networks, posing serious threats to their reliability in real-world applications. Although numerous defense strategies have been proposed, their practical deployment still faces two key challenges: 1) heavy reliance on prior knowledge of attacker behavior or training data characteristics, which limits the generalizability of these methods; and 2) difficulty in balancing model performance and defense effectiveness. To address these challenges, this paper proposed RADAR, a randomness enhanced bi-level optimization defense framework tailored for data poisoning backdoor attacks. Centered on data identification, RADAR organically integrated robust training and sample selection mechanisms. It enabled dynamic identification between clean and suspicious poisoned samples during training without requiring any prior knowledge, and subsequently fine-tuned the model on a trusted subset to obtain a backdoor-resilient model. Specifically, RADAR combined noise-augmented self-supervised pretraining with differentially private, parameter-adaptive fine-tuning. This allowed the model to identify poisoned samples as global outliers even in extreme scenarios where they dominated the target class, thereby ensuring accurate clean sample selection. In addition, RADAR introduced a random smoothing-based disentangled training strategy for clean features under limited clean data conditions, effectively reducing the false positive rate in suspicious poisoned sample identification. Extensive experiments across diversed data poisoning backdoor attacks demonstrate that RADAR not only maintains strong classification performance on clean data but also exhibits outstanding defensive capabilities, consistently suppressing attack success rates to below 7%. These results highlight the security and practical applicability of RADAR.

Key words: backdoor defenses, data poisoning, differential privacy

CLC Number: