Netinfo Security ›› 2025, Vol. 25 ›› Issue (12): 1878-1888.doi: 10.3969/j.issn.1671-1122.2025.12.004

Previous Articles     Next Articles

Detecting Poisoned Samples for Untargeted Backdoor Attacks

PANG Shuchao1, LI Zhengxiao1, QU Junyi1, MA Ruhao1, CHEN Hechang2, DU Anan3()   

  1. 1. School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
    2. School of Artificial Intelligence, Jilin University, Changchun 130012, China
    3. School of Computer and Software, Nanjing University of Industry Technology, Nanjing 210023, China
  • Received:2025-10-05 Online:2025-12-10 Published:2026-01-06
  • Contact: DU Anan E-mail:anan.du@niit.edu.cn

Abstract:

Backdoor attacks, as an important way of data poisoning attacks, represent a significant threat to the reliability of datasets and the security of model training. Currently, the predominant defensive strategies are largely targeted-backdoor-attacks and lack of research on non-target backdoor attacks. This study, however, proposed a poisoned sample detection method for untargeted backdoor attacks. This method was to propose a black-box method based on predicted behavioral anomalies to detect potential untargeted backdoor examples. This method consisted of two modules: a poisoned-example-detection module based on predictive behavior anomalies, which detected suspicious examples based on the discrepancy inprediction behaviors betweenthe original and the reconstructed samples; and a diffusion-model-data-generation module for poisoned examples attacks, which generated a new dataset similar to the original dataset, and without triggers. The feasibility of the method is demonstrated through experiments involving different types of targetless backdoor attack and different generative models. The great potential and application value of generative models, especially diffusion models, in the field of backdoor detection and defense is also demonstrated.

Key words: data security, untargeted backdoor attacks, image recognition, generative models, deep learning

CLC Number: