Netinfo Security ›› 2024, Vol. 24 ›› Issue (4): 545-554.doi: 10.3969/j.issn.1671-1122.2024.04.005

Previous Articles     Next Articles

Defense Scheme for Removing Deep Neural Network Backdoors Based on JSMA Adversarial Attacks

ZHANG Guanghua1,2, LIU Yichun2, WANG He1, HU Boning2()   

  1. 1. School of Cyber Engineering, Xidian University, Xi’an 710071, China
    2. School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang 050018, China
  • Received:2023-09-10 Online:2024-04-10 Published:2024-05-16

Abstract:

Deep learning models lack transparency and interpretability, and the abnormal behavior triggered by malicious attacks during the inference stage can lead to a decline in their performance. In response to this issue, this paper proposed a defense scheme for removing deep neural network (DNN) backdoors based on JSMA adversarial attacks. Firstly, the hidden backdoor trigger was restored using special disturbances generated by simulations of JSMA, and this foundation formed the basis for simulating the restoration of the backdoor trigger pattern. Secondly, a heatmap was used to locate the weight position of the restored hidden trigger. Finally, a ridge regression function was used to reset the weights to zero effectively removing the backdoor in the DNN. This paper tested the model on the MNIST and CIFAR10 datasets, and evaluated the performance of the model after the backdoor removal. The experimental results show that this scheme can effectively remove the backdoors in DNN models, with only less than a 3% decrease in the testing accuracy of the DNN.

Key words: deep learning model, counter attack, JSMA, ridge regression function

CLC Number: