Netinfo Security ›› 2020, Vol. 20 ›› Issue (9): 12-16.doi: 10.3969/j.issn.1671-1122.2020.09.003

Previous Articles     Next Articles

A Generation Method of Word-level Adversarial Samples for Chinese Text Classification

TONG Xin1, WANG Luona2, WANG Runzheng1, WANG Jingya1()   

  1. 1. College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
    2. Beijing Bytedance Technology Co., Ltd, Beijing 100000, China
  • Received:2020-07-16 Online:2020-09-10 Published:2020-10-15
  • Contact: WANG Jingya E-mail:wangjingya@ppsuc.edu.cn

Abstract:

Aiming at the robustness of the Chinese text classification model based on deep learning methods, a word-level black-box adversarial sample generation method CWordAttacker is proposed. The algorithm uses the targeted deletion scoring mechanism, which can locate the key words that significantly affect the classification results when the internal details of the model are unknown. It also uses a variety of attack strategies such as traditional Chinese and Pinyin replacement to generate the adversarial samples consistent with the original sentence semantics, which can complete the targeted and non-targeted attack modes. The results of testing LSTM, TextCNN and CNN with attention on sentiment, spam messages and news classification datasets show that CWordAttacker can greatly reduce the accuracy of the target machine model with less perturbation.

Key words: adversarial samples, natural language processing, Chinese text classification, black-box attack, AI security

CLC Number: