A Generation Method of Word-level Adversarial Samples for Chinese Text Classification

doi:10.3969/j.issn.1671-1122.2020.09.003

Abstract

Abstract:

Aiming at the robustness of the Chinese text classification model based on deep learning methods, a word-level black-box adversarial sample generation method CWordAttacker is proposed. The algorithm uses the targeted deletion scoring mechanism, which can locate the key words that significantly affect the classification results when the internal details of the model are unknown. It also uses a variety of attack strategies such as traditional Chinese and Pinyin replacement to generate the adversarial samples consistent with the original sentence semantics, which can complete the targeted and non-targeted attack modes. The results of testing LSTM, TextCNN and CNN with attention on sentiment, spam messages and news classification datasets show that CWordAttacker can greatly reduce the accuracy of the target machine model with less perturbation.

Key words: adversarial samples, natural language processing, Chinese text classification, black-box attack, AI security

CLC Number:

TP309

TONG Xin, WANG Luona, WANG Runzheng, WANG Jingya. A Generation Method of Word-level Adversarial Samples for Chinese Text Classification[J]. Netinfo Security, 2020, 20(9): 12-16.

Figures/Tables 6

References 13

[1]	SZEGEDY C, ZAREMBA W, SUTSKEVER I, et al. Intriguing Properties of Neural Networks[EB/OL]. https://arxiv.org/abs/1312.6199, 2013-5-1.
[2]	PAN Wenwen, WANG Xinyu, SONG Mingli, et al. Survey on Generating Adversarial Examples[J] Journal of Software, 2020,31(1):67-81.
	潘文雯, 王新宇, 宋明黎, 等. 对抗样本生成技术综述[J]. 软件学报, 2020,31(1):67-81.
[3]	WANG Wenqi, WANG Lina, WANG Run, et al. Towards a Robust Deep Neural Network in Text Domain a Survey[EB/OL]. https://arxiv.org/abs/1902.07285, 2019-1-3.
[4]	BELINKOV Y, BISK Y. Synthetic and Natural Noise both Break Neural Machine Translation[EB/OL]. https://arxiv.org/abs/1711.02173, 2017-2-24.
[5]	GAO Ji, LANCHANTIN J, SOFFA ML, et al. Black-box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers[C]// IEEE. 2018 IEEE Security and Privacy Workshops (SP Workshops 2018), May 24, 2018, San Francisco, CA, USA. New York: IEEE, 2018. 50-56.
[6]	JIN Di, JIN Zhijing, ZHOU Tianyi, et al. Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment[EB/OL]. https://arxiv.org/abs/1907.11932, 2019-12-30.
[7]	WANG Wenqi, WANG Run, WANG Lina, et al. Adversarial Examples Generation Approach for Tendency Classification on Chinese Texts[J]. Journal of Software, 2019,30(8):2415-2427.
	王文琦, 汪润, 王丽娜, 等. 面向中文文本倾向性分类的对抗样本生成方法[J]. 软件学报, 2019,30(8):2415-2427.
[8]	PAPERNOT N, MCDANIEL P, SWAMI A, et al. Crafting Adversarial Input Sequences for Recurrent Neural Networks[C]// IEEE. 2016 IEEE Military Communications Conference (MILCOM 2016), November 1-3, 2016, Baltimore, MD, USA. New York: IEEE, 2016: 49-54.
[9]	JIA Robin LIANG Percy. Adversarial Examples for Evaluating Reading Comprehension Systems[EB/OL]. https://arxiv.org/abs/1707.07328, 2017-7-23.
[10]	HOCHREITER S, SCHMIDHUBER J. Long Short-term Memory[J]. Neural Computation, 1997,9(8):1735-1780. doi: 10.1162/neco.1997.9.8.1735 URL pmid: 9377276
[11]	KIM Y. Convolutional Neural Networks for Sentence Classification[C]// ACL. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), October 25-29, 2014, Doha, Qatar. Stroudsburg, PA: ACL, 2014: 1746-1751.
[12]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is All You Need[C]// NIPS. Advances in Neural Information Processing Systems (NIPS 2017), December 4-9, 2017, Long Beach, CA, USA. Cambridge, MA: MIT Press, 2017: 5998-6008.
[13]	SUN Maosong, LI Jingyang, GUO zhipeng, et al. THUCTC: An Efficient Chinese Text Classifier[EB/OL]. http://thuctc.thunlp.org. 2016-12-30.
	孙茂松, 李景阳, 郭志芃, 等. THUCTC:一个高效的中文文本分类工具包[EB/OL]. http://thuctc.thunlp.org. 2016-12-30.

类型	样本对比	识别结果	置信度
原始文本	口干舌燥讲课真心是个力气活晕	消极情绪	99.49%
对抗样本	口干舌燥讲课真心是个力气活暈	积极情绪	87.28%
原始文本	欢迎致电中山市美亦纸类制品厂	垃圾短信	99.91%
对抗样本	歡迎电致中山市美亦纸类制品厂	正常短信	84.34%

项目	微博情感数据	垃圾短信数据	THUCNews^[13]
类别总数	2	2	10
标签分布	均匀	均匀	均匀
数据集规模	18000	18000	50000
验证集（条）	2000	2000	5000
测试集（条）	2000	2000	10000

模型	数据集	原始准确率	随机攻击		CWordAttacker
模型	数据集	原始准确率	准确率	降低幅度	准确率	降低幅度
LSTM	微博情感数据	96.40%	92.15%	4.25%	58.10%	38.30%
LSTM	垃圾短信数据	98.35%	97.95%	0.40%	90.45%	7.90%
LSTM	THUCNews	92.39%	91.59%	0.80%	49.10%	43.29%
TextCNN	微博情感数据	96.30%	91.25%	5.05%	55.40%	40.90%
TextCNN	垃圾短信数据	98.00%	97.15%	0.85%	87.55%	10.45%
TextCNN	THUCNews	94.04%	93.40%	0.64%	63.82%	30.22%
CNN-Attention	微博情感数据	96.05%	90.75%	5.30%	57.60%	38.45%
CNN-Attention	垃圾短信数据	97.20%	96.70%	0.5%	86.65%	10.55%
CNN-Attention	THUCNews	92.48%	90.95%	1.53%	68.07%	24.41%

模型	数据集	原始错分率	随机攻击		CWordAttacker
模型	数据集	原始错分率	错分率	攻击成功率	错分率	攻击成功率
LSTM	微博情感数据	0.10%	12.20%	12.10%	73.00%	72.90%
LSTM	垃圾短信数据	2.10%	3.30%	1.20%	18.80%	16.70%
LSTM	THUCNews	1.06%	1.13%	0.07%	9.46%	8.40%
TextCNN	微博情感数据	2.90%	13.90%	11.00%	89.00%	86.10%
TextCNN	垃圾短信数据	1.40%	2.60%	1.20%	19.20%	17.80%
TextCNN	THUCNews	1.47%	1.52%	0.05%	17.39%	15.92%
CNN-Attention	微博情感数据	4.40%	17.20%	12.80%	86.30%	81.90%
CNN-Attention	垃圾短信数据	4.30%	5.80%	1.50%	24.90%	20.60%
CNN-Attention	THUCNews	1.27%	1.30%	0.03%	10.10%	8.83%

[1]	LU Jiali. Log Anomaly Detection Method Based on Improved Time Series Model [J]. Netinfo Security, 2020, 20(9): 1-5.
[2]	SHEN Jinwei, ZHAO Yi, LIANG Chunlin, ZHANG Ping. RFID Group Tag Ownership Transfer Protocol Based on Cyclic Grouping Function [J]. Netinfo Security, 2020, 20(9): 102-106.
[3]	ZHOU Zhining, WANG Binjun, ZHAI Yiming, TONG Xin. Spam Filtering Model Based on ALBERT Dynamic Word Vector [J]. Netinfo Security, 2020, 20(9): 107-111.
[4]	HAN Lei, CHEN Wuping, ZENG Zhiqiang, ZENG Yingming. Research on Hierarchical Network Structure and Application of Blockchain [J]. Netinfo Security, 2020, 20(9): 112-116.
[5]	LI Qiao, LONG Chun, WEI Jinxia, ZHAO Jing. A Hybrid Model of Intrusion Detection Based on LMDR and CNN [J]. Netinfo Security, 2020, 20(9): 117-121.
[6]	HUANG Na, HE Jingsha, WU Yabiao, LI Jianguo. Method of Insider Threat Detection Based on LSTM Regression Model [J]. Netinfo Security, 2020, 20(9): 17-21.
[7]	ZHANG Runzi, LIU Wenmao, YOU Yang, XIE Feng. Research on AISecOps Automation Levels and Technology Trends [J]. Netinfo Security, 2020, 20(9): 22-26.
[8]	WU Zenan, TIAN Liqin, CHEN Nan. Research on Quantitative Analysis of System Security Based on Stochastic Petri Net [J]. Netinfo Security, 2020, 20(9): 27-31.
[9]	XU Yu, ZHOU You, LIN Lu, ZHANG Cong. Applied Research of Unsupervised Machine Learning in Game Anti-fraud [J]. Netinfo Security, 2020, 20(9): 32-36.
[10]	XU Huikai, LIU Yue, MA Zhenbang, DUAN Haixin. A Large-scale Measurement Study of MQTT Security [J]. Netinfo Security, 2020, 20(9): 37-41.
[11]	LIU Daheng, LI Hongling. Research on QR Code Phishing Detection [J]. Netinfo Security, 2020, 20(9): 42-46.
[12]	WANG Jinmiao, XIE Yongheng, WANG Guowei, LI Yiting. A Method of Privacy Preserving and Access Control in Blockchain Based on Attribute-based Encryption [J]. Netinfo Security, 2020, 20(9): 47-51.
[13]	ZENG Yingming, WANG Bin, GUO Min. Research on Collaborative Defense Technology of Network Security Based on Swarm Intelligence [J]. Netinfo Security, 2020, 20(9): 52-56.
[14]	LI Shibin, LI Jing, TANG Gang, LI Yi. Method of Network Security States Prediction and Risk Assessment for Industrial Control System Based on HMM [J]. Netinfo Security, 2020, 20(9): 57-61.
[15]	WU Jing, LU Tianliang, DU Yanhui. Generation of Malicious Domain Training Data Based on Improved Char-RNN Model [J]. Netinfo Security, 2020, 20(9): 6-11.