Netinfo Security ›› 2022, Vol. 22 ›› Issue (10): 121-128.doi: 10.3969/j.issn.1671-1122.2022.10.017

Previous Articles     Next Articles

Research on Multi-Strategy Data Enhancement Technology for Fraud Short Message Identification

HU Mianning1, LI Xin1,2(), LI Mingfeng1, SUN Haichun1   

  1. 1. School of Information Network Security, People’s Public Security University of China, Beijing 100038, China
    2. Key Laboratory of Security Technology and Risk Assessment, Ministry of Public Security, Beijing 100038, China
  • Received:2022-07-21 Online:2022-10-10 Published:2022-11-15
  • Contact: LI Xin E-mail:lixin@ppsuc.edu.cn

Abstract:

Aiming at the low robustness of the fraud short message identification model to the new fraud short message identification model, this paper proposed a model training method that included text generation and deep synthesis of data fusion enhancement technology. Through statistical analysis, it is found that the content and structural characteristics of the new fraud short message are different from those of ordinary fraud short message. By using data enhancement methods such as text generation, deep synthesis and integration technologies, the training set of native fraud short message is enhanced respectively, and comparative experiments are conducted on new fraud short message and native fraud short message in CNN, LSTM, GRU and other models to verify the optimization degree of model performance. Experimental results show that after using the data fusion enhancement technology, the recognition rate of the model for the new fraud short message increases from 73.4% to 98.4%, and the F1 value increases from 0.64 to 0.98. The overall performance of the fraud short message identification model is improved.

Key words: fraud short message identification, data enhancement, text generation, deep synthesis

CLC Number: