Netinfo Security ›› 2020, Vol. 20 ›› Issue (9): 107-111.doi: 10.3969/j.issn.1671-1122.2020.09.022

Previous Articles     Next Articles

Spam Filtering Model Based on ALBERT Dynamic Word Vector

ZHOU Zhining, WANG Binjun(), ZHAI Yiming, TONG Xin   

  1. College of Information and Cyber Security, People’s Public Security University of China, Beijing 100038, China
  • Received:2020-07-16 Online:2020-09-10 Published:2020-10-15
  • Contact: WANG Binjun E-mail:wangbinjun@ppsuc.edu.cn

Abstract:

In order to solve the problem of insufficient word vector learning in spam classification, this paper introduces a model with ALBERT dynamic word vector, and proposes an ALBERT-RNN model which combines the ALBERT dynamic word vector with the recurrent neural network. In the open spam dataset (TEC06C), two traditional statistical models and four ALBERT-RNN models with different RNN structure are compared, and the cross entropy loss function of ALBERT-RNN is optimized by Focal Loss method. The experimental results show that the ALBERT-LSTM model with Focal Loss achieves the highest accuracy (99.13%) on the TEC06C dataset.

Key words: Chinese spam, recurrent neural network, ALBERT model, dynamic word vector

CLC Number: