Netinfo Security ›› 2019, Vol. 19 ›› Issue (12): 72-78.doi: 10.3969/j.issn.1671-1122.2019.12.009

Previous Articles     Next Articles

Analyzing Malware Behavior and Capability Related Text Based on Feature Extraction

Xuruirui FENG, Jiayong LIU(), Pengsen CHENG   

  1. College of Cybersecurity, Sichuan University, Chengdu Sichuan 610065, China
  • Received:2019-08-10 Online:2019-12-10 Published:2020-05-11

Abstract:

In response to the threat of malware to cyberspace security, cybersecurity agencies have released a large number of malware reports, which contain many cybersecurity related information,such as the malware’s capabilities and the specific actions taken. By analyzing the malware reports and obtaining information, researchers can fully understand its functions and mount an effective defense. The task of automatically extract texts related to malware capabilities and behaviors from reports, facing the problems of a large number of reports, loose text structure, and polysemy. Based on the Bert pre-training model to disambiguate polysemy, input it into BiLSTM and attention mechanism network to further extract features and train the classifier. Experimented on the MalwareTextDB dataset, the recall rate and F1 value can be 85.56% and 66.67%. Compared to other methods, the model is able to extract texts related to malware behavior and capabilities from malware reports more automatically and efficiently.

Key words: malware, text classification, BERT, BiLSTM, attention mechanism

CLC Number: