Netinfo Security ›› 2021, Vol. 21 ›› Issue (10): 54-62.doi: 10.3969/j.issn.1671-1122.2021.10.008

Previous Articles     Next Articles

Malicious Code Visual Classification Algorithm Based on Behavior Knowledge Graph Sieve

ZHU Chaoyang1, ZHOU Liang1, ZHU Yayun1, LIN Qingwen2,3()   

  1. 1. Institute of Information and Communication, China Electric Power Research Institute Co., Ltd, Beijing 100192, China
    2. Beijing HXIS Technology Co. Ltd, Beijing 100876, China
    3. National Engineering Laboratory of Mobile Internet Security Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2021-06-25 Online:2021-10-10 Published:2021-10-14
  • Contact: LIN Qingwen E-mail:2019140856@bupt.edu.cn

Abstract:

In recent years, the virus industry has gradually formed a well-organized market and involves a huge amount of money. The main challenge facing today’s anti malware is to evaluate a large number of data and file samples to determine the potential malicious intent. Based on this, this paper proposes a visual classification algorithm of malicious code based on behavior graph sieve. The algorithm analyzes the assembly instruction flow of malicious code samples, extracts the program behavior fingerprint, and uses the knowledge map to escape the fingerprint content, so as to generate the fingerprint screen of the specified samples. By locating the spots in the fingerprint screen, the algorithm cleans up the noise in the malware samples and generates the corresponding fingerprint after screening. On the premise of retaining the original fingerprint features, the compression rate of the sifted fingerprint is 76.3%. Finally, the algorithm carries out visual analysis and opcode sequence analysis on the sifted fingerprint, and uses random forest algorithm for classification, which achieves 98.8% accuracy. Experiments show that the visual classification algorithm of malicious code based on behavior graph sieve can achieve better results in the classification of malicious code.

Key words: knowledge graph, malicious code classification, visual classification algorithm

CLC Number: