信息网络安全 ›› 2021, Vol. 21 ›› Issue (10): 25-32.doi: 10.3969/j.issn.1671-1122.2021.10.004

• 入选论文 • 上一篇    下一篇

基于图像识别的恶意软件静态检测模型

杨铭1,2, 张健1,2()   

  1. 1.南开大学网络空间安全学院,天津 300350
    2.天津市网络与数据安全技术重点实验室,天津 300350
  • 收稿日期:2021-04-30 出版日期:2021-10-10 发布日期:2021-10-14
  • 通讯作者: 张健 E-mail:zhang.jian@nankai.edu.cn
  • 作者简介:杨铭(1999—),男,湖南,硕士研究生,主要研究方向为云安全、网络安全、系统安全|张健(1968—),男,天津,正高级工程师,博士,主要研究方向为云安全、系统安全、恶意代码防治
  • 基金资助:
    国家重点研发计划(2021YFF0307202);天津市新一代人工智能科技重大专项(19ZXZNGX00090);天津市重点研发计划(20YFZCGX00680)

Static Detection Model of Malware Based on Image Recognition

YANG Ming1,2, ZHANG Jian1,2()   

  1. 1. College of Cyber Science, Nankai University, Tianjin 300350, China
    2. Tianjin Key Laboratory of Network and Data Security Technology, Tianjin 300350, China
  • Received:2021-04-30 Online:2021-10-10 Published:2021-10-14
  • Contact: ZHANG Jian E-mail:zhang.jian@nankai.edu.cn

摘要:

恶意软件是当前互联网安全的主要威胁之一。文章以对恶意软件进行快速有效检测为研究目的,提出了SIC模型,该模型采用SimHash方法,利用恶意软件的操作码的位置特征和数量特征,将恶意软件转换成特征向量,再转换为灰度图,然后使用卷积神经网络(CNN)识别出恶意软件所属的家族。文章使用多重Hash、块选择算法对SIC模型进行优化。模型选用微软2015年发布的恶意软件分类挑战数据集进行训练,实验结果表明,SIC模型的检测识别准确率可达96.774%。相较于其他基于传统的机器学习的恶意软件分类模型,文章方案有一定程度的提高。

关键词: 恶意软件, 静态分析, SimHash, 卷积神经网络

Abstract:

Malware is one of the main threats to Internet security at present. This paper took the rapid and effective detection of malware as the research purpose, proposed SIC model,which used SimHash method to transform malware into feature vector by using the location and quantity characteristics of the opcode of malware,and finally converted it into gray-scale image. Then, the convolutional neural network CNN was used to identify the family of the malware. During this period, this paper used MutiHash and block selection algorithm to optimize the SIC model. The malware classification challenge data set released by Microsoft in 2015 was selected for model training. The experimental results show that the detection and recognition accuracy of the SIC model can reach 96.774%, which is improved to a certain extent compared with other traditional machine learning malware classification methods and achieves good results.

Key words: malware, static analysis, SimHash, CNN

中图分类号: