信息网络安全 ›› 2023, Vol. 23 ›› Issue (11): 104-117.doi: 10.3969/j.issn.1671-1122.2023.11.011

• 理论研究 • 上一篇    下一篇

基于BiTCN-DLP的恶意代码分类方法

李思聪1,2, 王坚1, 宋亚飞1(), 黄玮1,2   

  1. 1.空军工程大学防空反导学院,西安 710051
    2.空军工程大学研究生院,西安 710051
  • 收稿日期:2023-08-25 出版日期:2023-11-10 发布日期:2023-11-10
  • 通讯作者: 宋亚飞 yafei_song@163.com
  • 作者简介:李思聪(2000—),女,陕西,硕士研究生,主要研究方向为网络空间安全和恶意代码检测|王坚(1982—),男,陕西,副教授,硕士,主要研究方向为智能信息处理和恶意软件检测|宋亚飞(1988—),男,河南,副教授,博士,主要研究方向为机器学习及其在目标识别和入侵检测等领域中的应用|黄玮(1998—),男,江西,硕士研究生,主要研究方向为网络空间安全和恶意代码检测
  • 基金资助:
    国家自然科学基金(61806219);国家自然科学基金(61703426);国家自然科学基金(61876189);陕西省科学基金(2021JM-226);陕西省高校科协青年人才托举计划(20190108);陕西省高校科协青年人才托举计划(20220106);陕西省创新能力支撑计划(2020KJXX-065)

Malicious Code Classification Method Based on BiTCN-DLP

LI Sicong1,2, WANG Jian1, SONG Yafei1(), HUANG Wei1,2   

  1. 1. Air and Missile Defense College, Air Force Engineering University, Xi’an 710051, China
    2. Graduate School of Air Force Engineering University, Xi’an 710051, China
  • Received:2023-08-25 Online:2023-11-10 Published:2023-11-10

摘要:

为应对不断升级的恶意代码变种,针对现有恶意代码分类方法对特征提取能力不足、分类准确率下降的问题,文章提出了基于双向时域卷积网络(Bidirectional Temporal Convolution Network,BiTCN)和池化融合(Double Layer Pooling,DLP)的恶意代码分类方法(BiTCN-DLP)。首先,该方法融合恶意代码操作码和字节码特征以展现不同细节;然后,构建BiTCN模型充分利用特征的前后依赖关系,引入池化融合机制进一步挖掘恶意代码数据内部深层的依赖关系;最后,文章在Kaggle数据集上对模型进行验证,实验结果表明,基于BiTCN-DLP的恶意代码分类准确率可达99.54%,且具有较快的收敛速度和较低的分类误差,同时,文章通过对比实验和消融实验证明了该模型的有效性。

关键词: 恶意代码分类, 特征融合, 双向时域卷积网络, 池化融合

Abstract:

To cope with the escalating malicious code variants, this article proposed a malicious code classification method (BiTCN-DLP) based on a bidirectional temporal convolution network (BiTCN) and double layer pooling (DLP) to address the problems of insufficient feature extraction and degradation of classification accuracy of existing malicious code classification methods. First, the method fused malicious code opcode and bytecode features to show different details, built BiTCN models to take advantage of the backward and forward dependencies of the features, and introduced a pooling fusion mechanism to further explore the deep dependencies within the malicious code data. Then, the model was validated on the Kaggle dataset. The experimental results show that the accuracy of malicious code classification based on BiTCN-DLP can reach 99.54% with fast convergence and low classification error. Finally, the effectiveness of the model was proved by comparison experiments and ablation experiments.

Key words: malicious code classification, feature fusion, BiTCN, DLP

中图分类号: