信息网络安全 ›› 2023, Vol. 23 ›› Issue (12): 29-37.doi: 10.3969/j.issn.1671-1122.2023.12.004
收稿日期:
2023-09-16
出版日期:
2023-12-10
发布日期:
2023-12-13
通讯作者:
谭振华
E-mail:tanzh@mail.neu.edu.cn
作者简介:
刘军(1978—),男,辽宁,工程师,硕士,主要研究方向为网络路由及安全技术|武志超(1997—),男,内蒙古,硕士研究生,主要研究方向为网络安全|吴建(1982—),男,沈阳,工程师,硕士,主要研究方向为网络路由及安全技术|谭振华(1980—),男,湖南,教授,博士,CCF高级会员,主要研究方向为数据安全、隐私保护和网络行为分析
基金资助:
LIU Jun1, WU Zhichao2, WU Jian1, TAN Zhenhua1,2()
Received:
2023-09-16
Online:
2023-12-10
Published:
2023-12-13
摘要:
恶意代码识别对保护计算机使用者的隐私、优化计算资源具有积极意义。现存恶意代码识别模型通常会将恶意代码转换为图像,再通过深度学习技术对图像进行分类。经恶意代码识别模型转换后的图像呈现两个特点,一是图像的末尾通常被填充上黑色像素,使图像中存在明显的重点特征(即代码部分)和非重点特征(即填充部分),二是代码之间具有语义特征相关性,而在将它们按顺序转换成像素时,这种相关性也在像素之间保留。然而,现有恶意代码检测模型没有针对恶意代码的特点设计,这导致对恶意图像在深层次特征提取方面的能力相对偏弱。鉴于此,文章提出了一种新的恶意代码检测模型,特别针对恶意图像的两个关键特点进行了设计。首先,将原始的恶意代码转换成图像,并对其进行预处理。然后通过一个FA-SA模块提取重点特征,并通过两个FA-SeA模块捕捉像素之间的相关性特征。文章所提模型不仅简化了恶意代码检测的网络结构,还提升了深层次特征提取能力及检测准确率。实验结果表明,文章融合注意力模块的方法对提升模型的识别效果具有显著帮助。在Malimg数据集上,恶意代码识别准确率达到了96.38%,比现存基于CNN的模型提高了3.56%。
中图分类号:
刘军, 武志超, 吴建, 谭振华. 一种融合图像空间特征注意力机制的恶意代码识别模型[J]. 信息网络安全, 2023, 23(12): 29-37.
LIU Jun, WU Zhichao, WU Jian, TAN Zhenhua. A Malicious Code Recognition Model Fusing Image Spatial Feature Attention Mechanism[J]. Netinfo Security, 2023, 23(12): 29-37.
表1
Malimg数据集恶意软件类型及样本数
软件类型 | 样本数/个 | 软件类型 | 样本数/个 |
---|---|---|---|
Allaple.L | 1591 | Alueron.gen!J | 198 |
Allaple.A | 2949 | Malex.gen!J | 136 |
Yuner.A | 800 | Lolyda.AT | 159 |
Lolyda.AA 1 | 213 | Adialer.C | 125 |
Lolyda.AA 2 | 184 | Wintrim.BX | 97 |
Lolyda.AA 3 | 123 | Dialplatform.B | 177 |
C2Lop.P | 146 | Dontovo.A | 162 |
C2Lop.gen!G | 200 | Obfuscator.AD | 142 |
Instantaccess | 431 | Agent.FYI | 116 |
Swizzor.gen!I | 132 | Autorun.K | 106 |
Swizzor.gen!E | 128 | Rbot!gen | 158 |
VB.AT | 408 | Skintrim.N | 80 |
Fakerean | 381 |
表2
网络模型各层的详细参数
模块 | 网络层名 | 层参数 |
---|---|---|
Baseline Block | Input | size=(32,32,1) |
C1层 | kernel=(3,3,3,128), stride=1, relu | |
P1, P2层 | padding=(2,2) | |
Maxpool | kernel=2 | |
C2层 | kernel=(3,3,128,64), stride=1, relu | |
FA-SA Block | C1层 | kernel=(7,7,2,1), stride=1, sigmod |
P1层 | padding=(3,3) | |
FA-SeA Block1 | C(Q) | kernel=(1,1,128,16), stride=1 |
C(K) | kernel=(1,1,128,16), stride=1 | |
C(V) | kernel=(1,1,128,128), stride=1 | |
FA-SeA Block2 | C(Q) | kernel=(1,1,64,8), stride=1 |
C(K) | kernel=(1,1,64,8), stride=1 | |
C(V) | kernel=(1,1,64,64), stride=1 | |
FC1, FC2, FC3 | Input | 5184 |
Output | 25 |
表4
本文模型测试结果
指标 样本标签 | Recall | Precision | F1 |
---|---|---|---|
Allaple.L | 98.7% | 98.4% | 98.5% |
Allaple.A | 98.8% | 99.0% | 98.9% |
Yuner.A | 100% | 87.9% | 93.6% |
Lolyda.AA1 | 100% | 93.5% | 96.6% |
Lolyda.AA2 | 91.9% | 100% | 95.8% |
Lolyda.AA3 | 100% | 100% | 100% |
C2LOP.P | 63.3% | 86.4% | 73.1% |
C2LOP.gen!g | 92.5% | 82.2% | 87.0% |
Instantaccess | 100% | 100% | 100% |
Swizzor.gen!E | 70.4% | 90.5% | 79.2% |
VB.AT | 94.4% | 65.4% | 77.3% |
Fakerean | 100% | 98.8% | 99.4% |
Alueron.gen!J | 100% | 100% | 100% |
Malex.gen!J | 100% | 100% | 100% |
Lolyda.AT | 100% | 100% | 100% |
Adialer.C | 100% | 100% | 100% |
Wintrim.BX | 100% | 100% | 100% |
Dialplatform.B | 95.0% | 100% | 97.4% |
Dontovo.A | 100% | 100% | 100% |
Obfuscator.AD | 100% | 100% | 100% |
Agent.FYI | 100% | 100% | 100% |
Autorun.K | 0 | 0 | — |
Rbot!gen | 100% | 100% | 100% |
Skintrim.N | 100% | 100% | 100% |
[1] |
SOURI A, HOSSENINI R. A State-of-the-Art Survey of Malware Detection Approaches Using Data Mining Techniques[J]. Human-Centric Computing and Information Sciences, 2018, 8(1): 1-22.
doi: 10.1186/s13673-017-0124-3 |
[2] | YE Yanfang, LI Tao, ADJEROH D, et al. A Survey on Malware Detection Using Data Mining Techniques[J]. ACM Computing Surveys (CSUR), 2017, 50(3): 1-40. |
[3] | SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[EB/OL]. (2014-09-04)[2023-09-10]. https://arxiv.org/abs/1409.1556. |
[4] | HE Kaiming, ZHANG Xiangyu, REN Sun, et al. Deep Residual Learning for Image Recognition[C]// IEEE. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2016: 770-778. |
[5] | NATARAJ L, KARTHIKEYAN S, JACOB G, et al. Malware Images: Visualization and Automatic Classification[C]// ACM. Proceedings of the 8th International Symposium on Visualization for Cyber Security. New York: ACM, 2011: 1-7. |
[6] | DOUZE M, HERVE J, HARSIMRAT S, et al. Evaluation of GIST Descriptors for Web-Scale Image Search[C]// ACM. Proceedings of the ACM International Conference on Image and Video Retrieval. New York: ACM, 2009: 1-8. |
[7] | YAJAMANAM S, SELVIN V, TROIA F D, et al. Deep Learning Versus Gist Descriptors for Image-Based Malware Classification[C]// IEEE. Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP 2018). New York: IEEE, 2018: 553-561. |
[8] |
HAN K S, LIM J H, KANG B, et al. Malware Analysis Using Visualized Images and Entropy Graphs[J]. International Journal of Information Security, 2015, 14(1): 1-14.
doi: 10.1007/s10207-014-0242-0 URL |
[9] | BHODIA N, PRAJAPATI P, TROIA F D, et al. Transfer Learning for Image-Based Malware Classification[EB/OL]. (2019-01-21)[2023-09-10]. http://archive.ifla.org/IV/ifla64/138-161e.htm. |
[10] | GAVRILUT D, CIMPOESU M, ANTON D, et al. Malware Detection Using Machine Learning[C]// IEEE. 2009 International Multiconference on Computer Science and Information Technology. New York: IEEE, 2009: 735-741. |
[11] | LE Q, BOYDELL O, NAMEE B M, et al. Deep Learning at the Shallow End: Malware Classification for Non-Domain Experts[J]. Digital Investigation, 2018, 26: 118-126. |
[12] |
CUI Zhihua, XUE Fei, CAI Xingjuan, et al. Detection of Malicious Code Variants Based on Deep Learning[J]. IEEE Transactions on Industrial Informatics, 2018, 14(7): 3187-3196.
doi: 10.1109/TII.9424 URL |
[13] | TAREEN S A K, SALEEM Z. A Comparative Analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK[C]// IEEE. 2018 International Conference on Computing, Mathematics and Engineering Technologies (iCoMET). New York: IEEE, 2018: 1-10. |
[14] | PRAJAPATI P, STAMP M. An Empirical Analysis of Image-Based Learning Techniques for Malware Classification[J]. Malware Analysis Using Artificial Intelligence and Deep Learning, 2021: 411-435. |
[15] | AGARAP A F. Towards Building an Intelligent Anti-Malware System: A Deep Learning Approach Using Support Vector Machine (SVM) for Malware Classification[EB/OL]. (2017-12-31)[2023-09-10]. https://arxiv.org/abs/1801.00318. |
[16] | AKHIL M R, ADITHYA K V S, HARIVARRDHAN S, et al. Malware Classification Using Deep Neural Networks: Performance Evaluation and Applications in Edge Devices[EB/OL]. (2023-08-21) [2023-09-10]. https://arxiv.org/abs/2310.06841. |
[17] | SON T T, LEE C, LE-MINH H, et al. An Evaluation of Image-Based Malware Classification Using Machine Learning[C]// ICCCI. Communications in Computer and Information Science. Berlin:Springer, 2020: 125-138. |
[18] | WOO S, PARK J, LEE J Y, et al. CBAM: Convolutional Block Attention Module[C]// ECCV. 15th European Conference. Berlin:Springer, 2018: 3-19. |
[19] |
WANG Bo, CAI Honghao, SU Yang. Classification of Malicious Code Variants Based on VGGNet[J]. Journal of Computer Applications, 2020, 40(1): 162-167.
doi: 10.11772/j.issn.1001-9081.2019050953 |
王博, 蔡弘昊, 苏旸. 基于VGGNet的恶意代码变种分类[J]. 计算机应用, 2020, 40(1): 162-167.
doi: 10.11772/j.issn.1001-9081.2019050953 |
|
[20] |
JIANG Kaolin, BAI Wei, ZHANG Lei, et al. Malicious Code Detection Based on Multi-Channel Image Deep Learning[J]. Journal of Computer Applications, 2021, 41(4): 1142-1147.
doi: 10.11772/j.issn.1001-9081.2020081224 |
蒋考林, 白玮, 张磊, 等. 基于多通道图像深度学习的恶意代码检测[J]. 计算机应用, 2021, 41(4): 1142-1147.
doi: 10.11772/j.issn.1001-9081.2020081224 |
|
[21] | ZHANG Han, GOODFELLOW I, METAXAS D, et al. Self-Attention Generative Adversarial Networks[C]// PMRL. Proceedings of the 36th International Conference on Machine Learning. Long Beach: Curran Associates, 2019: 7354-7363. |
[1] | 薛羽, 张逸轩. 深层神经网络架构搜索综述[J]. 信息网络安全, 2023, 23(9): 58-74. |
[2] | 刘刚, 杨雯莉, 王同礼, 李阳. 基于云联邦的差分隐私保护动态推荐模型[J]. 信息网络安全, 2023, 23(7): 31-43. |
[3] | 刘宇啸, 陈伟, 张天月, 吴礼发. 基于稀疏自动编码器的可解释性异常流量检测[J]. 信息网络安全, 2023, 23(7): 74-85. |
[4] | 蒋英肇, 陈雷, 闫巧. 基于双通道特征融合的分布式拒绝服务攻击检测算法[J]. 信息网络安全, 2023, 23(7): 86-97. |
[5] | 赵彩丹, 陈璟乾, 吴志强. 基于多通道联合学习的自动调制识别网络[J]. 信息网络安全, 2023, 23(4): 20-29. |
[6] | 谭柳燕, 阮树骅, 杨敏, 陈兴蜀. 基于深度学习的教育数据分类方法[J]. 信息网络安全, 2023, 23(3): 96-102. |
[7] | 徐占洋, 程洛飞, 程建春, 许小龙. 一种使用Bi-ADMM优化深度学习模型的方案[J]. 信息网络安全, 2023, 23(2): 54-63. |
[8] | 陈得鹏, 刘肖, 崔杰, 仲红. 一种基于双阈值函数的成员推理攻击方法[J]. 信息网络安全, 2023, 23(2): 64-75. |
[9] | 文伟平, 朱一帆, 吕子晗, 刘成杰. 针对品牌的网络钓鱼扩线与检测方案[J]. 信息网络安全, 2023, 23(12): 1-9. |
[10] | 廖丽云, 张伯雷, 吴礼发. 基于代价敏感学习的物联网异常检测模型[J]. 信息网络安全, 2023, 23(11): 94-103. |
[11] | 贾凡, 康舒雅, 江为强, 王光涛. 基于NLP及特征融合的漏洞相似性算法评估[J]. 信息网络安全, 2023, 23(1): 18-27. |
[12] | 沈传鑫, 王永杰, 熊鑫立. 基于图注意力网络的DNS隐蔽信道检测[J]. 信息网络安全, 2023, 23(1): 73-83. |
[13] | 李季瑀, 付章杰, 张玉斌. 一种基于跨域对抗适应的图像信息隐藏算法[J]. 信息网络安全, 2023, 23(1): 93-102. |
[14] | 张光华, 刘永升, 王鹤, 于乃文. 基于BiLSTM和注意力机制的智能合约漏洞检测方案[J]. 信息网络安全, 2022, 22(9): 46-54. |
[15] | 高博, 陈琳, 严迎建. 基于CNN-MGU的侧信道攻击研究[J]. 信息网络安全, 2022, 22(8): 55-63. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||