信息网络安全 ›› 2021, Vol. 21 ›› Issue (6): 52-62.doi: 10.3969/j.issn.1671-1122.2021.06.007
收稿日期:
2021-03-08
出版日期:
2021-06-10
发布日期:
2021-07-01
通讯作者:
徐国天
E-mail:459536384@qq.com
作者简介:
徐国天(1978—),男,辽宁,副教授,硕士,主要研究方向为网络空间安全、电子数据取证|沈耀童(1998—),男,河南,硕士研究生,主要研究方向为电子数据取证
基金资助:
Received:
2021-03-08
Online:
2021-06-10
Published:
2021-07-01
Contact:
XU Guotian*
E-mail:459536384@qq.com
摘要:
当前在恶意程序多分类检测领域,传统静态和动态检测方法受反取证技术影响较大;在新型基于网络流量的检测方法中,由于各类恶意程序流量特征的相似性较大,使用人工提取的数据流特征和传统机器学习方法不能取得较高的准确率。针对上述问题,文章提出一种基于XGBoost与Stacking融合模型的恶意程序多分类检测方法。在获取目标恶意程序对外通信流量并自动提取初始网络特征后,对初始数据集进行预处理和多重特征选择,而后使用基于XGBoost的特征创造算法,在初始特征基础上自动化生成高级特征集,并结合Stacking集成算法实现多模型融合以提升恶意程序多分类检测的准确率。在此过程中,为减少寻找最优参数组合的时间,使用贝叶斯优化方法确定各个模型的最优参数组合,并采取多种正则化策略解决模型过拟合问题。实验结果表明,与其他传统方法相比,该检测方法在恶意程序多分类的准确率上有较大提升。
中图分类号:
徐国天, 沈耀童. 基于XGBoost与Stacking融合模型的恶意程序多分类检测方法[J]. 信息网络安全, 2021, 21(6): 52-62.
XU Guotian*, SHEN Yaotong. Multiple Classification Detection Method for Malware Based on XGBoost and Stacking Fusion Model[J]. Netinfo Security, 2021, 21(6): 52-62.
[1] | National Internet Emergency Response Center. 2019 China Internet Network Security Report[EB/OL]. https://www.cert.org.cn/publish/main/46/2020/20200811124544754595627/20200811124544754595627_.html , 2020-06-01. |
国家互联网应急中心. 2019 年中国互联网网络安全报告[EB/OL]. https://www.cert.org.cn/publish/main/46/2020/20200811124544754595627/20200811124544754595627_.html , 2020-06-01. | |
[2] | YU Yuaner, ZHANG Linlin, ZHAO Kai, et al. Android Malware Family Classification Method Based on Sensitive Permissions and API[J]. Journal of Zhengzhou University(Science Edition), 2020, 52(3): 75-79,91. |
于媛尔, 张琳琳, 赵楷, 等. 基于敏感权限和API的Android恶意软件家族分类方法[J]. 郑州大学学报(理学版), 2020,52(3):75-79,91. | |
[3] | XIAO Yunchang, SU Haifeng, QIAN Yucun, et al. A Behavior-based Family Clustering Method for Android Malwares[J]. Journal of Wuhan University (Science Edition), 2016,62(5):429-436. |
肖云倡, 苏海峰, 钱雨村, 等. 一种基于行为的Android恶意软件家族聚类方法[J]. 武汉大学学报(理学版), 2016,62(5):429-436. | |
[4] | JIANG Tongtong, YIN Weixin, CAI Bing, et al. An Encrypted Malicious Traffic Recognition Method Based on Multi-head Self-attention[EB/OL]. https://doi.org/10.19678/j.issn.1000-3428.0058517, 2020-11-14. |
蒋彤彤, 尹魏昕, 蔡冰, 等. 基于多头注意力的恶意加密流量识别[EB/OL]. https://doi.org/10.19678/j.issn.1000-3428.0058517, 2020-11-14. | |
[5] | WANG Guodong, LU Tianliang, YIN Haoran, et al. Malicious Code Family Detection Technology Based on CNN-BiLSTM[J]. Computer Engineering and Applications, 2020,56(24):72-77. |
王国栋, 芦天亮, 尹浩然, 等. 基于CNN-BiLSTM的恶意代码家族检测技术[J]. 计算机工程与应用, 2020,56(24):72-77. | |
[6] | XU Guotian. Android Malicious Process Identification Method Based on Abnormal Encrypted Traffic Annotation[J]. Netinfo Security, 2020,20(7):30-41. |
徐国天. 基于异常加密流量标注的Android恶意进程识别方法研究[J]. 信息网络安全, 2020,20(7):30-41. | |
[7] | PFEFFER A, CALL C, CHAMBERLAIN J. Malware Analysis And Attribution Using Genetic Information [C]//IEEE. 2012 7th International Conference on Malicious and Unwanted Software (MALWARE), October 16-18, 2012, Fajardo, PR, USA. New York: IEEE, 2012: 39-45. |
[8] | CHEN Yi, TANG Di, ZOU Wei. Android Malware Detection Based on Deep Learning: Achievements and Challenges[J]. Journal of Electronics & Information Technology, 2020,42(9):2082-2094. |
陈怡, 唐迪, 邹维. 基于深度学习的Android恶意软件检测:成果与挑战[J]. 电子与信息学报, 2020,42(9):2082-2094. | |
[9] | GU Tong, XU Guoliang, LI Wanlin, et al. Intelligent House Price Evaluation Model Based on Ensemble LightGBM and Bayesian Optimization Strategy[J]. Journal of Computer Applications, 2020,40(9):2762-2767. |
顾桐, 许国良, 李万林, 等. 基于集成LightGBM和贝叶斯优化策略的房价智能评估模型[J]. 计算机应用, 2020,40(9):2762-2767. | |
[10] | YANG Chunyu, XU Yang, ZHANG Sicong, et al. Malware Classification Method Based on Fusion of Static Features[EB/OL]. http://kns.cnki.net/kcms/detail/11.2127.TP.20200819.1934.028.html , 2020-11-24. |
杨春雨, 徐洋, 张思聪,等. 基于静态特征融合的恶意软件分类方法 [EB/OL]. http://kns.cnki.net/kcms/detail/11.2127.TP.20200819.1934.028.html, 2020-11-24. | |
[11] | YONG Juya, ZHOU Zhongmei. Multi-level Feature Selection Algorithm Based on Mutual Information[J]. Journal of Computer Applications, 2020,40(12):3478-3484. |
雍菊亚, 周忠眉. 基于互信息的多级特征选择算法[J]. 计算机应用, 2020,40(12):3478-3484. | |
[12] | WANG Cheng, WANG Changqi. An Automated Feature Engineering Method for Online Payment Fraud Detection[J]. Chinese Journal of Computers, 2020,43(10):1983-2001. |
王成, 王昌琪. 一种面向网络支付反欺诈的自动化特征工程方法[J]. 计算机学报, 2020,43(10):1983-2001. | |
[13] | PIAOYANG Heran, REN Junling. Malicious Webpage Integrated Detection Method Based on Stacking Ensemble Algorithm[J]. Journal of Computer Applications, 2019,39(4):1081-1088. |
朴杨鹤然, 任俊玲. 基于Stacking的恶意网页集成检测方法[J]. 计算机应用, 2019,39(4):1081-1088. | |
[14] | REN Shougang, LIU Guoyang, GU Xingjian, et al. Research on Time Series Classification Algorithm with Hybrid-norm Trend Filtering[J]. Journal of Chinese Computer System, 2020,41(5):940-945. |
任守纲, 刘国阳, 顾兴健, 等. 混合范数趋势滤波时间序列分类算法研究[J]. 小型微型计算机系统, 2020,41(5):940-945. | |
[15] | LAYA T H, ANDI F A, ARASH H L. Extensible Android Malware Detection and Family Classification Using Network-flows and API-calls[J]. The IEEE(53rd) International Carnahan Conference on Security Technology, 2019,4(1):26-30. |
[1] | 徐国天, 沈耀童. 基于XGBoost和LightGBM双层模型的恶意软件检测方法[J]. 信息网络安全, 2020, 20(12): 54-63. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||