一种基于随机探测算法和信息聚合的漏洞检测方法

doi:10.3969/j.issn.1671-1122.2019.01.001

信息网络安全 ›› 2019, Vol. 19 ›› Issue (1): 1-7.doi: 10.3969/j.issn.1671-1122.2019.01.001

• 等级保护 • 下一篇

一种基于随机探测算法和信息聚合的漏洞检测方法

文伟平¹(), 李经纬¹, 焦英楠², 李海林¹

1.北京大学软件与微电子学院,北京 102600
2.国家计算机网络应急技术处理协调中心,北京 100029

收稿日期:2018-10-16 出版日期:2019-01-20 发布日期:2020-05-11
作者简介:
作者简介：文伟平(1976—),男,湖南,教授,博士,主要研究方向为网络攻击与防范、恶意代码研究、信息系统逆向工程和可信计算技术等;李经纬(1995—),男,辽宁,硕士研究生,主要研究方向为漏洞分析和漏洞挖掘;焦英楠(1983—),女,辽宁,工程师,硕士,主要研究方向为软件工程、信息安全等;李海林(1993—),男,四川,硕士研究生,主要研究方向为软件工程。
基金资助:
国家自然科学基金[U1736218]

A Vulnerability Detection Method Based on Random Detection Algorithm and Information Aggregation

Weiping WEN¹(), Jingwei LI¹, Yingnan JIAO², Hailin LI¹

1. School of Software and Microelectronics, Peking University, Beijing 102600, China
2. National Computer Network Emergency Response Technical Team / Coordination Center, Beijing 100029, China

Received:2018-10-16 Online:2019-01-20 Published:2020-05-11

摘要/Abstract

摘要：

随着计算机软件复杂度的持续增长,软件架构的安全性不断下降。由于软件各模块耦合性过高,导致软件漏洞数量急剧增加,安全漏洞的检测和防护技术逐渐成为网络安全领域的重点研究方向。现有的漏洞静态检测方法检测效果较差,而模糊测试技术需要消耗大量时间,业内缺乏能够快速对大规模二进制程序进行漏洞扫描的方法。文章基于机器学习方法,使用一种随机探测算法对反编译后的程序进行轻量级静态特征提取,并在动态特征提取过程中对参数进行信息聚合,对提取到的动态特征和静态特征分别运用Text-CNN、Logistic、随机森林等算法进行模型训练。实验表明,文章方法可以有效对二进制程序进行漏洞检测。

关键词: 漏洞检测, 特征提取, 机器学习

Abstract:

As the complexity of computer software continues to grow, the security of software architectures continues to decline. Due to the high coupling of software modules, the number of software vulnerabilities has increased dramatically. The detection and protection technologies of security vulnerabilities have gradually become key research directions in the field of network security. However, the existing vulnerability detection methods have many shortcomings. Fuzzy testing technology consumes a lot of time, and there is no fast vulnerability scanning method for large-scale binary programs in the industry. Based on machine learning method, this paper uses a random detection algorithm to extract lightweight static features of decompiled programs, and aggregates parameters in the process of extracting dynamic features. Text-CNN, Logistic and random forest algorithms are used to train dynamic and static features respectively. Experiments show that this method can effectively detect vulnerabilities in binary programs.

Key words: vulnerability detection, feature extraction, machine learning

中图分类号:

TP309

文伟平, 李经纬, 焦英楠, 李海林. 一种基于随机探测算法和信息聚合的漏洞检测方法[J]. 信息网络安全, 2019, 19(1): 1-7.

Weiping WEN, Jingwei LI, Yingnan JIAO, Hailin LI. A Vulnerability Detection Method Based on Random Detection Algorithm and Information Aggregation[J]. Netinfo Security, 2019, 19(1): 1-7.

图/表 8

图1

图2

图3

图4

表1

图5

图6

表2

参考文献 18

[1]	RAWAT S, MOUNIER L.Finding Buffer Overflow Inducing Loops in Binary Executables[C]// IEEE.IEEE Sixth International Conference on Software Security and Reliability, June 20-22, 2012, Gaithersburg, MD, USA.NJ: IEEE, 2012:177-186.
[2]	SONG Yuanyuan, Sovarel A N, YANG Jing. Detection and Prevention of Memory Corruption Attacks[EB/OL]..
[3]	WANG Xiajing, HU Changzhen, MA Rui, et al.A Survey of the Key Technology of Binary Program Vulnerability Discovery[J].Netinfo Security,2017,17(8):1-13.
	王夏菁, 胡昌振, 马锐, 等. 二进制程序漏洞挖掘关键技术研究综述[J]. 信息网络安全, 2017, 17(8): 1-13.
[4]	VIEGA J, BLOCH J T, KOHNO Y, et al.ITS4: A static Vulnerability Scanner for C and C++ code[C]// IEEE. 16th Annual Computer Security Applications Conference (ACSAC'00), December 11-15, 2000, New Orleans, LA, USA.NJ:IEEE, 2000:257-267.
[5]	Google Inc. and Cpplint Developers. Cpplint[EB/OL]..
[6]	THENAULT S. Pylint—Code Analysis for Python[EB/OL].,2018-9-20.
[7]	EVANS D, LAROCHELLE D.Improving Security Using Extensible Lightweight Static Analysis[J]. IEEE Software, 2002, 19(1):42-51.
[8]	YAMAGUCHI F, GOLDE N, ARP D, et al.Modeling and Discovering Vulnerabilities with Code Property Graphs[C]// IEEE. 2014 IEEE Symposium on Security and Privacy, May 18-21, 2014, San Jose, CA, USA.NJ:IEEE, 2014:590-604.
[9]	XU Youfu, WEN Weiping, WAN Zhengsu.Vulnerability-based Model Checking of Security Vulnerabilities Mining Method[J].Netinfo Security, 2011, 11(8): 72-75.
	徐有福;文伟平;万正苏. 基于漏洞模型检测的安全漏洞挖掘方法研究[J]. 信息网络安全, 2011, 11(8): 72-75.
[10]	CAI Jun, ZOU Peng, MA Jinxin, et al.SwordDTA: A Dynamic Taint Analysis Tool for Software Vulnerability Detection[J]. Journal of Wuhan University, 2016, 21(1):10-20.
[11]	ZHANG Xiong, LI Zhoujun.Review of Fuzzy Testing Technology[J].Computer Science, 2016, 43(5):1-8.
	张雄, 李舟军. 模糊测试技术研究综述[J]. 计算机科学, 2016, 43(5):1-8.
[12]	CADAR C, DUNBAR D, ENGLER D.KLEE: Unassisted and Automatic Generation of High-coverage Tests for Complex Systems Programs[C]//ACM. The 8th USENIX Conference on Operating Systems Design and Implementation, December 8 - 10, 2008 , San Diego, California ,USA. New York:ACM,2009:209-224.
[13]	WANG T, WEI Tao, GU Guofei, et al.TaintScope: A Checksum-Aware Directed Fuzzing Tool for Automatic Software Vulnerability Detection[C]// IEEE. 2010 IEEE Symposium on Security and Privacy, May 16-19 ,2010, Berkeley/Oakland, CA, USA.NJ:IEEE, 2010:497-512.
[14]	SEN K, AGHA G.CUTE and jCUTE: Concolic Unit Testing and Explicit Path Model-Checking Tools[C]// ACM. The 18th International Conference on Computer Aided Verification, August 17 - 20, 2006, Seattle, WA,USA.New York:ACM, 2006:419-423.
[15]	SANTOS I, DEVESA J, BREZO F, et al.OPEM: A Static-Dynamic Approach for Machine-Learning-Based Malware Detection[M]//Springer. International Joint Conference CISIS’12-ICEUTE´12-SOCO´12 Special Sessions. Heidelberg :Springer Berlin Heidelberg, 2013:271-280.
[16]	YAMAGUCHI F, LINDNER F, RIECK K.Vulnerability Extrapolation: Assisted Discovery of Vulnerabilities using Machine Learning[C]//ACM. The 5th USENIX Conference on Offensive Technologies , August 8-12, 2011, San Francisco, CA ,USA.New York:ACM,2012:13.
[17]	Springer. Transactions on Rough Sets[M]. Heidelberg: Springer-Verlag Berlin Heidelberg, 2008.
[18]	GRIECO G, GRINBLAT G L, UZAL L, et al.Toward Large-Scale Vulnerability Discovery Using Machine Learning[C]// ACM. The Sixth ACM Conference on Data and Application Security and Privacy , March 9 - 11, 2016, New Orleans, Louisiana, USA.New York:ACM, 2016:85-96.

实验环境	硬件	配置
硬件环境	CPU	2.5 GHz Intel Core i76700
	内存	8 GB 1600 MHz DDR3
	GPU	GTX 1080Ti
软件环境	Docker	17.12.0-ce
	Ubuntu	16.04.3LTS
	python	2.7.13
	Sklearn	0.19
	cudnn	5.1
	cuda	8.0
	keras	1.2.0
	Nvidiadriver	384

分类器	特征集	精确率	召回率	F1
Logistic回归	静态	0.332	0.523	0.406
随机森林	静态	0.383	0.568	0.457
Text-CNN	静态	0.572	0.208	0.305
Logistic回归	动态	0.342	0.432	0.381
随机森林	动态	0.401	0.543	0.461
Text-CNN	动态	0.421	0.552	0.478
随机预测	—	0.08	0.08	0.08

一种基于随机探测算法和信息聚合的漏洞检测方法

A Vulnerability Detection Method Based on Random Detection Algorithm and Information Aggregation

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 8

参考文献 18

相关文章 15

编辑推荐

Metrics

本文评价

[1]	郭春, 陈长青, 申国伟, 蒋朝惠. 一种基于可视化的勒索软件分类方法[J]. 信息网络安全, 2020, 20(4): 31-39.
[2]	杜义峰, 郭渊博. 一种基于信任值的雾计算动态访问控制方法[J]. 信息网络安全, 2020, 20(4): 65-72.
[3]	康健, 王杰, 李正旭, 张光妲. 物联网中一种基于多种特征提取策略的入侵检测模型[J]. 信息网络安全, 2019, 19(9): 21-25.
[4]	马泽文, 刘洋, 徐洪平, 易航. 基于集成学习的DoS攻击流量检测技术[J]. 信息网络安全, 2019, 19(9): 115-119.
[5]	陈冠衡, 苏金树. 基于深度神经网络的异常流量检测算法[J]. 信息网络安全, 2019, 19(6): 68-75.
[6]	李辉, 倪时策, 肖佳, 赵天忠. 面向互联网在线视频评论的情感分类技术[J]. 信息网络安全, 2019, 19(5): 61-68.
[7]	田春岐, 李静, 王伟, 张礼庆. 一种基于机器学习的Spark容器集群性能提升方法[J]. 信息网络安全, 2019, 19(4): 11-19.
[8]	胡建伟, 赵伟, 闫峥, 章芮. 基于机器学习的SQL注入漏洞挖掘技术的分析与实现[J]. 信息网络安全, 2019, 19(11): 36-42.
[9]	张健, 陈博翰, 宫良一, 顾兆军. 基于图像分析的恶意软件检测技术研究[J]. 信息网络安全, 2019, 19(10): 24-31.
[10]	王旭东, 余翔湛, 张宏莉. 面向未知协议的流量识别技术研究[J]. 信息网络安全, 2019, 19(10): 74-83.
[11]	鲁刚, 郭荣华, 周颖, 王军. 恶意流量特征提取综述[J]. 信息网络安全, 2018, 18(9): 1-9.
[12]	于颖超, 丁琳, 陈左宁. 机器学习系统面临的安全攻击及其防御技术研究[J]. 信息网络安全, 2018, 18(9): 10-18.
[13]	张阳, 姚原岗. 基于Xgboost算法的网络入侵检测研究[J]. 信息网络安全, 2018, 18(9): 102-105.
[14]	文伟平, 吴勃志, 焦英楠, 何永强. 基于机器学习的恶意文档识别工具设计与实现[J]. 信息网络安全, 2018, 18(8): 1-7.
[15]	赵健, 王瑞, 李思其. 基于污点分析的智能家居漏洞挖掘技术研究[J]. 信息网络安全, 2018, 18(6): 36-44.