信息网络安全 ›› 2019, Vol. 19 ›› Issue (1): 1-7.doi: 10.3969/j.issn.1671-1122.2019.01.001

• 等级保护 •    下一篇

一种基于随机探测算法和信息聚合的漏洞检测方法

文伟平1(), 李经纬1, 焦英楠2, 李海林1   

  1. 1.北京大学软件与微电子学院,北京 102600
    2.国家计算机网络应急技术处理协调中心,北京 100029
  • 收稿日期:2018-10-16 出版日期:2019-01-20 发布日期:2020-05-11
  • 作者简介:

    作者简介:文伟平(1976—),男,湖南,教授,博士,主要研究方向为网络攻击与防范、恶意代码研究、信息系统逆向工程和可信计算技术等;李经纬(1995—),男,辽宁,硕士研究生,主要研究方向为漏洞分析和漏洞挖掘;焦英楠(1983—),女,辽宁,工程师,硕士,主要研究方向为软件工程、信息安全等;李海林(1993—),男,四川,硕士研究生,主要研究方向为软件工程。

  • 基金资助:
    国家自然科学基金[U1736218]

A Vulnerability Detection Method Based on Random Detection Algorithm and Information Aggregation

Weiping WEN1(), Jingwei LI1, Yingnan JIAO2, Hailin LI1   

  1. 1. School of Software and Microelectronics, Peking University, Beijing 102600, China
    2. National Computer Network Emergency Response Technical Team / Coordination Center, Beijing 100029, China
  • Received:2018-10-16 Online:2019-01-20 Published:2020-05-11

摘要:

随着计算机软件复杂度的持续增长,软件架构的安全性不断下降。由于软件各模块耦合性过高,导致软件漏洞数量急剧增加,安全漏洞的检测和防护技术逐渐成为网络安全领域的重点研究方向。现有的漏洞静态检测方法检测效果较差,而模糊测试技术需要消耗大量时间,业内缺乏能够快速对大规模二进制程序进行漏洞扫描的方法。文章基于机器学习方法,使用一种随机探测算法对反编译后的程序进行轻量级静态特征提取,并在动态特征提取过程中对参数进行信息聚合,对提取到的动态特征和静态特征分别运用Text-CNN、Logistic、随机森林等算法进行模型训练。实验表明,文章方法可以有效对二进制程序进行漏洞检测。

关键词: 漏洞检测, 特征提取, 机器学习

Abstract:

As the complexity of computer software continues to grow, the security of software architectures continues to decline. Due to the high coupling of software modules, the number of software vulnerabilities has increased dramatically. The detection and protection technologies of security vulnerabilities have gradually become key research directions in the field of network security. However, the existing vulnerability detection methods have many shortcomings. Fuzzy testing technology consumes a lot of time, and there is no fast vulnerability scanning method for large-scale binary programs in the industry. Based on machine learning method, this paper uses a random detection algorithm to extract lightweight static features of decompiled programs, and aggregates parameters in the process of extracting dynamic features. Text-CNN, Logistic and random forest algorithms are used to train dynamic and static features respectively. Experiments show that this method can effectively detect vulnerabilities in binary programs.

Key words: vulnerability detection, feature extraction, machine learning

中图分类号: