信息网络安全 ›› 2022, Vol. 22 ›› Issue (10): 59-68.doi: 10.3969/j.issn.1671-1122.2022.10.009

• 入选论文 • 上一篇    下一篇

基于图神经网络和通用漏洞分析框架的C类语言漏洞检测方法

朱丽娜1, 马铭芮2,3,4(), 朱东昭5   

  1. 1.广东警官学院网络信息安全系,广州 510442
    2.华中科技大学网络空间安全学院,武汉 430074
    3.分布式系统安全湖北省重点实验室,武汉 430074
    4.湖北省大数据安全工程技术研究中心,武汉 430074
    5.中国移动信息技术有限公司黑龙江分公司,哈尔滨 150001
  • 收稿日期:2022-07-01 出版日期:2022-10-10 发布日期:2022-11-15
  • 通讯作者: 马铭芮 E-mail:jkpathfinder@126.com
  • 作者简介:朱丽娜(1974—),女,山东,讲师,硕士,主要研究方向为网络信息安全|马铭芮(2000—),男,黑龙江,硕士研究生,主要研究方向为神经网络、深度学习和网络信息安全|朱东昭(1977—),男,山东,高级工程师,硕士,主要研究方向为大数据和网络信息安全
  • 基金资助:
    国家自然科学基金(6217071437);国家自然科学基金(62072200);国家自然科学基金(62127808);广东省自然科学基金(2020A1515011096);广东省自然科学基金(2019A1515011841);广东警官学院院级科研项目(2022SY02)

Detection Method for C Language Family Based on Graph Neural Network and Generic Vulnerability Analysis Framework

ZHU Lina1, MA Mingrui2,3,4(), ZHU Dongzhao5   

  1. 1. Department of Network Information Security, Guangdong Police College, Guangzhou 510442, China
    2. School of Cyber Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China
    3. Hubei Key Laboratory of Distributed System Security, Wuhan 430074, China
    4. Hubei Engineering Research Center on Big Data Security, Wuhan 430074, China
    5. Heilongjiang Branch of China Mobile Information Technology Co., Ltd., Harbin 150001, China
  • Received:2022-07-01 Online:2022-10-10 Published:2022-11-15
  • Contact: MA Mingrui E-mail:jkpathfinder@126.com

摘要:

现有的自动化漏洞挖掘工具大多泛化能力较差,具有高误报率与漏报率。文章提出一种针对C类语言的多分类漏洞静态检测模型CSVDM。CSVDM运用代码相似性比对模块与通用漏洞分析框架模块从源码层面进行漏洞挖掘,代码相似性比对模块运用最长公共子序列(Longest Common Subsequence,LCS)算法与图神经网络对待检测源码与漏洞模板实施代码克隆与同源性检测,根据预设阈值生成漏洞相似度列表。通用漏洞分析框架模块对待检测源码进行上下文依赖的数据流与控制流分析,弥补了代码相似性比对模块在检测不是由代码克隆引起的漏洞时高假阴性的缺陷,生成漏洞分析列表。CSVDM综合漏洞相似度列表与漏洞分析列表,生成最终的漏洞检测报告。实验结果表明,CSVDM相较于Checkmarx等漏洞挖掘工具在评价指标方面有较大幅度提升。

关键词: 通用漏洞分析框架, LCS算法, Skip-Gram模型, 图神经网络, 图注意力机制

Abstract:

Most of the existing automated vulnerability mining tools have poor generalization ability and high false positive and false negative rale. In this paper, a static detection model called CSVDM was proposed for multi-class vulnerabilities in C language family. CSVDM used code similarity detection and generic vulnerability analysis framework module to perform vulnerability mining at the source code level. The similarity detection module integrated longest common subsequence(LCS) algorithm and graph neural network to implement code cloning and homology detection, generating the vulnerability similarity list according to a preset threshold. The generic vulnerability analysis framework module performed context-dependent data flow and controled flow analysis of the source code to be tested to compensate for the the similarity detection module’s high false negatives in detecting vulnerabilities not caused by code cloning, and generated the vulnerability analysis list. CSVDM combined the vulnerability similarity list and the vulnerability analysis list to generate the final vulnerability detection report. The experimental results show that CSVDM has a substantial improvement in evaluation metrics compared to other vulnerability mining tools such as checkmarx.

Key words: generic vulnerability analysis framework, LCS algorithm, Skip-Gram model, graph neural network, graph attention mechanism

中图分类号: