信息网络安全 ›› 2019, Vol. 19 ›› Issue (3): 26-33.doi: 10.3969/j.issn.1671-1122.2019.03.004

• • 上一篇    下一篇

基于深度学习的浏览器Fuzz样本生成技术研究

方勇1, 朱光夏天2(), 刘露平2, 贾鹏2   

  1. 1.四川大学网络空间安全学院,四川成都 610207
    2.四川大学电子信息学院,四川成都 610065
  • 收稿日期:2019-01-10 出版日期:2019-03-19 发布日期:2020-05-11
  • 作者简介:

    作者简介:方勇(1966—),男,四川,教授,博士,主要研究方向为信息安全理论与应用、网络攻防及网络行为监管技术;朱光夏天(1993—),男,湖北,硕士研究生,主要研究方向为Windows 安全、漏洞挖掘与利用;刘露平(1988—),男,四川,博士研究生,主要研究方向为二进制安全、漏洞挖掘;贾鹏(1988—),男,河南,博士研究生,主要研究方向为病毒传播动力学、二进制安全、恶意代码分析。

  • 基金资助:
    国家重点研发计划[2017YFB0802900]

Research on Browser Fuzz Sample Generation Technology Based on Deep Learning

Yong FANG1, Guangxiatian ZHU2(), Luping LIU2, Peng JIA2   

  1. 1. College of Cybersecurity, Sichuan University, Chengdu Sichuan 610207, China
    2. College of Electronics and Information, Sichuan University, Chengdu Sichuan 610065, China
  • Received:2019-01-10 Online:2019-03-19 Published:2020-05-11

摘要:

在众多软件漏洞挖掘的方法中,Fuzz测试是最为成熟有效的一种。而传统的Fuzz测试普遍存在挖掘深度不足、样本没有指向性等问题。针对该问题,文章提出一种使用长短期记忆网络(Long Short Term Memory, LSTM)引导生成浏览器Fuzz所需的样本集的框架。该框架包含样本生成和模糊测试两个部分。首先,对样本进行预处理,将样本解析为向量送入神经网络中学习。其次,待神经网络学习完成后,利用学习完成的网络生成样本,并利用传统变异策略将生成的样本进行变异,构成测试集。最后,使用测试集作为输入进行浏览器Fuzz测试。为验证该框架的有效性,对LSTM网络的学习结果、生成样本结果和Fuzz结果进行了统计与分析。实验证明,该框架能满足浏览器Fuzz生成的需求,并克服了传统浏览器Fuzz中样本挖掘深度不足、指向性弱的问题,适合针对某一类或某几类浏览器漏洞的挖掘。

关键词: 浏览器Fuzz, 深度学习, 样本生成, LSTM神经网络, 文件向量化

Abstract:

Fuzz testing is one of the most mature and effective methods among the approaches used to mine vulnerabilities for modern software. However, traditional Fuzz testing generally have some problems, such as limited depth of exploring code space or lacking of directivity in generating samples. To alleviate these issues, a kind of framework was proposed to generate samples of browsers by making use of long short term memory (LSTM) network. The framework consists two components: sample generating and Fuzz testing. Firstly, the sample are encoded into vectors which are much easier to implement in LSTM network. This process is called file preprocessing. After finishing the learning period, the network will generate a mound of samples as test set. Then test set will be generated by mutating samples based on traditional mutation strategies. Finally, the test set will be feed into the browser for Fuzz testing. In order to verify the effectiveness of the framework, the learning results, generating sample results and Fuzz results of LSTM network have been analyzed statistically. It is proofed that the proposed framework could satisfy the needs of browser Fuzz generation and overcome the difficulties of insufficient mining depth and lack of directivity in generating samples in traditional browser Fuzz, which was suitable for mining one or several browser vulnerabilities.

Key words: browser Fuzz, deep learning, sample generation, LSTM neural network, file vectorization

中图分类号: