信息网络安全 ›› 2019, Vol. 19 ›› Issue (6): 61-67.doi: 10.3969/j.issn.1671-1122.2019.06.008

• 技术研究 • 上一篇    下一篇

基于探索-利用模型优化AFL变异的方法

徐鹏1, 刘嘉勇2(), 林波1   

  1. 1. 四川大学电子信息学院,四川成都 610065
    2. 四川大学网络空间安全学院,四川成都 610065
  • 收稿日期:2019-01-10 出版日期:2019-06-10 发布日期:2020-05-11
  • 作者简介:

    作者简介:徐鹏(1981—),男,四川,硕士研究生,主要研究方向为信息系统安全、漏洞发掘;刘嘉勇(1962—),男,四川,教授,博士,主要研究方向为信息安全理论与应用、网络通信与网络安全;林波(1984—),男,四川,硕士研究生,主要研究方向为信息系统安全。

  • 基金资助:
    国家重点研发计划[2017YFB0802904]

Method on the Model of Exploration and Exploitation to Optimize the AFL Smutation

Peng XU1, Jiayong LIU2(), Bo LIN1   

  1. 1. College of Electronics and Information, Sichuan University, Chendu Sichuan 610065, China
    2. College of Cybersecurity, Sichuan University, Chengdu Sichuan 610065, China
  • Received:2019-01-10 Online:2019-06-10 Published:2020-05-11

摘要:

模糊测试是通过不断生成不同的输入来测试程序从而发现并识别安全漏洞,已经广泛应用于漏洞挖掘中。目前灰盒模糊测试是最流行的模糊测试策略,它将轻量级代码插桩与数据反馈驱动相结合,以生成新的程序输入。AFL(American Fuzzy Lop)是一种卓越的灰盒模糊测试工具,其以高效的forkserver执行、可靠的遗传算法和多种的变异策略著称,但其变异策略主要采样随机变异,存在较大的盲目性。文章提出了一种运用强化学习的方法来优化变异的策略,以多摇臂赌博机问题为模型,记录不同变异方式产生的输入在目标程序中的执行效果,利用探索-利用算法自适应地学习变异操作结果的概率分布情况,智能地进行变异操作策略调整,提升AFL的模糊测试性能。文章选择汤普森采样为优化算法设计实现了AFL-EE模糊测试工具,并对5类常用的文件类程序进行了验证测试,实验表明该方法能自动调整变异操作策略,有效地产生覆盖率高的测试输入,方法可行、额外资源消耗较小,总体上优于AFL工具。

关键词: AFL, 多摇臂赌博机, 探索-利用, 汤普森采样

Abstract:

Fuzzing is to detect and identify security vulnerabilities by generating different input continuously. It has been widely used in vulnerability discovery. At present, gray-box fuzzy testing is the most popular fuzzing strategy. It combines lightweight code instrumentation with data feedback driver to generate new program input. AFL is an excellent grey-box fuzzing test tool. It is famous for its efficient forkserver execution, reliable genetic algorithm and a variety of mutation strategies. However, its mutation strategy mainly sampled random mutation, which has great blindness. In this paper, a method of reinforcement learning is proposed to optimize mutation strategy. Taking Multi-Armed Bandit problem as a model, the execution effect of input generated by different mutation modes in the target program is recorded. The probabilistic distribution of mutation operation results is adaptively learned byExploration-Exploitation algorithm, and mutation operation strategy is intelligently adjusted to improve the fuzzing performance of AFL. According to the above principles, Thompson sampling is chosen as the optimization algorithm to design and implement AFL-EE fuzzing tool. Five kinds of common file programs are tested and verified. Experiments show that the method can automatically adjust the mutation operation strategy and effectively generate test input with high coverage. The method is feasible and has less additional resource consumption. It is superior to the original AFL in general.

Key words: AFL, multi-armed bandit, exploration-exploitation, thompson sampling

中图分类号: