信息网络安全 ›› 2021, Vol. 21 ›› Issue (10): 69-75.doi: 10.3969/j.issn.1671-1122.2021.10.010

• 入选论文 • 上一篇    下一篇

基于CNN改进模型的恶意域名训练数据生成技术

马骁, 蔡满春(), 芦天亮   

  1. 中国人民公安大学信息网络安全学院,北京 100038
  • 收稿日期:2021-05-06 出版日期:2021-10-10 发布日期:2021-10-14
  • 通讯作者: 蔡满春 E-mail:caimanchun@ppsuc.edu.cn
  • 作者简介:马骁(1997—),男,山东,硕士研究生,主要研究方向为信息网络安全|蔡满春(1972—),男,河北,副教授,博士,主要研究方向为密码学与通信保密|芦天亮(1985—),男,河北,副教授,博士,主要研究方向为恶意代码与人工智能安全
  • 基金资助:
    “十三五”国家密码发展基金密码理论研究重点课题(MMJJ20180108);中国人民公安大学2019年基本科研业务费重大项目(2019JKF108)

Malicious Domain Name Training Data Generation Technology Based on Improved CNN Model

MA Xiao, CAI Manchun(), LU Tianliang   

  1. College of Information Network Security, People’s Public Security University of China, Beijing 100038, China
  • Received:2021-05-06 Online:2021-10-10 Published:2021-10-14
  • Contact: CAI Manchun E-mail:caimanchun@ppsuc.edu.cn

摘要:

近年来,新型僵尸网络开始攻击命令与控制(C&C)服务器,并利用域名生成算法(DGA)来躲避检测。传统的域名生成算法存在寻址效率不高、域名相应代码流量太大导致通信容易被检测发现等弊端。文章通过改进传统的CNN模型,结合文本生成的相关思想,利用Bi-LSTM的自注意力机制来生成恶意域名。最终结果表明,该方法生成的域名数据在对比实验中表现良好,可以模拟真实的域名数据,提高了恶意域名检测效率。

关键词: 恶意域名, 卷积神经网络, 僵尸网络, 机器学习

Abstract:

In recent years, new botnets have begun to use command and control (C&C) server communication to attack and use domain name generation algorithms (DGA) to avoid detection. The traditional algorithm of domain name generation has some disadvantages,such as low addressing efficiency and easy detection due to the corresponding code traffic of a large number of domains. In this paper, we use the self-attention mechanism of BI-LSTM to generate malicious domain name by improving the traditional CNN model and combining with the related ideas of text generation. The final results show that the domain name data generated by this method can be used as real domain name data in the comparative experiment, which improves the efficiency of detecting malicious domain name.

Key words: malicious domain name, convolutional neural network, botnet, machine learning

中图分类号: