信息网络安全 ›› 2023, Vol. 23 ›› Issue (10): 8-15.doi: 10.3969/j.issn.1671-1122.2023.10.002

• 优秀论文 • 上一篇    下一篇

基于迁移学习和威胁情报的DGA恶意域名检测方法研究

叶桓荣1,2, 李牧远1,3, 姜波4,5()   

  1. 1.中国人民公安大学信息网络安全学院,北京 100038
    2.自贡市公安局网络安全保卫支队,自贡 643000
    3.青岛市公安局网络安全保卫支队,青岛 266000
    4.中国科学院信息工程研究所,北京 10085
    5.中国科学院大学网络空间安全学院,北京 100049
  • 收稿日期:2023-05-09 出版日期:2023-10-10 发布日期:2023-10-11
  • 通讯作者: 姜波 E-mail:jiangbo@iie.ac.cn
  • 作者简介:叶桓荣(1988—),男,四川,硕士研究生,CCF会员,主要研究方向为网络攻防技术、网络威胁态势感知|李牧远(1988—),男,山东,硕士研究生,主要研究方向为信息智能处理、自然语言处理|姜波(1985—),男,安徽,副研究员,博士,CCF会员,主要研究方向为态势感知、行为分析、信息智能处理
  • 基金资助:
    国家重点研发计划(2021YFF0307203);国家重点研发计划(2019QY1303);中国科学院战略性先导C类项目(XDC02040100)

Research on DGA Malicious Domain Name Detection Method Based on Transfer Learning and Threat Intelligence

YE Huanrong1,2, LI Muyuan1,3, JIANG Bo4,5()   

  1. 1. School of Information Network Security, People’s Public Security University of China, Beijing 100038, China
    2. Cyber Police Division of Zigong Municipal Public Security Bureau, Zigong 643000, China
    3. Cyber Police Division of Qingdao Municipal Public Security Bureau, Qingdao 266000, China
    4. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 10085, China
    5. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2023-05-09 Online:2023-10-10 Published:2023-10-11

摘要:

域名生成算法已被广泛运用在各类网络攻击中,其存在样本变化快、变种多、获取难等特点,导致现有传统模型检测精度不高,预警能力差。针对该情况,文章提出一种基于迁移学习和威胁情报的DGA恶意域名检测方法,通过构建双向长短时记忆神经网络和Transformer的组合模型,提取恶意域名上下文及语义关系特征,利用公开大样本恶意域名数据集进行预训练,迁移训练参数至新型未知小样本恶意域名进行模型检测性能测试。实验结果表明,该模型在多个APT组织使用的恶意域名小样本数据集中能达到96.14%的平均检测精度,检测性能表现良好。

关键词: 恶意域名, 迁移学习, 威胁情报, 双向长短时记忆神经网络, Transformer

Abstract:

Domain name generation algorithms have been widely used in various types of cyber attacks, which have the characteristics of rapid sample change, many variants, and difficult to obtain, leading to low detection accuracy and poor warning capability of existing traditional models. To address this situation, a DGA malicious domain detection method based on transfer learning and threat intelligence was proposed, which extracted malicious domain context and semantic relationship features by building a combined model of bidirectional long short-term memory neural network and Transformer, pre-trains by using a publicly available large-sample malicious domain dataset, and transfered the training parameters to a new unknown small-sample malicious domain of APT organizations held by threat intelligence for model detection performance testing. The experimental results show that the model can achieve an average detection accuracy of 96.14% in a small-sample dataset of malicious domains used by APT organizations, and the detection performance is good.

Key words: malicious domain name, transfer learning, threat intelligence, Bi-LSTM, Transformer

中图分类号: