信息网络安全 ›› 2024, Vol. 24 ›› Issue (3): 462-472.doi: 10.3969/j.issn.1671-1122.2024.03.011

• 技术研究 • 上一篇    下一篇

基于深度度量学习的异常流量检测方法

张强1, 何俊江1(), 李汶珊1,2, 李涛1   

  1. 1.四川大学网络空间安全学院,成都 610065
    2.成都信息工程大学网络空间安全学院,成都 610225
  • 收稿日期:2023-07-12 出版日期:2024-03-10 发布日期:2024-04-03
  • 通讯作者: 何俊江 E-mail:hejunjiang@scu.edu.cn
  • 作者简介:张强(1999—),男,河南,硕士研究生,主要研究方向为网络流量分类、深度学习、数据挖掘|何俊江(1993—),男,四川,助理研究员,博士,主要研究方向为网络流量分析识别、信息安全、数据挖掘|李汶珊(1995—),女,四川,讲师,博士研究生,主要研究方向为数据科学、机器学习、生物信息学|李涛(1965—),男,四川,教授,博士,主要研究方向为人工免疫、网络安全、信息安全、数据安全
  • 基金资助:
    国家自然科学基金(62032002);国家自然科学基金(62101358);国家重点研发计划(2020YFB1805400);中国博士后科学基金(2020M683345);中央高校基本科研业务费(2023SCU12127);四川省青年基金(2023NSFSC1395);四川大学和中国核动力院联合创新基金(HG2022143)

Anomaly Traffic Detection Based on Deep Metric Learning

ZHANG Qiang1, HE Junjiang1(), LI Wenshan1,2, LI Tao1   

  1. 1. School of Cyber Science and Engineering, Sichuan University, Chengdu 610065, China
    2. School of Cybersecurity, Chengdu University of Information Technology, Chengdu 610225, China
  • Received:2023-07-12 Online:2024-03-10 Published:2024-04-03
  • Contact: HE Junjiang E-mail:hejunjiang@scu.edu.cn

摘要:

网络异常流量识别是目前网络安全的重要任务之一。然而传统流量分类模型是依据流量数据训练得到,由于大部分流量数据分布不均导致分类边界模糊,极大限制了模型的分类性能。为解决上述问题,文章提出一种基于深度度量学习的异常流量检测方法。首先,与传统深度度量学习每个类别单一代理的算法不同,文章设计双代理机制,通过目标代理指引更新代理的优化方向,提升模型的训练效率,增强同类别流量数据的聚集能力和不同类别流量数据的分离能力,实现最小化类内距离和最大化类间距离,使数据的分类边界更清晰;然后,搭建基于1D-CNN和Bi-LSTM的神经网络,分别从空间和时间的角度高效提取流量特征。实验结果表明,NSL-KDD流量数据经过模型处理,其类内距离显著减小并且类间距离显著增大,类内距离相比原始类内距离减小了73.5%,类间距离相比原始类间距离增加了52.7%,且将文章搭建的神经网络比广泛使用的深度残差网络训练时间更短、效果更好。将文章所提模型应用在流量分类任务中,在NSL-KDD和CICIDS2017数据集上,相比传统的流量分类算法,其分类效果更好。

关键词: 深度度量学习, 异常流量检测, 流量数据分布, 神经网络

Abstract:

The identification of network anomalous traffic is one of the important tasks of cyber security nowadays. However, traditional traffic classification models are trained based on traffic data, and most of the traffic data are unevenly distributed, leading to fuzzy classification boundaries, which will greatly limits the classification performance of the model. In order to solve the above problems, this paper proposed a deep metric learning based abnormal traffic detection method. Firstly, a new double-proxy mechanism was designed to improve the efficiency of model training by guiding the optimization direction of updateable proxy through the target proxy compared with the traditional deep metric learning algorithm of single proxy for each category, and to enhance the ability of aggregating traffic data of the same category and separating traffic data of different categories to minimize the intra-class distance and maximized the inter-class distance, which in turn maked the classification of data boundaries more clearly, breaking the performance bottleneck of traditional traffic classification models. Secondly, this paper built neural networks based on 1D-CNN and Bi-LSTM, which can efficiently extract traffic features from spatial and temporal perspectives. The experimental results show that the intra-class distance of NSL-KDD traffic data is significantly reduced and the inter-class distance is significantly increased after the model processing. The intra-class distance decreased by 73.5% compared to the original intra-class distance and the inter-class distance increased by 52.7% compared to the original inter-class distance. And the neural network built in this paper is compared to the widely used deep residual network for deep metric learning with shorter training time and better results. Applying the model proposed in this paper to the traffic classification task on the NSL-KDD and CICIDS2017 datasets, the classification effect is also significantly improved compared to the traditional traffic classification algorithms.

Key words: deep metric learning, abnormal traffic detection, traffic data distribution, neural network

中图分类号: