信息网络安全 ›› 2023, Vol. 23 ›› Issue (9): 12-24.doi: 10.3969/j.issn.1671-1122.2023.09.002

• 技术研究 • 上一篇    下一篇

基于时频图与改进E-GraphSAGE的网络流量特征提取方法

张玉臣, 张雅雯(), 吴越, 李程   

  1. 中国人民解放军信息工程大学密码工程学院,郑州 450001
  • 收稿日期:2023-05-25 出版日期:2023-09-10 发布日期:2023-09-18
  • 通讯作者: 张雅雯 E-mail:wyyw4ever@qq.com
  • 作者简介:张玉臣(1977—),男,河南,教授,博士,主要研究方向为网络空间安全、网络态势分析和大数据处理|张雅雯(1996—),女,安徽,硕士研究生,主要研究方向为网络流量分类、网络入侵检测和网络安全管理|吴越(1996—),男,浙江,硕士研究生,主要研究方向为网络安全管理、网络态势感知和辅助决策|李程(1993—),男,河南,硕士研究生,主要研究方向为系统工程项目管理和系统风险评估
  • 基金资助:
    国家自然科学基金(61902427)

A Method of Feature Extraction for Network Traffic Based on Time-Frequency Diagrams and Improved E-GraphSAGE

ZHANG Yuchen, ZHANG Yawen(), WU Yue, LI Cheng   

  1. Department of Cryptogram Engineering, Information Engineering University of PLA, Zhengzhou 450001, China
  • Received:2023-05-25 Online:2023-09-10 Published:2023-09-18
  • Contact: ZHANG Yawen E-mail:wyyw4ever@qq.com

摘要:

由于网络系统的时变性,时域空间网络流量不稳定并且分离难度高,传统时空网络模型对时空序列数据空间结构的刻画和对时空特征的挖掘不充分。针对上述问题,文章提出一种基于时频图与改进E-GraphSAGE的网络流量特征提取方法。首先以bior1.3小波基函数为势变基底,完成原始流量一维时域向时频域空间的映射变换,通过可视化分析去除噪声频段;然后在E-GraphSAGE模型的内部融合ConvLSTM模型,构建融合时空长期依赖特征的三维特征提取方法;最后获得包含局部和全局信息的时空频三维特征的边缘嵌入信息,解决了传统时空特征提取模型存在的整体信息缺失问题。可视化分析和分类实验结果表明,处理后的流量特征具有更高的稳定性和可分离度。同时,将文章所提方法与其他关联度较高的方法进行比较,结果表明文章所提方法在准确率、精确度、召回率及F1-score上均表现较好。

关键词: 流量分类, 时频分析, 流谱理论, 特征提取, E-GraphSAGE

Abstract:

Due to the time variability of the network system, the instability of time-space network traffic and the difficulty of separation, and the traditional spatiotemporal network model are insufficient in characterizing the spatial structure of spatiotemporal sequence data and mining spatiotemporal features. Therefore, a method of feature extraction for network traffic based on time-frequency diagrams and improved E-GraphSAGE was proposed. Firstly, based on the potential change of the bior1.3 wavelet basis function, the mapping transformation of original traffic from the one-dimensional time domain to the time-frequency domain was completed, and the noise band was removed by visual analysis. Then, the 1D ConvLSTM model was fused within the E-GraphSAGE model to construct a 3D feature extraction method that integrated spatiotemporal and long-term dependent features. Finally, edge embedding of spatiotemporal frequency 3D features containing local and global information was obtained to solve the problem of global information loss in traditional spatiotemporal feature extraction models. The visual analysis and multi-classification experiments show that the traffic characteristics processed in this paper have higher stability and separability. At the same time, comparing with other methods with higher correlation degrees, this method achieves better results in accuracy, accuracy, recall rate, and F1-score.

Key words: traffic classification, time-frequency analysis, flow spectrum theory, feature extraction, E-GraphSAGE

中图分类号: