信息网络安全 ›› 2022, Vol. 22 ›› Issue (8): 64-71.doi: 10.3969/j.issn.1671-1122.2022.08.008
收稿日期:
2022-04-12
出版日期:
2022-08-10
发布日期:
2022-09-15
通讯作者:
李成豪
E-mail:120106333731@njust.edu.cn
作者简介:
魏松杰(1977—),男,江苏,副教授,博士,主要研究方向为网络攻防、流量分析、入侵检测、身份认证、安全协议设计、安全态势感知和区块链|李成豪(1997—),男,安徽,硕士研究生,主要研究方向为流量分析和入侵检测。|沈浩桐(1998—),男,江苏,硕士研究生,主要研究方向为软件定义网络、DDoS检测和回溯|张文哲(1997—),男,河南,硕士研究生,主要研究方向为强化学习、合作博弈和入侵检测
基金资助:
WEI Songjie, LI Chenghao(), SHEN Haotong, ZHANG Wenzhe
Received:
2022-04-12
Online:
2022-08-10
Published:
2022-09-15
Contact:
LI Chenghao
E-mail:120106333731@njust.edu.cn
摘要:
网络流量分类一直是许多研究工作的关注领域,数据加密的普遍使用使其成为一个公开的技术挑战。数据加密是各种隐私增强工具中使用的一项关键技术。其中,基于匿名通信系统Tor构建的暗网是现今规模最大的匿名通信实体,常被犯罪分子用来从事各类违法犯罪活动,因此高效识别Tor流量具有重要研究意义。文章根据Tor匿名通信流量特点设计了一组用于Tor流量行为检测的网络流特征,并在原有深度森林模型的内存需求和时间开销局限性问题上,提出一种改进的深度森林模型,用于Tor网络流量的识别。实验结果表明,与已有识别方法相比,文章提出的模型准确率可达99.86%,同时,检测时间开销和内存需求都有所优化。
中图分类号:
魏松杰, 李成豪, 沈浩桐, 张文哲. 基于深度森林的网络匿名流量检测方法研究与应用[J]. 信息网络安全, 2022, 22(8): 64-71.
WEI Songjie, LI Chenghao, SHEN Haotong, ZHANG Wenzhe. Research and Application of Network Anonymous Traffic Detection Method Based on Deep Forest[J]. Netinfo Security, 2022, 22(8): 64-71.
表2
特征描述
特征属性 | 描述 | |
---|---|---|
Flow Bytes/s | 每秒流字节数 | |
Flow Duration | 流持续时间 | |
Fwd IAT mean/max/min/std | 正向流中包到达间隔的mean/max/min/std | |
Bwd IAT mean/max/min/std | 反向流中包到达间隔的 mean/max/min/std | |
Fwd Packet Length mean/max/min/std | 正向流中包长度的 mean/max/min/std | |
Bwd Packet Length mean/max/min/std | 反向流中包长度的 mean/max/min/std | |
Flow IAT mean/max/min/std | 包到达间隔时间的 mean/max/min/std | |
Active mean/max/min/std | 流变为空闲态之前活动时间的 mean/max/min/std | |
Idle mean/max/min/std | 流变为活动态之前空闲时间的 mean/max/min/std | |
Total Fwd Packet | 正向数据包总数 | |
Total Bwd packets | 反向数据包总数 | |
Total Length of Fwd Packet | 正向数据包总大小 | |
Total Length of Bwd Packet | 反向数据包总大小 | |
Flow Packets/s | 每秒流数据包数 |
表5
不同流超时值下Tor流量检测实验
方法 | 准确率 | 精确率 | F1-Score | 召回率 | 训练 时长/s | 测试 时长/s | 内存 开销 /MB | 流超时时间/s |
---|---|---|---|---|---|---|---|---|
gcCS | 99.90% | 99.24% | 98.96% | 98.68% | 189.54 | 11.94 | 24.67 | 2 |
gc | 99.90% | 99.62% | 99.20% | 98.78% | 692.11 | 29.06 | 25.14 | 2 |
gcCS | 99.85% | 99.56% | 99.40% | 99.24% | 197.87 | 3.89 | 20.84 | 4 |
gc | 99.87% | 99.67% | 99.48% | 99.3% | 467.00 | 21.25 | 25.55 | 4 |
gcCS | 99.83% | 99.62% | 99.46% | 99.54% | 159.12 | 3.20 | 20.02 | 8 |
gc | 99.85% | 99.68% | 99.59% | 99.51% | 378.96 | 20.94 | 23.125 | 8 |
gcCS | 99.86% | 99.82% | 99.72% | 99.62% | 114.92 | 2.72 | 20.00 | 16 |
gc | 99.87% | 99.82% | 99.68% | 99.75% | 291.86 | 32.57 | 22.60 | 16 |
gcCS | 99.85% | 98.61% | 97.61% | 96.93% | 213.00 | 8.76 | 20.14 | 32 |
gc | 99.89% | 98.96% | 98.37% | 97.78% | 259.00 | 29.43 | 23.09 | 32 |
gcCS | 99.85% | 98.76% | 96.67% | 94.66% | 162.98 | 6.42 | 20.09 | 64 |
gc | 99.85% | 98.46% | 96.68% | 94.96% | 148.86 | 17.75 | 21.13 | 64 |
gcCS | 99.85% | 94.39% | 95.36% | 96.35% | 166.86 | 2.80 | 22.04 | 128 |
gc | 99.90% | 96.88% | 96.88% | 96.88% | 141.00 | 24.55 | 22.18 | 128 |
[1] | DINGLEDINE R, MATHEWSON N, SYVERSON P. Tor: The Second-Generation Onion Router[EB/OL].[ 2022-04-01]. https://xueshu.baidu.com/usercenter/paper/show?paperid=2ebc60ffb4dcd902c75ec2184eaf3e20. |
[2] | SAPUTRA F A, NADHORI I U, BARRY B F. Detecting and Blocking Onion Router Traffic Using Deep Packet Inspection>:[C]// IEEE. International Electronics Symposium (IES). New York: IEEE, 2016283-288. |
[3] | LASHKARI A H, DRAPER-GIL G, MAMUN M S I, et al. Characterization of Tor Traffic Using Time Based Features[EB/OL]. (2017-09-29) [2022-04-01] https://www.researchgate.net/publication/314521450_Characterization_of_Tor_Traffic_using_Time_based_Features. |
[4] | MADHUKAR A, WILLIAMSON C. A Longitudinal Study of P2P Traffic Classification[C]// IEEE. International Symposium on Modeling, Analysis, and Simulation. New York: IEEE, 2006: 179-188. |
[5] | MOORE A W, PAPAGIANNAKI K. Toward the Accurate Identification of Network Applications[C]// Springer. International Workshop on Passive and Active Network Measurement. Berlin: Springer, 2005: 41-54. |
[6] | LIU X B, YANG J H, XIE G G, et al. Automated Mining of Packet Signatures for Traffic Identification at Application Layer with Apriori Algorithm[EB/OL]. [ 2022-04-01]. https://xueshu.baidu.com/usercenter/paper/show?paperid=94fc3e9c8f42718fc085cc0e299e78db&site=xueshu_se&hitarticle=1. |
[7] |
ESTE A, GRINGOLI F, SALGARELLI L. Support Vector Machines for TCP Traffic Classification[J]. Computer Networks, 2009, 53(14): 2476-2490.
doi: 10.1016/j.comnet.2009.05.003 URL |
[8] | SAHU S, MEHTRE B M. Network Intrusion Detection System Using J48 Decision Tree[C]// IEEE. International Conference on Advances in Computing, Communications and Informatics (ICACCI). New York: IEEE, 2015: 2023-2026. |
[9] |
HELMAN P, VEROFF R, ATLAS S R, et al. A Bayesian Network Classification Methodology for Gene Expression Data[J]. Journal of Computational Biology, 2004, 11(4): 581-615.
doi: 10.1089/cmb.2004.11.581 URL |
[10] | DOGRU N, SUBASI A. Traffic Accident Detection Using Random Forest Classifier[C]// IEEE. 15th Learning and Technology Conference (L&T). New York: IEEE, 2018: 40-45. |
[11] | LI Xiaoming, REN Hui, YAN Jinyao. Analysis and Research on Network Traffic Classification Algorithm Based on Machine Learning[J] Journal of Communication University of China: Natural Science Edition, 2017, 24(2): 9-14. |
李晓明, 任慧, 颜金尧. 基于机器学习的网络流量分类算法分析研究[J]. 中国传媒大学学报:自然科学版, 2017, 24(2): 9-14. | |
[12] |
WANG Pan, YE Feng, CHEN Xuejiao, et al. Datanet: Deep Learning Based Encrypted Network Traffic Classification in SDN Home Gateway[J]. IEEE Access, 2018, 6: 55380-55391.
doi: 10.1109/ACCESS.2018.2872430 URL |
[13] |
ZHOU Zhihua, FENG Ji. Deep Forest[J]. National Science Review, 2019, 6(1): 74-86.
doi: 10.1093/nsr/nwy108 |
[14] | PANG M, TING K M, ZHAO P, et al. Improving Deep Forest by Confidence Screening[C]// IEEE. IEEE International Conference on Data Mining (ICDM). Piscataway: IEEE, 2018: 1194-1199. |
[15] | BLUMBERG B, DOWNIE M, IVANOV Y, et al. Integrated Learning for Interactive Synthetic Characters[C]//ACM. Proceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques. New York: ACM, 2002: 417-426. |
[16] | ZHOU Zhi Hua, FENG Ji. Deep Forest: Towards An Alternative to Deep Neural Networks[EB/OL]. [ 2022-04-01]. https://xueshu.baidu.com/usercenter/paper/show?paperid=637f02600a538dc721ff4c3213ce2b7a&site=xueshu_se. |
[17] | MA Pengfei, WU Youxi, LI Yan, et al. DBC-Forest: Deep Forest with Binning Confidence Screening[J]. Neurocomputing, 2021(12): 112-122. |
[1] | 顾兆军, 郝锦涛, 周景贤. 基于改进双线性卷积神经网络的恶意网络流量分类算法[J]. 信息网络安全, 2020, 20(10): 67-74. |
[2] | 魏书宁, 陈幸如, 唐勇, 刘慧. AR-HELM算法在网络流量分类中的应用研究[J]. 信息网络安全, 2018, 18(1): 9-14. |
[3] | 肖梅, 辛阳. 基于朴素贝叶斯算法的VoIP流量识别技术研究[J]. 信息网络安全, 2015, 15(10): 74-79. |
[4] | 安文娟;李丹;辛阳. 基于聚类算法的实时IP流量识别技术研究[J]. , 2012, 12(10): 0-0. |
[5] | 葛青林;王莹莹;李静. 基于决策树算法分析恶意网络攻击和入侵[J]. , 2010, (3): 0-0. |
阅读次数 | ||||||||||||||||||||||||||||||||||||||||||||||||||
全文 176
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||
摘要 284
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||