信息网络安全 ›› 2023, Vol. 23 ›› Issue (11): 94-103.doi: 10.3969/j.issn.1671-1122.2023.11.010
收稿日期:
2023-08-10
出版日期:
2023-11-10
发布日期:
2023-11-10
通讯作者:
吴礼发 作者简介:
廖丽云(1997—),女,海南,硕士研究生,主要研究方向为物联网安全|张伯雷(1988—),男,陕西,讲师,博士,CCF会员,主要研究方向为数据挖掘与机器学习|吴礼发(1968—),男,湖北,教授,博士,主要研究方向为网络安全与软件安全
基金资助:
LIAO Liyun, ZHANG Bolei, WU Lifa()
Received:
2023-08-10
Online:
2023-11-10
Published:
2023-11-10
摘要:
针对当前物联网异常检测算法中数据不平衡导致特征学习不全面,进而影响少数类攻击样本检测性能的问题,文章提出了一种基于代价敏感学习的物联网异常检测模型CS-CTIAD。该模型通过卷积神经网络和Transformer综合学习物联网流量的空间和时序特征,来缓解单一模型对少数类攻击样本特征学习不全面的问题;同时,在模型训练过程中引入代价敏感学习,动态调整少数类和多数类的损失权重,防止分类器因数据不平衡而忽略少数类攻击样本,进而提高少数类攻击样本的识别率。在CSE-CIC-IDS2018和IoT-23数据集上的测试结果表明,少数类攻击样本的检测性能得到明显提升。与现有工作相比,文章所提方法的整体评价指标(准确率、精确率、召回率和F1)更优。
中图分类号:
廖丽云, 张伯雷, 吴礼发. 基于代价敏感学习的物联网异常检测模型[J]. 信息网络安全, 2023, 23(11): 94-103.
LIAO Liyun, ZHANG Bolei, WU Lifa. IoT Anomaly Detection Model Based on Cost-Sensitive Learning[J]. Netinfo Security, 2023, 23(11): 94-103.
表1
针对网络流量数据不平衡问题的相关研究
文献 | 数据集 | 方法 | 具体实现 |
---|---|---|---|
文献[ | KDD CUP99 | 特征选择法 | 信息增益思想 |
文献[ | KDD CUP99 | 数据重采样 | SOINN欠采样 |
文献[ | CSE-CIC-IDS2018 | 数据重采样 | SMOTE过采样 |
文献[ | NSL-KDD, CSE-CIC-IDS2018 | 数据重采样 | DSSTE过采样 |
文献[ | NSL-KDD, UNSW-NB15, CICIDS2017 | 数据重采样 | IGAN过采样 |
文献[ | KDD CUP99, AAGM17, UNSW-NB15, CICIDS2017 | 数据重采样 | GAN过采样 |
文献[ | NSL-KDD, UNSW-NB15 | 数据重采样 | OSS欠采样+SMOTE过采样 |
文献[ | NSL-KDD | 数据重采样 | SMOTE过采样+随机过采样+NearMiss欠采样 |
文献[ | KDD CUP99, UNSW-NB15, CICIDS2017 | 数据重采样 | KNN欠采样+TACGAN过采样 |
文献[ | KDD CUP99, NSL-KDD | 代价敏感学习 | 代价矩阵+ CSSAE |
文献[ | InsectWingbeatSound | 代价敏感学习 | 代价因子+CNN |
表2
CSE-CIC-IDS2018数据集类别数量分布
流量类别 | 数量 | 占比 |
---|---|---|
Benign | 200000 | 23.8298% |
DDoS-HOIC | 100000 | 11.9149% |
DoS-GoldenEye | 30585 | 3.6442% |
SQL Injection | 53 | 0.0063% |
DoS-Hulk | 93659 | 11.1594% |
Bot | 100000 | 11.9149% |
SSH-Bruteforce | 94237 | 11.2282% |
Brute Force-XSS | 117 | 0.0139% |
DoS-SlowHTTPTest | 100000 | 11.9149% |
Brute Force-Web | 268 | 0.0319% |
DoS-Slowloris | 13475 | 1.6055% |
FTP-BruteForce | 100000 | 11.9149% |
DDoS-LOIC-UDP | 6682 | 0.7962% |
Infiltration | 209 | 0.0249% |
表5
在CSE-CIC-IDS2018上不同模型的精确率和F1
流量类别 | CTIAD | CS-CTIAD | ||
---|---|---|---|---|
精确率 | F1 | 精确率 | F1 | |
Benign | 99.933 % | 99.904 % | 99.998 % | 99.937 % |
DDOS-HOIC | 99.993 % | 99.997 % | 100 % | 100 % |
DoS-GoldenEye | 99.978 % | 99.989 % | 99.989 % | 99.995 % |
SQL Injection | 75.000 % | 75.000 % | 78.947 % | 85.714 % |
DoS-Hulk | 99.982 % | 99.980 % | 99.993 % | 99.986 % |
Bot | 100 % | 99.997 % | 99.990 % | 99.995 % |
SSH-Bruteforce | 100 % | 99.986 % | 100 % | 99.986 % |
Brute Force -XSS | 81.818 % | 90.000 % | 81.818 % | 90.000 % |
DoS-SlowHTTPTest | 99.970 % | 99.985 % | 99.973 % | 99.987 % |
Brute Force -Web | 76.000 % | 84.444 % | 77.451 % | 86.813 % |
DoS-Slowloris | 99.950 % | 99.975 % | 99.950 % | 99.975 % |
FTP-BruteForce | 99.990 % | 99.995 % | 99.977 % | 99.988 % |
DDoS-LOIC-UDP | 100 % | 100 % | 100 % | 100 % |
Infiltration | 54.839 % | 54.839 % | 66.667 % | 80.000 % |
表6
在IoT-23上不同模型的精确率和F1
流量类别 | CTIAD | CS-CTIAD | ||
---|---|---|---|---|
精确率 | F1 | 精确率 | F1 | |
Benign | 99.998 % | 99.999 % | 100 % | 100 % |
CC | 99.844 % | 99.902 % | 99.961 % | 99.980 % |
Okiru | 100 % | 99.997 % | 100 % | 99.997 % |
PartOfAHorizontalPortScan | 100 % | 100 % | 100 % | 100 % |
DDoS | 100 % | 100 % | 100 % | 100 % |
HeartBeat | 99.957 % | 99.979 % | 99.957 % | 99.979 % |
Torii | 100 % | 99.130 % | 100 % | 99.712 % |
Miral | 100 % | 100 % | 100 % | 100 % |
FileDownload | 100 % | 85.714 % | 100 % | 100 % |
表7
在不同数据集上与文献[21?-23]的对比结果
模型 | CSE-CIC-IDS2018 | IoT-23 | ||||||
---|---|---|---|---|---|---|---|---|
准确率 | 精确率 | 召回率 | F1 | 准确率 | 精确率 | 召回率 | F1 | |
CNN_LSTM[ | 85.778 % | 88.135 % | 98.276 % | 89.937 % | 99.994 % | 99.980 % | 99.870 % | 99.925 % |
TCN_LSTM[ | 99.387 % | 79.727 % | 98.654 % | 82.697 % | 92.481 % | 83.802 % | 85.487 % | 74.843 % |
U-Net[ | 99.961 % | 92.984 % | 98.447 % | 95.334 % | 99.987 % | 99.808 % | 99.931 % | 99.897 % |
RTIDS[ | 99.782 % | 85.023 % | 98.482 % | 88.314 % | 99.948 % | 93.337 % | 97.017 % | 94.780 % |
CS-CTIAD | 99.964 % | 93.197 % | 99.452 % | 95.884 % | 99.998 % | 99.991 % | 99.935 % | 99.963 % |
表8
单一模型与本文模型的对比结果
模型 | CSE-CIC-IDS2018 | IoT-23 | ||||
---|---|---|---|---|---|---|
精确率 | 召回率 | F1 | 精确率 | 召回率 | F1 | |
TextCNN | 91.705 % | 93.866 % | 92.659 % | 96.279 % | 99.682 % | 97.608 % |
Transformer | 84.990 % | 88.942 % | 86.471 % | 99.734 % | 91.602 % | 93.167 % |
LSTM | 91.793 % | 90.654 % | 89.764 % | 88.843 % | 88.563 % | 88.701 % |
CTIAD | 91.961 % | 94.618 % | 93.149 % | 99.978 % | 97.026 % | 98.302 % |
[1] | JSOF. CVE-2020-11896/CVE-2020-11898 Whitepaper[EB/OL]. (2020-10-25)[2023-05-11]. https://www.jsof-tech.com/wp-content/u-ploads/2020/08/Ripple20_CVE-2020-11901-August20.pdf. |
[2] |
FU Lidong, ZHANG Wenbo, TAN Xiaobo, et al. An Algorithm for Detection of Traffic Attribute Exceptions Based on Cluster Algorithm in Industrial Internet of Things[J]. IEEE Access, 2021, 9: 53370-53378.
doi: 10.1109/ACCESS.2021.3068756 URL |
[3] | LIU Xiangyu, LU Tianliang, DU Yanhui, et al. Lightweight IoT Intrusion Detection Method Based on Feature Selection[J]. Netinfo Security, 2023, 23(1): 66-72. |
刘翔宇, 芦天亮, 杜彦辉, 等. 基于特征选择的物联网轻量级入侵检测方法[J]. 信息网络安全, 2023, 23(1):66-72. | |
[4] | FERRAG A M, MAGLARAS L, MOSCHOYIANNIS S, et al. Deep Learning for Cyber Security Intrusion Detection: Approaches, Datasets, and Comparative Study[EB/OL]. [2023-07-16]. https://doi.org/10.1016/j.jisa.2019.102419. |
[5] | YIN Ying, ZHOU Zhihong, YAO Lihong. Research on LSTM-Based CAN Intrusion Detection Model[J]. Netinfo Security, 2022, 22(12): 57-66. |
银鹰, 周志洪, 姚立红. 基于LSTM的CAN入侵检测模型研究[J]. 信息网络安全, 2022, 22(12):57-66. | |
[6] | TESFAHUN A, BHASKARI D L. Intrusion Detection Using Random Forests Classifier with SMOTE and Feature Reduction[C]// IEEE. 2013 International Conference on Cloud & Ubiquitous Computing & Emerging Technologies. New York: IEEE, 2013: 127-132. |
[7] | WU Shuguang, WANG Hongyan, WANG Yu, et al. Application of SOINN Based Undersampling Method in Network Intrusion Detection[J]. Modern Electronics Technique, 2022, 45(21): 88-92. |
吴署光, 王宏艳, 王宇, 等. 基于SOINN的欠采样方法在网络入侵检测中的应用[J]. 现代电子技术, 2022, 45(21):88-92. | |
[8] |
KARATAS G, DEMIR O, SAHINGOZ O K. Increasing the Performance of Machine Learning-Based IDSs on an Imbalanced and Up-to-Date Dataset[J]. IEEE Access, 2020, 8: 32150-32162.
doi: 10.1109/Access.6287639 URL |
[9] |
LIU Lan, WANG Pengcheng, LIN Jun, et al. Intrusion Detection of Imbalanced Network Traffic Based on Machine Learning and Deep Learning[J]. IEEE Access, 2021, 9: 7550-7563.
doi: 10.1109/Access.6287639 URL |
[10] | HUANG Shuokang, LEI Kai. IGAN-IDS: An Imbalanced Generative Adversarial Network towards Intrusion Detection System in Ad-Hoc Networks[EB/OL]. (2020-08-01)[2023-07-16]. https://doi.org/10.1016/j.adhoc.2020.102177. |
[11] |
ANDRESINI G, APPICE A, ROSE L D, et al. GAN Augmentation to Deal with Imbalance in Imaging-Based Intrusion Detection[J]. Future Generation Computer Systems, 2021, 123: 108-127.
doi: 10.1016/j.future.2021.04.017 URL |
[12] |
JIANG Kaiyuan, WANG Wenya, WANG Aili, et al. Network Intrusion Detection Combined Hybrid Sampling With Deep Hierarchical Network[J]. IEEE Access, 2020, 8: 32464-32476.
doi: 10.1109/Access.6287639 URL |
[13] |
MA Xiangyu, SHI Wei. AESMOTE: Adversarial Reinforcement Learning with SMOTE for Anomaly Detection[J]. IEEE Transactions on Network Science and Engineering, 2021, 8(2): 943-956.
doi: 10.1109/TNSE.2020.3004312 URL |
[14] |
DING Hongwei, CHEN Leiyang, DONG Liang, et al. Imbalanced Data Classification: A KNN and Generative Adversaryal Networks-Based Hybrid Approach for Intrusion Detection[J]. Future Generation Computer Systems, 2022, 131: 240-254.
doi: 10.1016/j.future.2022.01.026 URL |
[15] | TELIKANI A, GANDOMI A H. Cost-Sensitive Stacked Auto-Encoders for Intrusion Detection in the Internet of Things[EB/OL]. [2023-07-16]. https://doi.org/10.1016/j.iot.2019.100122. |
[16] | FUQUA D, RAZZAGHI T. A Cost-Sensitive Convolution Neural Network Learning for Control Chart Pattern Recognition[EB/OL]. (2020-07-15)[2023-07-16]. https://doi.org/10.1016/j.eswa.2020.113275. |
[17] | KIM Y. Convolutional Neural Networks for Sentence Classification[C]// ACL. 2014 Empirical Methods in Natural Language Processing(EMNLP). Stroudsburg: ACL, 2014: 1746-1751. |
[18] |
LIN Tianyang, WANG Yuxin, LIU Xiangyang, et al. A Survey of Transformers[J]. AI Open, 2022, 3: 111-132.
doi: 10.1016/j.aiopen.2022.10.001 URL |
[19] | SHARAFALDIN I, LASHKARI A H, GHORBANI A A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization[C]// INSTICC. 4th International Conference on Information Systems Security and Privacy (ICISSP). Lisbon: SCITEPRESS, 2018: 108-116. |
[20] | GARCIA S, PARMISANO A, ERQUIAGA M J. IoT-23: A Labeled Dataset with Malicious and Benign IoT Network Traffic[EB/OL]. [2023-08-10]. https://zenodo.org/record/4743746. |
[21] |
SAHU A K, SHARMA S, TANVEER M, et al. Internet of Things Attack Detection Using Hybrid Deep Learning Model[J]. Computer Communications, 2021, 176(3): 146-154.
doi: 10.1016/j.comcom.2021.05.024 URL |
[22] |
MEZINA A, BURGET R, TRAVIESO-GONZÁLEZ C M. Network Anomaly Detection with Temporal Convolutional Network and U-Net Model[J]. IEEE Access, 2021, 9: 143608-143622.
doi: 10.1109/ACCESS.2021.3121998 URL |
[23] |
WU Zihan, ZHANG Hong, WANG Penghai, et al. RTIDS: A Robust Transformer-Based Approach for Intrusion Detection System[J]. IEEE Access, 2022, 10: 64375-64387.
doi: 10.1109/ACCESS.2022.3182333 URL |
[1] | 秦中元, 马楠, 余亚聪, 陈立全. 基于双重图神经网络和自编码器的网络异常检测[J]. 信息网络安全, 2023, 23(9): 1-11. |
[2] | 薛羽, 张逸轩. 深层神经网络架构搜索综述[J]. 信息网络安全, 2023, 23(9): 58-74. |
[3] | 张伟, 李子轩, 徐晓瑀, 黄海平. SDP-CoAP:基于软件定义边界的安全增强CoAP通信框架设计[J]. 信息网络安全, 2023, 23(8): 17-31. |
[4] | 刘宇啸, 陈伟, 张天月, 吴礼发. 基于稀疏自动编码器的可解释性异常流量检测[J]. 信息网络安全, 2023, 23(7): 74-85. |
[5] | 蒋英肇, 陈雷, 闫巧. 基于双通道特征融合的分布式拒绝服务攻击检测算法[J]. 信息网络安全, 2023, 23(7): 86-97. |
[6] | 李志华, 王志豪. 基于LCNN和LSTM混合结构的物联网设备识别方法[J]. 信息网络安全, 2023, 23(6): 43-54. |
[7] | 赵彩丹, 陈璟乾, 吴志强. 基于多通道联合学习的自动调制识别网络[J]. 信息网络安全, 2023, 23(4): 20-29. |
[8] | 郭瑞, 魏鑫, 陈丽. 工业物联网环境下可外包的策略隐藏属性基加密方案[J]. 信息网络安全, 2023, 23(3): 1-12. |
[9] | 施园, 李杨, 詹孟奇. 一种面向微服务的多维度根因定位算法[J]. 信息网络安全, 2023, 23(3): 73-83. |
[10] | 谭柳燕, 阮树骅, 杨敏, 陈兴蜀. 基于深度学习的教育数据分类方法[J]. 信息网络安全, 2023, 23(3): 96-102. |
[11] | 徐占洋, 程洛飞, 程建春, 许小龙. 一种使用Bi-ADMM优化深度学习模型的方案[J]. 信息网络安全, 2023, 23(2): 54-63. |
[12] | 陈得鹏, 刘肖, 崔杰, 仲红. 一种基于双阈值函数的成员推理攻击方法[J]. 信息网络安全, 2023, 23(2): 64-75. |
[13] | 郇鑫焘, 缪凯焘, 陈稳, 吴畅帆. 基于自主舍弃与校准的鲁棒物联网设备无线密钥生成方法[J]. 信息网络安全, 2023, 23(11): 17-26. |
[14] | 宋丽华, 张津威, 张少勇. 基于博弈论对手建模的物联网SSH自适应蜜罐策略[J]. 信息网络安全, 2023, 23(11): 38-47. |
[15] | 张玉臣, 李亮辉, 马辰阳, 周洪伟. 一种融合变量的日志异常检测方法[J]. 信息网络安全, 2023, 23(10): 16-20. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||