Netinfo Security ›› 2026, Vol. 26 ›› Issue (1): 91-101.doi: 10.3969/j.issn.1671-1122.2026.01.008
Previous Articles Next Articles
WANG Huanzhen1, XU Hongping2, LI Kuangdai1, LIU Yang1, YAO Linyuan1(
)
Received:2025-03-17
Online:2026-01-10
Published:2026-02-13
CLC Number:
WANG Huanzhen, XU Hongping, LI Kuangdai, LIU Yang, YAO Linyuan. A Study on Autonomous Decision-Making for Network Defense Based on Hierarchical Reinforcement Learning[J]. Netinfo Security, 2026, 26(1): 91-101.
Add to citation manager EndNote|Ris|BibTeX
URL: http://netinfo-security.org/EN/10.3969/j.issn.1671-1122.2026.01.008
| [1] | CrowdStrike. CrowdStrike 2024 Threat Hunting Report[EB/OL]. (2024-08-15)[2025-02-02]. ://crowdstrike.com/explore/crowdstrike-2024-threat-hunting-report/crowdstrike-2024-threat-hunting-report. |
| [2] |
MIAO Li, LI Shuai, WU Xiangjuan, et al. Mean-Field Stackelberg Game-Based Security Defense and Resource Optimization in Edge Computing[J]. Applied Sciences, 2024, 14(9): 3406-3417.
doi: 10.3390/app14083406 URL |
| [3] |
WU Huici, GAO Qiuyue, TAO Xiaofeng, et al. Differential Game Approach for Attack-Defense Strategy Analysis in Internet of Things Networks[J]. IEEE Internet of Things Journal, 2021, 9(12): 10340-10353.
doi: 10.1109/JIOT.2021.3122115 URL |
| [4] | LIU Liang, TANG Chuhao, ZHANG Lei, et al. A Generic Approach for Network Defense Strategies Generation Based on Evolutionary Game Theory[EB/OL]. (2024-08-15)[2025-02-02]. https://doi.org/10.1016/j.ins.2024.120875. |
| [5] | SUN Pengyu, TAN Jinglei, LI Chenwei, et al. Network Security Defense Decision-Making Method Based on Time Differential Game[J]. Netinfo Security, 2022, 22(5): 64-74. |
| 孙鹏宇, 谭晶磊, 李晨蔚, 等. 基于时间微分博弈的网络安全防御决策方法[J]. 信息网络安全, 2022, 22(5): 64-74. | |
| [6] | HAMMAR K, STADLER R. Finding Effective Security Strategies through Reinforcement Learning and Self-Play[C]// IEEE. The 16th International Conference on Network and Service Management. New York: IEEE, 2020: 1-9. |
| [7] |
HAMMAR K, STADLER R. Intrusion Prevention through Optimal Stopping[J]. IEEE Transactions on Network and Service Management, 2021, 19: 2333-2348.
doi: 10.1109/TNSM.2022.3176781 URL |
| [8] | ALSHAMRANI A, ALSHAHRANI A. Adaptive Cyber Defense Technique Based on Multiagent Reinforcement Learning Strategies[J]. Intelligent Automation & Soft Computing, 2023, 36(3): 2757-2771. |
| [9] | SELMONAJ A, SZEHR O, DEL R G, et al. Hierarchical Multi-Agent Reinforcement Learning for Air Combat Maneuvering[C]// IEEE. 2023 International Conference on Machine Learning and Applications. New York: IEEE, 2023: 1031-1038. |
| [10] | TANG Yunlong, SUN Jing, WANG Huan, et al. A Method of Network Attack-Defense Game and Collaborative Defense Decision-Making Based on Hierarchical Multi-Agent Reinforcement Learning[EB/OL]. (2024-07-01)[2025-02-02]. https://doi.org/10.1016/j.cose.2024.103871. |
| [11] | CHEAH M, STONE J, HAUBRICK P, et al. CO-DECYBER: Cooperative Decision Making for Cybersecurity Using Deep Multi-Agent Reinforcement Learning[C]// Springer. The 29th European Symposium on Research in Computer Security. Heidelberg: Springer, 2023: 628-643. |
| [12] | STANDEN M, LUCAS M, BOWMAN D, et al. Cyborg: A Gym for the Development of Autonomous Cyber Agents[EB/OL]. (2021-08-20)[2025-02-02]. https://doi.org/10.48550/arXiv.2108.09118. |
| [13] | WIEBE J, MALLAH R A, LI L. Learning Cyber Defence Tactics from Scratch with Multi-Agent Reinforcement Learning[EB/OL]. (2023-08-25)[2025-02-02]. https://doi.org/10.48550/arXiv.2310.05939. |
| [14] | PALMER G, PARRY C, HARROLD D J B, et al. Deep Reinforcement Learning for Autonomous Cyber Operations: A Survey[EB/OL]. (2024-09-17)[2025-02-02]. https://doi.org/10.48550/arXiv.2310.07745. |
| [15] | LIU Xiaohu, ZHANG Hengwei, DONG Shuqin, et al. Network Defense Decision-Making Based on a Stochastic Game System and a Deep Recurrent Q-Network[EB/OL]. (2021-12-01)[2025-02-02]. https://doi.org/10.1016/j.cose.2021.102480. |
| [16] |
WAHAB O A, BENTAHAR J, OTROK H, et al. Resource-Aware Detection and Defense System against Multi-Type Attacks in the Cloud: Repeated Bayesian Stackelberg Game[J]. IEEE Transactions on Dependable and Secure Computing, 2019, 18(2): 605-622.
doi: 10.1109/TDSC.8858 URL |
| [17] | SLIVKINS A. Introduction to Multi-Armed Bandits[EB/OL]. (2019-04-15)[2025-02-02]. https://doi.org/10.48550/arXiv.1904.07272. |
| [18] | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal Policy Optimization Algorithms[EB/OL]. (2019-04-15)[2025-02-02]. https://doi.org/10.48550/arXiv.1707.06347. |
| [19] | PATHAK D, AGRAWAL P, EFROS A A, et al. Curiosity-Driven Exploration by Self-Supervised Prediction[C]// PMLR. International Conference on Machine Learning. New York: PMLR, 2017: 2778-2787. |
| [20] | KIELY M, BOWMAN D, STANDEN M, et al. On Autonomous Agents in a Cyber Defence Environment[EB/OL]. (2023-09-02)[2025-02-02]. https://doi.org/10.48550/arXiv.2309.07388. |
| [21] | OASIS Open. OpenC 2 Language Specification Version 2.0[EB/OL]. (2024-05-15)[2025-02-02]. https://docs.oasis-open.org/openc2/oc2ls/v2.0/oc2ls-v2.0.pdf. |
| [22] | HANNAY J. Champion Award of CAGE Challenge 2[EB/OL]. (2023-06-06)[2025-02-02]. https://github.com/john-cardiff/-cyborg-cage-2. |
| [23] | RASHID T, SAMVELYAN M, DE W C S, et al. Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning[J]. Journal of Machine Learning Research, 2020, 21(178): 1-51. |
| [1] | GUO Yi, LI Xuqing, ZHANG Zijiao, ZHANG Hongtao, ZHANG Liancheng, ZHANG Xiangli. A Review of Data Security Sharing Based on Blockchain [J]. Netinfo Security, 2026, 26(1): 1-23. |
| [2] | ZHENG Kaifa, LUO Zhenpeng, LIU Jiayi, LIU Zhiquan, WANG Ze, WU Yunkun. A Lightweight Dynamic Node Participation Scheme for Federated Learning Nodes Supporting Attribute Update [J]. Netinfo Security, 2026, 26(1): 102-114. |
| [3] | LI Dong, GAO Yuan, YU Junqing, ZENG Muhong, CHEN Junxin. Polymorphic Network Control and Security Monitor Based on P4 [J]. Netinfo Security, 2026, 26(1): 115-124. |
| [4] | DONG Jiayu, GAO Hongmin, MA Zhaofeng, LAI Guanhui. Research and Implementation of Multi-Signature Mechanism in Blockchain [J]. Netinfo Security, 2026, 26(1): 125-138. |
| [5] | NIU Ke, HU Fangmeng, LI Jun. Research on Reversible Neural Network Video Steganography Based on Nonlocal Mechanism [J]. Netinfo Security, 2026, 26(1): 139-149. |
| [6] | DENG Yuyang, LU Tianliang, LI Zhihao, MENG Haoyang, MA Yuansheng. A SQL Injection Attack Detection Model Integrating GAT and Interpretable DQN [J]. Netinfo Security, 2026, 26(1): 150-167. |
| [7] | TONG Xin, JIAO Qiang, WANG Jingya, YUAN Deyu, JIN Bo. A Survey on the Trustworthiness of Large Language Models in the Public Security Domain: Risks, Countermeasures, and Challenges [J]. Netinfo Security, 2026, 26(1): 24-37. |
| [8] | WANG Yajie, LU Jinbiao, TAN Dongli, FAN Qing, ZHU Liehuang. Member Inference Risk Assessment for Capsule Network [J]. Netinfo Security, 2026, 26(1): 38-48. |
| [9] | SHI Yinsheng, BAO Yang, PANG Jingjing. Research on a Federated Privacy Enhancement Method against GAN Attacks [J]. Netinfo Security, 2026, 26(1): 49-58. |
| [10] | WU Yue, ZHANG Yawen, CHENG Xiangran. Research on Security Defense Strategy of Information System Based on Dynamic Security Management Model [J]. Netinfo Security, 2026, 26(1): 59-68. |
| [11] | ZHANG Shenming, LIANG Jinjie, XU Xinqiao, FENG Ge, ZOU Tianhua, HU Zhilin. Research on Time Strategy of IP Hopping System Based on Game Theory [J]. Netinfo Security, 2026, 26(1): 69-78. |
| [12] | XU Yifan, CHENG Guang, ZHOU Yuyang. Research on Complex LDoS Attack Detection Methods under Sampling Conditions [J]. Netinfo Security, 2026, 26(1): 79-90. |
| [13] | HAN Yiliang, PENG Yixuan, WU Xuguang, LI Yu. Multimodal Feature Fusion Encrypted Traffic Classification Model Based on Graph Variational Auto-Encoder [J]. Netinfo Security, 2025, 25(12): 1914-1926. |
| [14] | ZHANG Xuefeng, WANG Kehang. A Proxy Ring Signature Scheme Based on SM9 Algorithm [J]. Netinfo Security, 2025, 25(12): 1901-1913. |
| [15] | XIE Xiangpeng, SHAO Xingchen. Secure Gain-Scheduling Method for Stochastic Nonlinear CPS Based on Dual-Domain Polynomial Framework [J]. Netinfo Security, 2025, 25(12): 1889-1900. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||