Netinfo Security ›› 2023, Vol. 23 ›› Issue (9): 47-57.doi: 10.3969/j.issn.1671-1122.2023.09.005
Previous Articles Next Articles
ZHANG Guomin, ZHANG Shaoyong(), ZHANG Jinwei
Received:
2023-05-22
Online:
2023-09-10
Published:
2023-09-18
Contact:
ZHANG Shaoyong
E-mail:1345150105@qq.com
CLC Number:
ZHANG Guomin, ZHANG Shaoyong, ZHANG Jinwei. Discovery and Optimization Method of Attack Paths Based on PPO Algorithm[J]. Netinfo Security, 2023, 23(9): 47-57.
Add to citation manager EndNote|Ris|BibTeX
URL: http://netinfo-security.org/EN/10.3969/j.issn.1671-1122.2023.09.005
地址 | 操作系统 | 主机价值 | 服务 | 进程 |
---|---|---|---|---|
(1,0) | Linux | 0 | HTTP | - |
(2,0) | Windows | 100 | SMTP | Schtask |
(2,1) | Windows | 0 | SMTP | Schtask |
(3,0),(3,3),(3,4),(4,3) | Windows | 0 | FTP | Schtask |
(3,1) | Windows | 0 | FTP, HTTP | Daclsvc |
(3,2) | Windows | 0 | FTP | - |
(4,0),(4,1),(4,2),(5,2) | Linux | 0 | SSH | - |
(5,0) | Windows | 100 | SSH, SAMBA | Tomcat |
(5,1) | Linux | 0 | SSH, HTTP | Tomcat |
(5,3) | Linux | 0 | SSH | Daclsvc |
超参数 | 含义 | 取值(场景1~ 场景4) | 取值(场景5~ 场景7) |
---|---|---|---|
Actor learning rate, | Actor网络的学习率 | 1e-4 | 1e-5 |
Critic learning rate, | Critic网络的学习率 | 5e-3 | 5e-4 |
GAE计算过程中的参数 | 0.9 | 0.9 | |
Discount factor, | 折扣因子 | 0.9 | 0.9 |
Hidden layer size | 隐藏层神经元层数及个数 | [128,128] | [128,128] |
n_steps | 控制每个采样轨迹的长度 | 2000(Scenario 1) 3000(Scenario 2) 5000(Scenario 3) 8000(Scenario 4) | 8000 |
Epochs | 一条序列的数据 用来训练轮数 | 10 | 10 |
Clip Ratio | PPO中截断范围的参数 | 0.2 | 0.2 |
[1] | PHILLIPS C, SWILER L P. A Graph-Based System for Network-Vulnerability Analysis[C]// ACM. 1998 Workshop on New Security Paradigms. New York: ACM, 1998: 71-79. |
[2] | YOUSEFI M, MTETWA N, ZHANG Yan, et al. A Reinforcement Learning Approach for Attack Graph Analysis[C]// IEEE. 2018 17th IEEE International Conference on Trust, Security and Privacy in Computing and Communications/12th IEEE International Conference on Big Data Science and Engineering (TrustCom/BigDataSE). New York: IEEE, 2018: 212-217. |
[3] | OU Xinming, GOVINDAVAJHALA S, APPEL A W. MulVAL: A Logic-Based Network Security Analyzer[C]// USENIX. 14th Conference on USENIX Security Symposium. Berlin: USENIX, 2005: 113-128. |
[4] | BELLMAN R E. A Markovian Decision Process[J]. Journal of Mathematics and Mechanics, 1957, 6(5): 679-684. |
[5] | CASSANDRA A R, KAELBLING L P, LITTMAN M L. Acting Optimally in Partially Observable Stochastic Domains[C]// ACM. Twelfth AAAI National Conference on Artificial Intelligence. New York: ACM, 1994: 1023-1028. |
[6] | SARRAUTE C, BUFFET O, HOFFMANN J. Penetration Testing= =POMDP Solving?[EB/OL]. (2013-06-19)[2023-04-30]. https://arxiv.org/pdf/1306.4714.pdf. |
[7] | SCHWARTZ J, KURNIAWATI H. Autonomous Penetration Testing Using Reinforcement Learning[EB/OL]. (2019-05-15)[2023-04-30]. https://arxiv.org/abs/1905.05965. |
[8] | ZENNARO F M, ERDODI L. Modeling Penetration Testing with Reinforcement Learning Using Capture-the-Flag Challenges and Tabular Q-Learning[EB/OL]. (2020-05-26)[2023-04-30]. https://arxiv.org/abs/2005.12632. |
[9] | ZHANG Lei, BAI Wei, LI Wei, et al. Discover the Hidden Attack Path in Multiple Domain Cyberspace Based on Reinforcement Learning[EB/OL]. (2021-04-15)[2023-04-30]. https://arxiv.org/abs/2104.07195. |
[10] | HUANG Lanxiao, CODY T, REDINO C, et al. Exposing Surveillance Detection Routes via Reinforcement Learning, Attack Graphs, and Cyber Terrain[C]// IEEE. 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA). New York: IEEE, 2022: 21-28. |
[11] | SCHWARTZ J, KURNIAWATTI H. NASim: Network Attack Simulator[EB/OL]. (2019-05-26)[2023-04-30]. https://networkattacksimulator.readthedocs.io/. |
[12] | CHRISTIAN S, MICHAEL B, WILLIAM B, et al. CyberBattle-Sim[EB/OL]. (2021-05-11)[2023-04-30]. https://github.com/microsoft/cyberbattlesim. |
[13] | ZHOU Shicheng, LIU Jingju, HOU Dongdong, et al. Autonomous Penetration Testing Based on Improved Deep Q-Network[EB/OL]. (2021-07-16)[2023-04-30]. https://doi.org/10.3390/app11198823. |
[14] | FIGUEROA-LORENZO S, AÑORGA J, ARRIZABALAGA S. A Survey of IIoT Protocols: A Measure of Vulnerability Pisk Analysis Based on CVSS[J]. ACM Computing Surveys (CSUR), 2020, 53(2): 1-53. |
[15] | SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal Policy Optimization Algorithms[EB/OL]. (2017-07-20)[2023-04-30]. https://arxiv.org/abs/1707.06347. |
[16] | SCHULMAN J, LEVINE S, MORITZ P, et al. Trust Region Policy Optimization[C]// ACM. International Conference on Machine Learning. New York: ACM, 2015: 1889-1897. |
[17] |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-Level Control Through Deep Reinforcement Learning[J]. Nature, 2015, 518(7540): 529-533.
doi: 10.1038/nature14236 |
[18] | SCHULMAN J, MORITZ P, LEVINE S, et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation[EB/OL]. (2015-06-08)[2023-04-30]. https://arxiv.org/abs/1506.02438. |
[19] | CHOWDHARY A, HUANG Dijiang, MAHENDRAN J S, et al. Autonomous Security Analysis and Penetration Testing[C]// IEEE. 2020 16th International Conference on Mobility, Sensing and Networking (MSN). New York: IEEE, 2020: 508-515. |
[1] | QIN Zhongyuan, MA Nan, YU Yacong, CHEN Liquan. Network Anomaly Detection Based on Dual Graph Convolutional Network and Autoencoders [J]. Netinfo Security, 2023, 23(9): 1-11. |
[2] | GONG Pengfei, XIE Sijiang, CHENG Andong. The Multi-Leader Consensus Algorithm Based on Improvements to HotStuff [J]. Netinfo Security, 2023, 23(9): 108-117. |
[3] | DAI Yu, ZHOU Fei, XUE Dan. Handover Authentication Protocol Based on Chinese Remainder Theorem Secret Sharing [J]. Netinfo Security, 2023, 23(9): 118-128. |
[4] | ZHANG Yuchen, ZHANG Yawen, WU Yue, LI Cheng. A Method of Feature Extraction for Network Traffic Based on Time-Frequency Diagrams and Improved E-GraphSAGE [J]. Netinfo Security, 2023, 23(9): 12-24. |
[5] | LIU Qin, WANG Zhuobing, YU Chunwu, WANG Zhangyi. Efficient Attribute-Based Encryption Scheme from Lattices for Cloud Security [J]. Netinfo Security, 2023, 23(9): 25-36. |
[6] | ZHOU Quan, CHEN Minhui, WEI Kaijun, ZHENG Yulong. Blockchain Access Control Scheme with SM9-Based Attribute Encryption [J]. Netinfo Security, 2023, 23(9): 37-46. |
[7] | XUE Yu, ZHANG Yixuan. Survey on Deep Neural Architecture Search [J]. Netinfo Security, 2023, 23(9): 58-74. |
[8] | WU Wei, XU Shasha, GUO Sensen, LI Xiaoyu. Research on Hybrid Recommendation Algorithm for Points of Interest in Location-Based Social Network [J]. Netinfo Security, 2023, 23(9): 75-84. |
[9] | PU Junyan, LI Yahui, ZHOU Chunjie. Cross-Domain Dynamic Security Risk Analysis Method of Industrial Control System Based on Probabilistic Attack Graph [J]. Netinfo Security, 2023, 23(9): 85-94. |
[10] | ZHAO Jiahao, JIANG Jiajia, ZHANG Yushu. Cross-Chain Data Consistency Verification Model Based on Dynamic Merkle Hash Tree [J]. Netinfo Security, 2023, 23(9): 95-107. |
[11] | SHAO Zhenzhe, JIANG Jiajia, ZHAO Jiahao, ZHANG Yushu. An Improved Weighted Byzantine Fault Tolerance Algorithm for Cross-Chain System [J]. Netinfo Security, 2023, 23(8): 109-120. |
[12] | XIE Sijiang, CHENG Andong, GONG Pengfei. A QKD-Based Multiparty Byzantine Consensus Agreement [J]. Netinfo Security, 2023, 23(8): 41-51. |
[13] | LI Zhihui, LUO Shuangshuang, WEI Xingjia. Quantum Secret Sharing Schemes Based on a Class of Restricted Access Structures [J]. Netinfo Security, 2023, 23(8): 32-40. |
[14] | WANG Juan, ZHANG Chong, GONG Jiaxin, LI Jun’e. Review of Fuzzing Based on Machine Learning [J]. Netinfo Security, 2023, 23(8): 1-16. |
[15] | QIN Sihang, DAI Weiqi, ZENG Haiyan, GU Xianjun. Secure Sharing of Power Application Data Based on Blockchain [J]. Netinfo Security, 2023, 23(8): 52-65. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||