Netinfo Security ›› 2026, Vol. 26 ›› Issue (4): 654-664.doi: 10.3969/j.issn.1671-1122.2026.04.012
Previous Articles Next Articles
DONG Yingjuan1, LYU Ping2(
), LIU Bing3
Received:2026-02-03
Online:2026-04-10
Published:2026-04-29
CLC Number:
DONG Yingjuan, LYU Ping, LIU Bing. An Automated Penetration Testing System Based on Multi-Agent Architecture[J]. Netinfo Security, 2026, 26(4): 654-664.
Add to citation manager EndNote|Ris|BibTeX
URL: http://netinfo-security.org/EN/10.3969/j.issn.1671-1122.2026.04.012
| 试题名称 | 分类 | PentestGPT | 本文系统 | ||
|---|---|---|---|---|---|
| 测试/次 | 成功/次 | 成功/次 | 测试/次 | ||
| login | Web | 5 | 5 | 5 | 5 |
| advance-potion-making | forensics | 5 | 3 | 3 | 4 |
| spelling-quiz | crypto | 5 | 4 | 3 | 5 |
| caas | Web | 5 | 2 | 5 | 5 |
| XtrOrdinary | crypto | 5 | 5 | 3 | 5 |
| tripplesecure | crypto | 5 | 3 | 2 | 5 |
| clutteroverflow | binary | 5 | 1 | 3 | 5 |
| not crypto | reverse | 5 | 0 | 0 | 5 |
| scrambled-bytes | forensics | 5 | 0 | 0 | 5 |
| breadth | reverse | 5 | 0 | 0 | 5 |
| notepad | Web | 5 | 1 | 4 | 5 |
| college-rowing-team | crypto | 5 | 2 | 1 | 5 |
| fermat-strings | binary | 5 | 0 | 0 | 5 |
| corrupt-key-1 | crypto | 5 | 0 | 0 | 5 |
| SaaS | binary | 5 | 0 | 0 | 5 |
| riscy business | reverse | 5 | 0 | 0 | 5 |
| homework | binary | 5 | 0 | 0 | 5 |
| lockdown-horses | binary | 5 | 0 | 0 | 5 |
| corrupt-key-2 | crypto | 5 | 0 | 0 | 5 |
| vr-school | binary | 5 | 0 | 0 | 5 |
| MATRIX | reverse | 5 | 0 | 0 | 5 |
| 测试项 | AWVS | Xray | 本文系统 |
|---|---|---|---|
| Reflected XSS into HTML context with nothing encoded | 成功 | 成功 | 成功 |
| Stored XSS into HTML context with nothing encoded | 成功 | 失败 | 成功 |
| DOMXSS in document.write sink using source location.search | 成功 | 失败 | 成功 |
| DOM XSS in innexHTML sink using source location.search | 成功 | 失败 | 成功 |
| DOM XSS in jQuery anchor href attribute sink using location.search source | 失败 | 失败 | 成功 |
| DOM XSS in jQuery selector sink usinga hashchange event | 失败 | 失败 | 成功 |
| Reflected XSS into attribute with angle brackets HTML-encoded | 成功 | 成功 | 成功 |
| Stored XSS into anchor href attribute with double quotes HTML-encoded | 失败 | 失败 | 失败 |
| Reflected XSS into Javascript string with angle brackets HTML encoded | 成功 | 成功 | 成功 |
| DOM XSS in document.write sink using source location.search inside a select element | 成功 | 成功 | 成功 |
| DOM XSS in AngularJs expression with angle brackets and double quotes HTML-encoded | 成功 | 成功 | 成功 |
| Reflected DOM XSS | 失败 | 失败 | 失败 |
| Reflected XSS into HTML context with most tags and attributes blocked | 成功 | 成功 | 失败 |
| Reflected XSS into HTML context with all tags blocked except custom ones | 成功 | 成功 | 成功 |
| [1] | BISHOP M. About Penetration Testing[J]. IEEE Security & Privacy Magazine, 2007, 5(6): 84-87. |
| [2] | NIST SP 800-115 Technical Guide to Information Security Testing and Assessment[S]. Gaithersburg: National Institute of Standards and Technology, 2008. |
| [3] | ANTUNES N, VIEIRA M. Benchmarking Vulnerability Detection Tools for Web Services[C]// IEEE. 2010 IEEE International Conference on Web Services. New York: IEEE, 2010: 203-210. |
| [4] | XIONG Pulei, PEYTON L. A Model-Driven Penetration Test Framework for Web Applications[C]// IEEE. 2010 Eighth International Conference on Privacy, Security and Trust. New York: IEEE, 2010: 173-180. |
| [5] | ROY S S, THOTA P, NARAGAM K V, et al. From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models[C]// IEEE. 2024 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2024: 36-54. |
| [6] | BECKERICH M, PLEIN L, CORONADO S. RatGPT: Turning Online LLMs into Proxies for Malware Attacks[EB/OL].(2023-09-07)[2025-09-02]. https://arxiv.org/abs/2308.09183. |
| [7] | MOHAMED F M F, ELBREIKI W, ABDULLAHI I, et al. WormGPT: A Large Language Model Chatbot for Criminals[C]// IEEE. 2023 24th International Arab Conference on Information Technology (ACIT). New York: IEEE, 2023: 1-6. |
| [8] | HOU Xinyi, ZHAO Yanjie, LIU Yue, et al. Large Language Models for Software Engineering: A Systematic Literature Review[J]. ACM Transactions on Software Engineering and Methodology, 2024, 33(8): 1-79. |
| [9] | YANG Zhou, SUN Zhensu, YUE T Z, et al. Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code[EB/OL].(2024-03-12)[2025-09-02]. https://arxiv.org/abs/2403.07506. |
| [10] | OH S, LEE K, PARK S, et al. Poisoned ChatGPT Finds Work for Idle Hands: Exploring Developers’ Coding Practices with Insecure Suggestions from Poisoned AI Models[C]// IEEE. 2024 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2024: 1141-1159. |
| [11] | SCHUSTER R, SONG Congzheng, TROMER E, et al. You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion[C]// USENIX. The 30th USENIX Security Symposium. Berkely: USENIX Association, 2021: 1559-1575. |
| [12] | NGUYEN P T, DI S C, DI R J, et al. Adversarial Attacks to API Recommender Systems: Time to Wake up and Smell the Coffee?[C]// IEEE. 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). New York: IEEE, 2021: 253-265. |
| [13] | QI Shiyi, YANG Yuanhang, GAO S, et al. BadCS: A Backdoor Attack Framework for Code Search[EB/OL].(2023-05-09)[2025-09-02]. https://arxiv.org/abs/2305.05503. |
| [14] | SUN Weisong, CHEN Yuchen, TAO Guanhong, et al. Backdooring Neural Code Search[EB/OL].(2023-06-12)[2025-09-02]. https://arxiv.org/abs/2305.17506. |
| [15] | WAN Yao, ZHANG Shijie, ZHANG Hongyu, et al. You See What I Want You to See: Poisoning Vulnerabilities in Neural Code Search[C]// ACM. The 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2022: 1233-1245. |
| [16] | BODDY M, GOHDE J, HAIGH T, et al. Course of Action Generation for Cyber Security Using Classical Planning[C]// ICAPS. International Conference on Automated Planning and Scheduling. Palo Alto: AAAI, 2005: 16-21. |
| [17] | OBES J L, SARRAUTE C, RICHARTE G. Attack Planning in the Real World[EB/OL].(2013-06-19)[2025-09-02]. https://arxiv.org/abs/1306.4044. |
| [18] | ROBERTS M, HOWE A, RAY I, et al. Personalized Vulnerability Analysis through Automated Planning[EB/OL]. [2025-09-02]. https://www.researchgate.net/publication/228946141_Personalized_Vulnerability_Analysis_through_Automated_Planning. |
| [19] | DENG Gelei, LIU Yi, MAYORAL-VILCHES V, et al. PentestGPT: Evaluating and Harnessing Large Language Models for Automated Penetration Testing[C]// USENIX. The 33rd USENIX Security Symposium. Berkely: USENIX Association, 2024: 847-864. |
| [20] | HAPPE A, KAPLAN A, CITO J. LLMs as Hackers: Autonomous Linux Privilege Escalation Attacks[EB/OL].(2023-10-17)[2025-09-02]. https://arxiv.org/abs/2310.11409. |
| [21] | FANG R, BINDU R, GUPTA A, et al. LLM Agents Can Autonomously Hack Websites[EB/OL].(2024-02-06)[2025-09-02]. https://arxiv.org/abs/2402.06664. |
| [22] | FANG R, BINDU R, GUPTA A, et al. LLM Agents Can Autonomously Exploit One-Day Vulnerabilities[EB/OL].(2024-04-11)[2025-09-02]. https://arxiv.org/abs/2404.08144. |
| [23] | OpenAI. What is the Difference between the GPT-4 Models[EB/OL]. [2025-10-26]. https://help.openai.com/en/articles/7127966-what-is-the-difference-between-the-gpt-4-models. |
| [24] | LIU N F, LIN K, HEWITT J, et al. Lost in the Middle: How Language Models Use Long Contexts[EB/OL].(2023-07-06)[2025-09-02]. https://arxiv.org/abs/2307.03172. |
| [25] | BANG Yejin, CAHYAWIJAYA S, LEE N, et al. A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity[EB/OL].(2023-02-08)[2025-09-02]. https://arxiv.org/abs/2302.04023. |
| [26] | MITRE Corporation. MITRE ATT&CK® Matrix for Enterprise (Knowledge Base)[EB/OL].(2025-10-28)[2026-01-27]. https://attack.mitre.org/matrices/enterprise/. |
| [27] | SINGH G P, BHARTI V, HOODA M K. A Review on NIST, ISO 27001, HIPAA and MITRE ATT&CK Cybersecurity Frameworks[J]. Webology, 2021, 18(6): 1872-1880. |
| [28] | AMMANN P, WIJESEKERA D, KAUSHIK S. Scalable, Graph-Based Network Vulnerability Analysis[C]// ACM. The 9th ACM Conference on Computer and Communications Security. New York: ACM, 2002: 217-224. |
| [29] | Anthropic. Introducing the Model Context Protocol[EB/OL].(2024-11-25)[2025-09-02]. https://www.anthropic.com/news/model-context-protocol. |
| [30] | LYON G F. Nmap Network Scanning: The Official Nmap Project Guide to Network Discovery and Security Scanning[M]. Rockland, MA: Insecure. Com LLC, 2009. |
| [31] | Hack The Box. Hack The Box: Hacking Training for the Best[EB/OL]. [2025-11-05]. http://www.hackthebox.com/.https://www.acunetix.com/. |
| [32] | picoCTF. picoCTF 2021 Redpwn Competition[EB/OL]. [2025-11-21]. https://picoctf.org/competitions/2021-redpwn.html. |
| [33] | PortSwigger. Cross-Site Scripting (XSS)[EB/OL]. [2025-11-21]. https://portswigger.net/web-security/cross-site-scripting. |
| [34] | Acunetix. Acunetix Web Vulnerability Scanner[EB/OL]. [2025-11-21]. https://www.acunetix.com/.https://arxiv.org/abs/2404.08144. |
| [35] | Chaitin Technology. X-Ray Vulnerability Scanner[EB/OL].(2026-01-01)[2026-02-02]. https://www.chaitin.cn/en/xrayhttps://www.chaitin.cn/en/xray. |
| [1] | CUI Jinhua, DONG Liang, YANG Xin. A Survey of Privacy-Preserving Techniques for Large Language Model Inference [J]. Netinfo Security, 2026, 26(4): 503-520. |
| [2] | LI Hailong, ZHANG Yunhao, SHEN Xieyang, XING Yuhang, CUI Zhian. A Survey of Machine Learning-Based Malware Detection Methods [J]. Netinfo Security, 2026, 26(4): 521-541. |
| [3] | ZHENG Dong, LIU Yanrong, QIN Baodong. A Secure and Scalable Variant-Threshold Multiparty Private Set Intersection Protocol [J]. Netinfo Security, 2026, 26(4): 542-551. |
| [4] | ZHANG Yanshuo, KONG Jiayin, ZHOU Xingyu, QIN Xiaohong, HU Ronglei. A Deniable Ring Signcryption Scheme Based on SM9 [J]. Netinfo Security, 2026, 26(4): 552-565. |
| [5] | YI Wenzhe, XU Xiaoyang, SHI Lei, ZHUANG Yong, WANG Juan. Model Inversion Defense Method Based on Knowledge Transfer and Freezing [J]. Netinfo Security, 2026, 26(4): 566-578. |
| [6] | LI Jinkai, WANG Jingwen, DONG Libo, YAO Wenhan, LIU Chengjie, WEN Weiping. A Blockchain Anomaly Transaction Detection Method Based on Temporal Graph Attention Network [J]. Netinfo Security, 2026, 26(4): 579-590. |
| [7] | LI Yan, YANG Wenzhang, XUE Yinxing. Cross-Language Compiler Fuzzing Based on LLM Translation and Differential Testing [J]. Netinfo Security, 2026, 26(4): 591-604. |
| [8] | YU Miao, GUO Songhui, SONG Shuaichao, YANG Yeming. Research on Graph Neural Network Text Matching Model for Derivative Classification [J]. Netinfo Security, 2026, 26(4): 605-614. |
| [9] | HU Mianning, LI Xin, LI Mingfeng, YUAN Deyu. Research on Multi-Strategy Enhanced Chinese Network Threat Intelligence Entity Extraction Based on Large Language Model [J]. Netinfo Security, 2026, 26(4): 615-625. |
| [10] | SHU Zhan, MA Yilan, NIE Kaifeng, LI Zongpeng. A High-Confidence Assessment Method for Network Alarm Logs Based on OOD Technology [J]. Netinfo Security, 2026, 26(4): 626-641. |
| [11] | YUAN Xiaogang, PEI Huan, AN Dezhi, WAN Jianxin. Research on Deepfake Image Detection Based on Multi-Feature Perception and Attention Mechanism [J]. Netinfo Security, 2026, 26(4): 642-653. |
| [12] | YUAN Ming, ZOU Qilin, YUAN Wenqi, WANG Qun. A Survey on Prompt Injection Attacks and Defenses in Large Language Models [J]. Netinfo Security, 2026, 26(3): 341-354. |
| [13] | LI Fujuan, WANG Qun. Research Progress of Cyber Ranges [J]. Netinfo Security, 2026, 26(3): 355-366. |
| [14] | XU Yanwei, TU Min, ZHANG Liang. A Review on the Authenticity Verification of Deepfake Speech [J]. Netinfo Security, 2026, 26(3): 367-377. |
| [15] | HU Wentao, DING Weijie. DiffGuard: Network Traffic Anomaly Detection Based on Diffusion Models and Adaptive Sequence Learning [J]. Netinfo Security, 2026, 26(3): 378-388. |
| Viewed | ||||||
|
Full text |
|
|||||
|
Abstract |
|
|||||