Netinfo Security ›› 2025, Vol. 25 ›› Issue (8): 1254-1262.doi: 10.3969/j.issn.1671-1122.2025.08.007

Previous Articles     Next Articles

An Attack Path Discovery Method Based on Multi-Agent Adversarial Learning

ZHANG Guomin, ZHANG Junfeng(), TU Zhixin, WANG Zipeng   

  1. Institute of Command and Control Engineering, Army Engineering University of PLA, Nanjing 210001, China
  • Received:2024-09-13 Online:2025-08-10 Published:2025-09-09

Abstract:

Attack path discovery is a key technology in intelligent penetration testing. Due to factors such as security measures, target networks are often in a dynamically changing state. However, existing research methods are trained based on static virtual network environments, and agents struggle to adapt to environmental changes due to the problem of experience invalidation. To address this issue, this paper designed a fully competitive agent adversarial game framework (AGF), which simulated the adversarial game process between red and blue agents in the red team's attack path discovery within dynamic defense networks. Moreover, based on the proximal policy optimization (PPO) algorithm, an improved algorithm named PPODRP was proposed to plan and process states and actions, thereby enabling agents to adapt to dynamic environments. Experimental results show that compared with the traditional PPO algorithm, the PPODRP method achieves higher convergence efficiency in dynamic defense networks and can complete the attack path discovery task at a lower cost.

Key words: automated penetration testing, PPO algorithm, attack path discovery, adversarial reinforcement learning

CLC Number: