Netinfo Security ›› 2026, Vol. 26 ›› Issue (3): 341-354.doi: 10.3969/j.issn.1671-1122.2026.03.001

Previous Articles     Next Articles

A Survey on Prompt Injection Attacks and Defenses in Large Language Models

YUAN Ming1,2(), ZOU Qilin3, YUAN Wenqi4, WANG Qun1   

  1. 1. Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing 210031, China
    2. School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
    3. Sheyang County Public Security Bureau, Yancheng 224300, China
    4. Dafeng Branch of Yancheng Public Security Bureau, Yancheng 224199, China
  • Received:2025-08-11 Online:2026-03-10 Published:2026-03-30

Abstract:

With the widespread application of Large Language Models and their powered AI Agents in various domains, the security of LLMs has become increasingly prominent. As an emerging security threat, prompt injection attacks pose huge security risks to large language models. They exploit the weakness that large language models cannot distinguish user instructions from injected instructions, thereby inducing the model to deviate from the intended task and execute the attacker’s commands, leading to issues such as data leakage and system intrusion. This paper systematically reviewed the current research status of prompt injection attacks, covering attack types such as early direct injection, role-based injection, payload splitting, obfuscation injection, and optimization-based injection. In terms of defenses, this paper classified existing methods into detection-based defenses and prevention-based defenses according to defense mechanisms.

Key words: large language models, prompt injection attacks, AI agent, AI security

CLC Number: