A Payload Generation Method for SQL Injection Vulnerability Detection Based on Large Language Models

doi:10.3969/j.issn.1671-1122.2026.02.008

Abstract

Abstract:

Existing SQL injection vulnerability detection methods suffer from insufficient robustness and a lack of targeted test cases. To address these limitations, this paper proposed a large language model(LLM)-based approach for generating targeted detection payloads to effectively identify SQL injection vulnerabilities. Specifically, by integrating prompt engineering with the DeepSeek-V3 model, the method automatically extracted heterogeneous vulnerability features and constructed them into a unified semantic representation. A contribution-based feature selection mechanism was then employed to identify the most influential features, which serve as the core input to the model. Furthermore, key features were structured into a chain-of-thought format to enable effective fusion of multi-dimensional vulnerability representations. Domain-adaptive supervised fine-tuning was performed on the Qwen model using low-rank adaptation.Extensive experiments was conducted on multiple public vulnerability benchmarks to evaluate both the detection performance and payload generation quality of the proposed method against SqliGPT, GPT-2-web, and SQLMap. Additionally, we conducted an in-depth analysis of DeepSeek-V3’s capability in extracting meaningful features from complex SQL injection vulnerability data. Experimental results show that the Qwen model achieves an average detection accuracy of over 75%, representing improvements of 49.18%, 59.64%, and 15.19% over SqliGPT, GPT-2-web, and SQLMap, respectively. Moreover, the quality of its generated payloads is significantly superior to that of existing models, demonstrating the effectiveness and superiority of the proposed approach—leveraging large language models to generate detection payloads for SQL injection vulnerability identification.

Key words: large language model, SQL injection vulnerability, code generation, detection payload

CLC Number:

TP309

GU Zhaojun, LI Li, SUI He. A Payload Generation Method for SQL Injection Vulnerability Detection Based on Large Language Models[J]. Netinfo Security, 2026, 26(2): 274-290.

Figures/Tables 17

References 33

[1]	JAIN S. 160 Cybersecurity Statistics[EB/OL]. (2025-01-09)[2025-05-24]. https://www.getastra.com/blog/security-audit/cyber-security-statistics/.
[2]	FreeBuf. 2023 Global Top 10 Security Vulnerabilities \| FreeBuf Annual Review[EB/OL]. (2024-01-04)[2025-05-24]. https://www.freebuf.com/news/388742.html.
	FreeBuf. 2023 全球年度安全漏洞TOP 10 \| FreeBuf 年度盘点[EB/OL]. (2024-01-04)[2025-05-24]. https://www.freebuf.com/news/388742.html.
[3]	HUANG Kaijie, WANG Jian, CHEN Jiongyi. A Large Language Model Based SQL Injection Attack Detection Method[J]. Netinfo Security, 2023, 23(11): 84-93.
	黄恺杰, 王剑, 陈炯峄. 一种基于大语言模型的SQL注入攻击检测方法[J]. 信息网络安全, 2023, 23(11):84-93.
[4]	LU Dongzhe, FEI Jinlong, LIU Long. A Semantic Learning-Based SQL Injection Attack Detection Technology[EB/OL]. (2023-02-09)[2025-05-10]. https://doi.org/10.3390/electronics1206134.
[5]	BOLOTNIKOV I V, BORODIN A E. Interprocedural Static Analysis for Finding Bugs in Go Programs[J]. Programming and Computer Software, 2021, 47(5): 344-352. doi: 10.1134/S0361768821050030
[6]	LIVSHITS V B, LAM M S. Finding Security Vulnerabilities in Java Applications with Static Analysis[C]// USENIX. The 14th Conference on USENIX Security Symposium. New York: USENIX, 2005: 18-29.
[7]	LI Qi, LI Weishi, WANG Junfeng, et al. A SQL Injection Detection Method Based on Adaptive Deep Forest[J]. IEEE Access, 2019, 7: 145385-145394. doi: 10.1109/ACCESS.2019.2944951
[8]	RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving Language Understanding by Generative Pre-Training[EB/OL]. [2025-05-17]. https://api.semanticscholar.org/CorpusID:49313245.
[9]	TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: Open and Efficient Foundation Language Models[EB/OL]. (2023-02-27)[2025-05-17]. https://arxiv.org/abs/2302.13971.
[10]	GUI Zhiwen, WANG Enze, DENG Binbin, et al. SqliGPT: Evaluating and Utilizing Large Language Models for Automated SQL Injection Black-Box Detection[EB/OL]. (2024-08-07)[2025-05-17]. https://doi.org/10.3390/app1416692.
[11]	ĆIRKOVIĆ S, MLADENOVIĆ V, TOMIĆ S, et al. Utilizing Fine-Tuning of Large Language Models for Generating Synthetic Payloads: Enhancing Web Application Cybersecurity through Innovative Penetration Testing Techniques[J]. Computers, Materials & Continua, 2025, 82(3): 4409-4430.
[12]	WU Peize, LI Guanghui, WU Jinyu. Research on Automated Vulnerability Verification Code Generation Based on Large Language Models[EB/OL]. (2024-06-20)[2025-05-10]. https://kns.cnki.net/kcms2/article/abstract?v=MXvIvFkaDQz0Ed1hcQN9CL-gXr5KEIhM5964CkAGitVLQj534FnW1QowKkJ4WAgttjFFL0fZhSaGn07arFP_v3d_Buwl9snK_NfzS-YnI0oSzgnHjsO-O0TrWBMHKVS99os3LXwpBVAl_JCWrFc-_pT5Ybux81d8cT6Gw2I5naP9T-kI9v978mcS2fJKkXwY&uniplatform=NZKPT&language=CHS.
	吴佩泽, 李光辉, 吴津宇. 基于大语言模型的自动化漏洞验证代码生成方法研究[EB/OL]. (2024-06-20)[2025-05-10]. https://kns.cnki.net/kcms2/article/abstract?v=MXvIvFkaDQz0Ed1hcQN9CL-gXr5KEIhM5964CkAGitVLQj534FnW1QowKkJ4WAgttjFFL0fZhSaGn07arFP_v3d_Buwl9snK_NfzS-YnI0oSzgnHjsO-O0TrWBMHKVS99os3LXwpBVAl_JCWrFc-_pT5Ybux81d8cT6Gw2I5naP9T-kI9v978mcS2fJKkXwY&uniplatform=NZKPT&language=CHS.
[13]	YANG Guang, ZHOU Yu, CHEN Xiang, et al. ExploitGen: Template-Augmented Exploit Code Generation Based on CodeBERT[EB/OL]. (2023-03-01)[2025-05-17]. https://doi.org/10.1016/j.jss.2022.11157.
[14]	PENG Qi, CAI Yi, LIU Jiankun, et al. Integration of Multi-Source Medical Data for Medical Diagnosis Question Answering[J]. IEEE Transactions on Medical Imaging, 2025, 44(3): 1373-1385. doi: 10.1109/TMI.2024.3496862 pmid: 40030182
[15]	LIAO Xingming, CHEN Chong, WANG Zhuowei, et al. Large Language Model Assisted Fine-Grained Knowledge Graph Construction for Robotic Fault Diagnosis[EB/OL]. (2025-05-01)[2025-06-17]. https://doi.org/10.1016/j.aei.2025.10313.
[16]	WEI J, WANG Xuezhi, SCHUURMANS D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[C]// ACM. The 36th International Conference on Neural Information Processing Systems. New York: ACM, 2022: 24824-24837.
[17]	OU Jianjiu, ZHOU Jianlong, DONG Yifei, et al. Chain of Thought Prompting in Vision-Language Model for Vision Reasoning Tasks[C]// Springer. The 37th Australasian Joint Conference on Artificial Intelligence. Heidelberg: Springer, 2024: 298-311.
[18]	YAO Chengyuan, FUJITA S. Adaptive Control of Retrieval-Augmented Generation for Large Language Models through Reflective Tags[EB/OL]. (2024-11-25)[2025-07-15]. https://doi.org/10.3390/electronics1323464.
[19]	OffSEC. Exploit Database[EB/OL]. (2010-11-01)[2025-09-07]. https://www.exploit-db.com/.
[20]	Private Internet Access. PacketStorm Security Archive[EB/OL]. [2025-07-15]. https://packetstormsecurity.com/.
[21]	OpenAI. Completion-OpenAI API[EB/OL]. [2025-12-05]. https://beta.openai.com/docs/guides/completion/prompt-design.
[22]	DeepSeek. DeepSeek-V3[EB/OL]. [2025-03-07]. https://huggingface.co/deepseek-ai/DeepSeek-V3.
[23]	HU E J, SHEN Yelong, WALLIS P, et al. LoRA: Low-Rank Adaptation of Large Language Models[EB/OL]. (2021-06-17)[2025-04-17]. https://doi.org/10.48550/arXiv.2106.0968.
[24]	STAMPAR M, DAMELE A G B. SQLMap: Automatic SQL Injection and Database Takeover Tool[EB/OL]. [2025-03-26]. https://github.com/sqlmapproject/sqlmap.
[25]	MITRE. Common Vulnerabilities and Exposures(CVE)[EB/OL]. (2000-01-01)[2025-07-15]. https://cve.mitre.org/.
[26]	Audi-1. SQLI-Labs[EB/OL]. (2014-04-01)[2025-04-17]. https://github.com/Audi-1/sqli-labs.
[27]	DEWHURST R. Damn Vulnerable Web Application(DVWA)[EB/OL]. (2023-05-21)[2025-04-17]. https://github.com/digininja/DVWA.
[28]	HUN Lu. Pikachu[EB/OL]. [2025-04-18]. https://github.com/zhuifengshaonianhanlu/pikachu.
[29]	MALIK B. BWAPP[EB/OL]. (2013-01-08)[2025-04-18]. https://sourceforge.net/projects/bwapp/.
[30]	SHITOU CLOUD. AutoDL: High-Performance Cloud Computing Platform for Deep Learning[EB/OL]. [2025-05-07]. https://www.autodl.com.
	视拓云. AutoDL:高性能深度学习算力云平台[EB/OL]. [2025-05-07]. https://www.autodl.com.
[31]	ZHENG Yaowei, ZHANG Richong, ZHANG Junhao, et al. LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models[C]// Association for Computational Linguistics. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. Pennsylvania: Association for Computational Linguistics, 2024: 400-410.
[32]	PAPINENI K, ROUKOS S, WARD T, et al. Bleu: A Method for Automatic Evaluation of Machine Translation[C]// ACL. The 40th Annual Meeting of the Association for Computational Linguistics. Philadelphia: ACL, 2002: 311-318.
[33]	BERGER B, WATERMAN M S, YU Y W. Levenshtein Distance, Sequence Comparison and Biological Database Search[J]. IEEE Transactions on Information Theory, 2021, 67(6): 3287-3294. doi: 10.1109/tit.2020.2996543 pmid: 34257466

提示词内容	作用
全局规则	定义数据处理基本准则，确保数据一致性和标准化
字段要求	规定从漏洞文本中提取的关键信息及其格式
处理流程	指导模型从识别漏洞到生成最终输出的具体步骤
示例输出	展示根据上述规则组织和呈现提取出的信息

定义变量	含义
D={d₁, d₂,…, d_N}	SQL注入漏洞样本集合，其中，d_i表示第i个样本，N表示漏洞样本总数
K_i={k_i_,1, k_i_,2,…, k_i_,_M}	样本d_i的关键词集合，该样本的关键词总数为M=\|K_i\|
F={f₁, f₂,…, f_S} F_i={f_i_,1, f_i_,2,…, f_i_,_S}	F为总体漏洞特征集合，F_i为样本d_i特征集合，其中，每个特征表示为键值对形式(f_i_,_j,v_i_,_j)，f_i_,_j∈F_i，特征总数为S
$Coun{{t}_{{{\text{f}}_{\text{i,j}}}}}$	样本d_i中特征f_i_,_j的共现次数
$Fre{{q}_{{{f}_{i,j}}}}$	样本d_i中特征f_i_,_j的共现频率
$TotalFre{{q}_{{{f}_{j}}}}$	特征f_j在所有漏洞样本中的总共现频率
${{W}_{{{f}_{j}}}}$	特征f_j的初始权重

定义变量	含义
R_i^initial	第个i样本的初始Reasoning特征，包含关键函数和结构
f_i,injection	第i个样本中的注入点特征，包括参数、初始值和闭合方法
f_i,type	第i个样本的漏洞类型特征
S_prompt	系统提示模板，用于指导模型
P_i	第i个样本的检测载荷，即P_i∈d_i
T(·)	模型的处理函数，将输入转为CoT提示
C_i={c_i_,1, c_i_,2,…, c_i, _k}	第i个样本中的CoT提示，其中，c_i, _j表示一个推理步骤

数据集	来源	描述
训练数据	Exploit-Database	收录大量真实攻击载荷与利用代码，提升模型对实际攻击行为的理解
	PacketStorm	提供丰富的安全技术文档与利用示例，增强数据集的多样性与覆盖范围
	CVE^[25]	提供标准化漏洞标识与描述，确保数据的代表性和时效性
测试数据	SQL-Libs^[26]	专注于SQL注入漏洞的资源库，提供多种攻击示例及修复方法
	DVWA^[27]	提供常见Web攻击技术的合法环境，包括基础、中级和高级的SQL注入漏洞场景
	Pikachu^[28]	专注于练习Web漏洞的安全测试平台，漏洞类型广泛，包括但不限于SQL注入和XSS等
	bWAPP^[29]	支持超过100种不同漏洞场景的练习平台，适用于多种服务器端编程语言，包括SQL注入漏洞
	Newsqliset	收集CVE、CNVD和Freebuf等报告及个人复现的ORM、NoSQL和GraphQL等注入漏洞案例

来源类型	Exploit-Database/个	PacketStorm /个	CVE/个	总数/个
Union-based	1032	33	54	1119
Time Blind	126	147	88	361
Error-based	48	41	42	131
Boolean Blind	240	94	28	362
Stacked Query	49	6	7	62
Wide Byte	0	0	1	1
Column Probing	10	2	1	13
File Write	11	3	3	17
Bypass	173	51	13	237
Multi Parameter	5	0	0	5
ORM	0	2	1	3