Research on Protocol Fuzzing Technology Guided by Large Language Models

doi:10.3969/j.issn.1671-1122.2025.12.002

Abstract

Abstract:

Security vulnerabilities in network protocol software occur frequently and pose serious threats to cyberspace security. Gray-box protocol fuzzing tools, such as AFLNet, have improved vulnerability detection by introducing coverage feedback and state modeling mechanisms. However, constrained by a persistent “semantic barrier”, these tools struggle to comprehend protocol syntax structures and contextual logic, resulting in limited testing efficiency. In recent years, large language models have demonstrated exceptional generalization and comprehension capabilities in tasks such as semantic modeling, contextual reasoning, and code generation, providing a promising pathway to overcome this barrier. This paper proposed LPF (LLMProFuzz), a protocol fuzzing framework guided by large language models, which addressed the limitations of traditional methods from three perspectives: firstly, automatically extracting protocol syntax templates through few-shot prompt engineering; secondly, designing a seed enrichment mechanism based on historical vulnerability characteristics to generate high-value initial cases that cover boundary and exceptional scenarios; thirdly, introducing a structure-aware mutation location selection strategy to increase the proportion of effective test cases. Experimental results on representative protocol stacks, including HTTP, FTP, and RTSP, demonstrate that LPF significantly outperforms baseline tools such as AFLNet and StateAFL in terms of code coverage, state coverage, and testing efficiency.

Key words: large language models, network protocol, fuzzing, structure-aware mutation, prompt engineering

CLC Number:

TP309

YANG Liqun, LI Zhen, WEI Chaoren, YAN Zhimin, QIU Yongxin. Research on Protocol Fuzzing Technology Guided by Large Language Models[J]. Netinfo Security, 2025, 25(12): 1847-1862.

Figures/Tables 14

References 34

[1]	BA J, BÖHME M, MIRZAMOMEN Z, et al. Stateful Greybox Fuzzing[C]// USENIX. The 31st USENIX Security Symposium. Berkely: USENIX Association, 2022: 3255-3272.
[2]	LU Liyu, LIU Yuan, HONG Chao, et al. Screening Method of Fuzzy Test Seeds Based on Impact Orientation[J]. Network Security Technology & Application, 2024(2): 44-46.
	陆力瑜, 刘媛, 洪超, 等. 基于影响性导向的模糊测试种子筛选方法[J]. 网络安全技术与应用, 2024(2):44-46.
[3]	MILLER B P, ZHANG Mengxiao, HEYMANN E R. The Relevance of Classic Fuzz Testing: Have We Solved this One?[J]. IEEE Transactions on Software Engineering, 2022, 48(6): 2028-2039. doi: 10.1109/TSE.2020.3047766 URL
[4]	MANÈS V J M, HAN H, HAN C, et al. The Art, Science, and Engineering of Fuzzing: A Survey[J]. IEEE Transactions on Software Engineering, 2021, 47(11): 2312-2331. doi: 10.1109/TSE.2019.2946563 URL
[5]	ZHANG Xiaohan, ZHANG Cen, LI Xinghua, et al. A Survey of Protocol Fuzzing[J]. ACM Computing Surveys, 2024, 57(2): 1-36.
[6]	ZHAO Yiru, GAO Long, WEI Qiang, et al. Towards Tightly-Coupled Hybrid Fuzzing via Excavating Input Specifications[J]. IEEE Transactions on Dependable and Secure Computing, 2024, 21(5): 4801-4814. doi: 10.1109/TDSC.2024.3361008 URL
[7]	PHAM V, BÖHME M, ROYCHOUDHURY A. AFLNet: A Greybox Fuzzer for Network Protocols[C]// IEEE. The 13th IEEE International Conference on Software Testing, Verification and Validation (ICST). New York: IEEE, 2020: 460-465.
[8]	LUO Zhengxiong, YU Junze, DU Qingpeng, et al. Parallel Fuzzing of IoT Messaging Protocols through Collaborative Packet Generation[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024, 43(11): 3431-3442. doi: 10.1109/TCAD.2024.3444705 URL
[9]	MENG Ruijie, PHAM V-T, BÖHME M, et al. AFLNet Five Years Later: On Coverage-Guided Protocol Fuzzing[J]. IEEE Transactions on Software Engineering, 2025, 51(4): 960-974. doi: 10.1109/TSE.2025.3535925 URL
[10]	Website. American Fuzzy Lop (AFL) Fuzzer[EB/OL]. [2025-07-30]. http://lcamtuf.coredump.cx/afl.
[11]	LI Junqiang, LI Senyi, SUN Gang, et al. SNPSFuzzer: A Fast Greybox Fuzzer for Stateful Network Protocols Using Snapshots[J]. IEEE Transactions on Information Forensics and Security, 2022, 17: 2673-2687. doi: 10.1109/TIFS.2022.3192991 URL
[12]	ANDRONIDIS A, CADAR C. SnapFuzz: High-Throughput Fuzzing of Network Applications[C]// ACM. The 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2022: 340-351.
[13]	QIN Shisong, HU Fan, MA Zheyu, et al. NSFuzz: Towards Efficient and State-Aware Network Service Fuzzing[J]. ACM Transactions on Software Engineering and Methodology, 2023, 32(6): 1-26.
[14]	HUANG Tao, GAO Yansong, ZHENG Yifeng, et al. FineBID: Fine-Grained Protocol Reverse Engineering for Bit-Level Field Identification[J]. IEEE Transactions on Dependable and Secure Computing, 2025, 22(3): 2670-2686. doi: 10.1109/TDSC.2024.3521592 URL
[15]	KIM J, SEO M, MARIN E, et al. Ambusher: Exploring the Security of Distributed SDN Controllers through Protocol State Fuzzing[J]. IEEE Transactions on Information Forensics and Security, 2024, 19: 6264-6279. doi: 10.1109/TIFS.2024.3402967 URL
[16]	ZHANG Qingyu, LIN Jiayi, SUN Chenxin, et al. CherryPicker: A Parallel Solving and State Sharing Hybrid Fuzzing System[J]. IEEE Transactions on Dependable and Secure Computing, 2025, 22(4): 3324-3336. doi: 10.1109/TDSC.2025.3530010 URL
[17]	HONG Xuanquan, JIA Peng, LIU Jiayong. AFLNeTrans: Fuzzing of Protocols with State Relationship Awareness[J]. Netinfo Security, 2024, 24(1): 121-132.
	洪玄泉, 贾鹏, 刘嘉勇. AFLNeTrans:状态间关系感知的网络协议模糊测试[J]. 信息网络安全, 2024, 24(1):121-132.
[18]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is All You Need[C]// ACM. The 31st International Conference on Neural Information Processing Systems (NIPS’17). New York: ACM, 2017: 6000-6010.
[19]	BROWN T, MANN B, RYDER N, et al. Language Models are Few-Shot Learners[C]// ACM. The 34th International Conference on Neural Information Processing Systems (NIPS ‘20). New York: ACM, 2020: 1877-1901.
[20]	ZHU Xiaogang, ZHOU Wei, HAN Q, et al. When Software Security Meets Large Language Models: A Survey[J]. IEEE/CAA Journal of Automatica Sinica, 2025, 12(2): 317-334. doi: 10.1109/JAS.2024.124971 URL
[21]	LEMIEUX C, INALA J, LAHIRI S, et al. Codamosa: Escaping Coverage Plateaus in Test Generation with Pre-Trained Large Language Models[C]// IEEE. The 45th IEEE/ACM International Conference on Software Engineering (ICSE). New York: IEEE, 2023: 919-931.
[22]	DENG Yinlin, XIA C S, PENG Haoran, et al. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models[C]// ACM. The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis(ISSTA 2023). New York: ACM, 2023: 423-435.
[23]	DENG Yinlin, XIA C S, YANG Chenyuan, et al. Large Language Models are Edge-Case Generators: Crafting Unusual Programs for Fuzzing Deep Learning Libraries[C]// ACM. The IEEE/ACM 46th International Conference on Software Engineering (ICSE’24). New York: ACM, 2024: 1-13.
[24]	ZHANG Qiang, SHEN Yuheng, LIU Jianzhong, et al. ECG: Augmenting Embedded Operating System Fuzzing via LLM-Based Corpus Generation[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024, 43(11): 4238-4249. doi: 10.1109/TCAD.2024.3447220 URL
[25]	SHAHRIAR A, HISHAM S J, RAHMAN K M A, et al. 5GPT: 5G Vulnerability Detection by Combining Zero-Shot Capabilities of GPT-4 with Domain Aware Strategies through Prompt Engineering[J]. IEEE Transactions on Information Forensics and Security, 2025, 20: 7045-7060. doi: 10.1109/TIFS.2025.3586480 URL
[26]	ZHENG Tao, SHAO Jiang, DAI Jinqiao, et al. RESTLess: Enhancing State-of-the-Art REST API Fuzzing with LLMs in Cloud Service Computing[J]. IEEE Transactions on Services Computing, 2024, 17(6): 4225-4238. doi: 10.1109/TSC.2024.3489441 URL
[27]	WANG Jincheng, YU Le, LUO Xiapu. LLMIF: Augmented Large Language Model for Fuzzing IoT Devices[C]// IEEE. The 2024 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2024: 881-896.
[28]	YANG Liqun, WEI Chaoren, YANG Jian, et al. Code Large Language Model-Based Fuzz Testing for Industrial IoT Programs[J]. IEEE Internet of Things Journal, 2024: 1-11.
[29]	PIYUSH J, JOSEPH S, JAYA S G, et al. BertRLFuzzer: A BERT and Reinforcement Learning Based Fuzzer[C]// ACM. The Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence (AAAI’24/IAAI’24/EAAI’24). New York: ACM, 2024: 23521-23522.
[30]	YANG Chenyuan, DENG Yinlin, LU Runyu, et al. WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models[C]// ACM. The ACM on Programming Languages. New York: ACM, 2024, 8(2): 709-735.
[31]	EOM J, JEONG S, KWON T. Fuzzing JavaScript Interpreters with Coverage-Guided Reinforcement Learning for LLM-Based Mutation[C]// ACM. The 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024). New York: ACM, 2024: 1656-1668.
[32]	XIA C S, PALTENGHI M, TIAN Jiale, et al. Fuzz4All: Universal Fuzzing with Large Language Models[C]// ACM. The 46th International Conference on Software Engineering. New York: ACM, 2024: 1-13.
[33]	YE Kai, ZHU Xiaogang, XIAO Xi, et al. BazzAFL: Moving Fuzzing Campaigns towards Bugs via Grouping Bug-Oriented Seeds[J]. IEEE Transactions on Dependable and Secure Computing, 2025, 22(1): 179-191. doi: 10.1109/TDSC.2024.3391795 URL
[34]	LI Yuwei, JI Shouling, LYU Chenyang, et al. V-Fuzz: Vulnerability Prediction-Assisted Evolutionary Fuzzing for Binary Programs[J]. IEEE Transactions on Cybernetics, 2022, 52(5): 3745-3756. doi: 10.1109/TCYB.2020.3013675 URL

服务器软件	协议名称	版本号
ProFTPD	FTP	6903c5e
Live555	RTSP	2c92a57
Lighttpd1	HTTP	51d48de
Exim	SMTP	e3c7712
Kamailio	SIP	8d482b1

大模型	参数规模/B	准确率	缺失率	超出率
GPT-4	1760	96.7%	3.3%	4.7%
Qwen3	235	94.8%	5.2%	7.9%
Code Llama	34	91.2%	8.8%	10.4%

测试对象	LPF	AFLNet		StateAFL
测试对象	覆盖分支数 /个	覆盖分支数 /个	提升率	覆盖分支数 /个	提升率
ProFTPD	5043	4812	4.80%	4733	6.55%
Live555	3026	2792	8.38%	2801	8.03%
Lighttpd1	4726	4610	2.52%	4778	-1.09%
Exim	3880	3694	5.04%	3564	8.87%
Kamailio	9775	9565	2.20%	9323	4.85%

测试对象	LPF	AFLNet		StateAFL
测试对象	覆盖状态数 /个	覆盖状态数 /个	提升率	覆盖状态数 /个	提升率
ProFTPD	23.5	20.4	15.20%	21.2	10.85%
Live555	11.3	8.8	28.41%	11.0	2.73%
Lighttpd1	8.5	7.2	18.06%	8.7	-2.30%
Exim	14.4	11.0	30.91%	12.1	19.01%
Kamailio	10.1	9.5	6.32%	11.7	-13.68%

测试对象	崩溃时间/s
测试对象	AFLNet	StateAFL	LPF
ProFTPD	—	—	5845.4
Live555	2427.5	4618.7	2213.6
Lighttpd1	—	9228.3	4845.4
Exim	1880.0	594.9	864.6
Kamailio	1977.5	3250.6	1893.2