Endogenous Security Heterogeneous Entity Generation Method Based on Large Language Model

doi:10.3969/j.issn.1671-1122.2024.08.009

Abstract

Abstract:

To address the security challenges posed by unknown vulnerabilities and backdoors in software systems, the paper proposed an endogenous security heterogeneous entity generation method based on large language models. This method, centered around endogenous security strategies, diversified the execution bodies of code that were vulnerable within the program, enabling the system to swiftly switch to a healthy heterogeneous entity upon attack, thereby ensuring stable operation. Furthermore, it leveraged large language models to generate a variety of heterogeneous entities and optimized existing fuzz testing techniques with a seed distance-based method, enhancing the quality of test case generation and code coverage rates, ensuring the functional equivalence of these heterogeneous entities. Experimental results demonstrate that this method can effectively repair code vulnerabilities and produce functionally equivalent heterogeneous entities. Additionally, compared to the existing AFL algorithm, the optimized fuzz testing method consumes less time to achieve the same code coverage rate. It is evident that the method put forward in the paper can significantly improve the security and robustness of software systems, offering a new strategy for the defense against unknown threats.

Key words: endogenous security, large language model, fuzz testing

CLC Number:

TP309

CHEN Haoran, LIU Yu, CHEN Ping. Endogenous Security Heterogeneous Entity Generation Method Based on Large Language Model[J]. Netinfo Security, 2024, 24(8): 1231-1240.

Figures/Tables 6

References 33

[1]	WU Jiangxing. Research on Cyber Mimic Defense[J]. Journal of Cyber Security, 2016, 1(4): 1-10.
[2]	AHMAD W U, CHAKRABORTY S, RAY B, et al. Unified Pre-Training for Program Understanding and Generation[EB/OL]. (2021-04-10)[2024-03-30]. https://arxiv.org/abs/2103.06333v2.
[3]	ATHIWARATKUN B, GOUDA S K, WANG Zijian, et al. Multi-Lingual Evaluation of Code Generation Models[EB/OL]. (2022-10-26)[2024-03-30]. https://arxiv.org/abs/2210.
[4]	AUSTIN J, ODENA A, NYE M, et al. Program Synthesis with Large Language Models[EB/OL]. (2021-08-16)[2024-03-30]. https://arxiv.org/abs/2108.07732v1.
[5]	GODEFROID P, LEVIN M Y, MOLNAR D. SAGE: Whitebox Fuzzing for Security Testing[J]. Communications of the ACM, 2012, 55(3): 40-44.
[6]	WONDRACEK G, COMPARETTI P M, KRUEGEL C, et al. Automatic Network Protocol Analysis[C]// ISOC. National Down Syndrome Society. San Diego: ISOC, 2008: 1-14.
[7]	BAVARIAN M, JUN H, TEZAK N, et al. Efficient Training of Language Models to Fill in the Middle[EB/OL]. (2022-07-28)[2024-03-30]. https://arxiv.org/abs/2207.14255v1.
[8]	CHA S K, AVGERINOS T, REBERT A, et al. Unleashing Mayhem on Binary Code[C]// IEEE. 2012 IEEE Symposium on Security and Privacy. New York: IEEE, 2012: 380-394.
[9]	JANG J, AGRAWAL A, BRUMLEY D. ReDeBug: Finding Unpatched Code Clones in Entire OS Distributions[C]// IEEE. 2012 IEEE Symposium on Security and Privacy. New York: IEEE, 2012: 48-62.
[10]	GORBUNOV S, ROSENBLOOM A. AutoFuzz: Automated Network Protocol Fuzzing Framework[J]. International Journal of Computer Science and Network Security (IJCSNS), 2010, 10(8): 239-245.
[11]	GODBOLEY S, DUTTA A, PISIPATI R K, et al. SSG-AFL: Vulnerability Detection for Reactive Systems Using Static Seed Generator Based AFL[C]// IEEE. 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC). New York: IEEE, 2022: 1728-1733.
[12]	CHANG Yupeng, WANG Xu, WANG Jindong, et al. A Survey on Evaluation of Large Language Models[J]. ACM Transactions on Intelligent Systems and Technology, 2023, 15(3): 1-45.
[13]	WU Jiangxing. Endogenous Security in Cyberspace-Part II: Mimicry Defense and Generalized Robust Control[M]. Beijing: Science Press, 2020.
	邬江兴. 网络空间内生安全—下册:拟态防御与广义鲁棒控制[M]. 北京: 科学出版社, 2020.
[14]	CHERNYAVSKIY A, ILVOVSKY D, NAKOV P. Transformers: “the End of History” for Natural Language Processing?[C]// Springer. Machine Learning and Knowledge Discovery in Databases. Research Track. Heidelberg: Springer, 2021: 677-693.
[15]	MARK C, JERRY T, HEEWOO J, et al. Evaluating Large Language Models Trained on Code[EB/OL]. (2021-07-14)[2024-03-30]. https://arxiv.org/abs/2107.03374
[16]	CHEN Xinyun, LIU Chang, SONG D. Execution-Guided Neural Program Synthesis[EB/OL]. (2022-09-27)[2024-03-30]. https://api.semanticscholar.org/CorpusID:53317540.
[17]	CHEN Xinyun, SONG D, TIAN Yuandong. Latent Execution for Neural Program Synthesis[EB/OL]. (2021-06-29)[2024-03-30]. https://arxiv.org/abs/2107.00101v2.
[18]	CLARK K, LUONG M T, LE Q V, et al. ELECTRA: Pre-Training Text Encoders as Discriminators rather than Generators[EB/OL]. (2021-03-23)[2024-03-30]. https://arxiv.org/abs/2003.10555v1.
[19]	DEVLIN J, CHANG Mingwei, LEE K, et al. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding[EB/OL]. (2018-10-11)[2024-03-30]. https://arxiv.org/abs/1810.04805v2.
[20]	ELLIS K, NYE M, PU Y, et al. Write, Execute, Assess: Program Synthesis with a Repl[EB/OL]. (2019-06-09)[2024-03-30]. https://arxiv.org/abs/1906.04604.
[21]	FENG Zhangyin, GUO Daya, TANG Duyu, et al. CodeBERT: A Pre-Trained Model for Programming and Natural Languages[EB/OL]. (2020-02-19)[2024-03-30]. https://arxiv.org/abs/2002.08155v4.
[22]	FRIED D, AGHAJANYAN A, LIN J, et al. InCoder: A Generative Model for Code Infilling and Synthesis[EB/OL]. (2022-04-12)[2024-03-30]. https://arxiv.org/abs/2204.05999v3.
[23]	ZHANG Susan, ROLLER S, GOYAL N, et al. OPT: Open Pre-Trained Transformer Language Models[EB/OL]. (2022-05-02)[2024-03-30]. https://arxiv.org/abs/2205.01068v4.
[24]	TAY Y, DEHGHANI M, TRAN V Q, et al. UL2: Unifying Language Learning Paradigms[EB/OL]. (2020-05-10)[2024-03-30]. https://arxiv.org/abs/2205.05131.
[25]	AHMAD W U, CHAKRABORTY S, RAY B, et al. Unified Pre-Training for Program Understanding and Generation[EB/OL]. (2021-03-10)[2024-03-30]. https://arxiv.org/abs/2103.06333v2.
[26]	GUO Daya, LU Shuai, DUAN Nan, et al. UniXcoder: Unified Cross-Modal Pre-Training for Code Representation[EB/OL]. (2022-03-08)[2024-03-30]. https://arxiv.org/abs/2203.03850v1.
[27]	ZHAO Jianyu, RONG Yuyang, GUO Yiwen, et al. Understanding Programs by Exploiting (Fuzzing) Test Cases[EB/OL]. (2023-05-23)[2024-03-30]. https://arxiv.org/abs/2305.13592v2.
[28]	LU Yuteng, SHAO Kaicheng, SUN Weidi, et al. RGChaser: ARL-Guided Fuzz and Mutation Testing Framework for Deep Learning Systems[C]// IEEE. 2022 9th International Conference on Dependable Systems and Their Applications (DSA). New York: IEEE, 2022: 12-23.
[29]	LI Yuekang, CHEN Bihuan, CHANDRAMOHAN M, et al. Steelix: Program-State Based Binary Fuzzing[C]// ACM. Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. New York: ACM, 2017: 627-637.
[30]	BÖHME M, PHAM V T, ROYCHOUDHURY A. Coverage-Based Greybox Fuzzing as Markov Chain[C]// ACM. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2016: 1032-1043.
[31]	BÖHME M, PHAM V T, NGUYEN M D, et al. Directed Greybox Fuzzing[C]// ACM. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2017: 2329-2344.
[32]	PAK B. Hybrid Fuzz Testing: Discovering Software Bugs via Fuzzing and Symbolic Execution[D]. Pittsburgh: Carnegie Mellon University, 2012.
[33]	HALLER I, SLOWINSKA A, NEUGSCHWANDTNER M, et al. Dowsing for {Overflows}: A Guided Fuzzer to Find Buffer Boundary Violations[C]// USENIX. 22nd USENIX Security Symposium (USENIX Security 13). Berkeley: USENIX, 2013: 49-64.

程序	复杂度	方法	耗时/h	覆盖率
程序1	O(nlogn)	AFL	1	58.34%
			3	73.23%
			6	89.45%
		本文方法	1	52.86%
			3	69.92%
			6	91.56%
程序2	O(n²)	AFL	5	46.74%
			10	60.23%
			15	71.49%
		本文方法	5	49.01%
			10	65.59%
			15	77.03%
程序3	O(n³)	AFL	12	39.58%
			18	49.89%
			24	57.42%
		本文方法	12	43.57%
			18	55.28%
			24	65.24%