A Survey of Large Language Models in the Domain of Cybersecurity

doi:10.3969/j.issn.1671-1122.2024.05.011

Abstract

Abstract:

In recent years, with the rapid advancement of large language model technology, its application potential in various fields such as healthcare and law has become evident, simultaneously pointing to new directions for progress in the field of cybersecurity. This paper began by providing an overview of the foundational theories behind the design principles, training mechanisms, and core characteristics of large language models, offering the necessary background knowledge to readers. It then delved into the role of large language models in enhancing the capabilities to identify and respond to the growing threats online, detailing research progress in areas such as penetration testing, code security audit, social engineering attacks, and the assessment of professional cybersecurity knowledge. Finally, it analyzed the challenges related to security, cost, and interpretability of this technology, and looked forward to the future development direction.

Key words: LLMs, ChatGPT, cybersecurity

CLC Number:

TP309

ZHANG Changlin, TONG Xin, TONG Hui, YANG Ying. A Survey of Large Language Models in the Domain of Cybersecurity[J]. Netinfo Security, 2024, 24(5): 778-793.

Figures/Tables 13

References 93

[1]	WU Tianyu, HE Shizhu, LIU Jingping, et al. A Brief Overview of ChatGPT: The History, Status Quo and Potential Future Development[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(5): 1122-1136.
[2]	ACHIAM J, ADLER S, AGARWAL S, et al. Gpt-4 Technical Report[EB/OL]. (2024-03-4)[2024-03-15]. https://arxiv.org/pdf/2303.08774.pdf.
[3]	TOUVRON H, LAVRIL T, IZACARD G, et al. Llama: Open and Efficient Foundation Language Models[EB/OL]. (2023-02-27)[2024-02-13]. https://arxiv.org/pdf/2302.13971.pdf.
[4]	TOUVRON H, MARTIN L, STONE K, et al. Llama 2: Open Foundation and Fine-Tuned Chat Models[EB/OL]. (2023-07-19)[2024-02-13]. https://arxiv.org/pdf/2307.09288.pdf.
[5]	CUI Jiaxi, LI Zongjian, YAN Yang, et al. Chatlaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases[EB/OL]. (2023-06-28)[2024-02-13]. https://arxiv.org/pdf/2306.16092.pdf.
[6]	NGUYEN H T. A Brief Report on LawGPT 1.0: A Virtual Legal Assistant Based on GPT-3[EB/OL]. (2023-02-14)[2024-02-13]. https://arxiv.org/pdf/2302.05729.pdf.
[7]	ZHANG Kai, YU Jun, ADHIKARLA E, et al. BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks[EB/OL]. (2024-01-09)[2024-02-13]. https://arxiv.org/pdf/2305.17100.pdf.
[8]	WANG Haochun, LIU Chi, XI Nuwa, et al. Huatuo: Tuning Llama Model with Chinese Medical Knowledge[EB/OL]. (2023-04-14)[2024-02-13]. https://arxiv.org/pdf/2304.06975.pdf.
[9]	WU Shijie, IRSOY O, LU S, et al. Bloomberggpt: A Large Language Model for Finance[EB/OL]. (2024-01-09)[2024-02-13]. https://arxiv.org/pdf/2303.17564.pdf.
[10]	DAN Yuhao, LEI Zhikai, GU Yiyang, et al. Educhat: A Large-Scale Language Model-Based Chatbot System for Intelligent Education[EB/OL]. (2023-08-05)[2024-02-13]. https://arxiv.org/pdf/2308.02773.pdf.
[11]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is All You Need[C]// ACM. Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010.
[12]	XU Jingjing, SUN Xu, ZHANG Zhiyuan, et al. Understanding and Improving Layer Normalization[C]// ACM. Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York: ACM, 2019: 4381-4391.
[13]	ZHAO W X, ZHOU Kun, LI Junyi, et al. A Survey of Large Language Models[EB/OL]. (2023-11-24)[2024-02-13]. https://arxiv.org/pdf/2303.18223.pdf.
[14]	RADFORD A, NARASIMHAN K, SALIMANS T, et al. Improving Language Understanding by Generative Pre-Training[EB/OL]. (2018-06-12)[2024-02-13]. https://www.cs.ubc.ca/-amuham01/LING530/papers/radford2018improving.pdf.
[15]	SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal Policy Optimization Algorithms[EB/OL]. (2017-08-28)[2024-02-13]. https://arxiv.org/pdf/1707.06347.pdf.
[16]	LONG Ouyang, JEFF W, XU Jiang, et al. Training Language Models to Follow Instructions with Human Feedback[J]. Advances in Neural Information Processing Systems, 2022, 35: 27730-27744.
[17]	WEI J, TAY Y, BOMMASANI R, et al. Emergent Abilities of Large Language Models[EB/OL]. (2022-10-26)[2024-02-13]. https://arxiv.org/pdf/2206.07682.pdf.
[18]	BROWN T, MANN B, RYDER N, et al. Language Models are Few-Shot Learners[J]. Advances in Neural Information Processing Systems, 2020, 33: 1877-1901.
[19]	WEI J, WANG Xuezhi, SCHUURMANS D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[J]. Advances in Neural Information Processing Systems, 2022, 35: 24824-24837.
[20]	RAFFEL C, SHAZEER N, ROBERTS A, et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer[J]. The Journal of Machine Learning Research, 2020, 21(1): 5485-5551.
[21]	MUENNIGHOFF N, WANG T, SUTAWIKA L, et al. Crosslingual Generalization through Multitask Finetuning[EB/OL]. (2023-05-29)[2024-02-13]. https://arxiv.org/pdf/2211.01786.pdf.
[22]	TAORI R, GULRAJANI I, ZHANG T, et al. Stanford Alpaca: An Instruction-Following Llama Model[EB/OL]. (2023-05-30)[2024-02-13]. https://github.com/tatsu-lab/stanford_alpaca.
[23]	DU Zhengxiao, QIAN Yujie, LIU Xiao, et al. GLM: General Language Model Pretraining with Autoregressive Blank Infilling[C]// ACL. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin:ACL, 2022: 320-335.
[24]	ZENG Aohan, LIU Xiao, DU Zhengxiao, et al. GLM-130B: An Open Bilingual Pre-Trained Model[EB/OL]. (2023-10-25)[2024-02-13]. https://arxiv.org/pdf/2210.02414.pdf.
[25]	YANG Aiyuan, XIAO Bin, WANG Bingning, et al. Baichuan 2: Open Large-Scale Language Models[EB/OL]. (2023-09-20)[2024-02-13]. https://arxiv.org/pdf/2309.10305.pdf.
[26]	BAI Jinzu, BAI Shuai, CHU Yunfei, et al. Qwen Technical Report[EB/OL]. (2024-01-09)[2024-02-13]. https://arxiv.org/pdf/2309.16609.pdf.
[27]	DENG Gelei, LIU Yi, MAYORAL-VILCHES V, et al. Pentestgpt: An LLM-Empowered Automatic Penetration Testing Tool[EB/OL]. (2023-08-13)[2024-02-13]. https://arxiv.org/pdf/2308.06782.pdf.
[28]	HAPPE A, CITO J. Getting Pwn’d by Ai: Penetration Testing with Large Language Models[C]// ACM. Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2023: 2082-2086.
[29]	SAKAOGLU S. KARTAL: Web Application Vulnerability Hunting Using Large Language Models: Novel Method for Detecting Logical Vulnerabilities in Web Applications with Finetuned Large Language Models[D]. Stockholm: KTH Royal Institute of Technology, 2023.
[30]	SZABÓ Z, BILICKI V. A New Approach to Web Application Security: Utilizing GPT Language Models for Source Code Inspection[EB/OL]. (2023-09-28)[2024-02-13]. https://doi.org/10.3390/fi15100326.
[31]	BECKERICH M, PLEIN L, CORONADO S. Ratgpt: Turning Online LLMs into Proxies for Malware Attacks[EB/OL]. (2023-09-07)[2024-02-13]. https://arxiv.org/pdf/2308.09183.pdf.
[32]	GARVEY B, SVENDSEN A. Can Generative-AI (ChatGPT and Bard) Be Used as Red Team Avatars in Developing Foresight Scenarios?[EB/OL]. (2023-08-28)[2024-02-13]. http://dx.doi.org/10.2139/ssrn.4703135.
[33]	RESENDE P A A, DRUMMOND A C. A Survey of Random Forest Based Methods for Intrusion Detection Systems[J]. ACM Computing Surveys (CSUR), 2018, 51(3): 1-36.
[34]	WANG Huiwen, GU Jie, WANG Shanshan. An Effective Intrusion Detection Framework Based on SVM with Feature Augmentation[J]. Knowledge-Based Systems, 2017, 136: 130-139.
[35]	ALI T, KOSTAKOS P. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)[EB/OL]. (2023-09-27)[2024-02-13]. https://arxiv.org/pdf/2309.16021.pdf.
[36]	MENG Xuying, LIN Chuangang, WANG Yuquan, et al. NetGPT: Generative Pretrained Transformer for Network Traffic[EB/OL]. (2023-05-17)[2024-02-13]. https://arxiv.org/pdf/2304.09513.pdf.
[37]	MANOCCHIO L D, LAYEGHY S, LO W W, et al. Flowtransformer: A Transformer Framework for Flow-Based Network Intrusion Detection Systems[EB/OL]. (2023-04-28)[2024-02-13]. https://arxiv.org/pdf/2304.14746v1.pdf.
[38]	WANG Qineng, QIAN Chen, LI Xiaochang, et al. Lens: A Foundation Model for Network Traffic in Cybersecurity[EB/OL]. [2024-02-13]. https://arxiv.org/pdf/2402.03646.pdf.
[39]	NETO E C P, DADKHAH S, FERREIRA R, et al. CICIoT2023: A Real-Time Dataset and Benchmark for Large-Scale Attacks in IoT Environment[J]. Sensors, 2023, 23(13): 5941-5958.
[40]	PROVOS N. A Virtual Honeypot Framework[C]// USENIX. USENIX Security Symposium. Berkeley: USENIX, 2004, 173(2004): 1-14.
[41]	MCKEE F, NOEVER D. Chatbots in a Honeypot World[EB/OL]. (2023-01-10)[2024-02-13]. https://doi.org/10.48550/arXiv.2301.03771.
[42]	RAUT U, NAGARKAR A, TALNIKAR C, et al. Engaging Attackers with a Highly Interactive Honeypot System Using ChatGPT[C]// IEEE. 2023 7th International Conference On Computing, Communication, Control and Automation (ICCUBEA). New York: IEEE, 2023: 1-5.
[43]	SLADIĆ M, VALEROS V, CATANIA C, et al. LLM in the Shell: Generative Honeypots[EB/OL]. (2023-08-31)[2024-02-13]. https://arxiv.org/pdf/2309.00155v1.pdf.
[44]	YU Fangyi, MARTIN M V. Honey, I Chunked the Passwords: Generating Semantic Honeywords Resistant to Targeted Attacks Using Pre-trained Language Models[C]// Springer. International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Heidelberg: Springer, 2023: 89-108.
[45]	BOTACIN M. Gpthreats-3: Is Automatic Malware Generation a Threat?[C]// IEEE. 2023 IEEE Security and Privacy Workshops (SPW). New York: IEEE, 2023: 238-254.
[46]	PA P Y M, TANIZAKI S, KOU T, et al. An Attacker’s Dream? Exploring the Capabilities of ChatGPT for Developing Malware[C]// ACM. Proceedings of the 16th Cyber Security Experimentation and Test Workshop. New York: ACM, 2023: 10-18.
[47]	CHARAN P V, CHUNDURI H, ANAND P M, et al. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads[EB/OL]. (2024-01-09)[2024-02-13]. https://arxiv.org/pdf/2305.15336.pdf.
[48]	OMAR M, SHIAELES S. VulDetect: A Novel Technique for Detecting Software Vulnerabilities Using Language Models[C]// IEEE. 2023 IEEE International Conference on Cyber Security and Resilience (CSR). New York: IEEE, 2023: 105-110.
[49]	LI Zhen, ZOU Deqing, XU Shouhuai, et al. Sysevr: A Framework for Using Deep Learning to Detect Software Vulnerabilities[J]. IEEE Transactions on Dependable and Secure Computing, 2021, 19(4): 2244-2258.
[50]	KIM S, CHOI J, AHMED M E, et al. VulDeBERT: A Vulnerability Detection System Using BERT[C]// IEEE. 2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW). New York: IEEE, 2022: 69-74.
[51]	FERRAG M A, BATTAH A, TIHANYI N, et al. Securefalcon: The Next Cyber Reasoning System for Cyber Security[EB/OL]. (2023-07-13)[2024-02-13]. https://arxiv.org/pdf/2307.06616.pdf.
[52]	CHARALAMBOUS Y, TIHANYI N, JAIN R, et al. A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification[EB/OL]. (2023-05-24)[2024-02-13]. https://arxiv.org/pdf/2305.14752.pdf.
[53]	PEARCE H, TAN B, AHMAD B, et al. Examining Zero-Shot Vulnerability Repair with Large Language Models[C]// IEEE. 2023 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2023: 2339-2356.
[54]	CHEN Chong, SU Jianzhong, CHEN Jiachi, et al. When ChatGPT Meets Smart Contract Vulnerability Detection: How Far are We?[EB/OL]. (2023-09-14)[2024-02-13]. https://arxiv.org/pdf/2309.05520.pdf.
[55]	SUN Yuqiang, WU Daoyuan, XUE Yue, et al. GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis[EB/OL]. (2023-12-25)[2024-02-13]. https://arxiv.org/pdf/2308.03314.pdf.
[56]	TUSHAR S D. FraudGPT: New Black Hat AI Tool Launched by Cybercriminals[EB/OL]. (2023-07-27) [2024-02-13] https://cybersecuritynews.com/fraudgpt-new-black-hat-ai-tool/.
[57]	BEGOU N, VINOY J, DUDA A, et al. Exploring the Dark Side of AI: Advanced Phishing Attack Design and Deployment Using ChatGPT[C]// IEEE. 2023 IEEE Conference on Communications and Network Security (CNS). New York: IEEE, 2023: 1-6.
[58]	FALADE P V. Decoding the Threat Landscape: ChatGPT, FraudGPT, and WormGPT in Social Engineering Attacks[EB/OL]. (2023-10-09)[2024-02-13]. https://arxiv.org/ftp/arxiv/papers/2310/2310.05595.pdf.
[59]	KARANJAI R. Targeted Phishing Campaigns Using Large Scale Language Models[EB/OL]. (2022-12-30)[2024-02-13]. https://arxiv.org/pdf/2301.00665.pdf.
[60]	HAZELL J. Large Language Models Can Be Used to Effectively Scale Spear Phishing Campaigns[EB/OL]. (2023-05-11)[2024-02-13]. https://arxiv.org/pdf/2305.06972.pdf.
[61]	RANDO J, PEREZ-CRUZ F, HITAJ B. PassGPT: Password Modeling and (Guided) Generation with Large Language Models[EB/OL]. (2023-06-14)[2024-02-13]. https://arxiv.org/pdf/2306.01545.pdf.
[62]	Wikipedia. Rockyou[EB/OL]. (2024-01-21)[2024-02-13]. https://en.wikipedia.org/wiki/RockYou#Data_breach.
[63]	HITAJ B, GASTI P, ATENIESE G, et al. PassGAN: A Deep Learning Approach for Password Guessing[C]// Springer. Applied Cryptography and Network Security:17th International Conference, ACNS 2019. Heidelberg: Springer, 2019: 217-237.
[64]	CAMBIASO E, CAVIGLIONE L. Scamming the Scammers: Using ChatGPT to Reply Mails for Wasting Time and Resources[EB/OL]. (2023-02-10)[2024-02-13]. https://arxiv.org/pdf/2303.13521.pdf.
[65]	VÖRÖS T, BERGERON S P, BERLIN K. Web Content Filtering through Knowledge Distillation of Large Language Models[EB/OL]. (2023-05-10)[2024-02-13]. https://arxiv.org/pdf/2305.05027.pdf.
[66]	KOIDE T, FUKUSHI N, NAKANO H, et al. Detecting Phishing Sites Using ChatGPT[EB/OL]. [2024-02-13]. https://arxiv.org/pdf/2306.05816.pdf.
[67]	KEREOPA-YORKE B. Building Resilient SMEs: Harnessing Large Language Models for Cyber Security in Australia[EB/OL]. (2023-06-05)[2024-02-13]. https://arxiv.org/ftp/arxiv/papers/2306/2306.02612.pdf.
[68]	SHAFEE S, BESSANI A, FERREIRA P M. Evaluation of LLM Chatbots for Osint-Based Cyberthreat Awareness[EB/OL]. (2024-01-26)[2024-02-13]. https://arxiv.org/pdf/2401.15127v1.pdf.
[69]	TANN W, LIU Yuancheng, SIM J H, et al. Using Large Language Models for Cybersecurity Capture-the-Flag Challenges and Certification Questions[EB/OL]. (2023-08-21)[2024-02-13]. https://arxiv.org/pdf/2308.10443.pdf.
[70]	BHATT M, CHENNABASAPPA S, NIKOLAIDIS C, et al. Purple Llama Cyberseceval: A Secure Coding Benchmark for Language Models[EB/OL]. (2023-12-07)[2024-02-13]. https://arxiv.org/pdf/2312.04724.pdf.
[71]	TIHANYI N, FERRAG M A, JAIN R, et al. CyberMetric: A Benchmark Dataset for Evaluating Large Language Models Knowledge in Cybersecurity[EB/OL]. (2024-02-12)[2024-02-23]. https://arxiv.org/pdf/2402.07688.pdf.
[72]	ZHOU Y, MURESANU A I, HAN Z, et al. Large Language Models are Human-Level Prompt Engineers[EB/OL]. (2023-03-10)[2024-02-13]. https://arxiv.org/pdf/2211.01910.pdf.
[73]	LIU Haokun, TAM D, MUQEETH M, et al. Few-Shot Parameter-Efficient Fine-Tuning is Better and Cheaper than In-Context Learning[J]. Advances in Neural Information Processing Systems, 2022, 35: 1950-1965.
[74]	ZHANG Jiliang, LI Chen. Adversarial Examples: Opportunities and Challenges[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 31(7): 2578-2593.
[75]	GUPTA M, AKIRI C K, ARYAL K, et al. From ChatGPT to ThreatGPT: Impact of Generative AI in Cybersecurity and Privacy[J]. IEEE Access, 2023, 11: 80218-80245.
[76]	KUANG Weirui, QIAN Bingchen, LI Zitao, et al. Federatedscope-LLM: A Comprehensive Package for Fine-Tuning Large Language Models in Federated Learning[EB/OL]. (2023-09-01)[2024-02-13]. https://arxiv.org/pdf/2309.00363.pdf.
[77]	ALON G, KAMFONAS M. Detecting Language Model Attacks with Perplexity[EB/OL]. (2023-11-07)[2024-02-13]. https://arxiv.org/pdf/2308.14132.pdf.
[78]	SHAFAHI A, NAJIBI M, GHIASI A, et al. Adversarial Training for Free![C]// ACM. Proceedings of the 33rd International Conference on Neural Information Processing Systems. New York: ACM, 2019: 3358-3369.
[79]	KUMAR A, AGARWAL C, SRINIVAS S, et al. Certifying LLM Safety against Adversarial Prompting[EB/OL]. (2024-02-12)[2024-02-13]. https://arxiv.org/pdf/2309.02705.pdf.
[80]	KORBAK T, SHI Kejian, CHEN A, et al. Pretraining Language Models with Human Preferences[C]// PMLR. International Conference on Machine Learning. New York: PMLR, 2023: 17506-17533.
[81]	CHEFER H, GUR S, WOLF L. Transformer Interpretability beyond Attention Visualization[C]// IEEE. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2021: 782-791.
[82]	JACOBS C, GLAZEBROOK K, QIN A K, et al. Exploring the Interpretability of Deep Neural Networks Used for Gravitational Lens Finding with a Sensitivity Probe[J]. Astronomy and Computing, 2022, 38: 100535-100548.
[83]	BILLS S, CAMMARATA N, MoSsing D, et al. Language Models Can Explain Neurons in Language Models[EB/OL]. (2023-05-14)[2024-02-13]. https://openai.com/research/language-models-can-explain-neurons-in-language-models.
[84]	WIGHTMAN G P, DELUCIA A, DREDZE M. Strength in Numbers: Estimating Confidence of Large Language Models by Prompt Agreement[C]// ACL. Proceedings of the 3rd Workshop on Trustworthy Natural Language Processing (TrustNLP 2023). Stroudsburg: ACL, 2023: 326-362.
[85]	XU Yuzhuang, HAN Xu, YANG Zonghan, et al. OneBit: Towards Extremely Low-Bit Large Language Models[EB/OL]. [2024-02-13]. https://arxiv.org/pdf/2402.11295.pdf.
[86]	SHAZEER N. Fast Transformer Decoding: One Write-Head is All You Need[EB/OL]. (2019-11-06)[2024-02-13]. https://arxiv.org/pdf/1911.02150.pdf.
[87]	KWON W, LI Zhuohan, ZHUANG Siyuan, et al. Efficient Memory Management for Large Language Model Serving with Pagedattention[C]// ACM. Proceedings of the 29th Symposium on Operating Systems Principles. New York: ACM, 2023: 611-626.
[88]	LIU Xiao, LAI Hanyu, YU Hao, et al. WebGLM: Towards an Efficient Web-Enhanced Question Answering System with Human Preferences[C]// ACM. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2023: 4549-4560.
[89]	TALEBIRAD Y, NADIRI A. Multi-Agent Collaboration: Harnessing the Power of Intelligent LLM Agents[EB/OL]. (2023-06-05)[2024-02-13]. https://arxiv.org/pdf/2306.03314.pdf.
[90]	LEWIS P, PEREZ E, PIKTUS A, et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks[J]. Advances in Neural Information Processing Systems, 2020, 33: 9459-9474.
[91]	Xinhua Net. The National Engineering Research Center for Classified Protection of Cybersecurity and Security Defense Technology Established an Artificial General Intelligence Security Working Group[EB/OL]. (2023-07-14)[2024-02-23]. http://www.xinhuanet.com/legal/2023-07/14/c_1212244260.htm.
	新华网. 网络安全等级保护与安全保卫技术国家工程研究中心成立通用人工智能安全工作组[EB/OL]. (2023-07-14)[2024-02-13]. http://www.xinhuanet.com/legal/2023-07/14/c_1212244260.htm.
[92]	TONG Xin, JIN Bo, LIN Zhi, et al. CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for Chinese Public Security Domain[EB/OL]. (2024-02-11)[2024-02-23]. https://arxiv.org/pdf/2402.07234v1.pdf.
[93]	KE Pei, WEN Bosi, FENG Zhuoer, et al. Critiquellm: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation[EB/OL]. (2023-11-30)[2024-02-13]. https://arxiv.org/pdf/2311.18702.pdf.

时间	模型	开发者	开源情况	参数规模	访问链接
2019	T5^[20]	Google	开源	60MB, 220MB, 770MB, 3B, 11B	github.com/google-research/text-to-text-transfer-transformer
2022	ChatGPT	OpenAI	闭源	未知	chat.openai.com
2022	BLOOMZ^[21]	BigScience	开源	560M,1.1B,1.7B, 3B,7.1B,176B	huggingface.co/bigscience/bloomz
2023	GPT-4^[2]	OpenAI	闭源	未知	chat.openai.com
2023	Gemini	Google	闭源	未知	gemini.google.com
2023	Claude 2	Anthropic	闭源	未知	claude.ai
2023	LLaMA 2^[4]	Meta	开源	7B,13B,70B	llama.meta.com
2023	Mistral	Mistral AI	开源	7B,46.7B	mistral.ai
2023	Alpaca^[22]	Stanford	开源	7B	github.com/tatsu-lab/stanford_alpaca
2023	ChatGLM^[23,24]	智谱AI	部分开源	6B,12B,130B	chatglm.cn
2023	Baichuan2^[25]	百川智能	部分开源	7B,13B,53B	www.baichuan-ai.com
2023	QWEN^[26]	阿里巴巴	开源	1.8B,7B,14B,72B	tongyi.aliyun.com/qianwen
2023	讯飞星火	科大讯飞	闭源	未知	xinghuo.xfyun.cn
2023	文心一言	百度	闭源	未知	yiyan.baidu.com
2023	InternLM	上海人工智能实验室	开源	1.8B,7B,20B	internlm.org
2023	Aquila2	BAAI	开源	7B,34B,70B	github.com/FlagAI-Open/Aquila2

文献	基础模型	实现方法	主要功能
PentestGPT^[27]	GPT-3.5, GPT-4	提示工程	基于问题分解的方式实现自动化渗透
文献[28]	GPT-3.5	提示工程	从宏观维度分析LLMs的网络安全任务规划能力，同时从微观维度测试了LLMs在具体辅助渗透方面的效果
KARTAL^[29]	MPNet, MiniLM, DistillRoBERTa	微调	基于问题分解的方式实现Web漏洞挖掘
文献[30]	GPT-3.5, GPT-4	提示工程	利用ChatGPT检测CWE-653 漏洞
RatGPT^[31]	GPT-4	提示工程	利用LLMs作为攻击者和靶机之间的代理，并辅助交互
文献[32]	GPT-4, Bard	提示工程	探索LLMs能否作为网络安全红队
HuntGPT^[35]	GPT-3.5	提示工程	利用ChatGPT为基于机器学习的入侵检测系统提供解释说明和用户交互功能
NetGPT^[36]	GPT-2	微调	基于GPT-2的网络流量理解与生成模型，在入侵检测任务中具有良好的效果
FlowTransformer^[37]	GPT-2	微调	基于GPT-2的入侵检测模型，并针对输入编码和分类头等模块进行优化以进一步提升在该任务中的效率和效果
Lens^[38]	T5	微调	针对网络流量分析任务设计了3种更具针对性的微调任务，在传统入侵检测和IoT入侵检测任务中均取得了较好的效果
文献[41?-43]	ChatGPT	提示工程	利用LLMs构建具备高欺骗性的蜜罐与攻击者交互
文献[44]	GPT-3	微调	生成高度欺骗性的蜜词

文献	基础模型	实现方法	主要功能
文献[45]	GPT-3	提示工程	生成恶意代码功能模块
文献[46]	ChatGPT, text-davinci-003	提示工程	在Auto-GPT的帮助下绕过LLMs的安全围栏，进而生成恶意代码和入侵工具
文献[47]	ChatGPT, Bard	提示工程	使用ChatGPT和Bard生成2022年流行的十大MITRE 技术相关的攻击代码
VulDetect [48]	GPT-2	微调	利用LLMs审计代码中的漏洞和风险点
文献[51]	FalconLLM	微调	利用LLMs审计C语言代码中的漏洞
Charalambous[52]	GPT-3.5	提示工程	利用ChatGPT和上下文边界模型检查器合作以实现识别并修复代码中的漏洞
文献[53]	Codex, Jurassic J-1	提示工程	利用LLMs分析代码漏洞并进行修复，可有效处理人工合成的风险代码，但处理真实场景中的样本时仍需要进一步优化
文献[54]	GPT-3.5,GPT-4	提示工程	利用LLMs识别智能合约漏洞，ChatGPT在4/7的样本分析中优于对比方法
GPTSan [55]	GPT-3.5,GPT-4	提示工程	第一个将GPT与静态分析相结合的智能合约逻辑漏洞检测工具，能够在一定程度上减少任务复杂性以优化分析效果

文献	基础模型	实现方法	主要功能
FraudGPT^[56]	未知	未知	提供生成鱼叉式网络钓鱼电子邮件、创建破解工具等一系列社会工程学攻击功能
文献[57]	GPT-3.5	提示工程	完成克隆目标网站、窃取凭据、代码混淆、在云服务器自动部署网站、注册钓鱼域名以及将网站与反向代理集成等生成钓鱼网站的全流程
WormGPT^[58]	GPT-J	微调	用于发起商业电子邮件攻击以对攻击目标进行勒索
文献[59,60]	GPT-3.5, GPT-4	提示工程	以极低的成本自动搜集受害人信息并生成针对性的钓鱼邮件
PassGPT^[61]	GPT-2	微调	根据用户输入约束高效地破解密码
文献[64]	ChatGPT	提示工程	利用LLMs与钓鱼邮件诈骗者进行交互以浪费其时间
文献[65]	T5	微调	利用T5模型实现恶意URL的检测，并引入知识蒸馏技术提升分析效率
ChatPhishDetectors^[66]	GPT-4V	提示工程	将爬虫和多模态的GPT模型相结合来完成自动化的钓鱼网站识别

文献	基础模型	实现方法	主要功能
文献[67]	GPT-3.5, GPT-4	提示工程	利用LLMs为中小企业的网络安全政策、管理提供支持
文献[68]	ChatGPT, Alpaca, Falcon	提示工程	探索LLMs在开源网络安全情报分析中的应用，包括实体识别和文本分类两种任务
文献[69]	ChatGPT, Bard, Bing	提示工程	考察LLMs在 CTF竞赛中的能力，表明了ChatGPT等模型在Web安全、二进制攻击、密码学、逆向工程和取证分析等方面具有良好的知识
CyberSecEval^[70]	多种主流LLMs	提示工程	用于评估LLMs在生成不安全代码和协助网络攻击方面的基准
CyberMetric^[71]	多种主流LLMs	提示工程	用于评估LLMs在密码学、逆向工程和风险评估等网络安全方面知识的基准