结合提示词微调的智能合约漏洞检测方法

doi:10.3969/j.issn.1671-1122.2025.04.014

摘要/Abstract

摘要：

随着区块链交易平台的飞速发展，智能合约的部署数量显著增加，而近年来不断爆出的智能合约漏洞致使区块链交易平台蒙受了巨大的经济损失。因此，智能合约安全领域的研究引起了研究者的广泛关注。然而，现有的漏洞检测方法要么严重依赖于专家规则或复杂的数据处理步骤，要么采用与该领域目标不符的模型或学习策略，导致检测效果不佳。基于此，文章提出一种利用进行提示词微调的智能合约漏洞检测方法PC-Detector，该方法通过引入特定于任务的提示词知识，确保目标任务与模型预训练阶段任务的一致性，从而增强模型适应，提高检测效果。具体来说，文章提出4种针对智能合约漏洞检测的提示词设计方法，并验证了代码嵌入提示词不同位置对检测性能的影响。此外，文章利用代码嵌入提示词对CodeT5系列模型进行提示词微调，从而检测出智能合约中的漏洞。实验结果表明，该方法可以显著提高检测性能。

关键词: 智能合约, 区块链, 漏洞检测, 提示词微调

Abstract:

With the rapid development of blockchain trading platforms, the deployment of smart contracts has increased significantly. However, in recent years, vulnerabilities in smart contracts have led to substantial economic losses for block-chain transaction platforms, drawing considerable attention from researchers to the field of smart contract security. Existing methods either heavily rely on expert rules or complex data processing steps, or employ models or learning strategies that are misaligned with the objectives of this field, resulting in poor detection performance. Therefore, this paper proposed PC-Detector, a vulnerability detection method for smart contracts utilizing prompt fine-tuning of large language models. By introducing task-specific prompt knowledge, this method ensured consistency between the target task and the model’s pretraining tasks, thereby enhancing model adaptability and improving detection performance. Specifically, the paper proposed four prompt design strategies tailored to smart contract vulnerability detection and examined the impact of embedding prompts at different positions on detection performance. Furthermore, the paper prompt-tuning on the CodeT5 series models using code-embedded prompts to detect vulnerabilities in smart contracts. Extensive experiments demonstrate that this method significantly improved detection performance.

Key words: smart contract, blockchain, vulnerability detection, prompt tuning

中图分类号:

TP309

张雨轩, 黄诚, 柳蓉, 冷涛. 结合提示词微调的智能合约漏洞检测方法[J]. 信息网络安全, 2025, 25(4): 664-673.

ZHANG Yuxuan, HUANG Cheng, LIU Rong, LENG Tao. Smart Contract Vulnerability Detection Method Combining Prompt Tuning[J]. Netinfo Security, 2025, 25(4): 664-673.

图/表 11

图1

图2

图3

图4

表1

表2

表3

表4

图5

表5

表6

参考文献 44

[1]	WOOD G. Ethereum: A Secure Decentralized Generalized Transaction Ledger[J]. Ethereum Project Yellow Paper, 2014, 151: 1-32.
[2]	BUTERIN V. A Next-Generation Smart Contract and Decentralized Application Platform[EB/OL]. (2022-05-19)[2025-01-02]. https://github.com/ethereum/wiki/wiki/White-Paper.
[3]	MENG Bo, LIU Jiabing, LIU Qin, et al. Survey of Smart Contract Security[J]. Chinese Journal of Network and Information Security, 2020, 6(3): 1-13.
	孟博, 刘加兵, 刘琴, 等. 智能合约安全综述[J]. 网络与信息安全学报, 2020, 6(3): 1-13.
[4]	DHILLON V, METCALF D, HOOPER M. The DAO Hacked[J]. Blockchain Enabled Applications, 2021: 113-128.
[5]	KUSHWAHA S S, JOSHI S, SINGH D, et al. Ethereum Smart Contract Analysis Tools: A Systematic Review[J]. IEEE Access, 2022, 10: 57037-57062.
[6]	CONSENSYS. Mythril[EB/OL]. (2024-03-28)[2025-01-02]. https://github.com/ConsenSys/mythril.
[7]	TSANKOV P, DAN A, DRACHLERCOHEN D, et al. Securify: Practical Security Analysis of Smart Contracts[C]// ACM. 2018 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2018: 67-82.
[8]	BADRUDDOJA S, DANTU R, HE Yanyan, et al. Making Smart Contracts Smarter[C]// IEEE. Proceedings of the 2021 IEEE International Conference on Blockchain and Cryptocurrency. New York: IEEE, 2021: 1-3.
[9]	LIU Daifu. Research on Vulnerability Detection Methods for Ethereum Smart Contracts[D]. Nanjing: Southeast University, 2022.
	刘代富. 以太坊智能合约漏洞检测方法研究[D]. 南京: 东南大学, 2022.
[10]	COLIN L S H, MOHAN P M, PAN J, et al. An Integrated Smart Contract Vulnerability Detection Tool Using Multi-Layer Perceptron on Real-Time Solidity Smart Contracts[J]. IEEE Access, 2024, 12: 23549-23567.
[11]	MIKOLOV T, CHEN Kai, CORRADO G, et al. Efficient Estimation of Word Representations in Vector Space[EB/OL]. (2013-09-07)[2025-01-02]. https://arxiv.org/abs/1301.3781.
[12]	ZHANG Xiaosong, NIU Weina, HUANG Shiping, et al. A Survey of Smart Contract Vulnerability Detection Methods Based on Deep Learning[J]. Journal of Sichuan University (Natural Science Edition), 2023, 60(2): 7-18.
	张小松, 牛伟纳, 黄世平, 等. 基于深度学习的智能合约漏洞检测方法综述[J]. 四川大学学报(自然科学版), 2023, 60(2): 7-18.
[13]	JIE Wanqing, CHEN Qi, WANG Jiaqi, et al. A Novel Extended Multimodal AI Framework Towards Vulnerability Detection in Smart Contracts[EB/OL]. (2023-03-23)[2025-01-02]. https://doi.org/10.1016/j.ins.2023.03.132.
[14]	LE T T H, KIM J, LEE S, et al. Robust Vulnerability Detection in Solidity-Based Ethereum Smart Contracts Using Fine-Tuned Transformer Encoder Models[J]. IEEE Access, 2024, 12: 154700-154717.
[15]	HE Fei, LI Fei, LIANG Peili. Enhancing Smart Contract Security: Leveraging Pre-Trained Language Models for Advanced Vulnerability Detection[J]. IET Blockcahin, 2024, 4(1): 543-554.
[16]	DUY P T, KHOA N H, QUYEN N H, et al. Vulnsense: Efficient Vulnerability Detection in Ethereum Smart Contracts by Multimodal Learning with Graph Neural Network and Language Model[EB/OL]. (2023-09-05)[2025-01-02]. https://arxiv.org/abs/2309.08474v1.
[17]	JAIN V K, TRIPATHI M. An Integrated Deep Learning Model for Ethereum Smart Contract Vulnerability Detection[J]. International Journal of Information Security, 2024, 23(1): 557-575.
[18]	ADBELAZIZ T, HOBOR A. Smart Learning to Find Dumb Contracts[C]// USENIX. Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23). Berkeley: USENIX, 2023: 1775-1792.
[19]	NGUYEN H H, NGUYEN N M, XIE Chunyao, et al. MANDO-HGT: Heterogeneous Graph Transformers for Smart Contract Vulnerability Detection[C]// IEEE. Proceedings of the 2023 IEEE/ACM 20th International Conference on Mining Software Repositories. New York: IEEE, 2023: 334-346.
[20]	CAI Jie, LI Bin, ZHANG Tao, et al. Fine-Grained Smart Contract Vulnerability Detection by Heterogeneous Code Feature Learning and Automated Dataset Construction[EB/OL]. (2023-12-14)[2025-01-02]. https://doi.org/10.1016/j.jss.2023.111919.
[21]	LUO Feng, LUO Ruijie, CHEN Ting, et al. SCVHunter: Smart Contract Vulnerability Detection Based on Heterogeneous Graph Attention Network[C]// IEEE. Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. New York: IEEE, 2024: 2098-2110.
[22]	QIAN Peng, LIU Zhenguang, YIN Yifang, et al. Cross-Modality Mutual Learning for Enhancing Smart Contract Vulnerability Detection on Bytecode[C]// ACM. Proceedings of the ACM Web Conference. New York: ACM, 2023: 2220-2229.
[23]	ZHANG Ningyu, LI Luoqiu, CHEN Xiang, et al. Differentiable Prompt Makes Pre-Trained Language Models Better Few-Shot Learners[EB/OL]. (2022-05-04)[2025-01-02]. https://arxiv.org/abs/2108.13161v7.
[24]	LESTER B, ALRFOU R, CONSTANT N. The Power of Scale for Parameter-Efficient Prompt Tuning[C]// ACL. 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 3045-3059.
[25]	LI L, LIANG P. Prefix-Tuning: Optimizing Continuous Prompts for Generation[C]// ACL. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 4582-4597.
[26]	WANG Chaozheng, YANG Yuanhang, GAO Cuiyun, et al. Prompt Tuning in Code Intelligence: An Experimental Evaluation[J]. IEEE Transactions on Software Engineering, 2023, 49(11): 4869-4885.
[27]	CHEN Yizheng, DING Zhoujie, ALOWAIN L, et al. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection[C]// ACM. 26th International Symposium on Research in Attacks, Intrusions and Defenses. New York: ACM, 2023: 654-668.
[28]	THAPA C, JANG S I, AHMED M E, et al. Transformer-Based Language Models for Software Vulnerability Detection[C]// ACM. Proceedings of the 38th Annual Computer Security Applications Conference. New York: ACM, 2022: 481-496.
[29]	BROWN T B, MANN B, RYDER N, et al. Language Models are Few-shot Learners[C]// NIPS. 34th International Conference on Neural Information Processing Systems. New York: CAI, 2020: 1877-1901.
[30]	PyTorch. PyTorch[EB/OL]. (2024-11-18)[2025-01-02]. https://pytorch.org.
[31]	HUGGING F. Hugging Face[EB/OL]. (2024-11-18)[2025-01-02]. https://huggingface.co.
[32]	LIU Zhenguang, QIAN Peng, YANG Jiaxu, et al. Rethinking Smart Contract Fuzzing: Fuzzing With Invocation Ordering and Important Branch Revisiting[J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 1237-1251.
[33]	SMARTBUGS. SB Curated: A Curated Dataset of Vulnerable Solidity Smart Contracts[EB/OL]. (2024-06-17)[2025-01-02]. https://github.com/smartbugs/smartbugs-curated.
[34]	GHALEB A, PATTABIRAMAN K. How Effective are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools Using Bug Injection[C]// ACM. Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2020: 415-427.
[35]	ADBELAZIZ T, HOBOR A. Reentrancy Benchmark[EB/OL]. (2023-02-01)[2025-01-02]. https://bit.ly/Reentrancy_benchmark.
[36]	ADBELAZIZ T, HOBOR A. EthereumSClarge[EB/OL]. (2023-02-01)[2025-01-02]. https://bit.ly/EthereumSC_Dataset_Large.
[37]	ADBELAZIZ T, HOBOR A. EthereumSCsmall[EB/OL]. (2023-02-01)[2025-01-02]. https://bit.ly/EthereumSC_Dataset_Small.
[38]	ADBELAZIZ T, HOBOR A. SolidiFI Benchmark[EB/OL]. (2023-02-01)[2025-01-02]. https://bit.ly/SolidiFI_benchmark.
[39]	GEMINI T, ROHAN A, SEBASTIAN B, et al. Gemini: A Family of Highly Capable Multimodal Models[EB/OL]. (2024-06-17)[2025-01-02]. https://arxiv.org/abs/2312.11805v4.
[40]	NGUYEN T D, PHAM L H, SUN Jun, et al. sFuzz: An Efficient Adaptive Fuzzer for Solidity Smart Contracts[C]// ACM. Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. New York: ACM, 2020: 778-788.
[41]	ZHUANG Yuan, LIU Zhenguang, QIAN Peng, et al. Smart Contract Vulnerability Detection Using Graph Neural Networks[C]// IJCAI. Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. San Francisco: Morgan Kaufmann, 2021: 3283-3290.
[42]	LIU Zhenguang, QIAN Peng, WANG Xiang, et al. Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion[EB/OL]. (2021-06-17)[2025-01-02]. https://arxiv.org/abs/2106.09282.
[43]	GAO Zhipeng, JIANG Lingxiao, XIA Xin, et al. Checking Smart Contracts with Structural Code Embedding[J]. IEEE Transactions on Software Engineering, 2021, 47(12): 2874-2891.
[44]	QIAN Peng, LIU Zhenguang, HE Qingming, et al. Towards Automated Reentrancy Detection for Smart Contracts Based on Sequential Models[J]. IEEE Access, 2020, 8: 19685-19695.

类型	样例
直接提问（1-a）	“[X]”. Does this smart contract have bugs of the ‘{vulnerability_type}’? Fill the word for ‘{vulnerability_type}’.
直接提问（1-b）	Detect if this smart contract “[X]” vulnerable to attacks of the ‘{vulnerability_type}’? Fill the word for ‘{vulnerability_type}’.
直接提问（1-c）	Does the following smart contract have a ‘{vulnerability_type}’ vulnerability? Fill the word for ‘{vulnerability_type}’.“[X]”.
漏洞样例参考（2-a）	This is an example of Solidity code with [VT] vulnerability:“[VE]”. Detect if this code “[X]” has [VT] vulnerability.
漏洞原理描述（3-a）	This is the cause of the [VT] vulnerability:“[VC]”. Based on the knowledge above detect if this code “[X]” has [VT] vulnerability.
综合原理与样例（4-a）	This is an example of Solidity code with [VT] vulnerability:“[VE]”. This is the cause of the [VT] vulnerability:“[VC]”. Based on the knowledge above detect if this code “[X]” has [VT] vulnerability.

数据集	描述
数据集Ⅰ^[22]	1000余条合约，涉及4种类型的漏洞
数据集Ⅱ^[32]	12000余条合约，涉及8种类型的漏洞
数据集Ⅲ^[33]	100余条合约，涉及9种类型的漏洞
数据集Ⅳ^[34]	9369条数据，涉及7种不同的漏洞类型
数据集Ⅴ^[35]	可重入漏洞和正常合约共计473条的合约地址
数据集Ⅵ^[36]	22634条正常合约在内的29种类型的合约地址
数据集Ⅶ^[37]	1381条正常合约在内的21种类型的合约地址
数据集Ⅷ^[38]	444条4种漏洞类型的合约地址

方法		准确率
CodeT5-small	微调	87.82%
CodeT5-small	提示词微调	90.45%
CodeT5-base	微调	88.34%
CodeT5-base	提示词微调	91.74%
CodeT5-large	微调	88.96%
CodeT5-large	提示词微调	91.89%

提示词	可重入漏洞				整数溢出漏洞
提示词	准确率	召回率	精确率	F1值	准确率	召回率	精确率	F1值
(1-a)	91.53%	93.34%	90.03%	91.67%	92.33%	91.07%	92.75%	91.90%
(1-b)	91.63%	92.42%	90.50%	91.44%	92.42%	91.62%	91.68%	91.65%
(1-c)	90.40%	86.62%	93.64%	89.96%	91.27%	92.06%	89.97%	91.00%
(2-a)	93.33%	92.79%	92.87%	92.83%	92.55%	94.39%	93.22%	93.80%
(3-a)	92.83%	90.11%	91.75%	90.91%	92.70%	93.60%	92.98%	93.29%
(4-a)	94.29%	96.09%	93.20%	94.62%	93.67%	95.37%	93.10%	94.22%
提示词	时间操纵漏洞				委托调用漏洞
提示词	准确率	召回率	精确率	F1值	准确率	召回率	精确率	F1值
(1-a)	90.88%	87.39%	90.11%	88.73%	89.71%	82.06%	90.21%	85.94%
(1-b)	91.12%	88.09%	91.69%	88.58%	89.73%	83.22%	89.98%	86.47%
(1-c)	90.62%	87.15%	88.34%	87.74%	88.92%	82.00%	86.62%	84.25%
(2-a)	93.87%	91.42%	94.21%	92.79%	90.44%	85.44%	91.33%	88.29%
(3-a)	94.65%	92.71%	92.98%	92.84%	90.02%	85.49%	90.42%	87.89%
(4-a)	94.55%	92.92%	94.43%	93.67%	91.98%	86.62%	91.88%	89.17%

方法	性能
方法	准确率	召回率	精确率	F1值
Slither	68.50%	63.31%	70.26%	66.60%
Mythril	65.24%	60.26%	55.55%	57.81%
Securify	70.78%	71.06%	67.31%	69.13%
Oyente	64.96%	56.51%	51.61%	53.95%
sFuzz	48.94%	28.18%	29.33%	28.74%
PC-Detector	91.74%	92.64%	90.95%	91.79%