| [1] |
MAUS N, CHAO P, WONG E, et al. Black Box Adversarial Prompting for Foundation Models[EB/OL]. (2023-02-08)[2025-01-20]. https://arxiv.org/abs/2302.04237.
|
| [2] |
GUO Wei, TONDI B, BARNI M. An Overview of Backdoor Attacks against Deep Neural Networks and Possible Defences[J]. IEEE Open Journal of Signal Processing, 2022, 3: 261-287.
|
| [3] |
NING Rui, LI Jiang, XIN Chunsheng, et al. Invisible Poison: A Blackbox Clean Label Backdoor Attack to Deep Neural Networks[C]// IEEE. IEEE INFOCOM 2021-IEEE Conference on Computer Communications. New York: IEEE, 2021: 1-10.
|
| [4] |
LI Yiming, ZHAI Tongqing, JIANG Yong, et al. Backdoor Attack in the Physical World[EB/OL]. (2021-04-06)[2025-01-20]. https://arxiv.org/abs/2104.02361v2.
|
| [5] |
GU Naibin, FU Peng, LIU Xiyu, et al. A Gradient Control Method for Backdoor Attacks on Parameter-Efficient Tuning[C]// ACL. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2023: 3508-3520.
|
| [6] |
PETROV A, TORR P, BIBI A. When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations[EB/OL]. (2023-10-30)[2025-01-20]. https://arxiv.org/abs/2310.19698.
|
| [7] |
SHENG Xuan, HAN Zhaoyang, LI Piji, et al. A Survey on Backdoor Attack and Defense in Natural Language Processing[C]// IEEE. 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security (QRS). IEEE, 2022: 809-820.
|
| [8] |
PAN Xudong, ZHANG Mi, SHENG Beina, et al. Hidden Trigger Backdoor Attack on NLP Models via Linguistic Style Manipulation[C]// USENIX. The 31st USENIX Security Symposium. Berkeley: USENIX, 2022: 3611-3628.
|
| [9] |
SHAO Kun, ZHANG Yu, YANG Junan, et al. The Triggers that Open the NLP Model Backdoors are Hidden in the Adversarial Samples[EB/OL]. (2022-07-01)[2025-01-20]. https://doi.org/10.1016/j.cose.2022.102730.
|
| [10] |
ZHANG Jinghuai, LIU Hongbin, JIA Jinyuan, et al. Data Poisoning Based Backdoor Attacks to Contrastive Learning[C]// IEEE. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 24357-24366.
|
| [11] |
DAI Jiazhu, CHEN Chuanshuai, LI Yufeng. A Backdoor Attack against LSTM-Based Text Classification Systems[J]. IEEE Access, 2019, 7: 138872-138878.
doi: 10.1109/ACCESS.2019.2941376
|
| [12] |
KURITA K, MICHEL P, NEUBIG G. Weight Poisoning Attacks on Pretrained Models[C]// ACL. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 2793-2806.
|
| [13] |
KWON H, LEE S. Textual Backdoor Attack for the Text Classification System[EB/OL]. (2021-10-22)[2025-01-20]. https://doi.org/10.1155/2021/2938386.
|
| [14] |
QI Fanchao, YAO Yuan, XU S, et al. Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution[EB/OL]. (2021-06-11)[2025-01-20]. https://arxiv.org/abs/2106.06361v1.
|
| [15] |
LI Shaofeng, ZHU Haojin, WU Wen, et al. Hidden Backdoor Attacks in NLP Based Network Services[C]// Springer. Backdoor Attacks against Learning-Based Algorithms. Heidelberg: Springer, 2024: 79-122.
|
| [16] |
ZHANG Zheng, YUAN Xu, ZHU Lei, et al. BadCM: Invisible Backdoor Attack against Cross-Modal Learning[J]. IEEE Transactions on Image Processing, 2024, 33: 2558-2571.
|
| [17] |
CHENG Pengzhou, WU Zongru, DU Wei, et al. Backdoor Attacks and Countermeasures in Natural Language Processing Models: A Comprehensive Security Review[EB/OL]. (2023-09-12)[2025-01-20]. https://arxiv.org/abs/2309.06055v5.
|
| [18] |
ZENG Yueqi, LI Ziqiang, XIA Pengfei, et al. Efficient Trigger Word Insertion[C]// IEEE. 2023 9th International Conference on Big Data and Information Analytics (BigDIA). New York: IEEE, 2023: 21-28.
|
| [19] |
LYU Weimin, ZHENG Songzhu, PANG Lu, et al. Attention-Enhancing Backdoor Attacks against BERT-Based Models[EB/OL]. (2023-10-23)[2025-01-20]. https://arxiv.org/abs/2310.14480v2.
|
| [20] |
JIANG Wenbo, LI Hongwei, XU Guowen, et al. Color Backdoor: A Robust Poisoning Attack in Color Space[C]// IEEE. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 8133-8142.
|
| [21] |
SHEN Lujia, JI Shouling, ZHANG Xuhong, et al. Backdoor Pre-Trained Models Can Transfer to All[C]// ACM. Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2021: 3141-3158.
|
| [22] |
YANG Yuchen, HUI Bo, YUAN Haolin, et al. SneakyPrompt: Evaluating Robustness of Text-to-Image Generative Models’ Safety Filters[EB/OL]. (2023-05-20)[2025-01-20]. https://arxiv.org/abs/2305.12082.
|
| [23] |
CAI Xiangrui, XU Haidong, XU Sihan, et al. Badprompt: Backdoor Attacks on Continuous Prompts[J]. Neural Information Processing Systems, 2022, 35: 37068-37080.
|
| [24] |
MEI Kai, LI Zheng, WANG Zhenting, et al. NOTABLE: Transferable Backdoor Attacks against Prompt-Based NLP Models[EB/OL]. (2023-05-28)[2025-01-20]. https://arxiv.org/abs/2305.17826.
|
| [25] |
ZHANG Changlin, TONG Xin, TONG Hui, et al. A Survey of Large Language Models in the Domain of Cybersecurity[J]. Netinfo Security, 2024, 24(5): 778-793.
|
|
张长琳, 仝鑫, 佟晖, 等. 面向网络安全领域的大语言模型技术综述[J]. 信息网络安全, 2024, 24(5):778-793.
|
| [26] |
SOCHER R, PERELYGIN A, WU J, et al. Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank[C]// ACL. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2013: 1631-1642.
|
| [27] |
ALMEIDA T A, HIDALGO J M G, YAMAKAMI A. Contributions to the Study of SMS Spam Filtering: New Collection and Results[C]// ACM. Proceedings of the 11th ACM Symposium on Document Engineering. New York: ACM, 2011: 259-262.
|
| [28] |
XIANG Zhang, ZHAO Junbo, YANN L. Characterlevel Convolutional Networks for Text Classification[C]// NIPS. Annual Conference on Neural Information Processing Systems (NeurIPS). New York: Curran Associates, 2015: 649-657.
|
| [29] |
KASHNITSKY Y. Amazon Product Reviews Dataset for Hierarchical Text Classification[EB/OL]. (2020-04-22)[2025-01-20]. https://www.kaggle.com/kashnitsky/hierarchical-text-classification.
|
| [30] |
CHEN Chuanshuai, DAI Jiazhu. Mitigating Backdoor Attacks in LSTM-Based Text Classification Systems by Backdoor Keyword Identification[J]. Neurocomputing, 2021, 452: 253-262.
|
| [31] |
GAN Leilei, LI Jiwei, ZHANG Tianwei, et al. Triggerless Backdoor Attack for NLP Tasks with Clean Labels[C]// ACL. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg: ACL, 2022: 2942-2952.
|
| [32] |
YANG Zhilin, DAI Zihang, YANG Yiming, et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding[J]. (2019-06-19)[2025-01-10]. https://arxiv.org/abs/1906.08237.
|
| [33] |
QI Fanchao, CHEN Yangyi, LI Mukai, et al. ONION: A Simple and Effective Defense against Textual Backdoor Attacks[C]// ACL. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 9558-9566.
|
| [34] |
YANG Zhilin, DAI Zihang, YANG Yiming, et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding[EB/OL]. (2019-06-19)[2025-01-20]. https://arxiv.org/abs/1906.08237v2.
|
| [35] |
QI Fanchao, LI Mukai, CHEN Yangyi, et al. Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger[C]// ACL. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2021: 443-453..
|