信息网络安全 ›› 2024, Vol. 24 ›› Issue (10): 1528-1536.doi: 10.3969/j.issn.1671-1122.2024.10.006
收稿日期:
2024-06-05
出版日期:
2024-10-10
发布日期:
2024-09-27
通讯作者:
周昌令, 作者简介:
张子涵(2000—),男,上海,硕士研究生,主要研究方向为软件测试|赖清楠(1990—),男,江西,工程师,硕士,主要研究方向为网络攻防对抗、人工智能、网络与信息安全|周昌令(1977—),男,重庆,高级工程师,博士,主要研究方向为网络安全与网络攻防、多智能体、网络大数据分析
基金资助:
ZHANG Zihan1, LAI Qingnan2, ZHOU Changling2()
Received:
2024-06-05
Online:
2024-10-10
Published:
2024-09-27
摘要:
随着深度学习技术在多个领域的广泛应用,其框架的安全性和稳定性也变得尤为重要。文章从用户角度出发,分析了不同用户群体可能遇到的漏洞类型及相应的模糊测试方法。首先介绍了深度学习框架的发展背景及其重要性;然后详细讨论了针对模型库、深度学习框架及编译器的模糊测试研究现状,梳理了如模型变异、权重生成、样例构造和模型测试等关键技术,并以PyTorch和MLIR的漏洞为例分析了漏洞形成的原因;最后展望了未来的研究方向,包括错误定位与自动修复技术、大语言模型增强的模糊测试。
中图分类号:
张子涵, 赖清楠, 周昌令. 深度学习框架模糊测试研究综述[J]. 信息网络安全, 2024, 24(10): 1528-1536.
ZHANG Zihan, LAI Qingnan, ZHOU Changling. Survey on Fuzzing Test in Deep Learning Frameworks[J]. Netinfo Security, 2024, 24(10): 1528-1536.
表1
模型级模糊测试的模型生成器
模型生成器 | 模型生成策略 | 变异指导规则 | 模型权重 生成 | 样例构造 |
---|---|---|---|---|
LEMON[ | 添加、交换、复制layer等6种变异策略 | 马尔可夫链蒙特卡洛 | 添加高斯噪声、更改激活函数状态等5种 | MNIST、CIFAR-10、ImageNet等 6个数据集 |
Audee[ | 以LeNet-5、ResNet20等7个模型为种子,随机改变layer参数 | 遗传算法 | 向预训练模型添加Cauchy噪声 | LeNet-5、MNIST、CIFAR-10 |
Muffin[ | 根据模板生成基本结构,用卷积、池化等算子实例化计算图 | 基于适应度比例选择 | 未提到 | MNIST、F-MNIST、CIFAR-10等 6个数据集 |
NNSmith[ | 通过手工编写的算子属性约束生成增量图,并采用SMT求解器对layer属性进行实例化 | — | 用反向传播生成不产生异常值的模型权重 | 用反向传播生成不产生异常值的计算输入 |
表2
PyTorch漏洞类型分析
漏洞类型 | 错误描述 | issue编号 |
---|---|---|
自动微分 错误 | 当torch.pow的底数和指数不一致时,torch.pow的前向自动微分报错 | 77493 |
logaddexp2不支持反向传播 | 77963 | |
后端结果 不一致 | 当torch.nn.functional.embedding中传入错误的行、列数时,CPU后端报错,而CUDA后端不会报错 | 66751 |
nn.Conv2d的CUDA实现与基于cuDNN的实现结果不一致 | 55381 | |
类型不支持 | torch.allclose算子不支持不同类型之间的比较(如float32与float16) | 55356 |
torch.trace在CPU上不支持float16 | 65447 | |
运行时出错 | 在GPU上运行int8的矩阵乘法会报错 | 49890 |
torch.sigmoid的输入为复数类型时会报错 | 55359 |
表3
MLIR漏洞类型分析
错误类型 | 错误描述 | issue编号 |
---|---|---|
方言之间 转换 | 在—convert-scf-to-openmp pass中,当index.rems操作的第二个操作数为0时,会发生除0错误 | 59714 |
在—convert-scf-to-spirv pass中,未在verifer中对浮点数f80类型进行验证,导致crash | 60199 | |
在—gpu-to-llvm pass中,未对vector.mask中的maskOp操作数进行空指针校验,导致段错误 | 61094 | |
方言内部 通用转换 | 在—inline pass中,如果一个被内联的函数只有llvm.return语句,在内联时会发生崩溃 | 60093 |
在—cse pass中,未考虑vector的秩为0的情况,导致崩溃 | 60193 | |
在—canonicalize pass中,没有对tensor进行维度合法性检查,导致崩溃 | 59703 | |
特定方言 内部转换 | 在func方言的—convert-func-to-llvm pass中,断言检查期望dim操作是一个常量,而实际上可以是一个变量,导致断言错误 | 59993 |
在llvm方言的—llvm-legalize-for-export pass中,没有对llvm.br操作进行注册,导致生成llvm.br时发生崩溃 | 59462 | |
在affine方言的—affine-loop-unroll pass中,错误假定循环体中生成的返回值总是在与循环对应的块中,从而生成不存在的引用,导致崩溃 | 59234 |
[1] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep Residual Learning for Image Recognition[EB/OL]. (2016-12-12)[2024-05-10]. https://ieeexplore.ieee.org/document/7780459/metrics#metrics. |
[2] |
GRIGORESCU S, TRASNEA B, COCIAS T, et al. A Survey of Deep Learning Techniques for Autonomous Driving[J]. Journal of Field Robotics, 2020, 37(3): 362-386.
doi: 10.1002/rob.21918 |
[3] | TORFI A, SHIRVANI R A, KENESHLOO Y, et al. Natural Language Processing Advancements by Deep Learning: A Survey[EB/OL]. (2020-03-02)[2024-06-01]. http://arxiv.org/abs/2003.01200. |
[4] | KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet Classification with Deep Convolutional Neural Networks[J]. Communications of the ACM, 2017, 60(6): 84-90. |
[5] | HICKMANN B, CHEN Jiesheng, ROTZIN M, et al. Intel Nervana Neural Network Processor-T(NNP-T) Fused Floating Point Many-Term Dot Product[C]// IEEE. 2020 IEEE 27th Symposium on Computer Arithmetic(ARITH). New York: IEEE, 2020: 133-136. |
[6] | NVIDIA. NVIDIA Tensor Cores: Versatility for HPC & AI[EB/OL]. [2024-05-10]. https://www.nvidia.com/en-us/data-center/tensor-cores/. |
[7] | CHEN Tianqi, MOREAU T, JIANG Ziheng, et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning[EB/OL]. (2021-02-27)[2024-05-30]. http://arxiv.org/abs/1802.04799. |
[8] | PHAM H V, LUTELLIER T, QI Weizhen, et al. CRADLE: Cross-Backend Validation to Detect and Localize Bugs in Deep Learning Libraries[C]// IEEE. 2019 IEEE/ACM 41st International Conference on Software Engineering(ICSE). New York: IEEE, 2019: 1027-1038. |
[9] | KLIPPENSTEIN K. Exclusive: Surveillance Footage of Tesla Crash on SF’s Bay Bridge Hours After Elon Musk Announces “Self-Driving” Feature[EB/OL]. (2023-01-10)[2024-06-01]. https://theintercept.com/2023/01/10/tesla-crash-footage-autopilot/. |
[10] | ZHANG Xiaoyu, JIANG Weipeng, SHEN Chao, et al. Survey: A Survey of Deep Learning Library Testing Methods[EB/OL]. (2024-04-27)[2024-06-02]. http://arxiv.org/abs/2404.17871. |
[11] | JI Jiahe, KONG Wei, TIAN Jianwen, et al. Survey on Fuzzing Techniques in Deep Learning Libraries[C]// IEEE. 2023 8th International Conference on Data Science in Cyberspace(DSC). New York: IEEE, 2023: 461-467. |
[12] | PAN R, BISWAS S, CHAKRABORTY M, et al. An Empirical Study on the Bugs Found while Reusing Pre-Trained Natural Language Processing Models[EB/OL]. (2022-11-30)[2024-06-01]. http://arxiv.org/abs/2212.00105. |
[13] | CHEN Junjie, LIANG Yihua, SHEN Qingchao, et al. Toward Understanding Deep Learning Framework Bugs[J]. ACM Transactions on Software Engineering and Methodology, 2023, 32(6): 1-31. |
[14] | DENG Yao, ZHENG Xi, ZHANG Tianyi, et al. A Declarative Metamorphic Testing Framework for Autonomous Driving[J]. IEEE Transactions on Software Engineering, 2023, 49(4): 1964-1982. |
[15] | CAO Junming, CHEN Bihuan, SUN Chao, et al. Understanding Performance Problems in Deep Learning Systems[C]// ACM. 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2022: 357-369. |
[16] | WEI Moshi, HARZEVILI N S, HUANG Yuekai, et al. Demystifying and Detecting Misuses of Deep Learning APIs[C]// ACM. IEEE/ACM 46th International Conference on Software Engineering. New York: ACM, 2024: 1-12. |
[17] | WANG Zan, YAN Ming, CHEN Junjie, et al. LEMON: Deep Learning Library Testing via Effective Model Generation[C]// ACM. 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2020: 788-799. |
[18] | GUO Qianyu, XIE Xiaofei, LI Yi, et al. Audee: Automated Testing for Deep Learning Frameworks[C]// ACM. 35th IEEE/ACM International Conference on Automated Software Engineering. New York: ACM, 2020: 486-498. |
[19] | GU Jiazhen, LUO Xuchuan, ZHOU Yangfan, et al. Muffin: Testing Deep Learning Libraries via Neural Architecture Fuzzing[C]// ACM. The 44th International Conference on Software Engineering. New York: ACM, 2022: 1418-1430. |
[20] | LIU Jiawei, LIN Jinkun, RUFFY F, et al. NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers[C]// ACM. The 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2023: 530-543. |
[21] | SHI Jingyi, XIAO Yang, LI Yuekang, et al. ACETest: Automated Constraint Extraction for Testing Deep Learning Operators[C]// ACM. The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2023: 690-702. |
[22] | XIE Danning, LI Yitong, KIM Mijung, et al. Documentation-Guided Fuzzing for Testing Deep Learning API Functions[C]// ACM. The 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2022: 176-188. |
[23] | WEI Anjiang, DENG Yinlin, YANG Chenyuan, et al. FreeFuzz: Free Lunch for Testing: Fuzzing Deep-Learning Libraries from Open Source[C]// ACM. The 44th International Conference on Software Engineering. New York: ACM, 2022: 995-1007. |
[24] | DENG Yinlin, YANG Chenyuan, WEI Anjiang, et al. Fuzzing Deep-Learning Libraries via Automated Relational API Inference[C]// ACM. The 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2022: 44-56. |
[25] | YANG Chenyuan, DENG Yinlin, YAO Jiayi, et al. Fuzzing Automatic Differentiation in Deep-Learning Libraries[C]// IEEE. 2023 IEEE/ACM 45th International Conference on Software Engineering(ICSE). New York: IEEE, 2023: 1174-1186. |
[26] | CHRISTOU N, JIN Di, KEMERLIS V. IvySyn: Automated Vulnerability Discovery in Deep Learning Frameworks[EB/OL]. (2022-09-29)[2024-05-10]. https://www.semanticscholar.org/paper/IvySyn%3A-Automated-Vulnerability-Discovery-in-Deep-Christou-Jin/58b1b17a04279361fb5d138f0cd8f8ab94029d69. |
[27] | Github. Remove Some Interface Block Decoration by Llehtahw Pull Request #8102 Apache/TVM[EB/OL]. [2024-05-28]. https://github.com/apache/tvm/pull/8102. |
[28] | Github. dpankratz/TVMFuzz[EB/OL]. (2024-02-18)[2024-05-12]. https://github.com/dpankratz/TVMFuzz. |
[29] | WANG Zihan, NIE Pengbo, MIAO Xinyuan, et al. GenCoG: A DSL-Based Approach to Generating Computation Graphs for TVM Testing[C]// ACM. The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2023: 904-916. |
[30] | WANG Haoyu, CHEN Junjie, XIE Chuyue, et al. MLIRSmith: Random Program Generation for Fuzzing MLIR Compiler Infrastructure[C]// IEEE. 38th IEEE/ACM International Conference on Automated Software Engineering(ASE). New York: IEEE, 2023: 1555-1566. |
[31] | SU Qidong, GENG Chuqin, PEKHIMENKO G, et al. TorchProbe: Fuzzing Dynamic Deep Learning Compilers[EB/OL]. (2023-10-30)[2024-06-02]. http://arxiv.org/abs/2310.20078. |
[32] | LIMPANUKORN B, WANG Jiyuan, KANG Hongjin, et al. Fuzzing MLIR by Synthesizing Custom Mutations[EB/OL]. (2024-04-25)[2024-05-12]. http://arxiv.org/abs/2404.16947. |
[33] | LIU Jiawei, WEI Yuxiang, YANG Sen, et al. Coverage-Guided Tensor Compiler Fuzzing with Joint IR-Pass Mutation[J]. The ACM on Programming Languages, 2022, 6: 1-26. |
[34] | MA Haoyang, SHEN Qingchao, TIAN Yongqiang, et al. Fuzzing Deep Learning Compilers with HirGen[EB/OL]. (2022-08-03)[2024-05-10]. http://arxiv.org/abs/2208.02193. |
[35] | LIN Kuiliang, SONG Xiangpu, ZENG Yingpei, et al. DeepDiffer: Find Deep Learning Compiler Bugs via Priority-Guided Differential Fuzzing[C]// IEEE. 2023 IEEE 23rd International Conference on Software Quality, Reliability, and Security(QRS). New York: IEEE, 2023: 616-627. |
[36] | AGRAWAL H, DEMILLO R A, SPAFFORD E H. Debugging with Dynamic Slicing and Backtracking[J]. Software: Practice and Experience, 1993, 23(6): 589-616. |
[37] | ZELLER A, HILDEBRANDT R. Simplifying and Isolating Failure-Inducing Input[J]. IEEE Transactions on Software Engineering, 2002, 28(2): 183-200. |
[38] | HU Mingzhe, ZHAO Qi, ZHANG Yu, et al. FROG: Cross-Language Call Graph Construction Supporting Different Host Languages[C]// IEEE. 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering(SANER). New York: IEEE, 2023: 155-166. |
[39] | LI Wen, MING Jiang, LUO Xiapu, et al. POLYCRUISE: A Cross-Language Dynamic Information Flow Analysis[C]// USENIX. 31st USENIX Security Symposium(USENIX Security 22). Berkeley: USENIX, 2022: 2513-2530. |
[40] | KIM M, KIM Y, LEE E. Denchmark: A Bug Benchmark of Deep Learning-Related Software[C]// IEEE. 2021 IEEE/ACM 18th International Conference on Mining Software Repositories(MSR). New York: IEEE, 2021: 540-544. |
[41] | DENG Yinlin, XIA C S, PENG Haoran, et al. Large Language Models Are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models[C]// ACM. The 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2023: 423-435. |
[42] | DENG Yinlin, XIA Chunqiu, YANG Chenyuan, et al. Large Language Models are Edge-Case Generators: Crafting Unusual Programs for Fuzzing Deep Learning Libraries[C]// ACM. The IEEE/ACM 46th International Conference on Software Engineering. New York: ACM, 2024: 1-13. |
[43] | CHEN M, TWOREK J, JUN H, et al. Evaluating Large Language Models Trained on Code[EB/OL]. (2021-07-07)[2024-05-29]. https://arxiv.org/abs/2107.03374. |
[44] | NIJKAMP E, PANG B, HAYASHI H, et al. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis[EB/OL]. (2022-03-05)[2024-05-29]. http://arxiv.org/abs/2203.13474. |
[1] | 陈晓静, 陶杨, 吴柏祺, 刁云峰. 面向骨骼动作识别的优化梯度感知对抗攻击方法[J]. 信息网络安全, 2024, 24(9): 1386-1395. |
[2] | 徐茹枝, 张凝, 李敏, 李梓轩. 针对恶意软件的高鲁棒性检测模型研究[J]. 信息网络安全, 2024, 24(8): 1184-1195. |
[3] | 陈昊然, 刘宇, 陈平. 基于大语言模型的内生安全异构体生成方法[J]. 信息网络安全, 2024, 24(8): 1231-1240. |
[4] | 张立强, 路梦君, 严飞. 一种基于函数依赖的跨合约模糊测试方案[J]. 信息网络安全, 2024, 24(7): 1038-1049. |
[5] | 田钊, 牛亚杰, 佘维, 刘炜. 面向车联网的车辆节点信誉评估方法[J]. 信息网络安全, 2024, 24(5): 719-731. |
[6] | 张光华, 刘亦纯, 王鹤, 胡勃宁. 基于JSMA对抗攻击的去除深度神经网络后门防御方案[J]. 信息网络安全, 2024, 24(4): 545-554. |
[7] | 徐子荣, 郭焱平, 闫巧. 基于特征恶意度排序的恶意软件对抗防御模型[J]. 信息网络安全, 2024, 24(4): 640-649. |
[8] | 戚晗, 王敬童, 拱长青. 基于随机量子层的变分量子卷积神经网络鲁棒性研究[J]. 信息网络安全, 2024, 24(3): 363-373. |
[9] | 杨志鹏, 刘代东, 袁军翼, 魏松杰. 基于自注意力机制的网络局域安全态势融合方法研究[J]. 信息网络安全, 2024, 24(3): 398-410. |
[10] | 江荣, 刘海天, 刘聪. 基于集成学习的无监督网络入侵检测方法[J]. 信息网络安全, 2024, 24(3): 411-426. |
[11] | 冯光升, 蒋舜鹏, 胡先浪, 马明宇. 面向物联网的入侵检测技术研究新进展[J]. 信息网络安全, 2024, 24(2): 167-178. |
[12] | 赵鹏程, 于俊清, 李冬. 一种基于深度学习的SRv6网络流量调度优化算法[J]. 信息网络安全, 2024, 24(2): 272-281. |
[13] | 王鹃, 龚家新, 蔺子卿, 张晓娟. 多维深度导向的Java Web模糊测试方法[J]. 信息网络安全, 2024, 24(2): 282-292. |
[14] | 金志刚, 丁禹, 武晓栋. 融合梯度差分的双边校正联邦入侵检测算法[J]. 信息网络安全, 2024, 24(2): 293-302. |
[15] | 印杰, 陈浦, 杨桂年, 谢文伟, 梁广俊. 基于人工智能的物联网DDoS攻击检测[J]. 信息网络安全, 2024, 24(11): 1615-1623. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||