信息网络安全 ›› 2026, Vol. 26 ›› Issue (4): 566-578.doi: 10.3969/j.issn.1671-1122.2026.04.005
易文哲1,2, 徐枭洋1,2, 石磊3, 庄泳1,2, 王鹃1,2(
)
收稿日期:2025-06-16
出版日期:2026-04-10
发布日期:2026-04-29
通讯作者:
王鹃
E-mail:jwang@whu.edu.cn
作者简介:易文哲(2001—),男,湖北,博士研究生,主要研究方向为人工智能隐私安全、可信人工智能|徐枭洋(1999—),男,河北,博士研究生,主要研究方向为人工智能安全、分布式学习安全|石磊(1980—),男,山东,高级工程师,硕士,CCF会员,主要研究方向为网络安全、数据安全|庄泳(1999—),女,湖北,博士研究生,主要研究方向为生成式人工智能安全、可信人工智能|王鹃(1976—),女,湖北,教授,博士,CCF会员,主要研究方向为网络安全、可信计算、系统安全、人工智能安全
基金资助:
YI Wenzhe1,2, XU Xiaoyang1,2, SHI Lei3, ZHUANG Yong1,2, WANG Juan1,2(
)
Received:2025-06-16
Online:2026-04-10
Published:2026-04-29
摘要:
随着深度学习技术的快速发展和广泛应用,其所引发的隐私安全问题也日益受到关注。其中,模型反演攻击能够仅凭模型参数还原用户的人脸图像,对用户隐私构成严重威胁。尽管现有研究已提出多种防御方案,但仍存在模型性能与防御效果难以权衡、对新型攻击防御能力不足等问题。针对上述问题,文章提出一种基于知识迁移和冻结的模型反演防御方法。该方法通过冻结与分类相关的全连接层,有效防止隐私信息被提取,同时迁移紧邻全连接层的参数,以进一步增强防御能力。实验结果表明,与现有防御方法相比,该方法在多个模型和数据集上均展现出更优的防御性能与稳定性。
中图分类号:
易文哲, 徐枭洋, 石磊, 庄泳, 王鹃. 基于知识迁移和冻结的模型反演防御方法[J]. 信息网络安全, 2026, 26(4): 566-578.
YI Wenzhe, XU Xiaoyang, SHI Lei, ZHUANG Yong, WANG Juan. Model Inversion Defense Method Based on Knowledge Transfer and Freezing[J]. Netinfo Security, 2026, 26(4): 566-578.
表2
VGG-16模型的防御效果对比
| 目标模型 | 攻击方法 | 防御方法 | 测试准确率 | 攻击准确率 | 特征距离 |
|---|---|---|---|---|---|
| VGG-16 | GMI | 无防御 | 85.39% | 5.97% | 1893.23 |
| MID | 86.84% | 7.99% | 1821.10 | ||
| BiDO | 86.42% | 7.33% | 1841.95 | ||
| NLS | 85.54% | 3.93% | 1927.11 | ||
| TLDMI | 85.74% | 5.79% | 1916.22 | ||
| TF | 86.62% | 2.93% | 2140.34 | ||
| KEDMI | 无防御 | 85.39% | 36.12% | 1378.43 | |
| MID | 86.84% | 45.73% | 1308.22 | ||
| BiDO | 86.42% | 42.06% | 1350.15 | ||
| NLS | 85.54% | 25.53% | 1487.14 | ||
| TLDMI | 85.74% | 34.79% | 1378.00 | ||
| TF | 86.62% | 17.13% | 1813.13 | ||
| LOMMA | 无防御 | 85.39% | 59.56% | 1241.25 | |
| MID | 86.84% | 63.73% | 1224.74 | ||
| BiDO | 86.42% | 64.13% | 1229.07 | ||
| NLS | 85.54% | 61.73% | 1154.26 | ||
| TLDMI | 85.74% | 62.19% | 1224.45 | ||
| TF | 86.62% | 53.13% | 1408.93 | ||
| PLGMI | 无防御 | 85.39% | 85.25% | 1038.55 | |
| MID | 86.84% | 69.86% | 1189.83 | ||
| BiDO | 86.42% | 74.53% | 1166.52 | ||
| NLS | 85.54% | 62.40% | 1339.57 | ||
| TLDMI | 85.74% | 80.80% | 1080.82 | ||
| TF | 86.62% | 46.06% | 1549.52 |
表3
FaceNet64模型上的防御效果对比
| 目标模型 | 攻击方法 | 防御方法 | 测试准确率 | 攻击准确率 | 特征距离 |
|---|---|---|---|---|---|
| FaceNet64 | GMI | 无防御 | 87.41% | 11.33% | 1783.41 |
| MID | 84.74% | 17.86% | 1815.32 | ||
| BiDO | 82.76% | 5.06% | 1908.71 | ||
| NLS | 86.07% | 5.06% | 1840.55 | ||
| TLDMI | 83.18% | 4.73% | 1910.75 | ||
| TF | 87.34% | 4.26% | 2035.10 | ||
| KEDMI | 无防御 | 87.41% | 39.46% | 1355.96 | |
| MID | 84.74% | 55.06% | 1280.27 | ||
| BiDO | 82.76% | 30.59% | 1458.41 | ||
| NLS | 86.07% | 27.59% | 1465.72 | ||
| TLDMI | 83.18% | 35.73% | 1385.97 | ||
| TF | 87.34% | 23.00% | 1747.04 | ||
| LOMMA | 无防御 | 87.41% | 60.92% | 1262.41 | |
| MID | 84.74% | 55.00% | 1341.44 | ||
| BiDO | 82.76% | 51.93% | 1355.91 | ||
| NLS | 86.07% | 58.13% | 1236.25 | ||
| TLDMI | 83.18% | 62.40% | 1226.23 | ||
| TF | 87.34% | 36.26% | 1552.27 | ||
| PLGMI | 无防御 | 87.41% | 89.98% | 996.34 | |
| MID | 84.74% | 62.66% | 1328.30 | ||
| BiDO | 82.76% | 55.93% | 1361.91 | ||
| NLS | 86.07% | 90.39% | 958.19 | ||
| TLDMI | 83.18% | 82.59% | 1060.09 | ||
| TF | 87.34% | 50.19% | 1482.70 |
表4
ResNet-152模型上的防御效果对比
| 目标模型 | 攻击方法 | 防御方法 | 测试准确率 | 攻击准确率 | 特征距离 |
|---|---|---|---|---|---|
| ResNet-152 | GMI | 无防御 | 92.22% | 18.00% | 1737.80 |
| MID | 92.06% | 28.26% | 1613.31 | ||
| BiDO | 84.16% | 7.59% | 1871.73 | ||
| NLS | 90.14% | 8.60% | 1792.02 | ||
| TLDMI | 87.70% | 8.20% | 1841.32 | ||
| TF | 89.17% | 6.46% | 2050.48 | ||
| KEDMI | 无防御 | 92.22% | 52.39% | 1261.05 | |
| MID | 92.06% | 70.66% | 1156.38 | ||
| BiDO | 84.16% | 34.20% | 1406.98 | ||
| NLS | 90.14% | 35.26% | 1372.31 | ||
| TLDMI | 87.70% | 48.33% | 1323.27 | ||
| TF | 89.17% | 20.93% | 1810.63 | ||
| LOMMA | 无防御 | 92.22% | 72.06% | 1181.79 | |
| MID | 92.06% | 79.73% | 1085.46 | ||
| BiDO | 84.16% | 45.53% | 1359.48 | ||
| NLS | 90.14% | 71.73% | 1146.22 | ||
| TLDMI | 87.70% | 70.80% | 1185.46 | ||
| TF | 89.17% | 39.53% | 1553.92 | ||
| PLGMI | 无防御 | 92.22% | 95.13% | 925.34 | |
| MID | 92.06% | 92.79% | 980.15 | ||
| BiDO | 84.16% | 63.99% | 1265.38 | ||
| NLS | 90.14% | 95.13% | 892.71 | ||
| TLDMI | 87.70% | 94.13% | 911.18 | ||
| TF | 89.17% | 62.93% | 1341.79 |
| [1] | CHENG Yu, ZHAO Jian, WANG Zhecan, et al. Know You at One Glance: A Compact Vector Representation for Low-Shot Learning[C]// IEEE. 2017 IEEE International Conference on Computer Vision Workshops (ICCVW). New York: IEEE, 2017: 1924-1932. |
| [2] | RAJPURKAR P, IRVIN J, ZHU K, et al. CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning[EB/OL].(2017-12-25)[2025-05-23]. https://arxiv.org/abs/1711.05225. |
| [3] | MA Fenglong, CHITTA R, ZHOU Jing, et al. Dipole: Diagnosis Prediction in Healthcare via Attention-Based Bidirectional Recurrent Neural Networks[C]// ACM. The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2017: 1903-1911. |
| [4] | MELIS L, SONG Congzheng, DE C E, et al. Exploiting Unintended Feature Leakage in Collaborative Learning[C]// IEEE. 2019 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2019: 691-706. |
| [5] | SHOKRI R, STRONATI M, SONG Congzheng, et al. Membership Inference Attacks against Machine Learning Models[C]// IEEE. 2017 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2017: 3-18. |
| [6] | FREDRIKSON M, LANTZ E, JHA S, et al. Privacy in Pharmacogenetics: An {End-to-End} Case Study of Personalized Warfarin Dosing[C]// USENIX. 23rd USENIX Security Symposium (USENIX Security 14). Berkeley: USENIX, 2014: 17-32. |
| [7] | FREDRIKSON M, JHA S, RISTENPART T. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures[C]// ACM. The 22nd ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2015: 1322-1333. |
| [8] | ZHANG Yuheng, JIA Ruoxi, PEI Hengzhi, et al. The Secret Revealer: Generative Model-Inversion Attacks against Deep Neural Networks[C]// IEEE. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 253-261. |
| [9] | GOODFELLOW I, POUGET-ABADIE J, MIRZA M, et al. Generative Adversarial Nets[C]// ACM. The 28th International Conference on Neural Information Processing Systems (NIPS’14). New York: ACM, 2014: 2672-2680. |
| [10] | CHEN Si, KAHLA M, JIA Ruoxi, et al. Knowledge-Enriched Distributional Model Inversion Attacks[C]// IEEE. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). New York: IEEE, 2021: 16178-16187. |
| [11] | NGUYEN N B, CHANDRASEGARAN K, ABDOLLAHZADEH M, et al. Re-Thinking Model Inversion Attacks against Deep Neural Networks[C]// IEEE. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 16384-16393. |
| [12] | MIYATO T, KOYAMA M. CGANs with Projection Discriminator[EB/OL].(2018-08-15)[2025-05-23]. https://arxiv.org/abs/1802.05637. |
| [13] | YUAN Xiaojian, CHEN Kejiang, ZHANG Jie, et al. Pseudo Label-Guided Model Inversion Attack via Conditional Generative Adversarial Network[C]// ACM. The Thirty-Seventh AAAI Conference on Artificial Intelligence. New York: ACM, 2023: 3349-3357. |
| [14] | KARRAS T, LAINE S, AILA T. A Style-Based Generator Architecture for Generative Adversarial Networks[EB/OL].(2019-03-29)[2025-05-23]. https://arxiv.org/abs/1812.04948. |
| [15] | KARRAS T, LAINE S, AITTALA M, et al. Analyzing and Improving the Image Quality of StyleGAN[C]// IEEE. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 8110-8119. |
| [16] | AN Shengwei, TAO Guanhong, XU Qiuling, et al. MIRROR: Model Inversion for Deep Learning Network with High Fidelity[EB/OL].(2022-01-01)[2025-05-23]. https://www.ndss-symposium.org/ndss-paper/mirror-model-inversion-for-deep-learning-network-with-high-fidelity/. |
| [17] | STRUPPEK L, HINTERSDORF D, CORREIRA A D A, et al. Plug & Play Attacks: Towards Robust and Flexible Model Inversion Attacks[EB/OL].(2022-06-09)[2025-05-23]. https://arxiv.org/abs/2201.12179. |
| [18] | WANG Tianhao, ZHANG Yuheng, JIA Ruoxi. Improving Robustness to Model Inversion Attacks via Mutual Information Regularization[EB/OL].(2020-09-22)[2025-05-23]. https://arxiv.org/abs/2009.05241. |
| [19] | PENG Xiong, LIU Feng, ZHANG Jingfeng, et al. Bilateral Dependency Optimization: Defending against Model-Inversion Attacks[C]// ACM. The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 2022: 1358-1367. |
| [20] | STRUPPEK L, HINTERSDORF D, KERSTING K. Be Careful What You Smooth for: Label Smoothing Can be a Privacy Shield but also a Catalyst for Model Inversion Attacks[EB/OL].(2024-07-08)[2025-05-23]. https://arxiv.org/abs/2310.06549. |
| [21] | HO S T, HAO K J, CHANDRASEGARAN K, et al. Model Inversion Robustness: Can Transfer Learning Help[C]// IEEE. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2024: 12183-12193. |
| [22] | WANG Dong, QIN Qianqian, GUO Kaitian, et al. Survey on Model Inversion Attack and Defense in Federated Learning[J]. Journal on Communications, 2023, 44(11): 94-109. |
| 王冬, 秦倩倩, 郭开天, 等. 联邦学习中的模型逆向攻防研究综述[J]. 通信学报, 2023, 44(11): 94-109. | |
| [23] | HAN G, CHOI J, LEE H, et al. Reinforcement Learning-Based Black-Box Model Inversion Attacks[C]// IEEE. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2023: 20504-20513. |
| [24] | KAHLA M, CHEN Si, JUST H A, et al. Label-Only Model Inversion Attacks via Boundary Repulsion[C]// IEEE. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2022: 15045-15053. |
| [25] | NGUYEN B N, CHANDRASEGARAN K, ABDOLLAHZADEH M, et al. Label-Only Model Inversion Attacks via Knowledge Transfer[C]// ACM. The 37th International Conference on Neural Information Processing Systems Article (NIPS’23). New York: ACM, 2023: 68895-68907. |
| [26] | ZHUANG Fuzhen, QI Zhiyuan, DUAN Keyu, et al. A Comprehensive Survey on Transfer Learning[J]. Proceedings of the IEEE, 2021, 109(1): 43-76. |
| [27] | LIU Ziwei, LUO Ping, WANG Xiaogang, et al. Deep Learning Face Attributes in the Wild[C]// IEEE. 2015 IEEE International Conference on Computer Vision (ICCV). New York: IEEE, 2015: 3730-3738. |
| [28] | NG H W, WINKLER S. A Data-Driven Approach to Cleaning Large Face Datasets[C]// IEEE. 2014 IEEE International Conference on Image Processing (ICIP). New York: IEEE, 2014: 343-347. |
| [29] | SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[EB/OL].(2015-04-10)[2025-05-23]. https://arxiv.org/abs/1409.1556. |
| [30] | HE Kaiming, ZHANG Xiangyu, REN Shaoqing, et al. Deep Residual Learning for Image Recognition[C]// IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2016: 770-778. |
| [31] | DENG Jia, DONG Wei, SOCHER R, et al. ImageNet: A Large-Scale Hierarchical Image Database[C]// IEEE. 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2009: 248-255. |
| [32] | GUO Yandong, ZHANG Lei, HU Yuxiao, et al. MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition[C]// Springer. Computer Vision-ECCV 2016. Heidelberg:Springer, 2016: 87-102. |
| [33] | SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the Inception Architecture for Computer Vision[C]// IEEE. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2016: 2818-2826. |
| [1] | 崔津华, 董亮, 杨新. 大语言模型推理隐私保护技术综述[J]. 信息网络安全, 2026, 26(4): 503-520. |
| [2] | 郭毅, 李旭青, 张子蛟, 张宏涛, 张连成, 张香丽. 基于区块链的数据安全共享研究[J]. 信息网络安全, 2026, 26(1): 1-23. |
| [3] | 朱率率, 刘科乾. 基于掩码的选择性联邦蒸馏方案[J]. 信息网络安全, 2025, 25(6): 920-932. |
| [4] | 赵锋, 范淞, 赵艳琦, 陈谦. 基于本地差分隐私的可穿戴医疗设备流数据隐私保护方法[J]. 信息网络安全, 2025, 25(5): 700-712. |
| [5] | 秦金磊, 康毅敏, 李整. 智能电网中轻量级细粒度的多维多子集隐私保护数据聚合[J]. 信息网络安全, 2025, 25(5): 747-757. |
| [6] | 胡宇涵, 杨高, 蔡红叶, 付俊松. 三维分布式无线智能系统数据传输路径隐私保护方案[J]. 信息网络安全, 2025, 25(4): 536-549. |
| [7] | 何可, 王建华, 于丹, 陈永乐. 基于自适应采样的机器遗忘方法[J]. 信息网络安全, 2025, 25(4): 630-639. |
| [8] | 李佳东, 曾海涛, 彭莉, 汪晓丁. 一种保护数据隐私的匿名路由联邦学习框架[J]. 信息网络安全, 2025, 25(3): 494-503. |
| [9] | 张观平, 魏福山, 陈熹, 顾纯祥. 基于区块链的隐私保护跨域认证协议[J]. 信息网络安全, 2025, 25(12): 1948-1960. |
| [10] | 王后珍, 江皓朗, 刘继辰, 涂航. 基于Paillier同态加密的隐私保护排序方案[J]. 信息网络安全, 2025, 25(12): 1975-1989. |
| [11] | 王亚杰, 陆锦标, 李宇航, 范青, 张子剑, 祝烈煌. 基于可信执行环境的联邦学习分层动态防护算法[J]. 信息网络安全, 2025, 25(11): 1762-1773. |
| [12] | 温金明, 刘庆, 陈洁, 吴永东. 基于错误学习的全同态加密技术研究现状与挑战[J]. 信息网络安全, 2024, 24(9): 1328-1351. |
| [13] | 林湛航, 向广利, 李祯鹏, 徐子怡. 基于同态加密的前馈神经网络隐私保护方案[J]. 信息网络安全, 2024, 24(9): 1375-1385. |
| [14] | 郭倩, 赵津, 过弋. 基于分层聚类的个性化联邦学习隐私保护框架[J]. 信息网络安全, 2024, 24(8): 1196-1209. |
| [15] | 李增鹏, 王思旸, 王梅. 隐私保护近邻检测研究[J]. 信息网络安全, 2024, 24(6): 817-830. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||