信息网络安全 ›› 2025, Vol. 25 ›› Issue (4): 524-535.doi: 10.3969/j.issn.1671-1122.2025.04.002
收稿日期:2024-11-25
出版日期:2025-04-10
发布日期:2025-04-25
通讯作者:
宋晓 作者简介:李骁(2002—),男,黑龙江,博士研究生,主要研究方向为知识图谱、差分隐私、大语言模型|宋晓(1976—),男,四川,教授,博士,主要研究方向为深度学习与强化学习、工业互联网安全、航天装备建模与仿真|李勇(1993—),男,湖北,博士研究生,主要研究方向为差分隐私算法、深度学习
基金资助:Received:2024-11-25
Online:2025-04-10
Published:2025-04-25
摘要:
随着智能医疗系统的快速发展,标注数据的匮乏已成为制约研究进展的关键因素之一,知识蒸馏作为一种有效的数据利用策略能够缓解这一问题。然而,在智能医疗领域,模型通常用于替代人工进行影像、数据的诊断,这不仅对医疗信息隐私保护提出了更高要求,还强调了模型精度对诊断结果准确性的决定性影响。因此,文章提出一种结合差分隐私的知识蒸馏方案,并将其应用于图神经网络模型,在知识蒸馏过程中保护用户敏感信息的同时,确保较高的医疗诊断准确率。为验证所提方法的有效性,文章构建了图注意力网络(GAT)模型和卷积神经网络(CNN)模型作为对照组,并采用3种实际医疗图像数据集进行实验。结果表明,文章所提方法在GAT模型的准确率较在CNN模型的准确率有所提升,对应在3个数据集上分别由61%提升至68%、83%提升至93%、67%提升至80%。鉴于GAT模型的高资源开销,文章进一步设计了一种轻量化GAT模型架构。该轻量化模型在显著降低资源消耗的同时,仍保持优于CNN模型的分类性能,从而在差分隐私保护的前提下,有效提升医疗诊断效果。
中图分类号:
李骁, 宋晓, 李勇. 基于知识蒸馏的医疗诊断差分隐私方法研究[J]. 信息网络安全, 2025, 25(4): 524-535.
LI Xiao, SONG Xiao, LI Yong. Research on Differential Privacy Methods for Medical Diagnosis Based on Knowledge Distillation[J]. Netinfo Security, 2025, 25(4): 524-535.
表1
CNN模型神经网络结构
| 层类型 | 名称 | 输入尺寸 | 输出尺寸 | 参数描述 |
|---|---|---|---|---|
| 输入层 | — | (3,32,32) | (3,32,32) | 输入图像,3通道,32×32像素 |
| 卷积层 | Conv2d | (3,32,32) | (32,32,32) | 卷积核,32个3×3卷积核,步长1,填充1 |
| 激活层 | ReLU | (32,32,32) | (32,32,32) | ReLU激活函数 |
| 池化层 | MaxPool2d | (32,32,32) | (32,16,16) | 最大池化,2×2池化窗口,步长2 |
| 展平层 | — | (32,16,16) | (401408) | 展平操作,转换为1D向量 |
| 全连接层 | Linear | (401418) | (2) | 全连接层,输出两个类别 |
表2
GAT模型神经网络结构
| 层类型 | 名称 | 输入尺寸 | 输出尺寸 | 参数描述 |
|---|---|---|---|---|
| 输入层 | — | (N,3) | (N,3) | 输入节点特征矩阵,N为节点数量 |
| 图卷积层 | GATConv(conv1) | (N,3) | (N,96) | 96个卷积核,每个卷积核有27个权重和一个偏置,总计2688个参数 |
| 激活层 | ELU | (N,96) | (N,96) | 应用ELU激活函数 |
| 丢弃层 | Dropout | (N,96) | (N,96) | 随机丢弃比例为0.5 |
| 图卷积层 | GATConv(conv2) | (N,96) | (N,32) | 32个卷积核,每个卷积核有288个权重和一个偏置,总计9248个参数 |
| 激活层 | ELU | (N,32) | (N,32) | 应用ELU激活函数 |
| 丢弃层 | Dropout | (N,32) | (N,32) | 随机丢弃比例为0.5 |
| 图卷积层 | GATConv(conv3) | (N,32) | (N,32) | 32个卷积核,每个卷积核有96个权重和一个偏置,总计3104个参数 |
| 激活层 | ELU | (N,32) | (N,32) | 应用ELU激活函数 |
| 池化层 | Global Mean Pooling | (N,32) | (1,32) | 全局平均池化,整合整个图的特征 |
| 丢弃层 | Dropout | (1,32) | (1,32) | 随机丢弃比例为0.5 |
| 全连接层 | Linear | (1,32) | (1,2) | 全连接层,将32个隐藏单元映射到2个输出类别,共66个参数 |
| 激活函数 | Log Softmax | (1,2) | (1,2) | 使用log_softmax激活函数作为输出 |
| [1] | WAC M, SANTOS-RODRIGUEZ R, MCWILLIAMS C, et al. Capturing Requirements for a Data Annotation Tool for Intensive Care: Experimental User-Centered Design Study[EB/OL]. (2023-09-28)[2024-11-15]. https://arxiv.org/abs/2309.16500. |
| [2] | DAI Enyan, ZHAO Tianxiang, ZHU Huaisheng, et al. A Comprehensive Survey on Trustworthy Graph Neural Networks: Privacy, Robustness, Fairness, and Explainability[J]. Machine Intelligence Research, 2024, 21(6): 1011-1061. |
| [3] | WU Zonghan, PAN Shirui, CHEN Fengwen, et al. A Comprehensive Survey on Graph Neural Networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2020, 32(1): 4-24. |
| [4] | DILLON T W, LENDING D. Will They Adopt? Effects of Privacy and Accuracy[J]. Journal of Computer Information Systems, 2010, 50(4): 20-29. |
| [5] | ZHOU Jie, CUI Ganqu, HU Shengding, et al. Graph Neural Networks: A Review of Methods and Applications[J]. AI Open, 2020, 1: 57-81. |
| [6] | LIU Zhiyuan, ZHOU Jie. Graph Attention Networks[M]. Beijing: Posts & Telecom Press, 2021. |
| [7] | ZHANG Si, TONG Hanghang, XU Jiejun, et al. Graph Convolutional Networks: A Comprehensive Review[EB/OL]. (2019-09-10)[2024-11-15]. https://doi.org/10.1186/s40649-019-0069-y. |
| [8] | DWORK C. Differential Privacy[M]. Heidelberg: Springer, 2006. |
| [9] | DWORK C, ROTH A. The Algorithmic Foundations of Differential Privacy[J]. Foundations and Trends® in Theoretical Computer Science, 2014, 9(3-4): 211-407. |
| [10] | DWORK C, MCSHERRY F, NISSIM K, et al. Calibrating Noise to Sensitivity in Private Data Analysis[C]// Springer. Proceedings of the Third Conference on Theory of Cryptography. Heidelberg: Springer, 2006: 265-284. |
| [11] | WEN Jie, ZHANG Zhixia, LAN Yang, et al. A Survey on Federated Learning: Challenges and Applications[J]. International Journal of Machine Learning and Cybernetics, 2022, 14(2): 513-535. |
| [12] | YU Da, ZHANG Huishuai, CHEN Wei, et al. Large Scale Private Learning via Low-Rank Reparametrization[C]// PMLR. International Conference on Machine Learning. New York: PMLR, 2021: 12208-12218. |
| [13] | UNIYAL A, NAIDU R, KOTTI S, et al. DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy?[EB/OL]. (2024-03-19)[2024-11-15]. https://ar5iv.labs.arxiv.org/html/2106.12576. |
| [14] | CHEN Jing, ZHANG Jian. A Data-Free Personalized Federated Learning Algorithm Based on Knowledge Distillation[J]. Netinfo Security, 2024, 24(10): 1562-1569. |
| 陈婧, 张健. 基于知识蒸馏的无数据个性化联邦学习算法[J]. 信息网络安全, 2024, 24(10):1562-1569. | |
| [15] | LIU Zilong, WANG Xuequn. How to Regulate Individuals’ Privacy Boundaries on Social Network Sites: A Cross-Cultural Comparison[J]. Information & Management, 2018, 55(8): 1005-1023. |
| [16] | INAN A, GURSOY M E, SAYGIN Y. Sensitivity Analysis for Non-Interactive Differential Privacy: Bounds and Efficient Algorithms[J]. IEEE Transactions on Dependable and Secure Computing, 2017(1): 194-207. |
| [17] | ABADI M, CHU A, GOODFELLOW I, et al. Deep Learning with Differential Privacy[C]// ACM. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. New York: ACM, 2016: 308-318. |
| [18] | PAPERNOT N, ABADI M, ERLINGSSON Ú, et al. Semi-Supervised Knowledge Transfer for Deep Learning from Private Training Data[EB/OL]. (2017-02-06)[2024-11-15]. https://arxiv.org/abs/1610.05755v4. |
| [19] | KRICHEN M. Convolutional Neural Networks: A Survey[EB/OL]. (2023-06-17)[2024-11-15]. https://doi.org/10.3390/computers12080151. |
| [20] | ZHAO Xia, WANG Limin, ZHANG Yufei, et al. A Review of Convolutional Neural Networks in Computer Vision[EB/OL]. (2024-03-23)[2024-11-25]. https://doi.org/10.1007/s10462-024-10721-6. |
| [21] | VELICKOVIC P, CUCURULL G, CASANOVA A, et al. Graph Attention Networks[EB/OL]. (2017-10-30)[2024-11-15]. https://arxiv.org/abs/1710.10903. |
| [22] | NGONG I. Maintaining Privacy in Medical Data with Differential Privacy[EB/OL]. (2020-03-12)[2024-11-15]. https://openmined.org/blog/maintaining-privacy-in-medical-data-with-differential-privacy. |
| [23] | LIU Weikang, ZHANG Yanchun, YANG Hong, et al. A Survey on Differential Privacy for Medical Data Analysis[J]. Annals of Data Science, 2024, 11(2): 733-747. |
| [24] | WEI Li, DUAN Qin, LIU Zhiwei. A Summary of Medical Information Security Management in Internet Era[J]. Netinfo Security, 2019, 19(12): 88-92. |
| 韦力, 段沁, 刘志伟. 互联网时代医院网络安全管理综述[J]. 信息网络安全, 2019, 19(12):88-92. | |
| [25] | YAN Haicao, YIN Menghan, YAN Chaokun, et al. A Survey of Privacy Preserving Methods Based on Differential Privacy for Medical Data[C]// IEEE. 2024 7th World Conference on Computing and Communication Technologies (WCCCT). New York: IEEE, 2024: 104-108. |
| [26] | YANG Xingyi, HE Xuehai, ZHAO Jinyu, et al. COVID-CT-Dataset:A CT Scan Dataset about COVID-19[EB/OL]. (2020-03-30)[2024-11-15]. https://arxiv.org/abs/2003.13865v3. |
| [27] |
YANG Feng, POOSTCHI M, YU Hang, et al. Deep Learning for Smartphone-Based Malaria Parasite Detection in Thick Blood Smears[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(5): 1427-1438.
doi: 10.1109/JBHI.2019.2939121 pmid: 31545747 |
| [28] | KASSIM Y M, YANG Feng, YU Hang, et al. Diagnosing Malaria Patients with Plasmodium Falciparum and Vivax Using Deep Learning for Thick Smear Images[EB/OL]. (2021-10-27)[2024-11-15]. https://doi.org/10.3390/diagnostics11111994. |
| [29] | KASSIM Y M, PALANIAPPAN K, YANG Feng, et al. Clustering-Based Dual Deep Learning Architecture for Detecting Red Blood Cells in Malaria Diagnostic Smears[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 25(5): 1735-1746. |
| [30] | YANG Feng, QUIZON N, YU Hang, et al. Cascading YOLO: Automated Malaria Parasite Detection for Plasmodium Vivax in Thin Blood Smears[C]// SPIE. Medical Imaging 2020:Computer-Aided Diagnosis. Bellingham: SPIE, 2020: 404-410. |
| [31] | RAJARAMAN S, ANTANI S K, POOSTCHI M, et al. Pre-Trained Convolutional Neural Networks as Feature Extractors Toward Improved Malaria Parasite Detection in Thin Blood Smear Images[EB/OL]. (2018-04-16)[2024-11-15]. https://doi.org/10.7717/peerj.4568. |
| [1] | 刘晨飞, 万良. 基于时空图神经网络的CAN总线入侵检测方法[J]. 信息网络安全, 2025, 25(3): 478-493. |
| [2] | 刘强, 王坚, 王亚男, 王珊. 基于集成学习的恶意代码动态检测方法[J]. 信息网络安全, 2025, 25(1): 159-172. |
| [3] | 徐茹枝, 仝雨蒙, 戴理朋. 基于异构数据的联邦学习自适应差分隐私方法研究[J]. 信息网络安全, 2025, 25(1): 63-77. |
| [4] | 王健, 陈琳, 王凯崙, 刘吉强. 基于时空图神经网络的应用层DDoS攻击检测方法[J]. 信息网络安全, 2024, 24(4): 509-519. |
| [5] | 尹春勇, 贾续康. 基于策略图的三维位置隐私发布算法研究[J]. 信息网络安全, 2024, 24(4): 602-613. |
| [6] | 张新有, 孙峰, 冯力, 邢焕来. 基于多视图表征的虚假新闻检测[J]. 信息网络安全, 2024, 24(3): 438-448. |
| [7] | 余尚戎, 肖景博, 殷琪林, 卢伟. 关注社交异配性的社交机器人检测框架[J]. 信息网络安全, 2024, 24(2): 319-327. |
| [8] | 李奕轩, 贾鹏, 范希明, 陈尘. 基于控制流变换的恶意程序检测GNN模型对抗样本生成方法[J]. 信息网络安全, 2024, 24(12): 1896-1910. |
| [9] | 张选, 万良, 罗恒, 杨阳. 基于两阶段图学习的僵尸网络自动化检测方法[J]. 信息网络安全, 2024, 24(12): 1933-1947. |
| [10] | 李鹏超, 张全涛, 胡源. 基于双注意力机制图神经网络的智能合约漏洞检测方法[J]. 信息网络安全, 2024, 24(11): 1624-1631. |
| [11] | 陈婧, 张健. 基于知识蒸馏的无数据个性化联邦学习算法[J]. 信息网络安全, 2024, 24(10): 1562-1569. |
| [12] | 芦效峰, 程天泽, 龙承念. 基于随机游走的图神经网络黑盒对抗攻击[J]. 信息网络安全, 2024, 24(10): 1570-1577. |
| [13] | 徐茹枝, 戴理朋, 夏迪娅, 杨鑫. 基于联邦学习的中心化差分隐私保护算法研究[J]. 信息网络安全, 2024, 24(1): 69-79. |
| [14] | 尹春勇, 蒋奕阳. 基于个性化时空聚类的差分隐私轨迹保护模型[J]. 信息网络安全, 2024, 24(1): 80-92. |
| [15] | 秦中元, 马楠, 余亚聪, 陈立全. 基于双重图神经网络和自编码器的网络异常检测[J]. 信息网络安全, 2023, 23(9): 1-11. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
