信息网络安全 ›› 2025, Vol. 25 ›› Issue (1): 63-77.doi: 10.3969/j.issn.1671-1122.2025.01.006

• 理论研究 • 上一篇    下一篇

基于异构数据的联邦学习自适应差分隐私方法研究

徐茹枝, 仝雨蒙(), 戴理朋   

  1. 华北电力大学控制与计算机工程学院,北京 102206
  • 收稿日期:2024-09-28 出版日期:2025-01-10 发布日期:2025-02-14
  • 通讯作者: 仝雨蒙 E-mail:tongym02@163.com
  • 作者简介:徐茹枝(1966—),女,江西,教授,博士,主要研究方向为AI安全、智能电网|仝雨蒙(2002—),女,河北,硕士研究生,主要研究方向为联邦学习|戴理朋(1999—),男,安徽,硕士研究生,主要研究方向为联邦学习、差分隐私
  • 基金资助:
    国家重点研发计划(62372173)

Research on Federated Learning Adaptive Differential Privacy Method Based on Heterogeneous Data

XU Ruzhi, TONG Yumeng(), DAI Lipeng   

  1. School of Control and Computer Engineering, North China Electric Power University, Beijing 102206, China
  • Received:2024-09-28 Online:2025-01-10 Published:2025-02-14
  • Contact: TONG Yumeng E-mail:tongym02@163.com

摘要:

在联邦学习中,由于需要大量的参数交换,可能会引发来自不可信参与设备的安全威胁。为了保护训练数据和模型参数,必须采用有效的隐私保护措施。鉴于异构数据的不均衡特性,文章提出一种自适应性差分隐私方法来保护基于异构数据的联邦学习的安全性。首先为不同的客户端设置不同的初始隐私预算,对局部模型的梯度参数添加高斯噪声;其次在训练过程中根据每一轮迭代的损失函数值,动态调整各个客户端的隐私预算,加快收敛速度;接着设定一个可信的中央节点,对不同客户端的局部模型的每一层参数进行随机交换,然后将混淆过后的局部模型参数上传到中央服务器进行聚合;最后中央服务器聚合可信中央节点上传的混淆参数,根据预先设定的全局隐私预算阈值,对全局模型添加合适的噪声,进行隐私修正,实现服务器层面的隐私保护。实验结果表明,在相同的异构数据条件下,相对于普通的差分隐私方法,该方法具有更快的收敛速度以及更好的模型性能。

关键词: 联邦学习, 异构数据, 差分隐私, 高斯噪声

Abstract:

In federated learning, the need for a large amount of parameter exchange may lead to security threats from untrusted participating devices. In order to protect training data and model parameters, effective privacy protection measures must be taken. Given the imbalanced nature of heterogeneous data, this paper proposed an adaptive differential privacy method to protect the security of federated learning based on heterogeneous data. Firstly, different initial privacy budgets were set for different clients, and Gaussian noise was added to the gradient parameters of the local model; Secondly, during the training process, the privacy budget of each client was dynamically adjusted based on the loss function value of each iteration to accelerate convergence speed; Then, set a trusted central node to randomly exchange the parameters of each layer of local models from different clients, and then uploaded the confused local model parameters to the central server for aggregation; Finally, the central server aggregated the obfuscation parameters uploaded by trusted central nodes, added appropriate noise to the global model based on a pre-set global privacy budget threshold, and performed privacy correction to achieve server level privacy protection. The experimental results show that under the same heterogeneous data conditions, compared to ordinary differential privacy methods, the adaptive differential privacy method proposed in this paper has faster convergence speed and better model performance.

Key words: federated learning, heterogeneous data, differential privacy, Gaussian noise

中图分类号: