信息网络安全 ›› 2024, Vol. 24 ›› Issue (1): 69-79.doi: 10.3969/j.issn.1671-1122.2024.01.007

• 隐私保护 • 上一篇    下一篇

基于联邦学习的中心化差分隐私保护算法研究

徐茹枝, 戴理朋(), 夏迪娅, 杨鑫   

  1. 华北电力大学控制与计算机工程学院,北京 102200
  • 收稿日期:2023-08-20 出版日期:2024-01-10 发布日期:2024-01-24
  • 通讯作者: 戴理朋 E-mail:dlpdaniel1234@163.com
  • 作者简介:徐茹枝(1966—),女,江西,教授,博士,主要研究方向为AI安全、智能电网|戴理朋(1999—),男,安徽,硕士研究生,主要研究方向为联邦学习|夏迪娅(1999—),女,新疆,硕士研究生,主要研究方向为联邦学习、数据重构攻击|杨鑫(1996—),女,安徽,硕士研究生, 主要研究方向为差分隐私保护
  • 基金资助:
    国家自然科学基金(61972148)

Research on Centralized Differential Privacy Algorithm for Federated Learning

XU Ruzhi, DAI Lipeng(), XIA Diya, YANG Xin   

  1. School of Control and Computer Engineering, North China Electric Power University, Beijing 102200, China
  • Received:2023-08-20 Online:2024-01-10 Published:2024-01-24
  • Contact: DAI Lipeng E-mail:dlpdaniel1234@163.com

摘要:

近年来,联邦学习以独特的训练方式打破了数据“孤岛”,因此受到越来越多的关注。然而在训练全局模型时,联邦学习易受到推理攻击,可能会泄露参与训练成员的一些信息,产生严重的安全隐患。针对联邦训练过程中半诚实/恶意客户端造成的差分攻击,文章提出了基于中心化的差分隐私联邦学习算法DP-FEDAC。首先,优化联邦加速随机梯度下降算法,改进服务器的聚合方式,计算参数更新差值后采用梯度聚合方式更新全局模型,以提升稳定收敛;然后,通过对聚合参数添加中心化差分高斯噪声隐藏参与训练的成员贡献,达到保护参与方隐私信息的目的,同时还引入时刻会计(MA)计算隐私损失,进一步平衡模型收敛和隐私损失之间的关系;最后,与FedAC、分布式MB-SGD、分布式MB-AC-SGD等算法做对比实验,评估DP-FEDAC的综合性能。实验结果表明,在通信不频繁的情况下,DP-FEDAC算法的线性加速最接近FedAC,远优于另外两种算法,拥有较好的健壮性;此外DP-FEDAC算法在保护隐私的前提下能够达到与FedAC算法相同的模型精度,体现了算法的优越性和可用性。

关键词: 联邦学习, 隐私泄露, 差分隐私, 高斯噪声, 隐私追踪

Abstract:

Federated learning has received increasing attention in recent years for breaking down “data silos” with unique training methods. However, when the global model is trained, federation learning is vulnerable to inference attacks, which may reveal the information of some training members and bring about serious security risks. In order to solve differential attacks caused by semi-honest/malicious clients in federated training, this paper proposed a centralized differential privacy federated learning algorithm DP-FedAC. Firstly, the federal accelerated stochastic gradient descent algorithm was optimized to improve the aggregation mode of the server. After calculating the parameter update difference, the global model was updated by gradient aggregation mode to improve the stable convergence. Then, by adding centralized differential Gaussian noise to the aggregation parameters to hide the contributions of training members, the purpose of protecting the privacy information of participants was achieved. Time accounting (MA) was also introduced to calculate privacy loss to further balance the relationship between model convergence and privacy loss. Finally, comparative experiments were conducted with FedAC, distributed MB-SGD, distributed MB-AC-SGD and other algorithms to evaluate the comprehensive performance of DP-FedAC. The experimental results show that the linear acceleration of DP-FedAC algorithm is closest to that of FedAC in the case of infrequent communication, which is far better than the other two algorithms and has good robustness. In addition, the DP-FedAC algorithm achieves the same model accuracy as the FedAC algorithm on the premise of privacy protection, which reflects the superiority and usability of the algorithm.

Key words: federated learning, privacy leaks, differential privacy, Gaussian noise, privacy tracking

中图分类号: