信息网络安全 ›› 2023, Vol. 23 ›› Issue (7): 98-110.doi: 10.3969/j.issn.1671-1122.2023.07.010

• 理论研究 • 上一篇    下一篇

基于差分隐私和秘密共享的多服务器联邦学习方案

陈晶1, 彭长根1,2(), 谭伟杰1,2, 许德权1   

  1. 1.贵州大学公共大数据国家重点实验室,贵阳 550025
    2.贵州大学大数据产业发展应用研究院,贵阳 550025
  • 收稿日期:2023-03-27 出版日期:2023-07-10 发布日期:2023-07-14
  • 通讯作者: 彭长根 cgpeng@gzu.edu.cn
  • 作者简介:陈晶(1998—),男,江苏,硕士研究生,CCF会员,主要研究方向为安全多方计算和隐私保护|彭长根(1963—),男,贵州,教授,博士,CCF杰出会员,主要研究方向为隐私保护、密码学和大数据安全|谭伟杰(1981—),男,陕西,副教授,博士,CCF会员,主要研究方向为物联网安全、人工智能安全和通信网络安全|许德权(1989—),男,贵州,博士研究生,主要研究方向为密码学和数据安全。
  • 基金资助:
    国家自然科学基金(62272124)

A Multi-Server Federation Learning Scheme Based on Differential Privacy and Secret Sharing

CHEN Jing1, PENG Changgen1,2(), TAN Weijie1,2, XU Dequan1   

  1. 1. State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
    2. Guizhou Big Data Academy, Guizhou University, Guiyang 550025, China
  • Received:2023-03-27 Online:2023-07-10 Published:2023-07-14

摘要:

联邦学习依靠其中心服务器调度机制,能够在数据不出域的前提下完成多用户的联合训练。目前多数联邦学习方案及其相关的隐私保护方案都依赖于单个中心服务器完成加解密和梯度计算,一方面容易降低服务器的计算效率,另一方面一旦服务器受到外部攻击或是内部的恶意合谋,则会造成大量的隐私信息泄露。因此文章将差分隐私和秘密共享技术相结合,提出了一种多服务器的联邦学习方案。对本地用户训练的模型添加满足(ε,δ)-近似差分隐私的噪声,以防止多个服务器合谋获取隐私数据。将加噪后的梯度通过秘密共享协议分发至多服务器,保证传输梯度安全的同时利用多个服务器均衡计算负载,提高整体运算效率。基于公开数据集对该方案的模型性能、训练开销和安全性能进行了实验,结果表明该方案拥有较高的安全性,且该方案相较于明文方案,性能损耗仅为4%左右,对比单个服务器的加密方案,该方案在整体的计算开销上减少了近53%。

关键词: 联邦学习, 差分隐私, 秘密共享, 多服务器, 隐私安全

Abstract:

Federated learning relies on its central server scheduling mechanism, which can complete multi-user joint training without data leaving the domain. Most current federal learning schemes and their related privacy protection schemes rely on a single central server to complete encryption and decryption and gradient computation, which on the one hand tends to reduce the computational efficiency of the server, and on the other hand causes a large amount of privacy information leakage once the server is subject to external attacks or internal malicious collusion. Therefore, the paper combined differential privacy and secret sharing techniques to propose a multi-server federation learning scheme. Noise satisfying (ε,δ)-approximate differential privacy was added to the model trained by local users to prevent multiple servers from colluding to obtain private data. The noise-added gradients were distributed to multiple servers via a secret sharing protocol to ensure the security of the transmitted gradients while using multiple servers to balance the computational load and improve the overall computing efficiency. Experiments on the model performance, training overhead and security performance of the scheme based on public datasets show that the scheme has high security, and the performance loss of the scheme is only about 4% compared to the higher performance of the plaintext scheme, and the overall computational overhead is reduced by nearly 53% compared to the encryption scheme of a single server.

Key words: federated learning, differential privacy, secret sharing, multi-server, privacy security

中图分类号: