信息网络安全 ›› 2021, Vol. 21 ›› Issue (8): 26-34.doi: 10.3969/j.issn.1671-1122.2021.08.004

• 技术研究 • 上一篇    下一篇

一种新的参数掩盖联邦学习隐私保护方案

路宏琳1,2, 王利明1(), 杨婧1   

  1. 1.中国科学院信息工程研究所,北京 100093
    2.中国科学院大学网络空间安全学院,北京 100049
  • 收稿日期:2021-03-19 出版日期:2021-08-10 发布日期:2021-09-01
  • 通讯作者: 王利明 E-mail:wangliming@iie.ac.cn
  • 作者简介:路宏琳(1995—),女,山东,硕士研究生,主要研究方向为联邦学习的数据隐私保护|王利明(1978—),男,北京,正高级工程师,博士,主要研究方向为云计算安全、网络安全|杨婧(1984—),女,山西,高级工程师,博士,主要研究方向为网络安全、数据安全分析
  • 基金资助:
    国家重点研发计划(2017YFB1010004)

A New Parameter Masking Federated Learning Privacy Preserving Scheme

LU Honglin1,2, WANG Liming1(), YANG Jing1   

  1. 1. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China
    2. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-03-19 Online:2021-08-10 Published:2021-09-01
  • Contact: WANG Liming E-mail:wangliming@iie.ac.cn

摘要:

随着数据隐私保护相关的法律法规相继出台,传统集中式学习模式中的隐私数据暴露问题已经成为制约人工智能发展的重要因素。联邦学习的提出解决了这一问题,但是现有的联邦学习存在模型参数泄露敏感信息、依赖可信第三方服务器等问题。文章提出了一种新的参数掩盖联邦学习隐私保护方案,能够抵御服务器攻击、用户攻击、服务器和少于t个用户的联合攻击。该方案包含密钥交换、参数掩盖、掉线处理3个协议。用户在本地训练模型后上传掩盖的模型参数,服务器进行模型参数聚合后,只能获得掩盖后的参数聚合结果。实验表明,对于16位的输入值,27用户和220- 维向量,文章方案对比明文发送数据提供1.44×的通信扩展,相比已有研究方案具备更低的通信代价。

关键词: 联邦学习, 隐私保护, 参数掩盖, 用户掉线

Abstract:

With the successive promulgation of data privacy protection laws and regulations, the problem of privacy data exposure in the traditional centralized learning model has become an important factor restricting the development of artificial intelligence. The proposal of federated learning solves this problem, however, existing federated learning has problems such as model parameters leaking sensitive information and relying on trusted third-party servers. This paper proposed a new parameter masking federated learning privacy preserving scheme, which can resist server attacks, user attacks, server colluding with less than t users attacks. The scheme included three protocols: key exchange, parameter masking, and disconnection processing. User uploaded the masked model parameters after training the model locally. After the server aggregated model parameters, it can only obtain the masked parameter aggregation results. Experiments show that for 16-byte input values, our protocol offer 1.44× communication expansion for 27 user and 220- dimensional vector over sending data in the clear, and compared with existing scheme, it has lower communication cost.

Key words: federated learning, privacy preserving, parameter masking, user disconnection

中图分类号: