信息网络安全 ›› 2024, Vol. 24 ›› Issue (11): 1731-1738.doi: 10.3969/j.issn.1671-1122.2024.11.012

• 入选论文 • 上一篇    下一篇

基于可用性的数据噪声添加方法研究

顾海艳1(), 柳琪2, 马卓1, 朱涛1, 钱汉伟1   

  1. 1.江苏警官学院计算机信息与网络安全系,南京 210031
    2.南京市公安局玄武分局,南京 210058
  • 收稿日期:2024-07-05 出版日期:2024-11-10 发布日期:2024-11-21
  • 通讯作者: 顾海艳 ghy388@126.com
  • 作者简介:顾海艳(1970—),女,江苏,教授,硕士,主要研究方向为信息安全|柳琪(2000—),男,江苏,本科,主要研究方向为网络安全与执法|马卓(1993—),女,山西,讲师,博士,CCF会员,主要研究方向为信息安全、用户隐私|朱涛(1982—),男,四川,副教授,博士,主要研究方向为大数据技术|钱汉伟(1982—),男,江苏,高级工程师,硕士,主要研究方向为信息安全
  • 基金资助:
    国家自然科学基金(62202209);江苏省高等教育教改课题(2023JSJG364)

Research on Data Noise Addition Method Based on Availability

GU Haiyan1(), LIU Qi2, MA Zhuo1, ZHU Tao1, QIAN Hanwei1   

  1. 1. Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing 210031, China
    2. Nanjing Public Security Bureau Xuanwu Sub-Bureau, Nanjing 210058, China
  • Received:2024-07-05 Online:2024-11-10 Published:2024-11-21

摘要:

随着信息技术的加速发展,数据隐私保护逐渐成为研究热点。如何有效保护个人隐私并充分利用各种数据资源,成为亟待解决的问题。利用数据加噪实现隐私保护是目前较常用的方法之一,但目前缺乏对不同加噪方法对数据可用性影响的研究。文章对反映用户体验的“五星制”评价数据分别添加拉普拉斯噪声和高斯噪声,比较分析加噪后数据的平均绝对误差、均方根误差、方差增长率3个统计指标的变化情况。然后探讨添加两种噪声比例变化的组合噪声以及不同数据量对数据统计指标的影响规律。实验结果表明,当数据量较大时,添加高斯噪声的数据比例越大,加噪后数据的统计性能与原始数据越接近,能在实现个人隐私保护的同时,更好地保证数据的可用性。

关键词: 隐私保护, 可用性, 评价数据, 拉普拉斯噪声, 高斯噪声

Abstract:

As information technology rapidly advances, data privacy protection has become a focal point of interest. Effectively safeguarding personal privacy while maximizing the utility of data resources is an urgent issue to address. Implementing data privacy protection through noise addition is one of the prevalent methods, yet research on the impact of various noise addition techniques on data usability is scarce. This study experimentally introduced laplace noise and gaussian noise to “five-star” rating data indicative of user experience. It compared and analyzed the alterations in three statistical metrics of the noise data: mean absolute error, root mean square error, and variance growth rate. The paper further investigated the effects of combining noise with different ratios and varying data volumes on statistical indicators. The experimental results indicate that with larger data volumes, a higher proportion of Gaussian noise addition results in statistical properties of the noise data that more closely resemble the original data. This approach ensures data usability while achieving personal privacy protection.

Key words: privacy protection, availability, evaluation data, Laplace noise, Gaussian noise

中图分类号: