信息网络安全 ›› 2022, Vol. 22 ›› Issue (9): 11-20.doi: 10.3969/j.issn.1671-1122.2022.09.002

• 技术研究 • 上一篇    下一篇

基于k匿名数据集的鲁棒性水印技术研究

于晶1,2, 袁曙光1,2, 袁煜琳1,2, 陈驰1,2()   

  1. 1.中国科学院信息工程研究所,北京 100093
    2.中国科学院大学网络空间安全学院,北京 100049
  • 收稿日期:2022-05-29 出版日期:2022-09-10 发布日期:2022-11-14
  • 通讯作者: 陈驰 E-mail:chenchi@iie.ac.cn
  • 作者简介:于晶(1986—),女,辽宁,高级工程师,博士研究生,主要研究方向为数据安全、云计算安全|袁曙光(1994—),男,山东,博士研究生,主要研究方向为数据安全|袁煜琳(1998—),女,河南,博士研究生,主要研究方向为数据安全|陈驰(1978—),男,山东,正高级工程师,博士,主要研究方向为数据安全、云计算安全
  • 基金资助:
    国家重点研发计划(2020AAA0107800)

A Robust Watermarking Technology Based on k-Anonymity Dataset

YU Jing1,2, YUAN Shuguang1,2, YUAN Yulin1,2, CHEN Chi1,2()   

  1. 1. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China
    2. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2022-05-29 Online:2022-09-10 Published:2022-11-14
  • Contact: CHEN Chi E-mail:chenchi@iie.ac.cn

摘要:

在大数据时代,安全可控的数据发布变得越来越重要。数据持有者在发布数据前,出于保护用户隐私的目的,通常对数据集进行k匿名处理;而出于对数据版权的保护,则需要对数据集添加水印。因此,在k匿名数据集上嵌入水印信息具有现实意义。文章以在k匿名数据集上嵌入水印为研究目标,针对k匿名数据集缺少主键和具有受限的水印空间的问题,提出了一种基于k匿名数据集的鲁棒性水印方案,方案使用准标识符属性替代主键属性作为水印定位函数的种子,在k匿名数据集中的非敏感属性上嵌入水印信息,并在水印检测阶段采用两次多数投票机制纠正水印错误。该方案在不影响k匿名隐私目标实现的前提下,实现了数据版权信息嵌入,达到了数据隐私和数据版权的双重保护。实验证明,文章提出的水印方案具有良好的鲁棒性和执行效率。

关键词: 数据库水印, 版权保护, k匿名, 隐私保护

Abstract:

In the era of big data, secure and controlled data publishing becomes increasingly vital. When data holders publish dataset to data demanders, data holders often anonymize user’s data by k-anonymity for privacy purpose and embed watermarking in published dataset for protecting data copyright. Hence, there is a realistic demand for watermarking k-anonymity dataset. The main purpose of this paper is to embed watermark in k-anonymity dataset. However, there are two important problems for k-anonymity dataset to be addressed: the lack of primary key and the limited watermark space. In this paper, we try to address above problems by proposing a robust watermarking scheme based on k-anonymity dataset. This scheme used quasi-identifier attribute as the seed of watermark location function instead of primary key, embedded watermark information on non-sensitive attributes, and corrected error by twice majority voting in watermark detection phase. This scheme did not affect the effect of k-anonymity and realize the dual protection of privacy and copyright. Experiments showed that the watermarking scheme proposed in this paper has good robustness and efficiency.

Key words: database watermarking, copyright protection, k-anonymity, privacy protection

中图分类号: