信息网络安全 ›› 2024, Vol. 24 ›› Issue (12): 1855-1870.doi: 10.3969/j.issn.1671-1122.2024.12.005
袁煜琳1,2,3, 袁曙光1,3, 于晶1,2,3, 陈驰1,2,3()
收稿日期:
2024-08-20
出版日期:
2024-12-10
发布日期:
2025-01-10
通讯作者:
陈驰 作者简介:
袁煜琳(1998—),女,河南,博士研究生,主要研究方向为数据安全|袁曙光(1994—),男,山东,助理研究员,博士,CCF会员,主要研究方向为数据安全|于晶(1986—),女,辽宁,高级工程师,博士,主要研究方向为数据安全|陈驰(1978—),男,山东,正高级工程师,博士,主要研究方向为云计算安全理论与技术、数据安全理论与技术
基金资助:
YUAN Yulin1,2,3, YUAN Shuguang1,3, YU Jing1,2,3, CHEN Chi1,2,3()
Received:
2024-08-20
Online:
2024-12-10
Published:
2025-01-10
摘要:
个人隐私泄露是当前数据安全面临的严峻挑战。匿名技术通过对个人信息去标识化以降低隐私泄露的风险,但是不恰当的匿名处理流程会影响匿名结果,并且匿名数据仍存在一定程度的重识别风险。随着国内对数据安全流通监管的加强,如何面向数据合规,制定匿名流程,评估数据风险,对个人信息共享有重要意义。以往的匿名风险评估大多通过攻击模型判定安全性,忽视了匿名流程中的风险以及匿名数据的合规性。因此,文章提出一个匿名通用流程,并在此基础上,聚焦数据的安全性和合规性展开风险评估。安全性评估围绕流程风险和数据重识别风险提出配套的评估方法以及指标体系。合规性评估归纳现有标准并提出可量化的合规要求,在评估安全性的同时完成合规判定。文章设计匿名流程的仿真实验,验证了匿名通用流程的可行性,并通过模拟不同的风险场景,验证了风险评估方法可有效发现潜在威胁。
中图分类号:
袁煜琳, 袁曙光, 于晶, 陈驰. 面向数据合规的匿名通用流程与风险评估方法[J]. 信息网络安全, 2024, 24(12): 1855-1870.
YUAN Yulin, YUAN Shuguang, YU Jing, CHEN Chi. Anonymization General Process and Risk Assessment Method for Data Compliance[J]. Netinfo Security, 2024, 24(12): 1855-1870.
[1] | Ernst & Young Global Limited, Cyber Research Institute. 2021 Global Data Compliance and Privacy Technology Development Report[EB/OL]. [2024-08-05]. https://www.sicsi.org.cn/Upload/ueditor_file/ueditor/20211231/1640943375438233.pdf. |
永安(中国)企业咨询有限公司, 赛博研究院. 2021全球数据合规与隐私科技发展报告[EB/OL]. [2024-08-05]. https://www.sicsi.org.cn/Upload/ueditor_file/ueditor/20211231/1640943375438233.pdf. | |
[2] | Qi An Xin Technology Group Inc. Chinese Government and Enterprise Data Security Risk Analysis Report[EB/OL]. [2024-08-05]. https://www.qianxin.com/threat/reportdetail?report_id=168. |
奇安信. 中国政企机构数据安全风险分析报告[EB/OL]. [2024-08-05]. https://www.qianxin.com/threat/reportdetail?report_id=168. | |
[3] | GB/T 35273-2020 Information Security Technology-Personal Information Security Specification[S]. Beijing: Standards Press of China, 2020. |
GB/T 35273-2020 信息安全技术个人信息安全规范[S]. 北京: 中国标准出版社, 2020. | |
[4] | SION L, VAN LANDUYT D, WUYTS K, et al. Privacy Risk Assessment for Data Subject-Aware Threat Modeling[C]// IEEE. 2019 IEEE Security and Privacy Workshops (SPW). New York: IEEE, 2019: 64-71. |
[5] | National Standardization Administration of the People’s Republic of China. Information Security Technology-Risk Assessment Approaches for Data Security (Draft for Comments)[EB/OL]. (2023-08-20)[2024-08-05]. https://www.tc260.org.cn/file/2023-08-22/9702c85b-9c43-48f4-ac36-23021652f7be.pdf. |
国家标准化管理委员会. 信息安全技术数据安全风险评估方法(征求意见稿)[EB/OL]. (2023-08-20)[2024-08-05]. https://www.tc260.org.cn/file/2023-08-22/9702c85b-9c43-48f4-ac36-23021652f7be.pdf. | |
[6] | ALVIM M S, FERNANDES N, MCIVER A, et al. Flexible and Scalable Privacy Assessment for very Large Datasets, with an Application to Official Governmental Microdata[J]. Proceedings on Privacy Enhancing Technologies, 2022, 2022(4): 378-399. |
[7] | FUNG B C M, WANG Ke, CHEN Rui, et al. Privacy-Preserving Data Publishing[J]. ACM Computing Surveys, 2010, 42(4): 1-53. |
[8] | EL E K. Guide to the De-Identification of Personal Health Information[M]. New York: CRC Press, 2013. |
[9] |
PRASSER F, KOHLMAYER F, KUHN K A. The Importance of Context: Risk-Based De-Identification of Biomedical Data[J]. Methods of Information in Medicine, 2016, 55(4): 347-355.
doi: 10.3414/ME16-01-0012 pmid: 27322502 |
[10] | ABDELHAMEED S A, MOUSSA S M, KHALIFA M E. Privacy-Preserving Tabular Data Publishing: A Comprehensive Evaluation from Web to Cloud[J]. Computers & Security, 2018, 72: 74-95. |
[11] | ZIGOMITROS A, CASINO F, SOLANAS A, et al. A Survey on Privacy Properties for Data Publishing of Relational Data[J]. IEEE Access, 1071, 8: 51071-51099. |
[12] | BANDARA P L M K, BANDARA H D, FERNANDO S. Evaluation of Re-Identification Risks in Data Anonymization Techniques Based on Population Uniqueness[C]// IEEE. 2020 5th International Conference on Information Technology Research (ICITR). New York: IEEE, 2020: 1-5. |
[13] | DANKAR F K, EL E K, NEISA A, et al. Estimating the Re-Identification Risk of Clinical Data Sets[J]. BMC Medical Informatics and Decision Making, 2012, 12(1): 1-15. |
[14] | BETHLEHEM J G, KELLER W J, PANNEKOEK J. Disclosure Control of Microdata[J]. Journal of the American Statistical Association, 1990, 85(409): 38-45. |
[15] | ZAYATZ L V. Estimation of the Percent of Unique Population Elements on a Microdata File Using the Sample[EB/OL]. (1991-08-14)[2024-08-05]. https://permanent.access.gpo.gov/lps40408/lps40408/www.census.gov/srd/papers/pdf/rr91-08.pdf. |
[16] | ZAYATZ L V. Estimation of the Number of Unique Population Elements Using a Sample[EB/OL]. [2024-01-08]. http://www.asasrms.org/Proceedings/papers/1991_061.pdf. |
[17] | DIAZ C. Anonymity Metrics Revisited[EB/OL]. [2024-08-05]. https://drops.dagstuhl.de/storage/16dagstuhl-seminar-proceedings/dsp-vol05411/DagSemProc.05411.2/DagSemProc.05411.2.pdf. |
[18] | BEZZI M, VIMERCATI SDCD, FORESTI S, et al. Modeling and Preventing Inferences from Sensitive Value Distributions in Data Release1[J]. Journal of Computer Security, 2012, 20(4): 393-436. |
[19] | DIAZ M, WANG Hao, CALMON F P, et al. On the Robustness of Information-Theoretic Privacy Measures and Mechanisms[J]. IEEE Transactions on Information Theory, 2020, 66(4): 1949-1978. |
[20] | AGRAWAL D, AGGARWAL C C. On the Design and Quantification of Privacy Preserving Data Mining Algorithms[C]// ACM. Proceedings of the Twentieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. New York: ACM, 2001: 247-255. |
[21] | WAGNER I, ECKHOFF D. Technical Privacy Metrics[J]. ACM Computing Surveys, 2019, 51(3): 1-38. |
[22] | DU P C F, FAWAZ N. Privacy against Statistical Inference[C]// IEEE. 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton). New York: IEEE, 2012: 1401-1408. |
[23] | ANTONATOS S, BRAGHIN S, HOLOHAN N, et al. PRIMA: An End-to-End Framework for Privacy at Scale[C]// IEEE. 2018 IEEE 34th International Conference on Data Engineering (ICDE). New York: IEEE, 2018: 1531-1542. |
[24] | OMAR Elgabry. The Ultimate Guide to Data Cleaning[EB/OL]. (2019-01-03)[2024-01-08]. https://towardsdatascience.com/the-ultimate-guide-to-data-cleaning-3969843991d4. |
[25] | CHEN Shudong, OUYANG Xiaoye. Overview of Named Entity Recognition Technology[J]. Radio Communications Technology, 2020, 46(3): 251-260. |
陈曙东, 欧阳小叶. 命名实体识别技术综述[J]. 无线电通信技术, 2020, 46(3): 251-260. | |
[26] | GB/T 37964-2019 Information Security Technology-Guide for De-Identifying Personal Information[S]. Beijing: Standards Press of China, 2019. |
GB/T 37964-2019 信息安全技术个人信息去标识化指南[S]. 北京: 中国标准出版社, 2019. | |
[27] | YAN Weiwei, XIE Shunxin, PAN Jing, et al. Data Classification: Research Progress, Policy Standards and Enterprise Practice[J]. Digital Library Forum, 2022(9): 2-12. |
严炜炜, 谢顺欣, 潘静, 等. 数据分类分级:研究趋势、政策标准与实践进展[J]. 数字图书馆论坛, 2022(9): 2-12. | |
[28] | LEFEVRE K, DEWITT D J, RAMAKRISHNAN R. Mondrian Multidimensional K-Anonymity[C]// IEEE. 22nd International Conference on Data Engineering (ICDE’06). New York: IEEE, 2006: 25-35. |
[29] |
EL E K, DANKAR F K, ISSA R, et al. A Globally Optimal K-Anonymity Method for the De-Identification of Health Data[J]. Journal of the American Medical Informatics Association, 2009, 16(5): 670-682.
doi: 10.1197/jamia.M3144 pmid: 19567795 |
[30] | KOHLMAYER F, PRASSER F, ECKERT C, et al. Flash:Efficient, Stable and Optimal K-Anonymity[C]// IEEE. 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing. New York: IEEE, 2012: 708-717. |
[31] | XU Jian, WANG Wei, PEI Jian, et al. Utility-Based Anonymization Using Local Recoding[C]// ACM. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2006: 785-790. |
[32] | BAYARDO R J, AGRAWAL R. Data Privacy through Optimal K-Anonymization[C]// IEEE. 21st International Conference on Data Engineering (ICDE’05). New York: IEEE, 2005: 217-228. |
[33] | IYENGAR V S. Transforming Data to Satisfy Privacy Constraints[C]// ACM. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2002: 279-288. |
[34] | LUO Kang. Design and Implementation of Privacy Risk Assessment System for Medical Data Release[D]. Guiyang: Guizhou University, 2022. |
罗康. 医疗数据发布的隐私泄露风险评估系统设计与实现[D]. 贵阳: 贵州大学, 2022. | |
[35] | WU Ruixue. Research on Rational Privacy Protection Model and Algorithm Based on Association Rules[D]. Guiyang: Guizhou University, 2019. |
吴睿雪. 基于关联规则的理性隐私保护模型及算法研究[D]. 贵阳: 贵州大学, 2019. | |
[36] | MACHANAVAJJHALA A, GEHRKE J, KIFER D, et al. L-Diversity: Privacy Beyond K-Anonymity[C]// IEEE. 22nd International Conference on Data Engineering (ICDE’06). New York: IEEE, 2006: 24-28. |
[37] | LI Ninghui, LI Tiancheng, VENKATASUBRAMANIAN S. T-Closeness: Privacy beyond K-Anonymity and L-Diversity[C]// IEEE. 2007 IEEE 23rd International Conference on Data Engineering. New York: IEEE, 2007: 106-115. |
[38] | National Standardization Administration of the People’s Republic of China. Information Security Technology-Requirements for Classification and Grading of Network Data (Draft for Comments)[EB/OL]. (2022-09-14)[2024-08-05]. https://www.tc260.org.cn/file/2022-09-14/edb6ff74-01f8-4b40-8979-e2f9a34eba36.pdf. |
国家标准化管理委员会. 信息安全技术网络数据分类分级要求(征求意见稿)[EB/OL]. (2022-09-14)[2024-08-05]. https://www.tc260.org.cn/file/2022-09-14/edb6ff74-01f8-4b40-8979-e2f9a34eba36.pdf. | |
[39] | National Standardization Administration of the People’s Republic of China. Information Security Technology-Security Requirements for Processing of Sensitive Personal Information (Draft for Comments)[EB/OL]. (2023-08-08)[2024-08-05]. https://www.tc260.org.cn/file/2023-08-08/cf1db508-85e0-4ac0-abc6-ee619e004a25.pdf. |
国家标准化管理委员会. 信息安全技术敏感个人信息处理安全要求 (征求意见稿)[EB/OL]. (2023-08-08)[2024-08-05]. https://www.tc260.org.cn/file/2023-08-08/cf1db508-85e0-4ac0-abc6-ee619e004a25.pdf. | |
[40] | National Standardization Administration of the People's Republic of China. Information Security Technology-Guide for Evaluating the Effectiveness of Personal Information De-Identification[EB/OL]. (2023-03-17)[2024-08-05]. http://c.gb688.cn/bzgk/gb/showGb?type=online&hcno=E1A4E7943D64346D9EF1E3D0855F8496. |
国家标准化管理委员会. 信息安全技术个人信息去标识化效果评估指南[EB/OL]. (2023-03-17)[2024-08-05]. http://c.gb688.cn/bzgk/gb/showGb?type=online&hcno=E1A4E7943D64346D9EF1E3D0855F8496. | |
[41] | GB/T 39725-2020 Information Security Technology-Guide for Health Data Security[S]. Beijing: Standards Press of China, 2020. |
GB/T 39725-2020 信息安全技术健康医疗数据安全指南[S]. 北京: 中国标准出版社, 2020. | |
[42] | CHEN Guang, KELLER-MCNULTY S. Estimation of Identification Disclosure Risk in Microdata[J]. Journal of Official Statistics, 1998, 14(1): 79-85. |
[43] | RAO S S. Engineering Optimization Theory and Practice[M]. Hoboken: Wiley, 2019. |
[44] | CUI Binkai. Analysis on Ecological Sensitivity and Resilience of Pingliang City Based on Entropy Method and AHP[D]. Lanzhou: Lanzhou University, 2023. |
崔斌凯. 基于熵值法和层次分析法的平凉市生态敏感性与恢复力分析[D]. 兰州: 兰州大学, 2023. | |
[45] | ZHAO Jun, REN Yi, LI Bao, et al. Research on the Supply Chain Security Risk Assessment Methods for Mixed Source Operating System[J]. Netinfo Security, 2023, 23(5): 50-61. |
赵俊, 任怡, 李宝, 等. 混源操作系统供应链安全风险评估方法研究[J]. 信息网络安全, 2023, 23(5): 50-61. | |
[46] | LIN Li. Research on Collaborative Filtering Recommendation Algorithm of Books Based on Analytic Hierarchy Process[J]. Software Guide, 2023, 22(10): 178-184. |
林丽. 基于层次分析法的图书协同过滤推荐算法研究[J]. 软件导刊, 2023, 22(10): 178-184. | |
[47] | SAATY T L. A Scaling Method for Priorities in Hierarchical Structures[J]. Journal of Mathematical Psychology, 1977, 15(3): 234-281. |
[48] | KELE_87. Medical Data[EB/OL]. [2024-08-05]. https://www.heywhale.com/mw/dataset/5fe48c919762b2003013847a. |
KELE_87. 医疗数据[EB/OL]. [2024-08-05]. https://www.heywhale.com/mw/dataset/5fe48c919762b2003013847a. |
[1] | 孙钰, 熊高剑, 刘潇, 李燕. 基于可信执行环境的安全推理研究进展[J]. 信息网络安全, 2024, 24(12): 1799-1818. |
[2] | 金志刚, 陈旭阳, 武晓栋, 刘凯. 增量式入侵检测研究综述[J]. 信息网络安全, 2024, 24(12): 1819-1830. |
[3] | 何泽平, 许建, 戴华, 杨庚. 联邦学习应用技术研究综述[J]. 信息网络安全, 2024, 24(12): 1831-1844. |
[4] | 崔霆, 周屹东, 陈士伟, 张奕. 基于字的分组密码的谱值不变子空间[J]. 信息网络安全, 2024, 24(12): 1845-1854. |
[5] | 李科慧, 陈杰, 刘君. 一种针对碰撞攻击的白盒SM4改进方案[J]. 信息网络安全, 2024, 24(12): 1871-1881. |
[6] | 张国敏, 屠智鑫, 邢长友, 王梓澎, 张俊峰. 基于对抗样本的流量时序特征混淆方法[J]. 信息网络安全, 2024, 24(12): 1882-1895. |
[7] | 李奕轩, 贾鹏, 范希明, 陈尘. 基于控制流变换的恶意程序检测GNN模型对抗样本生成方法[J]. 信息网络安全, 2024, 24(12): 1896-1910. |
[8] | 徐健锋, 张炜, 涂敏, 魏勍颋, 赖展晴, 王倩倩. 基于语义融合轨迹生成的k匿名轨迹集补全方法[J]. 信息网络安全, 2024, 24(12): 1911-1921. |
[9] | 刘卓娴, 王靖亚, 石拓. 融合对抗训练与BERT-CNN-BiLSTM多通道神经网络的恶意URL检测研究[J]. 信息网络安全, 2024, 24(12): 1922-1932. |
[10] | 张选, 万良, 罗恒, 杨阳. 基于两阶段图学习的僵尸网络自动化检测方法[J]. 信息网络安全, 2024, 24(12): 1933-1947. |
[11] | 刁毅刚. 基于CiteSpace的个人信息保护领域研究热点及趋势分析[J]. 信息网络安全, 2024, 24(12): 1948-1954. |
[12] | 李万青, 朱丽, 刘兴安, 郑伟, 古乙舜. 水力发电站关键信息基础设施网络安全防护技术[J]. 信息网络安全, 2024, 24(12): 1955-1962. |
[13] | 印杰, 陈浦, 杨桂年, 谢文伟, 梁广俊. 基于人工智能的物联网DDoS攻击检测[J]. 信息网络安全, 2024, 24(11): 1615-1623. |
[14] | 李鹏超, 张全涛, 胡源. 基于双注意力机制图神经网络的智能合约漏洞检测方法[J]. 信息网络安全, 2024, 24(11): 1624-1631. |
[15] | 陈宝刚, 张毅, 晏松. 民航空管信息系统用户多因子持续身份可信认证方法研究[J]. 信息网络安全, 2024, 24(11): 1632-1642. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||