信息网络安全 ›› 2021, Vol. 21 ›› Issue (2): 10-15.doi: 10.3969/j.issn.1671-1122.2021.02.002

• 技术研究 • 上一篇    下一篇

Geohash编码抗k近邻攻击的脆弱性分析

涂国庆1,2, 杨延浩1,2(), 刘树波3   

  1. 1.武汉大学国家网络安全学院,武汉 430072
    2.空天信息安全与可信计算教育部重点实验室,武汉 430072
    3.武汉大学计算机学院,武汉 430070
  • 收稿日期:2020-10-21 出版日期:2021-02-10 发布日期:2021-02-23
  • 通讯作者: 杨延浩 E-mail:yangyanhao@whu.edu.cn
  • 作者简介:涂国庆(1974—),男,湖北,副教授,博士,主要研究方向为物联网安全|杨延浩(1997—),男,山东,硕士研究生,主要研究方向为数据库安全|刘树波(1970—),男,黑龙江,教授,博士,主要研究方向为物联网安全
  • 基金资助:
    国家自然科学基金(41671443);武汉市应用基础前沿项目(2020020601012266)

Vulnerability Analysis of Geohash Code Against k-nearest Neighbor Attack

TU Guoqing1,2, YANG Yanhao1,2(), LIU Shubo3   

  1. 1. School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
    2. Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, Wuhan 430072, China
    3. School of Computer Science,Wuhan University, Wuhan 430070, China
  • Received:2020-10-21 Online:2021-02-10 Published:2021-02-23
  • Contact: YANG Yanhao E-mail:yangyanhao@whu.edu.cn

摘要:

Geohash编码作为一种降维技术目前已应用于空间数据库和空间数据引擎中,但其安全性还有待进一步研究。文章关注Geohash编码存在的安全漏洞,从理论上分析了此种降维技术产生推理通道的原因,并提出一种基于k近邻查询的加密Geohash字段重构算法,通过观察大量k近邻查询响应中的明文信息进行统计推断并重构出加密Geohash的原始值。对加密兴趣点数据库进行重构实验,实验表明,观察到的查询响应数量越多,重构值的精确度越高。在Geohash编码精度为30 bit的情况下,当观察到100000到3000000次查询响应时,重构值与原始值平均误差为0.074%到0.015%。该实验揭示了Geohash编码在抵抗k近邻查询推理攻击方面的脆弱性及形成机理,将促进相关地理信息系统行业的安全应用与研究。

关键词: 空间数据库, Geohash编码, k近邻查询, 可搜索加密, 数据库推理攻击

Abstract:

As a dimensionality reduction technology, Geohash coding has been applied to many spatial databases and spatial data engines, but there is no research on its security. This paper focuses on the security vulnerabilities in Geohash encoding, theoretically analyzes the reason why this dimensionality reduction technology produces inference channels, and proposes an encrypted Geohash field reconstruction algorithm based on k nearest neighbor query, by observing a large number of k nearest neighbor query responses for plaintext information, perform statistical inference and reconstruct the original value of encrypted Geohash. Reconstruction experiments on the encrypted interest point database show that the more the number of query responses observed, the higher the accuracy of the reconstruction value. In the case of Geohash coding accuracy of 30bit, when 100000 to 3000000 query responses are observed, the average error between the reconstructed value and the original value is 0.074% to 0.015%. This work reveals the vulnerability and formation mechanism of Geohash coding in resisting k nearest neighbor query inference attacks, and will promote the security application and research of related geographic information system industries.

Key words: spatial database, Geohash coding, k-nearest neighbor query, searchable encryption, database reasoning attack

中图分类号: