信息网络安全 ›› 2019, Vol. 19 ›› Issue (2): 53-59.doi: 10.3969/j.issn.1671-1122.2019.02.007

• 技术研究 • 上一篇    下一篇

一种基于目录哈希树的磁盘数据同步方法研究

李帅1, 刘晓洁2(), 徐兵1   

  1. 1.四川大学计算机学院,四川成都610041
    2.四川大学网络空间安全学院,四川成都 610041
  • 收稿日期:2018-11-15 出版日期:2019-02-10 发布日期:2020-05-11
  • 作者简介:

    作者简介:李帅(1992—),男,山西,硕士研究生,主要研究方向为数据存储;刘晓洁(1965—),女,江苏,教授,硕士,主要研究方向为网络信息对抗与保护技术、数字虚拟资产保护技术;徐兵(1993—),男,湖北,硕士研究生,主要研究方向为网络与信息安全。

  • 基金资助:
    国家重点研发计划 [2016YFB0800604,2016YFB0800605];国家自然科学基金[61572334,U1736212];四川省重点研发项目[2018GZ0183]

Research on a Disk Data Synchronization Method Based on Directory Hash Tree

Shuai LI1, Xiaojie LIU2(), Bing XU1   

  1. 1.College of Computer Science, Sichuan University, Chengdu Sichuan 610041, China
    2.College of Cybersecurity, Sichuan University, Chengdu Sichuan 610041, China
  • Received:2018-11-15 Online:2019-02-10 Published:2020-05-11

摘要:

随着云计算的广泛应用,云数据安全已经变得愈发重要。云数据安全一个重要的领域就是云数据容灾备份。而当前各种主流的云平台在数据容灾备份过程中大多使用的是Rsync同步算法。Rsync同步算法是一种高效的文件数据同步算法,但在面对复杂的云存储环境时,数据备份大多以磁盘为单元进行。而Rsync同步算法面对数据量大、分区目录结构复杂的磁盘数据时,存在对未变化文件判定效率低下及对新增文件同步效率低下的问题。文章针对这一问题,提出一种基于目录哈希树的磁盘数据同步方法。该方法在保持与原磁盘目录树拓扑结构一致的条件下,通过利用目录哈希树,能够快速确定文件的异同,并对差异文件使用Rsync同步算法,从而实现对新增文件的同步。实验结果表明,该方法与单一使用Rsync的方法相比能更有效地对磁盘数据进行同步,提升了同步效率。

关键词: 云存储, 数据容灾备份, 数据同步, Rsync, 目录哈希树

Abstract:

With the widespread use of cloud computing, cloud data security has become increasingly important. An important area for cloud data security is cloud data disaster recovery backup. Currently, most mainstream cloud platforms use the Rsync synchronization algorithm in the data disaster recovery backup process. The Rsync synchronization algorithm is an efficient file data synchronization algorithm, but in the face of the new cloud storage environment, data backup is mostly performed on a disk basis. When the Rsync synchronization algorithm faces disk data with a large amount of data and a complicated partition directory structure, there is a problem that the determination of the unchanged file is inefficient and the synchronization of the newly added file is inefficient. This paper proposes a disk data synchronization method based on directory hash tree for this problem. The method can quickly determine the similarities and differences of files by using the directory hash tree while maintaining the same topology as the original disk directory tree, and use the Rsync method to synchronize the difference files and completely synchronize the newly added files. The experimental results show that the proposed method can synchronize the disk data more effectively than the single Rsync method, which improves the synchronization efficiency.

Key words: cloud storage, data disaster recovery backup, data synchronization, Rsync, directory hash tree

中图分类号: