信息网络安全 ›› 2017, Vol. 17 ›› Issue (5): 28-36.doi: 10.3969/j.issn.1671-1122.2017.05.005

• • 上一篇    下一篇

开源大数据治理与安全软件综述

王文杰, 胡柏青, 刘驰()   

  1. 北京理工大学软件学院,北京 100081
  • 收稿日期:2017-03-11 出版日期:2017-05-20 发布日期:2020-05-12
  • 作者简介:

    作者简介: 王文杰(1995—),男,江西,硕士研究生,主要研究方向为大数据安全;胡柏青(1992—),男,湖北,硕士研究生,主要研究方向为大数据安全;刘驰(1984—),男,北京,教授,博士,主要研究方向为大数据、物联网技术。

  • 基金资助:
    国家自然科学基金[61300179]

A Survey of Open Source Software for Big Data Governance and Security

Wenjie WANG, Baiqing HU, Chi LIU()   

  1. School of Software, Beijing Institute of Technology, Beijing 100081, China
  • Received:2017-03-11 Online:2017-05-20 Published:2020-05-12

摘要:

随着网络和信息技术的不断发展与普及,人类产生的数据量也正在呈指数级增长。数据不再像传统技术时代那样在数据所有者的可控范围内,因此大数据安全与隐私成为了人们共同关注的热点问题。大数据安全与治理是为了解决数据的安全性以及数据隐私难以得到保障等问题而形成的最为热门的研究领域之一。文章首先介绍了大数据安全与治理的基本概念,接着分别讨论了Apache Falcon、Apache Atlas、Apache Ranger、Apache Sentry与Kerberos等大数据治理与安全开源框架。Apache Falcon和Apache Atlas能够对大数据平台执行包括数据采集、数据处理、数据备份和数据清洗在内的数据生命周期管理,也能够对大数据平台的各种组件进行很好的调度。Apache Ranger和Apache Sentry框架可以提供对大数据平台中的数据访问进行细粒度的权限控制和日志审计功能。Kerberos框架主要用来对大数据平台上框架进行权限认证,维护大数据平台上框架的安全。

关键词: 大数据, 安全, 治理, 开源

Abstract:

With the development of Internet technology, the amount of data increase exponentially. This data is no longer easily to be controlled by the owner which is different from the traditional technology. Therefore, big data security and privacy has become a hot issue. Big data security and governance is one of the most popular research fields to solve the data security and data privacy. This paper introduces the basic concepts of data security and governance first, and then talks about open source framework, including Apache Falcon, Apache Atlas, Apache Ranger, Apache Sentry and Kerberos. Apache Falcon and Apache Atlas can perform data lifecycle management, including data collection, data processing, data backup and data cleansing, for big data platforms, as well as for fine scheduling of components of big data platforms. Apache Ranger and Apache Sentry can fine grained authorization to do a specific action or operation and provide a central audit server. Kerberos is mainly used for big data platform for the authority of the framework of certification, and maintain security of the big data platform.

Key words: big data, security, governance, open source

中图分类号: