Netinfo Security ›› 2017, Vol. 17 ›› Issue (10): 81-85.doi: 10.3969/j.issn.1671-1122.2017.10.013

• Orginal Article • Previous Articles     Next Articles

Research on HDFS Small File Problem Based on Real-time Data of Cybersecurity

Shaojie WANG(), Chun LONG, Wei WAN, Jing ZHAO   

  1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
  • Received:2017-08-01 Online:2017-10-10 Published:2020-05-12

Abstract:

Cybersecurity awareness needs the real-time risk information to work. That is to write the real-time data and search it out as soon as possible. However, read and write the same storage unit at the same time will cause conflict and finally result into error. Some data source has the ability to transfer files on a regular basis, which can solve this problem. But it will produce a lot of small files and waste a lot of storage with small interval. To solve the small file problem, this paper came up with a file transfer append strategy based on file size. That is to add append function to the write and transfer file function to merge small files. This strategy can guarantee the file size over the pre-set value. The simulation result shows that this strategy can reduce the file amount and cut down the waste of storage effectively.

Key words: big data, HDFS, cybersecurity

CLC Number: