信息网络安全 ›› 2019, Vol. 19 ›› Issue (4): 55-62.doi: 10.3969/j.issn.1671-1122.2019.04.007

• 技术研究 • 上一篇    下一篇

网络安全态势感知中Trie树关键词高速匹配算法研究

徐国天1(), 张铭2   

  1. 1.中国刑事警察学院网络犯罪侦查系,辽宁沈阳 110854
    2.哈尔滨医科大学附属第二医院,黑龙江哈尔滨 150086
  • 收稿日期:2018-12-10 出版日期:2019-04-10 发布日期:2020-05-11
  • 作者简介:

    作者简介:徐国天(1978—),男,辽宁,副教授,硕士,主要研究方向为网络安全、电子物证;张铭(1970—),男,黑龙江,高级工程师,硕士,主要研究方向为计算机软件。

  • 基金资助:
    辽宁省自然科学基金[20180550841,2015020091];公安部理论及软科学研究计划[2016LLYJXJXY013];公安部技术研究计划[2016JSYJB06];辽宁省经济社会发展研究重大课题[2018LSLKTZD-028]

Research on Trie Tree Keyword Fast Matching Algorithm in Network Security Situational Awareness

Guotian XU1(), Ming ZHANG2   

  1. 1. Cyber Crime Investigation Department, Criminal Investigation Police University of China, Shenyang Liaoning 110854, China
    2. The Second Affiliated Hospital of Harbin Medical University, Harbin Heilongjiang 150086, China
  • Received:2018-12-10 Online:2019-04-10 Published:2020-05-11

摘要:

海量数据中关键词高速检索对增强网络安全态势感知系统反应速度,提高系统整体效率和安全性具有重要意义。基于双数组Trie树的网络信息检索算法具有较高的查找效率,但其插入时间复杂度较高,同时叶子结点占用了大量存储空间。为此,文章提出一种基于叶子结点压缩存储的双数组Trie树构造方法,按层次遍历Trie树,将分枝结点存储在基本双数组中,对叶子结点进行压缩后以位图形式存储于压缩数组中。该方法在保留双数组Trie树查询性能的同时,一定程度上提高了插入效率,改善了存储空间利用效率。

关键词: 态势感知, 双数组, Trie树, 压缩, 信息检索

Abstract:

High-speed keyword retrieval in massive data is of great significance for enhancing the response speed of network security situational awareness system and improving the overall efficiency and security of the system.The network information retrieval algorithm based on the double-array Trie-tree has higher search efficiency, but its insertion time complexity is higher, and the leaf nodes consume a lot of storage space.For this reason, this paper proposes a double-array Trie-tree construction method based on leaf node compression storage, traverses the tree hierarchically, stores the branch nodes in the basic double array, compresses the leaf nodes, and stores them in the compressed array. This method not only preserves the query performance of double-array Trie-tree, but also improves the insertion efficiency and storage space utilization efficiency.

Key words: situational awareness, double-array, Trie-tree, compression, information retrieval

中图分类号: