• • 上一篇    下一篇

基于倒排列表的网流索引检索与压缩方法

陈震%刘洪健   

  • 基金资助:
    国家自然科学A3重点基金[61161140320]、国家重点基础研究发展计划(国家973项目)(2012CB315800)

A Method of Net Flow Index Retrieval and Compression based on Inverted List

CHEN Zhen%LIU Hong-jian   

  • About author:清华大学信息技术研究院,北京 100084; 清华大学信息科学与技术国家实验室,北京 100084%北京邮电大学信息与通信工程学院,北京,100876

摘要: 随着计算机的广泛应用以及互联网的飞速发展,互联网流量呈现爆炸式增长的态势。为了应对日益严重的网络滥用以及网络安全事件,出于安全取证的需要,必须对互联网流量进行收集、存储和分析。互联网流量的监控需要及时统计网络流量的源地址、目的地址、源端口、目的端口、协议、时间戳等信息,以便进行流量统计和综合分析。但是网络流量信息是海量的,如何快速检索相关流量是一个挑战性问题。在搜索引擎中,为了处理海量数据检索,倒排索引是快速搜索技术的关键方法。文章把搜索引擎中的倒排索引方法和索引压缩算法应用到互联网网流信息检索中。通过实验测试和验证,在网流信息检索中,倒排索引以及索引压缩算法能够有效提高检索速度。

Abstract: Nowadays, with the pervasive usage of computer and Internet, the amount of Internet traffic is increasing dramatically. Traffic monitor is essential in network security and traffic forensic analysis. To monitor the flow, we are able to record the flow information of traffic, such as source IP, destination IP, source Port, destination Port, Protocol field, and timestamp etc. With this information, one can collect the statistics of traffic and conduct further analysis of attack pattern etc. However, the amount of flow information increases very fast. Searching a specified IP address could be low efficiency if we do not index flow information completely. As we know, inverted index is the key method of a practical search engine. Thus, this paper applies the idea of inverted index and index compress algorithm to the net flow information retrieval. After the analysis and experiment, the result shows that inverted index method is feasible in flow information retrieval and can improve the query performance as expected.