信息网络安全 ›› 2016, Vol. 16 ›› Issue (11): 33-39.doi: 10.3969/j.issn.1671-1122.2016.11.006

• • 上一篇    下一篇

基于多窗口机制的聚类异常检测算法

何明亮1(), 陈泽茂1, 左进2   

  1. 1. 海军工程大学信息安全系,湖北武汉 430033
    2. 91428部队,浙江宁波 315000
  • 收稿日期:2016-07-01 出版日期:2016-11-20 发布日期:2020-05-13
  • 作者简介:

    作者简介:何明亮(1988—),男,湖北,硕士研究生,主要研究方向为信息安全;陈泽茂(1975—),男,福建,教授,博士,主要研究方向为网络安全;左进(1989—),男,四川,硕士,主要研究方向为信息安全。

  • 基金资助:
    湖北省自然科学基金[2015CF867]

Cluster Anomaly Detection Algorithm Based on Multi-windows Mechanism

Mingliang HE1(), Zemao CHEN1, Jin ZUO2   

  1. 1. Information Security Department, Naval University of Engineering, Wuhan Hubei 430033, China
    2. 91428 Troops of PLA, Ningbo Zhejiang 315000, China
  • Received:2016-07-01 Online:2016-11-20 Published:2020-05-13

摘要:

文章通过分析单窗口聚类异常检测算法的不足,综合利用权值、相似度和局部密度等概念对单窗口检测出的潜在异常点进行归属查找和异常合并,设计了一种基于多窗口机制的数据流异常检测算法。该算法首先在单个窗口内用改进的K-means聚类算法对预处理之后的数据流进行初步聚类检测,将每个窗口聚类的结果分为正常簇集合和潜在异常点集合。然后对单窗口检测结果进行二次判断。针对单窗口检测的潜在异常点,利用相似度原理进行正常类簇的归属查找,排除异常误判;利用局部密度等概念,对剩下的潜在异常点进行异常合并,再次排除可能的正常点。最后利用时间权值,综合多个数据流窗口的检测结果得出最终异常数据。仿真实验表明,相较于单窗口数据流异常检测算法,该算法提高了数据流的异常检测率,减少了异常误判,在检测率和误报率方面更具优势。

关键词: 单窗口, 多窗口, 数据流, 异常检测

Abstract:

This paper analyses the weaknesses of cluster anomaly detection algorithm based on single-window, takes advantage of weigh value, similarity, local density and other concepts to conduct affiliation search and abnormal merging on potential abnormal point obtained by single-window algorithm. Moreover, a dataflow anomaly detection algorithm based on multi-window mechanism is designed. This algorithm firstly conducts primary cluster detection to preprocessed dataflow with improved K-means cluster algorithm in single window and then conduct second judge to the results. For the potential abnormal point detected by single-window algorithm, similarity principle is adopted to conduct normal cluster affiliation search to exclude misjudges, other conceptions like local density is adopted to conduct abnormal merging to the rest potential abnormal points to exclude normal points again. Lastly, the time weigh value is used to obtain final abnormal data comprehensively from the detection results of several dataflow windows. The simulation shows that this algorithm has advantage over single-window cluster anomaly detection algorithm on detection rate and misjudge rate.

Key words: single window, multi-windows, data flow, anomaly detection

中图分类号: