信息网络安全 ›› 2016, Vol. 16 ›› Issue (11): 45-51.doi: 10.3969/j.issn.1671-1122.2016.11.008

• • 上一篇    下一篇

基于流量行为特征的异常流量检测

胡洋瑞, 陈兴蜀(), 王俊峰, 叶晓鸣   

  1. 四川大学计算机学院,四川成都 610065
  • 收稿日期:2016-07-01 出版日期:2016-11-20 发布日期:2020-05-13
  • 作者简介:

    作者简介:胡洋瑞(1991—),男,四川,硕士研究生,主要研究方向为信息安全、数据分析;陈兴蜀(1968—),女,四川,教授,博士,主要研究方向为大数据、云安全与网络安全;王俊峰(1976—),男,四川,教授,博士,主要研究方向为空间信息网络、智能交通;叶晓鸣(1981—),女,四川,博士研究生,主要研究方向为基于大数据安全的网络流量分析。

  • 基金资助:
    国家自然科学基金 [61272447]

Anomalous Traffic Detection Based on Traffic Behavior Characteristics

Yangrui HU, Xingshu CHEN(), Junfeng WANG, Xiaoming YE   

  1. College of Computer Science of Sichuan University, Chengdu Sichuan 610065, China
  • Received:2016-07-01 Online:2016-11-20 Published:2020-05-13

摘要:

针对真实网络流量缺乏标记数据集的问题,文章提出了一种无监督异常流量检测方法。通过对四川大学网络出口流量行为的分析和研究,构建了用户行为特征集,利用改进的k-means++余弦聚类方法建立正常流量行为模型,通过度量流量行为与正常行为模型之间的偏离距离以识别异常流量。文章通过Spark大数据处理平台实现了特征抽取、k-means改进算法和异常检测的研发,通过实验验证了该方法的可行性和有效性,实验结果表明文章提出的方法对异常流量行为检测具有较高的准确性和敏感性。

关键词: 大数据, 异常流量检测, k-means

Abstract:

Real network environment lack of labeled data set, so traditional anomaly traffic detection method based on labeled data set is unsuitable for actual large-scale network. To resolve this, the paper proposes an improved k-means anomaly traffic detection method for unlabeled data sets. Firstly, collect the Sichuan University network outlet flow and store in the distributed file system; secondly, construct user behavior feature set on the basis of network flow analysis, and extract relevant characteristics by Spark big data processing platform. Referenced principles of group to define the normal behavior of clusters in the actual flow, construct normal flow behavior model on improved K-means++ cosine clustering method; Finally, the cosine distance between the normal behavior model and user actual flow behavior is calculated to detected anomaly flow behavior. The feasibility and validity of the method are verified by attacking experiment. The experimental results show that the normal flow behavior model for anomaly flow detection has higher accuracy.

Key words: big data, anomaly traffic detection, k-means

中图分类号: