信息网络安全 ›› 2021, Vol. 21 ›› Issue (8): 1-9.doi: 10.3969/j.issn.1671-1122.2021.08.001

• 等级保妒 • 上一篇    下一篇

基于流量特征分类的异常IP识别系统的设计与实现

文伟平(), 胡叶舟, 赵国梁, 陈夏润   

  1. 北京大学软件与微电子学院,北京 100080
  • 收稿日期:2021-04-12 出版日期:2021-08-10 发布日期:2021-09-01
  • 通讯作者: 文伟平 E-mail:weipingwen@ss.pku.edu.cn
  • 作者简介:文伟平(1976—),男,湖南,教授,博士,主要研究方向为网络攻击与防范、软件安全漏洞分析、恶意代码研究、信息系统逆向工程和可信计算技术|胡叶舟(1995—),男,河南,硕士研究生,主要研究方向为异常网络流量识别、区块链安全|赵国梁(1991—),男,山东,硕士研究生,主要研究方向为恶意代码研究、异常网络流量识别|陈夏润(1997—)男,江西,硕士研究生,主要研究方向为软件安全漏洞分析、恶意代码研究
  • 基金资助:
    国家自然科学基金(61872011)

Design and Implementation of an Abnormal IP Identification System Based on Traffic Feature Classification

WEN Weiping(), HU Yezhou, ZHAO Guoliang, CHEN Xiarun   

  1. School of Software and Microelectronics, Peking University, Beijing 100080, China
  • Received:2021-04-12 Online:2021-08-10 Published:2021-09-01
  • Contact: WEN Weiping E-mail:weipingwen@ss.pku.edu.cn

摘要:

异常IP识别是追踪恶意主机的重要方式,是网络安全研究的热点之一。当前应用机器学习技术进行异常IP识别多依赖整体网络流量,在单台服务器流量下会失效,且面临标记数据成本高昂问题。针对上述问题,文章把聚类算法和遗传算法应用到对端异常IP主机的识别与分类技术中,利用网络流量的多维特征和单台主机上可检测的IP地址特征数据,使用无监督学习和半监督学习相结合的方法,实现对端异常IP的识别、检测,并且将方法实现为异常IP识别系统。系统在实验中能实现对UNSW-NB15数据集9种不同类型恶意IP的识别,识别精度最高可以达到98.84%。文章方法对恶意IP分类工作十分有效,并且可以识别未知类型的恶意IP,具有广泛的适用性和健壮性,已应用在国家某网络安全中心的流量识别系统中。

关键词: 恶意主机, 分类算法, 主机识别, 权重向量

Abstract:

Anomalous IP identification is an important way to track malicious hosts, and is one of the hot spots in network security research. Current applications of machine learning techniques for anomalous IP identification mostly rely on overall network traffic, which will fail under single server traffic and face the problem of high cost of labeled data. To address the above problems, the paper applies clustering algorithm and genetic algorithm to the identification and classification technology of end-to-end abnormal IP hosts, using the multidimensional features of network traffic and IP address feature data detectable on a single host, using a combination of unsupervised learning and semi-supervised learning to achieve the identification and detection of end-to-end abnormal IP, and implements the method as an abnormal IP identification system. The system can achieve the identification of 9 different types of malicious IP in the UNSW-NB15 dataset in the experiment, and the recognition accuracy can reach up to 98.84%. The article method is very effective for malicious IP classification work and can identify unknown types of malicious IP with wide applicability and robustness, and has been applied in the traffic identification system of a national network security center.

Key words: malicious hosts, classification algorithm, host identification, weight vector

中图分类号: