Netinfo Security ›› 2017, Vol. 17 ›› Issue (3): 66-71.doi: 10.3969/j.issn.1671-1122.2017.03.011

• Orginal Article • Previous Articles     Next Articles

URL Classification Method Based on AdaBoost and Bayes Algorithm

Tengfei ZHANG, Qian ZHANG, Jiayong LIU()   

  1. College of Electronic and Information Engineering of Sichuan University, Chengdu Sichuan 610065,China
  • Received:2016-11-01 Online:2017-03-20 Published:2020-05-12

Abstract:

In order to realize the analysis of the behavior of the data stream from the HTTP protocol, the user needs to identify the URL. In this paper, a new method based on rule filtering and machine learning algorithm is proposed to quickly identify users to access URL. Firstly, the analytical data packets according to the URL suffix filtered load resources packet. Secondly, according to the unique browser user agent field and in the browser access identifying characteristic of the web browser user agent. Finally, the AdaBoost and Bayes algorithm to train a good sub category recognition user access URL based on. Experimental results show that the method can efficiently and accurately identify the user access URL in the local area network data stream.

Key words: rule filtering, machine learning algorithm, URL classification

CLC Number: