Netinfo Security ›› 2021, Vol. 21 ›› Issue (12): 118-125.doi: 10.3969/j.issn.1671-1122.2021.12.016

Previous Articles     Next Articles

Tor Anonymous Traffic Identification Method Based on Weighted Stacking Ensemble Learning

WANG Xirui, LU Tianliang(), ZHANG Jianling, DING Meng   

  1. College of Information Technology and Internet Security, People’s Public Security University of China, Beijing 100038, China
  • Received:2021-08-16 Online:2021-12-10 Published:2022-01-11
  • Contact: LU Tianliang E-mail:lutianliang@ppsuc.edu.cn

Abstract:

The Tor network is often utilized by criminals to engage in various illegal activities, so it is important to identify the tor traffic efficiently for network supervision and fighting against crime. In this paper, based on the integrated learning idea, the weighted stacking model for tor traffic identification was proposed to solve the problem of sparse tor traffic and low recognition accuracy in real environment. Based on the data flow, time correlation characteristics of the flow were extracted, and the first 14 features of the information gain were calculated to form the input data set. KNN, SVM and XGBoost were weighted differently and used as base learners. XGBoost was used as the meta learners to construct two-layer stacking model. Compared with 10 algorithms on the open data set, the experimental results show that the recognition model proposed in this paper is superior to most algorithms in accuracy and has a lower missed rate, which is more in line with the target of tor traffic recognition in real network environment.

Key words: anonymous network, Tor, unbalanced data, Stacking

CLC Number: