Netinfo Security ›› 2022, Vol. 22 ›› Issue (1): 64-71.doi: 10.3969/j.issn.1671-1122.2022.01.008

Previous Articles     Next Articles

Intrusion Detection Model Based on Extra Trees-recursive Feature Elimination and LightGBM

HE Hongyan1,2, HUANG Guoyan1,2(), ZHANG Bing1,2, JIA Damiao1,2   

  1. 1. Department of Information Science and Engineering, Yanshan University, Qinhuangdao 066001, China
    2. Hebei Key Laboratory of Software Engineering, Qinhuangdao 066001, China
  • Received:2021-08-24 Online:2022-01-10 Published:2022-02-16
  • Contact: HUANG Guoyan E-mail:hgy@ysu.edu.cn

Abstract:

The classification performance is seriously affected by the problems of large data dimension, unbalanced data sample and large dispersion of intrusion detection dataset. This paper proposed an intrusion detection method based on extra trees (ET)-recursive feature elimination (ET-RFE) and LightGBM (LGBM). Firstly, the network data was reconstructed by the one-hot encoding, and the attack class of a small number of samples was balanced in the data level. Secondly, ET-RFE based on ET was used for feature selection and dimension reduction of traffic features to find the optimal feature subset with the largest information. Finally, the obtained optimal feature subset was used as the LGBM input data set for classification training, and the Bayesian algorithm was used to optimize the LGBM parameters. In the real network traffic dataset UNSW-NB15, compared with the random forest (RF), XGboost algorithm and GALR-DT, the results show that the proposed method can effectively improve the detection rate, and achieve an effective recall rate for small sample attack types.

Key words: class imbalance, intrusion detection, LightGBM, recursive feature elimination

CLC Number: