Netinfo Security ›› 2025, Vol. 25 ›› Issue (8): 1231-1239.doi: 10.3969/j.issn.1671-1122.2025.08.005

Previous Articles     Next Articles

WebShell Detection Method Based on Multi-Dimensional Features and LightGBM-AdaBoost

GAO Jian1, HE Junpeng1,2(), MIAO Qingqing1   

  1. 1. School of Information Network Security, People's Public Security University of China, Beijing 100038, China
    2. Zigong Municipal Public Security Bureau, Zigong 643002, China
  • Received:2025-06-05 Online:2025-08-10 Published:2025-09-09

Abstract:

To address the low accuracy of traditional text-based detection methods in identifying WebShell files, as well as the limitations of existing machine learning and deep learning approaches, which tended to focus primarily on PHP WebShell and involved constrained feature selection, this paper proposed the construction of a high-dimensional feature space that incorporates file-intrinsic features, official standard features and BERT-based semantic features, additionally, a LightGBM-AdaBoost ensemble detection model was designed to tackle the challenge of distinguishing between benign files and WebShell in complex language scenarios where simple features fell short. The proposed method enabled efficient detection of both PHP and JSP WebShell types. Experimental results demonstrate that the proposed method achieves high detection accuracies of 99.81% for PHP WebShell and 98.93% for JSP WebShell. Compared with existing methods, this approach significantly improves detection accuracy and expands the types of detection.

Key words: WebShell detection, multi-dimensional features, LightGBM algorithm, AdaBoost algorithm

CLC Number: