Netinfo Security ›› 2024, Vol. 24 ›› Issue (7): 1098-1109.doi: 10.3969/j.issn.1671-1122.2024.07.011

Previous Articles     Next Articles

Large Language Model-Generated Text Detection Based on Linguistic Feature Ensemble Learning

XIANG Hui(), XUE Yunhao, HAO Lingxin   

  1. School of Cyberspace, Hangzhou Dianzi University, Hangzhou 310018, China
  • Received:2024-02-01 Online:2024-07-10 Published:2024-08-02

Abstract:

The rapid development of large language model (LLM) has provided great convenience for daily life and work, but has also brought challenges for individuals and society. Therefore, there is an urgent need for detectors that can detect text generated by large language models. For good detection performance and generalization ability, this paper proposed a large language model-generated text detection method based on linguistic feature learning—EBF detection. EBF detection combined the fine-tuned pre-trained language model and higher-order natural language statistical features, and used the decision mechanism to realize the LLM-generated text detection. Experimental results show that EBF Detection not only achieves an average detection accuracy of 98.72% on in-domain data, but also achieves an average detection accuracy of 96.79% on out-of-domain data.

Key words: large language model, LLM-generated text detection, ensemble learning, linguistic feature

CLC Number: