Netinfo Security ›› 2026, Vol. 26 ›› Issue (4): 615-625.doi: 10.3969/j.issn.1671-1122.2026.04.009

Previous Articles     Next Articles

Research on Multi-Strategy Enhanced Chinese Network Threat Intelligence Entity Extraction Based on Large Language Model

HU Mianning1, LI Xin1,2,3(), LI Mingfeng1, YUAN Deyu1,2,3   

  1. 1 School of Information and Network Security, People’s Public Security University of China, Beijing 100038, China
    2 Key Laboratory of Security Technology and Risk Assessment, Ministry of Public Security, Beijing 100038, China
    3 Public Security Big Data Strategy Research Center of the People’s Public Security University of China, Beijing 100038, China
  • Received:2024-12-21 Online:2026-04-10 Published:2026-04-29

Abstract:

With the increasing complexity of the cyberspace environment, network threat intelligence driven network security defense methods are gradually occupying an important position. The article aims to address the issues of insufficient data ownership, inefficient Chinese word segmentation and extraction in the current field of Chinese cyber threat intelligence. It conducts research on entity extraction based on a large language model with multiple strategies to enhance Chinese cyber threat intelligence, aiming to empower the construction of a knowledge graph for cyber threat intelligence and intelligence driven defense. The article improved the accuracy of network threat intelligence extraction by building a self constructed entity annotation dataset of Chinese network threat intelligence and applying a multi-strategy data augmentation technique. And MECT was used on multiple enhanced datasets to conduct horizontal and vertical comparative experiments with multiple models such as LGN, LR_CNN, Lattice_LSTM, etc. The results showed that the named entity recognition performance improves by nearly 10%. The article validates the effectiveness of multi-strategy data augmentation based on large language models in the task of extracting Chinese network threat intelligence entities through experiments, demonstrating its reliability and practicality in the field of network threat intelligence entity extraction.

Key words: entity extraction, data augmentation, Chinese cyber threat intelligence, large language model

CLC Number: