信息网络安全 ›› 2026, Vol. 26 ›› Issue (3): 471-481.doi: 10.3969/j.issn.1671-1122.2026.03.013
收稿日期:2025-08-10
出版日期:2026-03-10
发布日期:2026-03-30
通讯作者:
肖文
E-mail:shauven@126.com
作者简介:肖文(1976—),男,江西,讲师,博士,主要研究方向为信息内容安全、电子取证|涂敏(1967—),女,江西,教授,本科,主要研究方向为电子取证、网络信息安全
基金资助:Received:2025-08-10
Online:2026-03-10
Published:2026-03-30
摘要:
案件关键要素识别是司法文本智能分析的核心任务,在类案检索、裁判辅助等场景中具有重要价值。然而,司法领域标注数据稀缺的“低资源”特性,导致依赖大规模标注数据的传统命名实体识别方法性能受限。文章提出一种融合标签语义信息的识别模型,将实体类型标签作为提示信息嵌入文本编码过程,通过构建标签锚点向量与上下文文本向量的交互机制,显式建模标签与文本之间的语义关联,增强模型对要素类型语义的理解能力和低资源场景下的要素边界定位能力。实验结果表明,该方法在低资源案件数据集上的识别性能优于对比的基线模型,验证了标签语义对关键要素识别的增强作用,为司法领域低资源信息抽取任务提供了新的解决方案。
中图分类号:
肖文, 涂敏. 标签语义增强的低资源案件关键要素识别[J]. 信息网络安全, 2026, 26(3): 471-481.
XIAO Wen, TU Min. Key Element Identification of Low-Resource Cases with Label Semantic Enhancement[J]. Netinfo Security, 2026, 26(3): 471-481.
表3
不同规模训练样本下模型各指标对比
| 训练样本 数量/条 | 30 | 50 | 100 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| P | R | F1 | P | R | F1 | P | R | F1 | |
| BERT-CE | 67.79% | 70.94% | 69.33% | 73.42% | 76.40% | 74.88% | 77.09% | 81.64% | 79.30% |
| BERT-CRF | 66.37% | 70.87% | 68.54% | 74.25% | 78.69% | 76.41% | 79.61% | 81.49% | 80.54% |
| OVA-AUC | 64.21% | 72.29% | 68.01% | 69.86% | 79.12% | 74.21% | 76.31% | 81.72% | 78.92% |
| SMART-SPAN | 75.30% | 69.03% | 72.03% | 84.32% | 77.53% | 80.78% | 86.31% | 80.31% | 83.20% |
| PMRC | 77.53% | 66.99% | 71.88% | 82.80% | 75.97% | 79.24% | 88.10% | 80.83% | 84.30% |
| 本文模型 | 73.02% | 71.60% | 72.30% | 84.79% | 79.85% | 82.25% | 85.96% | 84.71% | 85.33% |
| 训练样本 数量/条 | 200 | 300 | 500 | ||||||
| P | R | F1 | P | R | F1 | P | R | F1 | |
| BERT-CE | 80.66% | 83.04% | 81.83% | 84.32% | 86.43% | 85.36% | 85.48% | 87.68% | 86.57% |
| BERT-CRF | 82.12% | 84.66% | 83.37% | 85.52% | 86.21% | 85.86% | 87.04% | 88.64% | 87.83% |
| OVA-AUC | 79.38% | 82.51% | 80.91% | 83.44% | 86.35% | 84.87% | 83.60% | 88.34% | 85.91% |
| SMART-SPAN | 88.94% | 82.50% | 85.59% | 88.61% | 82.58% | 85.49% | 89.65% | 86.41% | 88.00% |
| PMRC | 87.53% | 83.50% | 85.47% | 87.76% | 83.50% | 85.57% | 88.92% | 85.68% | 87.27% |
| 本文模型 | 86.95% | 85.68% | 86.31% | 88.75% | 86.17% | 87.44% | 88.11% | 88.10% | 88.11% |
| [1] | DENG Shumin, MA Yubo, ZHANG Ningyu, et al. Information Extraction in Low-Resource Scenarios: Survey and Perspective[C]// IEEE. 2024 IEEE International Conference on Knowledge Graph (ICKG). New York: IEEE, 2024: 33-49. |
| [2] | HUANG Yi, GAO Yuhan, REN Chengjuan. A Survey of Data Augmentation in Named Entity Recognition[EB/OL]. (2025-07-10)[2025-07-29]. https://www.sciencedirect.com/science/article/pii/S0925231225015280. |
| [3] | SANTOSO J, SUTANTO P, CAHYADI B, et al. Pushing the Limits of Low-Resource NER Using LLM Artificial Data Generation[C]// ACL. Findings of the Association for Computational Linguistics:ACL 2024. Stroudsburg: ACL, 2024: 9652-9667. |
| [4] | ZHANG Xinghua, CHEN Gaode, CUI Shiyao, et al. Exogenous and Endogenous Data Augmentation for Low-Resource Complex Named Entity Recognition[C]// ACM. The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM,2024: 630-640. |
| [5] | JIANG Miao, CHEN Honghui. Label-Guided Data Augmentation for Chinese Named Entity Recognition[EB/OL]. (2025-02-26)[2025-07-29]. https://www.mdpi.com/2076-3417/15/5/2521. |
| [6] | SASIKUMAR N, MANTRI K S I. Transfer Learning for Low-Resource Clinical Named Entity Recognition[C]// ACL. The 5th Clinical Natural Language Processing Workshop. Stroudsburg: ACL, 2023: 514-518. |
| [7] | XU Yiwu, CHEN Yun. ECTTLNER: An Effective Cross-Task Transferring Learning Method for Low-Resource Named Entity Recognition[EB/OL]. (2025-01-31)[2025-07-29]. https://link.springer.com/article/10.1007/s11063-025-11729-x. |
| [8] | HOU Wenlong, ZHAO Weidong, LIU Xianhui, et al. Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition[J]. ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, 23(5): 1-15. |
| [9] | LEE S, OH S, JUNG W. Enhancing Low-Resource Fine-Grained Named Entity Recognition by Leveraging Coarse-Grained Datasets[C]// ACL. The 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2023: 3269-3279. |
| [10] | ZHANG Min, QIAO Xiaosong, ZHAO Yanqing, et al. SmartSpanNer: Making Spanner Robust in Low Resource Scenarios[C]// ACL. Findings of the Association for Computational Linguistics:EMNLP 2023. Stroudsburg:ACL, 2023: 7964-7976. |
| [11] | NGUYEN N D, TAN Wei, DU Lan, et al. AUC Maximization for Low-Resource Named Entity Recognition[C]// AAAI. The AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2023, 37(11): 13389-13399. |
| [12] | NGUYEN N D, TAN W, DU L, et al. Low-Resource Named Entity Recognition: Can One-vs-All AUC Maximization Help?[C]// IEEE. 2023 IEEE International Conference on Data Mining (ICDM). New York: IEEE, 2023: 1241-1246. |
| [13] | SHRIMAL A, JAIN A, MEHTA K, et al. NER-MQMRC: Formulating Named Entity Recognition as Multi Question Machine Reading Comprehension[C]//ACL. The 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies:Industry Track. Stroudsburg: ACL, 2022: 230-238. |
| [14] | ZHANG Yuhao, WANG Yongliang. A Query-Parallel Machine Reading Comprehension Framework for Low-Resource NER[C]// ACL. Findings of the Association for Computational Linguistics:EMNLP 2023. Stroudsburg: ACL, 2023: 2052-2065. |
| [15] | HUANG Jin, YAN Danfeng, CAI Yuanqiang. PMRC: Prompt-Based Machine Reading Comprehension for Few-Shot Named Entity Recognition[C]// AAAI. The AAAI Conference on Artificial Intelligence. Menlo Park: AAAI, 2024, 38(16): 18316-18326. |
| [16] |
LIU Jiang, FEI Hao, LI Fei, et al. Tkdp: Threefold Knowledge-Enriched Deep Prompt Tuning for Few-Shot Named Entity Recognition[J]. IEEE Transactions on Knowledge and Data Engineering, 2024, 36(11): 6397-6409.
doi: 10.1109/TKDE.2024.3389650 URL |
| [17] | MA Jie, BALLESTEROS M, DOSS S, et al. Label Semantics for Few Shot Named Entity Recognition[C]// ACL. Findings of the Association for Computational Linguistics:ACL 2022. Stroudsburg: ACL, 2022: 1956-1971. |
| [18] | SHAO Qi, XIAO Bo, CHEN Qiao, et al. Chinese Name Entity Recognition with Label Semantics[C]// IEEE. 2023 8th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC). New York: IEEE, 2023: 1-5. |
| [19] |
LI Xuewei, LI Xinliang, ZHAO Mankun, et al. CLINER: Exploring Task-Relevant Features and Label Semantic for Few-Shot Named Entity Recognition[J]. Neural Computing and Applications, 2024, 36(9): 4679-4691.
doi: 10.1007/s00521-023-09285-3 |
| [20] |
YUAN Yihan, ZHANG Qinghua, ZHOU Xiong, et al. A Chinese Named Entity Recognition Model: Integrating Label Knowledge and Lexicon Information[J]. International Journal of Machine Learning and Cybernetics, 2025, 16(1): 253-266.
doi: 10.1007/s13042-024-02207-2 |
| [21] | LIU Xiaoya, LUO Senlin, WU Zhouting, et al. Joint Contrastive Learning with Semantic Enhanced Label Referents for Few-Shot NER[EB/OL]. (2024-10-11)[2025-07-29]. https://www.sciencedirect.com/science/article/pii/S0925231224018526. |
| [22] | ZHANG Yue, WANG Changzheng, SU Xuefeng, et al. Few-Shot Named Entity Recognition Method Based on Semantic Information Awareness of Labels[J]. Acta Scientiarum Naturalium Universitatis Pekinensi, 2024, 60(3): 413-421. |
| 张越, 王长征, 苏雪峰, 等. 基于标签语义信息感知的少样本命名实体识别方法[J]. 北京大学学报自然科学版, 2024, 60(3):413-421. | |
| [23] | DONG Yuhong, LU Peng, CHEN Jing, et al. Method for Extracting Legal Elements Based on Judgment Text[J]. Journal of CAEIT, 2024, 19(6): 552-558. |
| 董玉红, 卢鹏, 陈静, 等. 基于司法裁判文本的法律要素抽取方法[J]. 中国电子科学研究院学报, 2024, 19(6):552-558. | |
| [24] | WANG Yingjie, ZHANG Chengye, BAI Fengbo, et al. Named Entity Recognition Approach of Judicial Documents Based on Transformer[J]. Computer Science, 2024, 51(S1): 125-133. |
| 王颖洁, 张程烨, 白凤波, 等. 基于Transformer的司法文书命名实体识别方法[J]. 计算机科学, 2024, 51(S1):125-133. | |
| [25] | DOU Wenqi, CHEN Yanping, QIN Yongbin, et al. Method for Case Element Recognition Based on Machine Reading Comprehension[J]. Computer Engineering and Design, 2023, 44(8): 2475-2481. |
| 窦文琦, 陈艳平, 秦永彬, 等. 基于机器阅读理解的案件要素识别方法[J]. 计算机工程与设计, 2023, 44(8):2475-2481. | |
| [26] | MAO Xingliang, CHEN Xiaohong, NING Ken, et al. Global and Local Information Integration for Recognizing Key Case Elements[J]. Journal of Software, 2023, 34(12): 5724-5736 |
| 毛星亮, 陈晓红, 宁肯, 等. 全局和局部信息融合的案情关键要素识别[J]. 软件学报, 2023, 34(12):5724-5736. | |
| [27] | WANG Xiao, WAN Yuqing. A Named Entity Identification Method for Legal Documents[J]. Computer Applications and Software, 2023, 40(8): 180-186. |
| 王霄, 万玉晴. 一种面向法律文书的命名实体识别方法[J]. 计算机应用与软件, 2023, 40(8):180-186. | |
| [28] | LU Rui, LI Linying. A Named Entity Recognition Model for Legal Documents[J]. Netinfo Security, 2024, 24(11): 1783-1792. |
| 卢睿, 李林瑛. 一种面向法律文书的命名实体识别模型[J]. 信息网络安全, 2024, 24(11):1783-1792. | |
| [29] | ZHOU Peng, HE Jun. Named Entity Recognition in Chinese Legal Domains Based on Random Prompts[J]. Computer Engineering and Design, 2025, 46(4): 1167-1173. |
| 周鹏, 何军. 基于随机提示的中文法律领域命名实体识别[J]. 计算机工程与设计, 2025, 46(4):1167-1173. | |
| [32] | WANG Jintao, MENG Qixiang, GAO Zhilin, et al. Research on Case Information Element Extraction Method Based on Instruction Fine-Tuning of Large Language Models[J]. Journal of Frontiers of Computer Science and Technology, 2025, 19(8): 2161-2173. |
|
王劲滔, 孟琪翔, 高志霖, 等. 基于大语言模型指令微调的案件信息要素抽取方法研究[J]. 计算机科学与探索, 2025, 19(8):2161-2173.
doi: 10.3778/j.issn.1673-9418.2412085 |
|
| [31] |
LIU Qiang, WANG Jianbin, FU Jinbo, et al. Named Entity Recognition Method of Legal Instruments Based on Improved Few-Shot Learning[J]. IEEE Access, 2024, 12: 157444-157454.
doi: 10.1109/ACCESS.2024.3484765 URL |
| [1] | 卢睿, 李林瑛. 一种面向法律文书的命名实体识别模型[J]. 信息网络安全, 2024, 24(11): 1783-1792. |
| [2] | 王亚欣, 张健. 基于少样本命名实体识别技术的电子病历指纹特征提取[J]. 信息网络安全, 2024, 24(10): 1537-1543. |
| [3] | 丁家伟, 刘晓栋. 基于ELECTRA-CRF的电信网络诈骗案件文本命名实体识别模型[J]. 信息网络安全, 2021, 21(6): 63-69. |
| [4] | GULKhanSafiQamas, 尹继泽, 潘丽敏, 罗森林. 基于深度神经网络的命名实体识别方法研究[J]. 信息网络安全, 2017, 17(10): 29-35. |
| 阅读次数 | ||||||
|
全文 |
|
|||||
|
摘要 |
|
|||||
