信息网络安全 ›› 2024, Vol. 24 ›› Issue (11): 1773-1782.doi: 10.3969/j.issn.1671-1122.2024.11.016
收稿日期:
2024-06-10
出版日期:
2024-11-10
发布日期:
2024-11-21
通讯作者:
秦振凯 作者简介:
秦振凯(1996—),男,广西,高级工程师,硕士研究生,CCF会员,主要研究方向为知识图谱和大语言模型|徐铭朝(2002—),男,广西,本科,主要研究方向为知识图谱|蒋萍(1981—),女,广西,教授,硕士,CCF会员,主要研究方向为自然语言处理
基金资助:
QIN Zhenkai1,2(), XU Mingchao1,3, JIANG Ping1,2
Received:
2024-06-10
Online:
2024-11-10
Published:
2024-11-21
摘要:
针对传统案件处理和分析方法效率低、耗时长的问题,文章提出一种构建案件知识图谱的方法,旨在提高案件处理效率,增强案件分析的深度和广度,为公安人员提供更加全面和精准的案件信息支持。首先,在OneKE大语言模型的基础上融入CasePrompt提示学习方法,提出案例事件抽取模型。然后,根据案件领域数据搭建知识图谱概念层架构,使用案例事件抽取模型实现实体抽取。最后,将结构化案件数据转化为三元组形式存入Neo4j图数据库,实现基于提示学习的案件知识图谱构建。实验结果表明,提示学习微调的大模型相比传统深度学习模型展现了更优秀的事件抽取性能,能够有效识别并抽取案件文本数据中的事件信息,进而构建高质量的案件知识图谱,从而提升案件分析效率。
中图分类号:
秦振凯, 徐铭朝, 蒋萍. 基于提示学习的案件知识图谱构建方法及应用研究[J]. 信息网络安全, 2024, 24(11): 1773-1782.
QIN Zhenkai, XU Mingchao, JIANG Ping. Research on the Construction Method and Application of Case Knowledge Graph Based on Prompt Learning[J]. Netinfo Security, 2024, 24(11): 1773-1782.
[1] | DING Jiawei, LIU Xiaodong. Text Named Entity Recognition Model of Telecommunication Network Fraud Crime Based on ELECTRA-CRF[J]. Netinfo Security, 2021, 21 (6): 63-69. |
丁家伟, 刘晓栋. 基于ELECTRA-CRF的电信网络诈骗案件文本命名实体识别模型[J]. 信息网络安全, 2021, 21(6):63-69. | |
[2] | WANG Yaqing, CHU Haoda, ZHANG Chao, et al. Learning from Language Description: Low-Shot Named Entity Recognition via Decomposed Framework[C]// ACL. Findings of the Association for Computational Linguistics:EMNLP 2021. Stroudsburg: ACL, 2021: 1618-1630. |
[3] | LIU Pan, GUO Yanming, WANG Fenglei, et al. Chinese Named Entity Recognition: The State of the Art[J]. Neurocomputing, 2022, 473: 37-53. |
[4] | DENG Mingkai, TAN Bowen, LIU Zhengzhong, et al. Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation[C]// ACL. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 7580-7605. |
[5] | TAO Yuan, HUANG Tao, LI Moyan, et al. Research on Log Audit and Analysis Model of Cyberspace Security Classified Protection Driven by Knowledge Map[J]. Netinfo Security, 2020, 20 (1): 46-51. |
陶源, 黄涛, 李末岩, 等. 基于知识图谱驱动的网络安全等级保护日志审计分析模型研究[J]. 信息网络安全, 2020, 20 (1): 46-51. | |
[6] | YANG Tongchao, QIN Yongbin, HUANG Ruizhang, et al. Prediction Method of Legal Provisions of Drug-Related Cases Based on Knowledge Graph[J]. Computer Engineering and Design, 2023, 44 (6): 1899-1906. |
杨通超, 秦永彬, 黄瑞章, 等. 基于知识图谱的涉毒案件法条预测方法[J]. 计算机工程与设计, 2023, 44(6):1899-1906. | |
[7] | PEI Bingsen, LI Xin, WU Yue. Influence Evaluation of Telecom Fraud Case Types Based on ChatGPT[J]. Journal of Frontiers of Computer Science and Technology, 2023, 17(10): 2413-2425. |
裴炳森, 李欣, 吴越. 基于ChatGPT的电信诈骗案件类型影响力评估[J]. 计算机科学与探索, 2023, 17 (10): 2413-2425.
doi: 10.3778/j.issn.1673-9418.2306044 |
|
[8] |
WANG Zhizheng, WANG Lei, LI Shuaichi, et al. Sentencing Prediction Based on Multi-View Knowledge Graph Embedding[J]. Pattern Recognition and Artificial Intelligence, 2021, 34 (7): 655-665.
doi: 10.16451/j.cnki.issn1003-6059.202107007 |
王治政, 王雷, 李帅驰, 等. 基于多视角知识图谱嵌入的量刑预测[J]. 模式识别与人工智能, 2021, 34(7):655-665.
doi: 10.16451/j.cnki.issn1003-6059.202107007 |
|
[9] | QAZI N, WILLIAM W B L. Behavioural & Tempo-Spatial Knowledge Graph for Crime Matching through Graph Theory[C]// IEEE. 2017 European Intelligence and Security Informatics Conference (EISIC). New York: IEEE, 2017: 143-146. |
[10] | ELEZAJ O, YAYILGAN S Y, KALEMI E, et al. Towards Designing a Knowledge Graph-Based Framework for Investigating and Preventing Crime on Online Social Networks[C]// Spring. Communications in Computer and Information Science. New York: Springer, 2019: 181-195. |
[11] | HOGENBOOM F, FRASINCAR F, KAYMAK U, et al. A Survey of Event Extraction Methods from Text for Decision Support Systems[J]. Decision Support Systems, 2016, 85: 12-22. |
[12] | WAN Qizhi, WAN Changxuan, HU Rong, et al. Chinese Financial Event Extraction Base on Syntactic and Semantic Dependency Parsing[J]. Chinese Journal of Computers, 2021, 44(3): 508-530. |
万齐智, 万常选, 胡蓉, 等. 基于句法语义依存分析的中文金融事件抽取[J]. 计算机学报, 2021, 44(3):508-530. | |
[13] | RILOFF E M. Automatically Constructing a Dictionary for Information Extraction Tasks[C]// AAAI. AAAI’93:Proceedings of the Eleventh National Conference on Artificial Intelligence July. Washington: AAAI, 1993: 811-816. |
[14] | LI Peifeng, ZHOU Guodong, ZHU Qiaoming. Semantics-Based Joint Model of Chinese Event Trigger Extraction[J]. Journal of Software, 2016, 27(2): 280-294. |
李培峰, 周国栋, 朱巧明. 基于语义的中文事件触发词抽取联合模型[J]. 软件学报, 2016, 27(2):280-294. | |
[15] | AHN D. The Stages of Event Extraction[C]// ACL. Proceedings of the Workshop on Annotating and Reasoning about Time and Events. Stroudsburg: ACL, 2006: 1-8. |
[16] | WEI Jianxiang, LIANG Shuai, ZHU Yunxia, et al. Progress in the Study of Event Evolution Graph[J]. Information and Documentation Services, 2023, 44 (6): 35-43. |
魏建香, 梁帅, 朱云霞, 等. 事理图谱研究进展[J]. 情报资料工作, 2023, 44(6):35-43. | |
[17] | LIAO Shasha, GRISHMAN R. Acquiring Topic Features to Improve Event Extraction: in Preselected and Balanced Collections[C]// ACL. Proceedings of the International Conference Recent Advances in Natural Language Processing 2011. Stroudsburg: ACL, 2011: 9-16. |
[18] | HE Xinyu, LI Lishuang. Trigger Word Recognition Based on Bidirectional LSTM and Two-Stage Method[J]. Journal of Chinese Information Technology, 2017, 31(6): 147-154. |
何馨宇, 李丽双. 基于双向 LSTM 和两阶段方法的触发词识别[J]. 中文信息学报, 2017, 31(6):147-154. | |
[19] | LIN Hongyu, LU Yaojie, HAN Xianpei, et al. Nugget Proposal Networks for Chinese Event Detection[C]// ACL. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2018: 1565-1574. |
[20] | CHEN Yubo, XU Liheng, LIU Kang, et al. Event Extraction via Dynamic Multi-Pooling Convolutional Neural Networks[C]// ACL. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg: ACL, 2015: 167-176. |
[21] | LIU Pengfei, YUAN Weizhe, FU Jinlan, et al. Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing[J]. ACM Computing Surveys, 2023, 55(9): 1-35. |
[22] | BROWN T, MANN B, RYDER N, et al. Language Models are Few-Shot Learners[J]. Advances in Neural Information Processing Systems, 2020, 33: 1877-1901. |
[23] |
LI Hongpeng, MA Bo, YANG Yating, et al. Document-Level Event Extraction Method Based on Slot Semantic Enhancement Prompt Learning[J]. Computer Engineering, 2023, 49 (9): 23-31.
doi: 10.19678/j.issn.1000-3428.0066170 |
李鸿鹏, 马博, 杨雅婷, 等. 基于槽位语义增强提示学习的篇章级事件抽取方法[J]. 计算机工程, 2023, 49(9):23-31.
doi: 10.19678/j.issn.1000-3428.0066170 |
|
[24] |
CHEN Nuo, LI Xuhui. An Event Extraction Method Based on Template Prompt Learning[J]. Data Analysis and Knowledge Discovery, 2023, 7 (6): 86-98.
doi: 10.11925/infotech.2096-3467.2022.0495 |
陈诺, 李旭晖. 一种基于模板提示学习的事件抽取方法[J]. 数据分析与知识发现, 2023, 7 (6):86-98.
doi: 10.11925/infotech.2096-3467.2022.0495 |
|
[25] | CUI Leyang, WU Yu, LIU Jian, et al. Template-Based Named Entity Recognition Using BART[C]// ACL. Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021. Stroudsburg: ACL, 2021: 1835-1845. |
[26] | WEBSON A, PAVLICK E. Do Prompt-Based Models Really Understand the Meaning of Their Prompts[C]// ACL. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg: ACL, 2022: 2300-2344. |
[27] | GUI Hongbao, YUAN Lin, YE Hongbin, et al. IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus[EB/OL]. (2024-02-22)[2024-05-15]. https://arxiv.org/abs/2402.14710v3. |
[28] |
GRISHMAN R. Twenty-Five Years of Information Extraction[J]. Natural Language Engineering, 2019, 25(6): 677-692.
doi: 10.1017/S1351324919000512 |
[29] | LIU Xiao, JI Kaixuan, FU Yicheng, et al. P-Tuning: Prompt Tuning can be Comparable to Fine-Tuning Universally Across Scales and Tasks[C]// ACL. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 61-68. |
[30] | BAEVSKI A, ZHOU H, MOHAMED A. Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations[C]// NIPS. Proceedings of the 34th International Conference on Neural Information Processing Systems. San Diego: NIPS, 2020: 12449-12460. |
[31] | WEI J, WANG Xuezhi, SCHUURMANS D, et al. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models[C]// NIPS. Proceedings of the 36th International Conference on Neural Information Processing Systems. San Diego: NIPS, 2022: 24824-24837. |
[32] | LU Yaojie, LIU Qing, DAI Dai, et al. Unified Structure Generation for Universal Information Extraction[C]// ACL. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2022: 5755-5772. |
[33] | LI Sha, JIHeng, HAN Jiawei. Document-Level Event Argument Extraction by Conditional Generation[C]// ACL. Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. Stroudsburg: ACL, 2021: 894-908. |
[34] | WADDEN D, WENNBERG U, LUAN Yi, et al. Entity, Relation, and Event Extraction with Contextualized Span Representations[EB/OL]. (2019-09-08)[2024-05-15]. https://arxiv.org/abs/1909.03546v2. |
[35] | LI Xiuxia, SHAO Zuoyun. Knowledge Discovery Based on the Integration of Citation Analysis and Content Analysis[M]. Beijing: Economy & Management Publishing House, 2022. |
李秀霞, 邵作运. 引文分析与内容分析融合的知识发现:理论,方法与应用[M]. 北京: 经济管理出版社, 2022. | |
[36] | VASWANI A, SHAZEER N, PARMAR N, et al. Attention is All You Need[C]// ACM. Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6000-6010. |
[37] | GUO Danqing, FU Ying, ZHU Ye, et al. Semantic Segmentation of Remote Sensing Image via Self-Attention-Based Multi-Scale Feature Fusion[J]. Journal of Computer-Aided Design & Computer Graphics, 2023, 35 (8): 1259-1268. |
郭丹青, 符颖, 朱烨, 等. 自注意力多尺度特征融合的遥感图像语义分割算法[J]. 计算机辅助设计与图形学学报, 2023, 35(8): 1259-1268. | |
[38] | LEWIS M, LIU Yinhan, GOYAL N, et al. BART:Denoising Sequence-to-Sequence Pre-Training for Natural Language Generation, Translation, and Comprehension[C]// ACL. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: ACL, 2020: 7871-7880. |
[39] | SUI Yuan, BU Fanyang, HU Yingting, et al. Trigger-GNN: A Trigger-Based Graph Neural Network for Nested Named Entity Recognition[C]// IEEE. Proceedings of the International Joint Conference on Neural Networks (IJCNN). New York: IEEE, 2022: 1-8. |
[40] | LIU Yupeng, LI Dongdong. Chinese Named Entity Recognition Method Based on Bi-Directional LSTM-CNN-CRF[J]. Journal of Harbin University of Science and Technology, 2020, 25(1): 115-120. |
[41] | SHEN Tongping, YU Lei, JIN Li, et al. Chinese Entity Recognition Based on BERT-BILSTM-CRF Model[J]. Journal of Qiqihar University (Natural Science Edition), 2022, 38(1): 26-32. |
[42] | CUI Yiming, CHE Wanxiang, LIU Ting, et al. Revisiting Pre-Trained Models for Chinese Natural Language Processing[C]// ACL. Findings of the Association for Computational Linguistics:EMNLP 2020. Stroudsburg: ACL, 2020: 657-668. |
[43] | WANG Xiaodi, HUANG Cheng, LIU Jiayong. A Survey of Cyber Security Open-Source Intelligence Knowledge Graph[J]. Netinfo Security, 2023, 23 (6): 11-21. |
王晓狄, 黄诚, 刘嘉勇. 面向网络安全开源情报的知识图谱研究综述[J]. 信息网络安全, 2023, 23(6):11-21. | |
[44] | WANG Xin, ZOU Lei, WANG Chaokun, et al. Research on Knowledge Graph Data Management: A Survey[J]. Journal of Software, 2019, 30 (7): 2139-2174. |
王鑫, 邹磊, 王朝坤, 等. 知识图谱数据管理研究综述[J]. 软件学报, 2019, 30(7):2139-2174. |
[1] | 焦诗琴, 张贵杨, 李国旗. 一种聚焦于提示的大语言模型隐私评估和混淆方法[J]. 信息网络安全, 2024, 24(9): 1396-1408. |
[2] | 陈昊然, 刘宇, 陈平. 基于大语言模型的内生安全异构体生成方法[J]. 信息网络安全, 2024, 24(8): 1231-1240. |
[3] | 项慧, 薛鋆豪, 郝玲昕. 基于语言特征集成学习的大语言模型生成文本检测[J]. 信息网络安全, 2024, 24(7): 1098-1109. |
[4] | 郭祥鑫, 林璟锵, 贾世杰, 李光正. 针对大语言模型生成的密码应用代码安全性分析[J]. 信息网络安全, 2024, 24(6): 917-925. |
[5] | 张长琳, 仝鑫, 佟晖, 杨莹. 面向网络安全领域的大语言模型技术综述[J]. 信息网络安全, 2024, 24(5): 778-793. |
[6] | 李娇, 张玉清, 吴亚飚. 面向网络安全关系抽取的大语言模型数据增强方法[J]. 信息网络安全, 2024, 24(10): 1477-1483. |
[7] | 黄恺杰, 王剑, 陈炯峄. 一种基于大语言模型的SQL注入攻击检测方法[J]. 信息网络安全, 2023, 23(11): 84-93. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||