信息网络安全 ›› 2017, Vol. 17 ›› Issue (10): 29-35.doi: 10.3969/j.issn.1671-1122.2017.10.005

• • 上一篇    下一篇

基于深度神经网络的命名实体识别方法研究

GULKhanSafiQamas, 尹继泽, 潘丽敏(), 罗森林   

  1. 北京理工大学信息系统及安全对抗实验中心,北京 100081
  • 收稿日期:2017-06-21 出版日期:2017-10-10 发布日期:2020-05-12
  • 作者简介:

    作者简介: GUL Khan Safi Qamas(1981—),男,巴基斯坦,博士研究生,主要研究方向为信息安全、文本安全;尹继泽(1995—),男,辽宁,硕士研究生,主要研究方向为文本安全;潘丽敏(1968—),女,黑龙江,高级实验师,硕士,主要研究方向为信息安全、数据挖掘、文本安全、媒体安全;罗森林(1968—),男,河北,教授,博士,主要研究方向为信息安全、数据挖掘、文本安全、媒体安全。

  • 基金资助:
    国家242信息安全计划[2017A149]

Research on the Algorithm of Named Entity Recognition Based on Deep Neural Network

Khan Safi Qamas GUL, Jize YIN, Limin PAN(), Senlin LUO   

  1. Information System and Security & Countermeasures Experimental Center, Beijing Institute of Technology,Beijing 100081, China
  • Received:2017-06-21 Online:2017-10-10 Published:2020-05-12

摘要:

针对中文社交媒体命名实体识别的特征提取不充分问题,文章提出一种基于深度神经网络、结合长短时记忆和注意力模型的命名实体识别方法。一条社交媒体文本信息等价于一个字符序列,因此,首先将其中每个字符转化为对应的字向量;其次,利用长短时记忆处理转化后的字向量序列来提取文本全局特征;然后,利用注意力模型处理前一步输出的文本全局特征向量序列,进一步提取文本局部特征;最后,利用线性链式条件随机场根据文本全局和局部特征向量序列进行命名实体标注,获得命名实体识别结果并输出。实验结果表明,文中方法与基线算法及当前优良算法相比,其F-测度值高于其他对比方法。

关键词: 命名实体识别, 中文社交媒体, 深度神经网络, 注意力机制

Abstract:

For the problem of insufficient feature extraction of named entity recognition for Chinese social media, a method of named entity recognition based on deep neural networks that combines a long short-term memory with a soft attention model is proposed in this article. A message from social media text is equivalent to a character sequence, so each character in the sequence should be converted into a corresponding character vector firstly. Secondly, a long short-term memory is used to extract the global text features from the converted character vector sequence. Thirdly, a soft attention model is used to extract the local text features from the global text feature vector sequence outputted by the previous step. Finally, a linear chain conditional random field is used to tag the named entities according to the global and local text feature vector sequence, and the results of named entity recognition are gotten and outputted. The results show that the proposed method in this article has a higher F-measure value compared with the baseline algorithm and the state-of-the-art algorithm.

Key words: named entity recognition, Chinese social media, deep neural network, attention mechanism

中图分类号: