信息网络安全 ›› 2021, Vol. 21 ›› Issue (12): 102-108.doi: 10.3969/j.issn.1671-1122.2021.12.014

• 入选论文 • 上一篇    下一篇

融合互注意力机制与BERT的中文问答匹配技术研究

代翔, 孙海春(), 牛硕, 朱容辰   

  1. 中国人民公安大学信息网络安全学院,北京 100038
  • 收稿日期:2021-09-29 出版日期:2021-12-10 发布日期:2022-01-11
  • 通讯作者: 孙海春 E-mail:sunhaichun@ppsuc.edu.cn
  • 作者简介:代翔(1996—),男,河南,硕士研究生,主要研究方向为自然语言处理|孙海春(1985—),女,讲师,博士,主要研究方向为信息服务、机器学习和面向服务的计算|牛硕(1999—),男,本科,主要研究方向为网络安全、自然语言处理|朱容辰(1996—),男, 硕士研究生,主要研究方向为网络安全、视频网络、机器学习
  • 基金资助:
    国家自然科学基金(41971367);国家重点研发计划(2017YFC0803700);公安部技术研究计划(2020JSYJC22ok)

Research on Chinese Question Answering Matching Based on Mutual Attention Mechanism and Bert

DAI Xiang, SUN Haichun(), NIU Shuo, ZHU Rongchen   

  1. School of Information and Network Security, People’s Public Security University of China, Beijing 100038, China
  • Received:2021-09-29 Online:2021-12-10 Published:2022-01-11
  • Contact: SUN Haichun E-mail:sunhaichun@ppsuc.edu.cn

摘要:

问答匹配任务是问答系统关键技术之一,针对传统问答匹配模型对中文词向量表示不够精确、文本间交互特征提取不充分的问题,文章提出基于注意力的双向编码表征问答匹配模型。在中文向量表征上采用迁移学习引入预训练中文BERT模型参数,并在训练集上进一步微调获取最优参数,通过BERT模型对中文字向量进行表示,从而解决传统词向量模型在中文词汇上表征能力不足的问题。在文本交互层面,首先利用互注意力机制提取问题与答案的交互特征,并将生成的交互特征与注意力机制的输入向量形成特征组合;然后使用双向长短期记忆网络进行推理组合并降低特征维度,融入上下文语义信息;最后在中文法律数据集上进行测试。测试结果表明,该模型优于多项传统模型,与ESIM相比,在Top-1准确率上提高了3.55%,在MAP上提高了5.21%,在MRR上提高了4.05%。

关键词: 问答匹配, BERT, 互注意力机制

Abstract:

Question and answer matching task is one of the key technologies of question and answer system. Focusing on the problems that the traditional question and answer matching model is not accurate enough in the representation of Chinese word vector and insufficient extraction of interactive features between texts, a bi-directional encoder representation algorithm based on attention is proposed. In Chinese vector representation, transfer learning is used to introduce the pretrained Chinese BERT model parameters, and further finetune the training set to obtain the optimal parameters. The Chinese character vector is represented by the BERT model, so as to solve the problem of insufficient representation ability of the traditional word vector model in Chinese vocabulary. At the text interaction layer, the interactive features of questions and answers are extracted by using the mutual attention mechanism, and the generated interactive features are combined with the input vector of the attention mechanism to form a feature combination. Then BiLSTM is used for reasoning combination, reducing the feature dimension and integrating the context semantic information. Finally, it is tested on the Chinese legal data set. The experimental results show that the model is better than many traditional models. Compared with ESIM, it improves the accuracy of Top-1 by 3.55%, MAP by 5.21% and MRR by 4.05%.

Key words: question and answer matching, bi-directional encoder representation from transformers, mutual attention mechanism

中图分类号: