Netinfo Security ›› 2021, Vol. 21 ›› Issue (7): 63-71.doi: 10.3969/j.issn.1671-1122.2021.07.008

Previous Articles     Next Articles

Research on English-Chinese Machine Translation Based on Sentence Grouping

ZHAO Yuran, MENG Kui()   

  1. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2021-04-15 Online:2021-07-10 Published:2021-07-23
  • Contact: MENG Kui E-mail:mengkui@sjtu.edu.cn

Abstract:

Although neural machine translation models can obtain improvements when using larger data set for training, the information about categories and structures of sentences in the data set has not been properly utilized. This paper proposes a neural machine translation model based on sentence grouping, which adds a discriminator based on attention mechanism after encoders. In addition, this paper proposes a method to calculate the structural information vector of sentences as well. These vectors can be used to obtain the group labels by unsupervised method. Before training, sentences in the data set will be divided according to their content category and sentence structure to get group labels. Then the model is trained with these labels and parallel corpus at the same time, which will help the model identify the group that sentences belong to. In this way, the information in the data set can be more fully utilized. Sufficient comparative experiments show the rationality of the grouping idea. The translation results of Transformer model based on group architecture have been improved. Compared with the vanilla Transformer model, the BLEU score of our model has increased by at most 1.2.

Key words: machine translation, sentence grouping, structural information

CLC Number: