信息网络安全 ›› 2022, Vol. 22 ›› Issue (7): 46-54.doi: 10.3969/j.issn.1671-1122.2022.07.006

• 技术研究 • 上一篇    下一篇

宋词自动生成的信息隐藏算法

杨婉霞1,2(), 陈帅3, 管磊4,5, 杨忠良2,4   

  1. 1.甘肃农业大学机电工程学院,兰州 730070
    2.清华大学天津电子信息研究院,天津 300467
    3.公安部网络安全保卫局,北京100006
    4.清华大学电子工程系,北京 100084
    5.公安部第一研究所,北京 100044
  • 收稿日期:2022-02-26 出版日期:2022-07-10 发布日期:2022-08-17
  • 通讯作者: 杨婉霞 E-mail:yangwanxia@163.com
  • 作者简介:杨婉霞(1979—),女,甘肃,副教授,博士,主要研究方向为信息隐藏、自然语言处理|陈帅(1989—),男,河南,工程师,硕士,主要研究方向为网络安全、大数据应用技术|管磊(1988—),男,河北,博士研究生,主要研究方向为云计算、数据安全|杨忠良(1993—),男,湖北,博士后,主要研究方向为信息安全、自然语言处理
  • 基金资助:
    国家自然科学基金(61862002)

Information Steganography Algorithm for Automatic Generation of Song Ci

YANG Wanxia1,2(), CHEN Shuai3, GUAN Lei4,5, YANG Zhongliang2,4   

  1. 1. College of Mechanical and Electrical Engineering, Gansu Agricultural University, Lanzhou 730070, China
    2. Institute for Electronics and Information Technology in Tianjin, Tsinghua University, Tianjin 300467, China
    3. Network Security Bureau of the Ministry of Public Security, Beijing 100006, China
    4. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
    5. The First Research Institute of the Ministry of Public Security, Beijing 100044, China
  • Received:2022-02-26 Online:2022-07-10 Published:2022-08-17
  • Contact: YANG Wanxia E-mail:yangwanxia@163.com

摘要:

传统的文本隐写方案很难均衡隐藏容量和隐蔽性之间的矛盾。利用宋词载体语义丰富、句法灵活的特点,文章提出BERT(Bidirectional Encoder Representations from Transformers)词嵌入结合Attention机制的Seq2Seq模型生成隐写宋词的算法。该算法采用BERT词嵌入作为生成模型的语义向量转换部分,其丰富的词向量空间保证了生成句子间语义的连贯性,提高了生成宋词的质量。另外,该算法采用格律模板和互信息选词方法约束隐写语句的生成,增强了隐藏算法的安全性。通过与现有文本隐藏算法在嵌入率方面的对比实验和分析表明,文章所提算法的嵌入率相比Ci-stega提高了7%以上,且在安全性和鲁棒性方面均有良好的表现。

关键词: 文本信息隐藏, 宋词, 格律模板, 互信息

Abstract:

It is hard to balance the contradiction between hiding capacity and imperceptibility by traditional text steganography methods. Taking advantage of the semantic richness and syntactic flexibility of Song Ci, an algorithm to generate steganographic Song Ci based on the Seq2Seq model with bidirectional encoder representation from transformers (BERT) and attention mechanism was proposed in this paper. BERT was used as the semantic vector transformation in the generation model. Its rich word vector space ensured the semantic coherence between the generated sentences and improved the quality of the generated Song Ci. In addition, the algorithm designed the word selection method using the rule template and mutual information to restrict the generation of steganographic sentences, which enhanced the security of the hiding algorithm. Compared with the existing and advanced text information hiding algorithms, the experimental results indicate that the embedding rate of the algorithm proposed in this paper is improved by more than 7% in comparison with Ci-stega, and it has good performance in security and robustness.

Key words: text information steganography, Song Ci, metrical template, mutual information

中图分类号: