信息网络安全 ›› 2026, Vol. 26 ›› Issue (3): 462-470.doi: 10.3969/j.issn.1671-1122.2026.03.012

• 入选论文 • 上一篇    下一篇

基于多模态特征的暗链标题检测方法

印杰1, 刘家银1, 黄肖宇1, 兰浩良1(), 谢文伟2   

  1. 1.江苏警官学院计算机信息与网络安全系,南京 210031
    2.趋势科技股份有限公司,南京 210012
  • 收稿日期:2025-08-08 出版日期:2026-03-10 发布日期:2026-03-30
  • 通讯作者: 兰浩良 E-mail:lanhaoliang@jspi.cn
  • 作者简介:印杰(1977—),男,江苏,高级工程师,硕士,主要研究方向为人工智能|刘家银(1986—),男,重庆,副教授,博士,主要研究方向为网络安全|黄肖宇(2002—),男,江苏,本科,主要研究方向为自然语言处理|兰浩良(1986—),男,山东,讲师,博士,主要研究方向为网络空间安全|谢文伟(1978—),男,江苏,工程师,硕士,主要研究方向为人工智能、机器视觉
  • 基金资助:
    国家自然科学基金(62272203)

Hidden Link Headline Detection Method Based on Multi-Modal Features

YIN Jie1, LIU Jiayin1, HUANG Xiaoyu1, LAN Haoliang1(), XIE Wenwei2   

  1. 1. Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing 210031, China
    2. Trend Micro Incorporated, Nanjing 210012, China
  • Received:2025-08-08 Online:2026-03-10 Published:2026-03-30

摘要:

随着网页篡改植入暗链现象的愈演愈烈以及自动化检出方法的普及,暗链标题植入已成为危害网络安全的重要因素之一。当前,攻击者常采用形近字、干扰符号、表情文字等手段进行伪装,这对基于单模态自然语言处理的检测技术构成了挑战。针对这一问题,文章提出基于混合特征的多模态检测方法。该方法首先利用BERT与ResNet分别提取标题文本的语义特征与图像特征,随后通过门函数和多头注意力方法对特征进行深度融合,进而实现对暗链标题的分类。实验结果表明,在评测数据集上,所提方法的识别准确率达到0.966,较基准方法提升了约1个百分点,这表明图像特征可以有效弥补文本特征在应对标题伪装时的不足。

关键词: 暗链标题检测, BERT, ResNet, 多模态特征融合

Abstract:

As the growing phenomenon of web page tampering with implanted hidden links, and the popularity of automatic detection methods, hidden link headline implantation has become one of the important factors endangering network security. Currently, the detection rate of unimodal, natural language processing-based detection techniques gradually decreases as hidden link attackers adopt disguises such as morphological close characters, interference symbols, and emoticons. To address this problem, this paper proposed a multimodal detection method based on image features and text features. The proposed method first extracted the semantic features and image features of the headline text with BERT and ResNet respectively, and then based on the gate function and multi-headed attention methods, the features were deeply fused to achieve the classification of hidden link headlines. Experimental results on the evaluation dataset show that the recognition accuracy of the proposed method can reach 0.966, which is about 1 percentage points higher than that of the benchmark method. This indicates that the image features can effectively overcome the shortage that text features cannot cope with the problem of headline disguise.

Key words: hidden link headline detection, BERT, ResNet, multi-modal feature fusion

中图分类号: