信息网络安全 ›› 2023, Vol. 23 ›› Issue (11): 69-83.doi: 10.3969/j.issn.1671-1122.2023.11.008

• 技术研究 • 上一篇    下一篇

多模态对比学习中的靶向投毒攻击

刘高扬1,2, 吴伟玲1, 张锦升3, 王琛1,2()   

  1. 1.华中科技大学电子信息与通信学院,武汉 430074
    2.智能互联网技术湖北省重点实验室,武汉 430074
    3.武汉龙安集团有限责任公司,武汉 430074
  • 收稿日期:2023-06-20 出版日期:2023-11-10 发布日期:2023-11-10
  • 通讯作者: 王琛 chenwang@hust.edu.cn
  • 作者简介:刘高扬(1991—),男,湖北,博士,CCF会员,主要研究方向为人工智能安全|吴伟玲(1998—),女,湖北,硕士研究生,主要研究方向为对比学习和对抗机器学习|张锦升(1986—),男,湖北,工程师,硕士,主要研究方向为系统设计与算法安全|王琛(1985—),男,湖北,副研究员,博士,CCF高级会员,主要研究方向为物联网、数据安全和可信机器学习
  • 基金资助:
    国家自然科学基金(62272183);湖北省重点研发计划(2021BAA026)

Targeted Poisoning Attacks against Multimodal Contrastive Learning

LIU Gaoyang1,2, WU Weiling1, ZHANG Jinsheng3, WANG Chen1,2()   

  1. 1. School of Electronic Information and Communication, Huazhong University of Science and Technology, Wuhan 430074, China
    2. Hubei Key Laboratory of Smart Internet Technology, Wuhan 430074, China
    3. Wuhan Long’an Group Co., Ltd., Wuhan 430074, China
  • Received:2023-06-20 Online:2023-11-10 Published:2023-11-10

摘要:

近年来,使用对比学习技术在大规模无标注数据上所构建的预训练模型得到了广泛的应用(如车道检测、人脸识别等)。然而,其面临的安全和隐私问题也引起学者的广泛关注。文章聚焦于针对多模态对比学习模型的投毒攻击,该攻击将精心构造的数据注入训练集,以改变模型在特定数据上的预测行为。针对现有投毒攻击主要针对文本或图像单模态模型,没有利用文本或者图像间的多模态信息的问题,文章提出一种同时对文本与图像编码器投毒的靶向投毒攻击。首先,基于Beta分布自动生成水印图像透明度;然后,根据透明度生成添加水印后的样本,并根据水印样本与目标样本之间的欧式距离得到该透明度下应当投毒的样本数;最后,通过特定的优化算法生成投毒数据集。与现有的投毒攻击相比,文章所提方法具有更低的投毒率,并能够保持目标模型的性能。

关键词: 对比学习, 靶向投毒攻击, 下游任务投毒, 多模态信息

Abstract:

In recent years, the applications of pre-trained models constructed with contrastive learning techniques on large-scale unlabeled data have gained widespread adoption, such as lane detection and face recognition. However, the security and privacy issues of contrastive learning models have increasingly attracted the attention of researchers. This paper focused on the poisoning attack against the multimodal contrastive learning models. Poisoning attack injected carefully crafted data into the training set to change the behavior of victim models. To tackle the issue of existing attacks primarily targeting either text or image encoders individually and failing to fully leverage other modality-related information, this paper proposed a specific targeted poisoning attack, which poisoned both the text and image encoders simultaneously. Firstly, this paper employed a generator utilizing the Beta distribution to produce opacity values, which were used to automatically watermark the images. Subsequently, this paper calculated the number of instances to be collected based on the Euclidean distance between the watermarking instance and the target instance. Following the watermarking process, this paper optimized the instances to generate poisoning instances. Compared with the state-of-the-art attacks, this method achieves a lower poisoning rate, and a better model accuracy.

Key words: contrastive learning, targeted poisoning attacks, downstream task poisoning, multimodal information

中图分类号: