Netinfo Security ›› 2023, Vol. 23 ›› Issue (11): 69-83.doi: 10.3969/j.issn.1671-1122.2023.11.008

Previous Articles     Next Articles

Targeted Poisoning Attacks against Multimodal Contrastive Learning

LIU Gaoyang1,2, WU Weiling1, ZHANG Jinsheng3, WANG Chen1,2()   

  1. 1. School of Electronic Information and Communication, Huazhong University of Science and Technology, Wuhan 430074, China
    2. Hubei Key Laboratory of Smart Internet Technology, Wuhan 430074, China
    3. Wuhan Long’an Group Co., Ltd., Wuhan 430074, China
  • Received:2023-06-20 Online:2023-11-10 Published:2023-11-10

Abstract:

In recent years, the applications of pre-trained models constructed with contrastive learning techniques on large-scale unlabeled data have gained widespread adoption, such as lane detection and face recognition. However, the security and privacy issues of contrastive learning models have increasingly attracted the attention of researchers. This paper focused on the poisoning attack against the multimodal contrastive learning models. Poisoning attack injected carefully crafted data into the training set to change the behavior of victim models. To tackle the issue of existing attacks primarily targeting either text or image encoders individually and failing to fully leverage other modality-related information, this paper proposed a specific targeted poisoning attack, which poisoned both the text and image encoders simultaneously. Firstly, this paper employed a generator utilizing the Beta distribution to produce opacity values, which were used to automatically watermark the images. Subsequently, this paper calculated the number of instances to be collected based on the Euclidean distance between the watermarking instance and the target instance. Following the watermarking process, this paper optimized the instances to generate poisoning instances. Compared with the state-of-the-art attacks, this method achieves a lower poisoning rate, and a better model accuracy.

Key words: contrastive learning, targeted poisoning attacks, downstream task poisoning, multimodal information

CLC Number: