基于频域分布对齐的无训练黑盒深伪攻击方法

doi:10.3969/j.issn.1671-1122.2026.05.002

信息网络安全 ›› 2026, Vol. 26 ›› Issue (5): 684-698.doi: 10.3969/j.issn.1671-1122.2026.05.002

基于频域分布对齐的无训练黑盒深伪攻击方法

虞楚尔¹, 王菡悦¹, 吴坚²^,³^,⁴, 丁伟杰³, 陈先钳²^,⁴, 王总辉¹()

¹ 浙江大学计算机科学与技术学院, 杭州 310027
² 浙江省公安厅网络安全保卫总队, 杭州 310009
³ 浙江警察学院信息网络安全学院, 杭州 310053
⁴ 浙江大学网络空间安全学院, 杭州 310027

收稿日期:2026-01-20 出版日期:2026-05-10 发布日期:2026-06-03
通讯作者: 王总辉 zhwang@zju.edu.cn
作者简介:虞楚尔（1995—）,女,浙江,博士研究生,主要研究方向为多媒体内容安全|王菡悦（2003—）,女,河南,硕士研究生,主要研究方向为人工智能内容安全|吴坚（1980—）,男,浙江,正高级工程师,硕士,主要研究方向为人工智能、网络安全和数字取证技术|丁伟杰（1980—）,男,河南,教授,博士,主要研究方向为人工智能内容安全、数据智能分析及网络空间治理技术|陈先钳（1984—）,男,浙江,正高级工程师,硕士,主要研究方向为电子数据取证技术|王总辉（1979—）,男,浙江,高级工程师,博士,主要研究方向为内容安全、人工智能安全和计算机体系结构
基金资助:
国家重点研发计划(2024YFF0618800)

A Training-Free Black-Box Attack against DeepFake Detectors via Frequency Distribution Alignment

YU Chuer¹, WANG Hanyue¹, WU Jian²^,³^,⁴, DING Weijie³, CHEN Xianqian²^,⁴, WANG Zonghui¹()

¹ College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
² Cybersecurity Corps, Zhejiang Provincial Public Security Department, Hangzhou 310009, China
³ College of Information and Cyber Security, Zhejiang Police College, Hangzhou 310053, China
⁴ School of Cyber Science and Technology, Zhejiang University, Hangzhou 310027, China

Received:2026-01-20 Online:2026-05-10 Published:2026-06-03

摘要/Abstract

摘要：

随着深度伪造（简称“深伪”）技术生成质量不断逼近真实影像,深伪检测在多媒体内容安全中的作用愈发重要。然而,现有方法多依赖真实与伪造样本的统计偏差进行判别,使其在黑盒条件下面临对抗脆弱性。文章研究在无训练、零查询黑盒场景下检测系统的安全性,提出一种新的攻击视角,将对抗攻击建模为对伪造图像统计指纹的定向修复,而非外源噪声扰动。基于此,文章提出频域分布对齐的无训练黑盒攻击方法（SpectralFusion）。该方法利用深伪生成中天然存在的真实-伪造成对先验,在无需访问检测器参数、梯度及额外训练与查询的前提下,通过频域分析定位伪造图像与真实参考的统计差异,并仅对异常频段进行受控修正。具体地,设计差异感知掩码以精确定位异常频段,并引入自适应融合强度机制动态调节修正能量；结合局部重叠块的频谱处理策略,实现对伪造频域特征的精细对齐与重建。实验结果表明,SpectralFusion在保持较高视觉保真度的同时,能够稳定削弱多种检测模型的判别置信度,并在不同模型架构与伪造类型下展现出良好泛化能力。研究结果揭示了深伪检测模型在频域统计层面的潜在安全隐患,为评估黑盒检测系统的鲁棒性提供了新视角。

关键词: 深度伪造检测, 黑盒攻击, 频域分布对齐

Abstract:

As deepfake generation quality increasingly approaches that of real images, deepfake detection has become crucial for multimedia content security. However, most existing methods rely on statistical discrepancies between real and fake samples, rendering them potentially vulnerable under black-box conditions. This paper investigated the security of deepfake detection systems in a training-free, zero-query black-box setting and introduced a novel attack perspective: modeling adversarial attacks as targeted correction of statistical fingerprints in fake images, rather than exogenous noise perturbations. Based on this insight, this paper proposed SpectralFusion, a training-free black-box method that aligns frequency-domain distribution distributions. Leveraging the inherent real-fake paired prior present in deepfake generation, SepctralFusion identifies statistical discrepancies between fake images and their real references through frequency-domain analysis and applies controlled corrections only to anomalous frequency bands—without accessing model parameters, gradients, or additional training and queries. Specifically, we designed a difference-aware frequency band mask to accurately localize abnormal frequency components, and introduced an adaptive fusion strength mechanism to dynamically regulate correction intensity. Combined with a local overlapping block-based frequency processing strategy, our method enables fine-grained alignment and reconstruction of manipulated frequency features. Extensive experiments results show that SpectralFusion consistently deceives multiple deepfake detection models while preserving high visual fidelity, and generalizes well across diverse model architectures and manipulation types. Our findings reveal inherent vulnerabilities of deepfake detectors in the frequency-domain statistical space, offering a new perspective for evaluating the robustness of black-box detection systems in real-world scenarios.

Key words: deepfake detection, black-box attack, frequency distribution alignment

中图分类号:

TP309

虞楚尔, 王菡悦, 吴坚, 丁伟杰, 陈先钳, 王总辉. 基于频域分布对齐的无训练黑盒深伪攻击方法[J]. 信息网络安全, 2026, 26(5): 684-698.

YU Chuer, WANG Hanyue, WU Jian, DING Weijie, CHEN Xianqian, WANG Zonghui. A Training-Free Black-Box Attack against DeepFake Detectors via Frequency Distribution Alignment[J]. Netinfo Security, 2026, 26(5): 684-698.

图/表 10

图1

表1

图2

表2

表3

表4

表5

图3

表6

表7

参考文献 23

[1]	ROSSLER A, COZZOLINO D, VERDOLIVA L, et al. FaceForensics++: Learning to Detect Manipulated Facial Images[C]//IEEE. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). New York: IEEE, 2019: 1-11.
[2]	LI Lingzhi, BAO Jianmin, ZHANG Ting, et al. Face X-Ray for More General Face Forgery Detection[C]//IEEE. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 5000-5009.
[3]	QIAN Yuyang, YIN Guojun, SHENG Lu, et al. Thinking in Frequency: Face Forgery Detection by Mining Frequency-Aware Clues[C]//Springer. European Conference on Computer Vision. Heidelberg: Springer, 2020: 86-103.
[4]	LI Jiaming, XIE Hongtao, LI Jiahong, et al. Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection[C]//IEEE. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2021: 6454-6463.
[5]	CARLINI N, FARID H. Evading Deepfake-Image Detectors with White- and Black-Box Attacks[C]//IEEE. The IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. New York: IEEE, 2020: 658-659.
[6]	APRUZZESE G, ANDERSON H S, DAMBRA S, et al. “Real Attackers Don’t Compute Gradients”: Bridging the Gap between Adversarial ML Research and Practice[C]//IEEE. 2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML). New York: IEEE, 2023: 339-364.
[7]	PANEBIANCO F, D’ONGHIA M, ZANERO S, et al. How Stealthy Is Stealthy? Studying the Efficacy of Black-Box Adversarial Attacks in the Real World[C]//IFIP. IFIP International Conference on ICT Systems Security and Privacy Protection. Heidelberg: Springer, 2025: 140-154.
[8]	LIU Chi, CHEN Huajie, ZHU Tianqing, et al. Making DeepFakes More Spurious: Evading Deep Face Forgery Detection via Trace Removal Attack[J]. IEEE Transactions on Dependable and Secure Computing, 2023, 20(6): 5182-5196. doi: 10.1109/TDSC.2023.3241604 URL
[9]	CHEN Renwang, CHEN Xuanhong, NI Bingbing, et al. SimSwap: An Efficient Framework for High Fidelity Face Swapping[C]//ACM. The 28th ACM International Conference on Multimedia. New York: ACM, 2020: 2003-2011.
[10]	HALIASSOS A, VOUGIOUKAS K, PETRIDIS S, et al. Lips Don’t Lie: A Generalisable and Robust Approach to Face Forgery Detection[C]//IEEE. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2021: 5037-5047.
[11]	ASTRID M, GHORBEL E, AOUADA D. Audio-Visual Deepfake Detection with Local Temporal Inconsistencies[C]//IEEE. 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New York: IEEE, 2025: 1-5.
[12]	XU Z J, ZHANG Yaoyu, LUO Tao. Overview Frequency Principle/Spectral Bias in Deep Learning[J]. Communications on Applied Mathematics and Computation, 2025, 7(3): 827-864. doi: 10.1007/s42967-024-00398-7
[13]	WANG Shengyu, WANG O, ZHANG R, et al. CNN-Generated Images Are Surprisingly Easy to Spot… for Now[C]//IEEE. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 8692-8701.
[14]	GEIRHOS R, RUBISCH P, MICHAELIS C, et al. ImageNet-Trained CNNs are Biased towards Texture; Increasing Shape Bias Improves Accuracy and Robustness[EB/OL]. (2018-11-29)[2026-01-02]. https://arxiv.org/abs/1811.12231.
[15]	CHEN Guangyao, PENG Peixi, MA Li, et al. Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain[C]//IEEE. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). New York: IEEE, 2021: 448-457.
[16]	DONG Junhao, WANG Yuan, LAI Jianhuang, et al. Restricted Black-Box Adversarial Attack against DeepFake Face Swapping[J]. IEEE Transactions on Information Forensics and Security, 2023, 18: 2596-2608. doi: 10.1109/TIFS.2023.3266702 URL
[17]	BATTIATO S, CASU M, GUARNERA F, et al. Adversarial Attacks on Deepfake Detectors: A Challenge in the Era of AI-Generated Media[C]// ACM. The 33rd ACM International Conference on Multimedia. New York: ACM, 2025: 13714-13719.
[18]	JIANG Liming, LI Ren, WU W, et al. DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection[C]// IEEE. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2020: 2886-2895.
[19]	ZHUANG Wanyi, CHU Qi, TAN Zhentao, et al. UIA-ViT: Unsupervised Inconsistency-aware Method Based on Vision Transformer for Face Forgery Detection[C]// Springer. European Conference on Computer Vision. Heidelberg: Springer, 2022: 391-407.
[20]	CAO Junyi, MA Chao, YAO Taiping, et al. End-to-End Reconstruction-Classification Learning for Face Forgery Detection[C]// IEEE. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2022: 4103-4112.
[21]	MADRY A, MAKELOV A, SCHMIDT L, et al. Towards Deep Learning Models Resistant to Adversarial Attacks[EB/OL]. (2017-06-19)[2026-01-02]. https://arxiv.org/abs/1706.06083.
[22]	XIE Cihang, ZHANG Zhishuai, ZHOU Yuyin, et al. Improving Transferability of Adversarial Examples with Input Diversity[C]// IEEE. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE, 2019: 2725-2734.
[23]	CARLINI N, WAGNER D. Towards Evaluating the Robustness of Neural Networks[C]// IEEE. 2017 IEEE Symposium on Security and Privacy (SP). New York: IEEE, 2017: 39-57.

输入变体	幅值来源（A）	相位来源（P）	平均伪造置信度↓	判伪率（大于0.5）↓
目标真实图像I_r	真实（A_r）	真实（P_r）	0.02	1.0%
原始伪造图像I_f	伪造（A_f）	伪造（P_f）	0.95	96.6%
相位替换图像 I_phase-swap	伪造（A_f）	真实（P_r）	0.41	37.7%
幅值替换图像 I_mag-swap	真实（A_r）	伪造（P_f）	0.63	67.9%
随机相位图像 I_rand-phase	伪造（A_f）	伪造+噪声	0.88	96.8%
随机幅值图像 I_rand-mag	伪造+噪声	伪造（P_f）	0.87	92.1%

攻击方法	Xception		F3Net		UIA-ViT		RECCE		视觉质量
攻击方法	ASR	$\Delta p$	ASR	$\Delta p$	ASR	$\Delta p$	ASR	$\Delta p$	PSNR/dB	SSIM	LPIPS
高斯模糊	0.3%	0.026	7.7%	0.163	0	-0.010	0	-0.058	32.7	0.927	0.179
中值滤波	2.9%	0.045	9.2%	0.149	0	-0.102	0	-0.064	32.1	0.909	0.203
PGD	100%	0.993	90.6%	0.892	79.8%	0.777	90.7%	0.853	30.8	0.706	0.424
DI-FGSM	100%	0.994	94.9%	0.930	87.1%	0.855	89.9%	0.840	31.2	0.728	0.434
C&W	97.0%	0.944	76.5%	0.730	33.9%	0.326	56.5%	0.513	41.5	0.962	0.149
SpectralFusion	58.4%	0.557	56.2%	0.515	35.3%	0.349	39.6%	0.370	32.0	0.971	0.049

伪造数据集	参数$\kappa $	Xception		F3Net		UIA-ViT		RECCE		视觉质量
伪造数据集	参数$\kappa $	ASR	$\Delta p$	ASR	$\Delta p$	ASR	$\Delta p$	ASR	$\Delta p$	PSNR/dB	SSIM	LPIPS
Deepfakes （源域）	2.0	58.4%	0.557	56.2%	0.515	35.3%	0.349	39.6%	0.370	32.0	0.971	0.049
FaceSwap	3.0	28.1%	0.281	29.0%	0.287	39.7%	0.379	36.8%	0.352	32.4	0.970	0.051
Face2Face	6.0	33.6%	0.330	48.1%	0.452	61.3%	0.579	57.9%	0.477	34.7	0.965	0.069
NeuralTextures	6.0	56.3%	0.521	65.5%	0.600	62.9%	0.577	61.6%	0.467	36.1	0.972	0.055
Deeperforensics-1.0	2.0	61.7%	0.399	69.6%	0.410	60.7%	0.486	36.8%	0.207	33.7	0.981	0.029

伪造数据集	Xception	F3Net	UIA-ViT	平均值
Deepfakes	5.1%	3.5%	2.0%	3.5%
FaceSwap	10.2%	6.7%	7.6%	8.2%
Face2Face	9.8%	5.0%	2.5%	5.8%
NeuralTextures	7.7%	5.3%	3.9%	5.6%
DeeperForensics-1.0	2.6%	1.4%	3.0%	2.3%
平均值	7.1%	4.4%	3.8%	5.1%

变体	自适应频域掩码M	自适应融合强度λ	局部重叠融合	ASR	$\Delta p$	PSNR/dB	SSIM	LPIPS
A1（Global Linear）	—	—	—	19.5%	0.213	30.9	0.919	0.196
A2（Mask Only）	√	—	√	70.9%	0.666	31.3	0.965	0.071
A3（Intensity Only）	—	√	√	60.0%	0.573	31.1	0.965	0.059
A4（Global Adaptive）	√	√	—	13.1%	0.148	32.7	0.937	0.159
B1（Block Linear）	—	—	√	67.7%	0.638	29.0	0.967	0.069
B2（Block Swap-LF）	—	—	√	39.5%	0.380	37.1	0.956	0.103
B3（Block Swap-HF）	—	—	√	54.4%	0.513	31.2	0.960	0.084
A5（SpectralFusion）	√	√	√	58.4%	0.560	31.3	0.967	0.057

基于频域分布对齐的无训练黑盒深伪攻击方法

A Training-Free Black-Box Attack against DeepFake Detectors via Frequency Distribution Alignment

RichHTML

PDF (PC)

可视化

摘要/Abstract

引用本文

使用本文

图/表 10

参考文献 23

相关文章 6

编辑推荐

Metrics

本文评价

参数类型	参数值	攻击性能		视觉质量
参数类型	参数值	ASR	$\Delta p$	PSNR/dB	SSIM	LPIPS
温度参数 τ	1.0	58.9%	0.565	31.79	0.970	0.053
	2.0	58.3%	0.564	31.73	0.970	0.054
	5.0	58.6%	0.553	31.85	0.970	0.053
	10.0	58.4%	0.557	31.78	0.971	0.052
	15.0	56.9%	0.544	32.08	0.972	0.050
	20.0	56.6%	0.540	31.97	0.972	0.050
	25.0	55.4%	0.530	32.21	0.972	0.049
	50.0	51.6%	0.501	32.37	0.973	0.047
	100.0	48.6%	0.465	32.54	0.974	0.045
	200.0	40.7%	0.398	32.95	0.976	0.041
攻击强度控制参数 κ	0.5	4.1%	0.046	43.20	0.996	0.008
	1.0	20.3%	0.206	37.11	0.988	0.024
	1.5	41.2%	0.409	33.90	0.979	0.038
	2.0	58.4%	0.557	31.78	0.971	0.052
	2.5	68.2%	0.650	30.65	0.964	0.063
	3.0	74.9%	0.705	29.97	0.959	0.071
	4.0	80.1%	0.746	29.30	0.954	0.082
分块大小 p	8	78.1%	0.742	27.63	0.939	0.075
	16	63.3%	0.598	29.96	0.960	0.051
	32	47.3%	0.459	31.54	0.970	0.046
	48	58.4%	0.557	31.78	0.971	0.052
	64	61.6%	0.584	32.06	0.970	0.058
	80	60.2%	0.566	31.95	0.968	0.064

场景分类	实验配置说明	ASR	$\Delta p$	PSNR/dB
Pose Mismatch	随机抽取视频帧（存在姿态差异）	52.8%	0.508	24.6
Jitter（Shift）	像素随机位移	48.9%	0.474	27.1
Jitter（Scale）	随机缩放	41.0%	0.397	27.5
Hybrid	位移 + 缩放复合扰动	42.3%	0.404	26.2
Baseline	完全像素级对齐	58.4%	0.557	31.5

[1]	陈宇琪, 钱汉伟, 夏玲玲, 王群. FEViT：一种基于频域增强ViT的深度伪造检测模型[J]. 信息网络安全, 2026, 26(3): 432-441.
[2]	陈咏豪, 蔡满春, 张溢文, 彭舒凡, 姚利峰, 朱懿. 多尺度多层次特征融合的深度伪造人脸检测方法[J]. 信息网络安全, 2025, 25(9): 1456-1464.
[3]	赵伟, 任潇宁, 薛吟兴. 基于集成学习的成员推理攻击方法[J]. 信息网络安全, 2024, 24(8): 1252-1264.
[4]	李晨蔚, 张恒巍, 高伟, 杨博. 基于AdaN自适应梯度优化的图像对抗迁移攻击方法[J]. 信息网络安全, 2023, 23(7): 64-73.
[5]	彭舒凡, 蔡满春, 刘晓文, 马瑞. 基于图像细粒度特征的深度伪造检测算法[J]. 信息网络安全, 2022, 22(11): 77-84.
[6]	仝鑫, 王罗娜, 王润正, 王靖亚. 面向中文文本分类的词级对抗样本生成方法[J]. 信息网络安全, 2020, 20(9): 12-16.