Netinfo Security ›› 2026, Vol. 26 ›› Issue (3): 432-441.doi: 10.3969/j.issn.1671-1122.2026.03.009

Previous Articles     Next Articles

FEViT: A Frequency Domain Enhanced ViT for Deepfake Detection

CHEN Yuqi1, QIAN Hanwei1,2, XIA Lingling1, WANG Qun1()   

  1. 1. Department of Computer Information and Cyber Security, Jiangsu Police Institute, Nanjing 210031, China
    2. State Key Laboratory for Novel Software Technology at Nanjing University, Nanjing 210093, China
  • Received:2025-08-10 Online:2026-03-10 Published:2026-03-30

Abstract:

The rapid advancement of deepfake technology has led to increasing concerns over social security issues, including AI-based face-swapping, identity forgery, portrait rights violations, and the dissemination of false information. Current deepfake detection methods often rely heavily on specific datasets, resulting in data bias and making it challenging to capture generalizable forgery features across different algorithms and scenarios. Consequently, these methods generally exhibit reduced detection accuracy and limited generalization ability when faced with novel forgery techniques. In response to this, the present study proposed a deepfake detection method FEViT that integrated high-frequency artifact information with visual transformers to enhance the model’s ability to generalize across forgeries from diverse sources. The approach employed a multi-dimensional optimization strategy: first, high-frequency artifact features were accurately extracted by combining Fourier transform and high-pass filtering, thereby amplifying frequency domain differences; second, three optimizations were applied to the visual transformer architecture to improve sensitivity to local anomalies and enhance the classification of complex features. Experimental results demonstrate that the proposed method outperforms existing detection techniques across multiple public datasets, with significant improvements in accuracy, AUC, and F1 score, achieving an average accuracy increase of 8% to 16.4%, and showing strong detection performance and generalization ability.

Key words: deepfake detection, visual transformer, high-frequency artifacts, Fourier transform

CLC Number: