信息网络安全 ›› 2026, Vol. 26 ›› Issue (3): 367-377.doi: 10.3969/j.issn.1671-1122.2026.03.003

• 入选论文 • 上一篇    下一篇

深度伪造语音真实性鉴定研究综述

徐衍微1,2(), 涂敏1,2, 张亮1,2   

  1. 1.江西警察学院网络安全学院,南昌 330100
    2.电子数据管控与取证江西省重点实验室,南昌 330100
  • 收稿日期:2025-08-10 出版日期:2026-03-10 发布日期:2026-03-30
  • 通讯作者: 徐衍微 E-mail:xywlbq@qq.com
  • 作者简介:徐衍微(1982—),女,江西,讲师,硕士,主要研究方向为网络安全、声纹鉴定|涂敏(1967—),女,江西,教授,本科,主要研究方向为网络安全、计算机取证|张亮(1976—),女,江西,副教授,硕士,主要研究方向为网络安全、声像资料鉴定
  • 基金资助:
    江西省教育厅科学技术研究重点项目(GJJ2202302);江西省教育厅科学技术研究项目(GJJ2402204)

A Review on the Authenticity Verification of Deepfake Speech

XU Yanwei1,2(), TU Min1,2, ZHANG Liang1,2   

  1. 1. School of Cyber Security, Jiangxi Police College, Nanchang 330100, China
    2. Jiangxi Provincial Key Laboratory of Electronic Data Control and Forensics, Nanchang 330100, China
  • Received:2025-08-10 Online:2026-03-10 Published:2026-03-30

摘要:

随着深度伪造语音技术在电信诈骗、网络虚假信息传播中的滥用,高保真合成语音的真实性鉴定面临严峻挑战。文章以面向深度伪造的语音真实性鉴定为研究对象,构建原始性鉴定、完整性鉴定与深度检测相结合的技术框架。在原始性鉴定层面,分析语音设备与系统环境一致性检验、文件属性与元数据逻辑核验的方法及适用边界;在完整性鉴定层面,系统阐述听视觉检验、声谱检验与其他信号分析的技术路径;在深度伪造检测层面,从全局判别与局部定位两个维度,归纳其检测方法、基准数据集与评估指标。研究表明,构建文件属性分析、传统声学检验与深度学习检测的综合技术路径,有助于保障鉴定工作的可解释性、可验证性与司法适用性,为复杂网络环境下的语音真实性鉴定提供理论依据与技术支撑。

关键词: 语音真实性鉴定, 深度伪造语音检测, 深度学习, 声谱检验

Abstract:

With the misuse of deepfake speech technology in telecom fraud and online disinformation dissemination, the authenticity verification of high-fidelity synthetic speech presents severe challenges for forensic practice. This paper focused on deepfake-oriented forensic speech authentication as the research subject, and established an integrated technical framework consisting of originality verification, integrity verification, and deepfake detection.For originality verification, this study examined the methodologies and applicable scopes of consistency checking for recording devices and system environments, as well as logical verification of file attributes and metadata. For integrity verification, it systematically elaborated the technical approaches of auditory examination, spectrographic analysis, and other signal-based forensic examinations. For deepfake detection, it summarized detection algorithms, benchmark datasets, and evaluation metrics from the perspectives of global discrimination and local tampering localization. The results demonstrate that an integrated technical paradigm combining file metadata analysis, traditional acoustic forensic examination, and deep learning detection is conducive to ensuring the interpretability, verifiability, and judicial admissibility of forensic identification, thereby providing a theoretical foundation and technical support for speech authenticity verification in complex network environments.

Key words: audio authenticity verification, deepfake speech detection, deep learning, sonogram examination

中图分类号: