A review of research on face Deepfake detection methods
- Pages: 1-21(2024)
Published Online: 23 December 2024
DOI: 10.11834/jig.240586
移动端阅览
浏览全部资源
扫码关注微信
Published Online: 23 December 2024 ,
移动端阅览
姚文达,李盼池,赵娅等.人脸深度伪造检测方法研究综述[J].中国图象图形学报,
Yao Wenda,Li Panchi,Zhao Ya,et al.A review of research on face Deepfake detection methods[J].Journal of Image and Graphics,
深度伪造技术是一种基于深度学习的合成技术,旨在生成高度逼真的合成图像、音频或视频,包括人脸伪造、声音模仿和人体姿态合成等内容。其中,人脸深度伪造可以实现非常逼真的换脸效果,广泛应用于电影、动画制作等领域。然而,该技术的滥用同时也导致不雅视频与虚假新闻的传播,带来了恶劣的社会影响。为应对这些负面影响,众多学者提出了一系列检测方法,以有效识别伪造人脸图像或视频。然而,当前的检测方法类型多样、优缺点各异,且应用场景各不相同,鉴于此,对相关研究进行了系统的归纳与整理。首先,整理了人脸深度伪造检测常用数据集与评价指标;其次,从图像级伪造人脸检测与视频级伪造人脸检测两个领域出发,根据特征选择的不同,将前者划分为基于空间域和基于频率域的检测方法,将后者划分为基于时空不一致、基于生物特征和基于多模态的检测方法,并详细总结了各类方法的原理、优缺点及发展趋势,特别地,考虑到最近两年文本生成图像/视频的流行以及生成式人工智能在多模态创作上的显著进步,综述了针对文本生成图像/视频的检测方法和基于多模态的检测方法;最后,梳理了人脸深度伪造检测领域的研究现状及瓶颈,并对未来的研究与发展方向进行了探讨。
Deepfake technology refers to the synthesis of images, audio, and videos using deep learning algorithms. This technology enables the precise mapping of facial features or other physical characteristics from one person onto a subject in another video, achieving highly realistic face-swapping effects. With advancements in algorithms and the increased accessibility of computational resources, the threshold for utilizing Deepfake technology has gradually lowered, which brings both convenience and numerous social and legal challenges. For example, Deepfake technology is used to bring deceased actors back to the screen, providing a novel experience for audiences. On the other hand, it is often exploited to impersonate citizens or leaders for fraudulent activities, produce pornographic content, or create fake news to influence public opinion. Consequently, the importance of Deepfake detection technology is growing, making it a significant focus of current research. To detect images and videos synthesized by Deepfake technology, researchers need to design models that can uncover subtle traces of manipulation within these media. However, accurately identifying these traces remains challenging due to several factors that complicate the detection process. Firstly, the rapid advancements in Deepfake technology have made it increasingly difficult to differentiate fake images and videos from authentic content. As techniques like generative adversarial networks (GANs) and diffusion models continue to evolve and improve, the texture, lighting, and motion within synthesized media become more seamlessly realistic, which imposes significant challenges on detection models that seek to recognize subtle cues of manipulation. Secondly, forgers can employ a variety of countermeasures to obscure these traces of manipulation, such as applying compression, cropping, or noise addition. Additionally, forgers may create adversarial samples specifically crafted to exploit and bypass detection models’ vulnerabilities, making the identification of Deepfakes even more complex. Thirdly, the generalizability of Deepfake detection methods remains a significant hurdle, as different generative techniques leave behind distinct forensic traces. For instance, GAN-generated images often exhibit prominent grid-like artifacts in the frequency domain, while images produced through diffusion models typically leave only subtle, less detectable traces in this domain. Therefore, it is crucial that detection models do not exclusively rely on low-level, technique-specific features but instead focus on capturing deep, generalized features that ensure robustness and applicability across diverse forgery types and detection scenarios. To address these multifaceted challenges, numerous scholars have proposed a variety of detection methods designed to capture the nuanced traces left by Deepfake manipulations. For instance, certain approaches concentrate on identifying subtle forgery artifacts within the frequency domain of images, capitalizing on the distinct spectral anomalies that forgeries often introduce. Other methods prioritize assessing the temporal consistency across video frames, as unnatural transitions or frame-level inconsistencies can indicate synthesized content. Additionally, some detection strategies focus on evaluating the synchronization between different modalities within videos—such as audio and visual elements—to detect inconsistencies that may reveal forgery. Currently, several review papers in academia have summarized key research and developments within this domain. However, due to the rapid advancements in generative artificial intelligence, fake faces created with diffusion models have recently gained popularity, yet there is scarcely any review that addresses the detection of such forgeries. Furthermore, as generative AI progresses continues advancing toward multimodal integration, Deepfake detection methods are similarly evolving to incorporate features from multiple modalities. Nonetheless, the majority of existing reviews lack sufficient focus on multimodal detection approaches, underscoring a gap in the literature that this review seeks to address. To provide an up-to-date overview of face Deepfake detection, this review first organizes commonly used datasets and evaluation metrics in the field. Then, it divides the detection methods into image-level and video-level face Deepfake detection. Based on feature selection approaches, image-level methods are categorized into spatial-domain and frequency-domain methods, while video-level methods are categorized into approaches based on spatiotemporal inconsistencies, biological features, and multimodal features. Each category is thoroughly analyzed regarding its principles, strengths, weaknesses, and developmental trends. Finally, the current research status and challenges in face Deepfake detection are summarized, and future research directions are discussed. Compared with other related reviews, the novelty of this review lies in its summary of detection methods specifically targeting text-to-image/video generation and multimodal detection methods. This review aligns with the latest trends in generative artificial intelligence, offering a thorough and up-to-date summary of recent advancements in face Deepfake detection. By examining the latest methodologies, including those developed to address forgeries created through advanced techniques like diffusion models and multimodal integration, this review reflects the ongoing evolution of detection technology. It highlights both the progress made and the challenges that remain, positioning itself as a valuable resource for researchers aiming to navigate and contribute to the cutting-edge developments in this rapidly advancing field. A comprehensive analysis of face Deepfake detection methods reveals that current techniques achieve nearly 100% accuracy within the training datasets, particularly those leveraging advanced models like Transformers. However, their performance often declines significantly in cross-dataset testing, especially for spatial-domain and frequency-domain detection methods. This decline suggests that these approaches may fail to capture essential, generalizable features that are robust across varying datasets. By contrast, biological feature-based methods demonstrate superior generalization capabilities, successfully adapting to different contexts, yet they require carefully tailored training data and specific application conditions to reach optimal performance. Meanwhile, multimodal detection methods, which integrate features across multiple modalities, offer enhanced robustness and adaptability due to their layered approach; however, this added complexity often results in higher computational costs and increased model intricacy. Given the diversity in feature selection, alongside the unique advantages and limitations inherent to each detection approach, no single method yet provides a fully comprehensive solution to the Deepfake detection challenge. This reality underscores the critical need for continued research in this evolving field and highlights the importance of this review in mapping current advancements and identifying future research directions.
深度伪造检测人脸伪造检测人脸图像人脸视频空间域特征频率域特征时序特征多模态特征
Deepfake detectionface forgery detectionface imageface videospatial domain featurefrequency domain featuretemporal featuresmultimodal features
Afchar D, Nozick V, Yamagishi J and Echizen I. 2018. Mesonet: a compact facial video forgery detection network//Proceedings of 10th IEEE International Workshop on Information Forensics and Security (WIFS). New York: IEEE: 1-7 [DOI: 10.1109/WIFS.2018. 8630761http://dx.doi.org/10.1109/WIFS.2018.8630761]
Agarwal A, Agarwal A, Sinha S, Vatsa M and Singh R. 2021. MD-CSDNetwork: Multi-domain cross stitched network for deepfake detection//Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG). New York: IEEE: 1-8 [DOI: 10.1109/FG52635.2021.9666937http://dx.doi.org/10.1109/FG52635.2021.9666937]
Aghasanli A, Kangin D and Angelov P. 2023. Interpretable-through- prototypes deepfake detection for diffusion models//Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Los Alamitos: IEEE Computer Soc: 467-474 [DOI: 10.1109/ICCVW60793.2023.00053http://dx.doi.org/10.1109/ICCVW60793.2023.00053]
Amoroso R, Morelli D, Cornia M, Baraldi L, Del Bimbo A and Cucchiara R. 2024. Parents and children: Distinguishing multi- modal deepfakes from natural images. ACM Transactions on Multimedia Computing, Communications, and Applications: [DOI: 10.1145/3665497http://dx.doi.org/10.1145/3665497]
Aneja S and Nießner M. 2020. Generalized zero and few-shot transfer for facial forgery detection[EB/OL]. [2024-11-07]. https://arxiv. org/abs/2006.11863https://arxiv.org/abs/2006.11863
Aneja S, Markhasin L and Nießner M. 2022. TAFIM: Targeted adversarial attacks against facial image manipulations//Proceedings of the 17th European Conference on Computer Vision. Cham: Springer Nature Switzerland: 58-75 [DOI: 10.1007/978-3-031- 19781-9_4http://dx.doi.org/10.1007/978-3-031-19781-9_4]
AV A, Das S and Das A. 2024. Latent flow diffusion for deepfake video generation//Proceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition. Los Alamitos: IEEE: 3781-3790
Binh L M and Woo S. 2022. ADD: Frequency attention and multi-view based knowledge distillation to detect low-quality compressed deepfake images//Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: Assoc Advancement Artificial Intelligence: 122-130 [DOI: 10.1609/aaai.v36i1.19886http://dx.doi.org/10.1609/aaai.v36i1.19886]
Cai Z, Stefanov K, Dhall A and Hayat M. 2022. Do you really mean that? content driven audio-visual deepfake dataset and multimodal method for temporal forgery localization//Proceedings of the 2022 International Conference on Digital Image Computing: Techniques and Applications (DICTA). Los Alamitos: IEEE: 1-10 [DOI: 10.1109/DICTA56598.2022.10034605http://dx.doi.org/10.1109/DICTA56598.2022.10034605]
Cao J Y, Ma C, Yao T P, Chen S, Ding S H and Yang X K. 2022a. End-to-end reconstruction-classification learning for face forgery detection//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Computer Soc: 4103-4112 [DOI: 10.1109/CVPR52688.2022.00408http://dx.doi.org/10.1109/CVPR52688.2022.00408]
Cao S H, Liu X H, Mao X Q, and Zou Q. 2022b. A review of human face forgery and forgery-detection technologies. Journal of Image and Graphics, 27(04): 1023-1038
曹申豪, 刘晓辉, 毛秀青, 邹勤. 2022b. 人脸伪造及检测技术综述. 中国图象图形学报, 27(04): 1023-1038 [DOI:10.11834/jig.200466http://dx.doi.org/10.11834/jig.200466]
Chen H, Lin Y Z and Li B. 2022a. Exposing face forgery clues via retinex-based image enhancement//Proceedings of the Asian Conference on Computer Vision. Berlin: Springer-Verlag Berlin: 20-34 [DOI: 10.1007/978-3-031-26316-3_2http://dx.doi.org/10.1007/978-3-031-26316-3_2]
Chen L, Zhang Y, Song Y B, Liu L Q and Wang J. 2022b. Self-supervised learning of adversarial example: Towards good generalizations for deepfake detection//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Reco- gnition (CVPR). Los Alamitos: IEEE Computer Soc: 18689-18698 [DOI: 10.1109/CVPR52688.2022.01815http://dx.doi.org/10.1109/CVPR52688.2022.01815]
Chen P, Liu J, Liang T, Zhou G Z, Gao H C, Dai J and Han J Z. 2020. FSSpotter: Spotting face-swapped video by spatial and temporal clues//Proceedings of the 2020 IEEE International Conference on Multimedia and Expo (ICME). New York: IEEE: 1-6 [DOI: 10.1109/ICME46284.2020.9102914http://dx.doi.org/10.1109/ICME46284.2020.9102914]
Chen S, Yao T P, Chen Y, Ding S H, Li J L and Ji R R. 2021. Local relation learning for face forgery detection//Proceedings of the AAAI conference on artificial intelligence. Palo Alto: Assoc Advancement Artificial Intelligence: 1081-1088 [DOI: 10.1609/ aaai.v35i2.16193http://dx.doi.org/10.1609/aaai.v35i2.16193]
Chen W and McDuff D. 2018. Deepphys: Video-based physiological measurement using convolutional attention networks//Proceedings of the European Conference on Computer Vision (ECCV). Berlin: Springer: 349-365 [DOI: 10.1007/978-3-030-01216-8_22http://dx.doi.org/10.1007/978-3-030-01216-8_22]
Cheng H, Guo Y, Wang T, Nie L and Kankanhalli M. 2024a. Diffusion facial forgery detection//Proceedings of the 32nd ACM Internation- al Conference on Multimedia. New York: Association for Comput- ing Machinery: 5939-5948 [DOI: 10.1145/3664647.3680797http://dx.doi.org/10.1145/3664647.3680797]
Ciftci U A, Demir I and Yin L. 2020. Fakecatcher: Detection of synthetic portrait videos using biological signals. IEEE Transac- tions on Pattern Analysis and Machine Intelligence [DOI: 10.1109/ TPAMI.2020.3009287http://dx.doi.org/10.1109/TPAMI.2020.3009287]
Dai Y S, Fei J W, Xia Z H, Liu J N and Wong J. 2023. Local similarity anomaly for general face forgery detection. Journal of Image and Graphics, 28(11): 3453-3470
戴昀书, 费建伟, 夏志华, 刘家男, 翁健. 2023. 局部相似度异常的强泛化性伪造人脸检测. 中国图象图形学报, 28(11): 3453-3470 [DOI:10.11834/jig.221006http://dx.doi.org/10.11834/jig.221006]
Dolhansky B, Bitton J, Pflaum B, Lu J, Howes R, Wang M and Ferrer C C. 2020. The deepfake detection challenge (dfdc) dataset[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2006.07397https://arxiv.org/abs/2006.07397
Durall R, Keuper M, Pfreundt F J and Keuper J. 2019. Unmasking deepfakes with simple features[EB/OL]. [2024-11-07]. https://arxiv. org/abs/1911.00686https://arxiv.org/abs/1911.00686
Feng C, Chen Z Y and Owens A. 2023. Self-supervised video forensics by audio-visual anomaly detection//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos: IEEE Computer Soc: 10491-10503 [DOI: 10.1109/ CVPR52729.2023.01011http://dx.doi.org/10.1109/CVPR52729.2023.01011]
Feng C, Liu C, Wang Y and Zhou Q. 2024. Face forgery detection with image patch comparison and residual map estimation. Journal of Image and Graphics, 29(2): 457-467
冯才博, 刘春晓, 王昱烨, 周其当. 2024. 结合图像块比较与残差图估计的人脸伪造检测. 中国图象图形学报, 29(2): 457-467 [DOI: 10.11834/jig.230149http://dx.doi.org/10.11834/jig.230149]
Frank J, Eisenhofer T, Schönherr L, Fischer A, Kolossa D and Holz T. 2019. Leveraging frequency analysis for deep fake image recognition//Proceedings of the 25th Americas Conference on Information Systems (AMCIS 2019). Atlanta: Assoc Information Systems: 3247-3258
Fung S O, Lu X Q, Zhang C and Li C S. 2021. DeepfakeUCL: Deepfake detection via unsupervised contrastive learning//Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN). New York: IEEE: 1-8 [DOI: 10.1109/IJCNN52387.2021.9534089http://dx.doi.org/10.1109/IJCNN52387.2021.9534089]
Gandhi A and Jain S. 2020. Adversarial perturbations fool deepfake detectors//Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN). New York: IEEE: 1-8 [DOI: 10.1109/IJCNN48605.2020.9207034http://dx.doi.org/10.1109/IJCNN48605.2020.9207034]
Gu Q Q, Chen S, Yao T P, Chen Y, Ding S H and Yi R. 2022a. Exploiting fine-grained face forgery clues via progressive enhancement learning//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: Assoc Advancement Artificial Intelligence: 735-743 [DOI: 10.1609/aaai.v36i1.19954http://dx.doi.org/10.1609/aaai.v36i1.19954]
Gu Z H, Chen Y, Yao T P, Ding S H, Li J L and Ma L Z. 2022b. Delving into the local: Dynamic inconsistency learning for deepfake video detection//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: Assoc Advancement Artificial Intelligence: 744-752 [DOI: 10.1609/aaai.v36i1.19955http://dx.doi.org/10.1609/aaai.v36i1.19955]
Güera D and Delp E J. 2018. Deepfake video detection using recurrent neural networks//Proceedings of the 15th IEEE international conf- erence on advanced video and signal based surveillance (AVSS). New York: IEEE: 127-132 [DOI:10.1109/AICCIT57614.2023. 10217956http://dx.doi.org/10.1109/AICCIT57614.2023.10217956]
Haliassos A, Mira R, Petridis S and Pantic M. 2022. Leveraging real talking faces via self-supervision for robust forgery detection//Pro- ceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos: IEEE Computer Soc: 14930-14942 [DOI: 10.1109/CVPR52688.2022.01453http://dx.doi.org/10.1109/CVPR52688.2022.01453]
Haliassos A, Vougioukas K, Petridis S and Pantic M. 2021. Lips don't lie: A generalisable and robust approach to face forgery detection//Pro- ceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Comp Soc: 5039-5049 [DOI: 10.1109/CVPR46437.2021.00500http://dx.doi.org/10.1109/CVPR46437.2021.00500]
Hashmi A, Shahzad S, Ahmad W, Lin C, Tsao Y and Wang H. 2023. Multimodal forgery detection using ensemble learning//Proceed- ings of the 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. New York: IEEE: 1524-1532 [DOI: 10.23919/APSIPAASC55919.2022.9980255http://dx.doi.org/10.23919/APSIPAASC55919.2022.9980255]
Heo Y J, Yeo W H and Kim B G. 2023. DeepFake detection algorithm based on improved vision transformer. Applied Intelligence, 53(7): 7512-7527 [DOI: 10.1007/s10489-022-03867-9http://dx.doi.org/10.1007/s10489-022-03867-9]
Hernandez-Ortega J, Tolosana R, Fierrez J and Morales A. 2020. DeepFakesOn-Phys: Deepfakes detection based on heart rate estimation[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2010.00400https://arxiv.org/abs/2010.00400
Hooda A, Mangaokar N, Feng R, Fawaz K, Jha S and Prakash A. 2024.
D4: Detection of adversarial diffusion deepfakes using disjoint ensembles//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE: 3812-3822 [DOI: 10.1109/WACV57701.2024.00377http://dx.doi.org/10.1109/WACV57701.2024.00377]
Hu S, Li Y Z and Lyu S W. 2021. Exposing GAN-generated faces using inconsistent corneal specular highlights//Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New York: IEEE: 2500-2504 [DOI: 10.1109/ICASSP39728.2021.9414582http://dx.doi.org/10.1109/ICASSP39728.2021.9414582]
Jiang Y, Huang Z, Pan X, Loy C C and Liu Z. 2021. Talk-to-edit: Fine-grained facial editing via dialog//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE: 13799-13808 [DOI: 10.1109/ICCV48922.2021.01354http://dx.doi.org/10.1109/ICCV48922.2021.01354]
Karras T, Laine S and Aila T. 2019a. A style-based generator architecture for generative adversarial networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(12): 4217-4228 [DOI: 10.1109/TPAMI.2020.2970919http://dx.doi.org/10.1109/TPAMI.2020.2970919]
Khalid H, Tariq S, Kim M and Woo S S. 2021. FakeAVCeleb: A novel audio-video multimodal deepfake dataset[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2108.05080https://arxiv.org/abs/2108.05080
Khan S A and Dai H. 2021. Video transformer for deepfake detection with incremental learning//Proceedings of the 29th ACM Interna- tional Conference on Multimedia. New Yrok: Association for Computing Machinery: 1821-1828 [DOI: 10.1145/3474085. 3475332http://dx.doi.org/10.1145/3474085.3475332]
Khormali A and Yuan J S. 2024. Self-supervised graph transformer for deepfake detection. IEEE Access, 12: 58114-58127 [DOI: 10.1109/ access.2024.3392512http://dx.doi.org/10.1109/access.2024.3392512]
Korshunov P and Marcel S. 2018. Deepfakes: A new threat to face recognition? assessment and detection[EB/OL]. [2024-11-07]. https://arxiv.org/abs/1812.08685https://arxiv.org/abs/1812.08685
Korshunova I, Shi W Z, Dambre J and Theis L. 2017. Fast face-swap using convolutional neural networks//Proceedings of the 16th IEEE International Conference on Computer Vision (ICCV). New York: IEEE: 3697-3705 [DOI: 10.1109/ICCV.2017.397http://dx.doi.org/10.1109/ICCV.2017.397]
Lee S, Tariq S, Kim J and Woo S S. 2021. TAR: Generalized forensic framework to detect deepfakes using weakly supervised learning. ICT Systems Security and Privacy Protection, 625: 351-366 [DOI: 10.1007/978-3-030-78120-0_23http://dx.doi.org/10.1007/978-3-030-78120-0_23]
Lewis J K, Toubal I E, Chen H, Sandesera V, Lomnitz M, Hampel-Arias Z, Prasad C and Palaniappan K. 2020. Deepfake video detection based on spatial, spectral, and temporal inconsistencies using multimodal deep learning//Proceedings of the 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). New York: IEEE: 1-9 [DOI: 10.1109/AIPR50011.2020.9425167http://dx.doi.org/10.1109/AIPR50011.2020.9425167]
Li H D, Li B, Tan S Q and Huang J W. 2020b. Identification of deep network generated images using disparities in color components. Signal Process, 174 [DOI: 10.1016/j.sigpro.2020.107616http://dx.doi.org/10.1016/j.sigpro.2020.107616]
Li J M, Xie H T, Yu L Y and Zhang Y D. 2022. Wavelet-enhanced weakly supervised local feature learning for face forgery detection//Proceedings of the 30th ACM International Conference on Multimedia. New York: Assoc Computing Machinery: 1299-1308 [DOI: 10.1145/3503161.3547832http://dx.doi.org/10.1145/3503161.3547832]
Li J, Xie H, Li J, Wang Z and Zhang Y. 2021. Frequency-aware discriminative feature learning supervised by single-center loss for face forgery detection//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE: 6458-6467 [DOI: 10.1109/CVPR46437.2021.00639http://dx.doi.org/10.1109/CVPR46437.2021.00639]
Li L Z, Bao J M, Zhang T, Yang H, Chen D, Wen F and Guo B N. 2020c. Face X-ray for more general face forgery detection//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New York: IEEE: 5000-5009 [DOI: 10.1109/ CVPR42600.2020.00505http://dx.doi.org/10.1109/CVPR42600.2020.00505]
Li X D, Lang Y N, Chen Y F, Mao X F, He Y, Wang S H, Xue H and Lu Q. 2020d. Sharp multiple instance learning for deepfake video detection//Proceedings of the 28th ACM International Conference on Multimedia. New York: Assoc Computing Machinery: 1864-1872 [DOI: 10.1145/3394171.3414034http://dx.doi.org/10.1145/3394171.3414034]
Li Y and Lyu S. 2018a. Exposing deepfake videos by detecting face warping artifacts[EB/OL]. [2024-11-07]. https://arxiv.org/abs/1811. 00656https://arxiv.org/abs/1811.00656
Li Y, Bian S, Wang C T and Lu W. 2023. CNN and Transformer-coord- inated deepfake detection. Journal of Image and Graphics, 28(03): 0804-0819
李颖, 边山, 王春桃, 卢伟. 2023. CNN结合Transformer的深度伪造高效检测. 中国图象图形学报, 28(03): 0804-0819 [DOI:10.11834/jig.220519http://dx.doi.org/10.11834/jig.220519]
Li Y, Chang M C and Lyu S. 2018b. In ictu oculi: Exposing ai created fake videos by detecting eye blinking//Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS). New York: IEEE: 1-7 [DOI:10.1109/WIFS.2018. 8630787http://dx.doi.org/10.1109/WIFS.2018.8630787]
Li Y, Yang X, Sun P, Qi H and Lyu S. 2020a. Celeb-DF: A large-scale challenging dataset for deepfake forensics//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion. Los Alamitos: IEEE: 3207-3216 [DOI:10.1109/CVPR42600. 2020.00327http://dx.doi.org/10.1109/CVPR42600.2020.00327]
Liu A A, Su Y T, Wang L J, Li B, Qian Z X, Zhang W M, Zhou L N, Zhang X P, Zhang Y D, Huang J W and Yu N H. 2024. Review on the progress of the AIGC visual content generation and traceability. Journal of Image and Graphics, 29(06): 1535-1554
刘安安, 苏育挺, 王岚君, 李斌, 钱振兴, 张卫明, 周琳娜, 张新鹏, 张勇东, 黄继武, 俞能海. 2024b. AIGC视觉内容生成与溯源研究进展. 中国图象图形学报, 29(06): 1535-1554 [DOI:10.11834/jig. 240003http://dx.doi.org/10.11834/jig.240003]
Liu B, Liu B, Ding M and Zhu T. 2024a. Detection of diffusion model-generated faces by assessing smoothness and noise tolerance//Proceedings of the 2024 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE: 1-6 [DOI: 10.1109/BMSB62888.2024.10608232http://dx.doi.org/10.1109/BMSB62888.2024.10608232]
Lorenz P, Durall R L and Keuper J. 2023. Detecting images generated by deep diffusion models using their local intrinsic dimension- ality//Proceedings of 2023 IEEE/CVF International Conference on Computer Vision (ICCV). Los Alamitos: IEEE Computer Soc: 448-459 [DOI: 10.1109/ICCVW60793.2023.00051http://dx.doi.org/10.1109/ICCVW60793.2023.00051]
Lu W, Sun W and Lu H T. 2009. Robust watermarking based on DWT and nonnegative matrix factorization. Computers and Electrical Engineering, 35(1): 183-188 [DOI: 10.1016/j.compeleceng.2008. 09.004http://dx.doi.org/10.1016/j.compeleceng.2008.09.004]
Mandelli S, Bonettini N O, Bestagini P and Tubaro S. 2022. Detecting GAN-generated images by orthogonal training of multiple CNNs//Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP). New York: IEEE: 3091-3095 [DOI: 10.1109/ICIP46576.2022.9897310http://dx.doi.org/10.1109/ICIP46576.2022.9897310]
Mao M and Yang J. 2021. Exposing deepfake with pixel-wise AR and PPG correlation from faint signals[EB/OL]. [2024-11-07]. https:// arxiv.org/abs/2110.15561https://arxiv.org/abs/2110.15561
Matern F, Riess C and Stamminger M. 2019. Exploiting visual artifacts to expose deepfakes and face manipulations//Proceedings of the 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). New York: IEEE: 83-92 [DOI:10.1109/WACVW.2019. 00020http://dx.doi.org/10.1109/WACVW.2019.00020]
Mehta D, Mehta A and Narang P. 2024. LDFaceNet: Latent diffusion- based network for high-fidelity deepfake generation[EB/OL]. [2024-11-7]. https://arxiv.org/abs/2303.04226https://arxiv.org/abs/2303.04226
Miao Q, Kang S, Marsella S, DiPaola S, Wang C and Shapiro A. 2022. Study of detecting behavioral signatures within deepfake videos[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2208.03561https://arxiv.org/abs/2208.03561
Nguyen H H, Yamagishi J and Echizen I. 2019. Capsule-forensics: Using capsule networks to detect forged images and videos//Proceedings of the 44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New York: IEEE: 2307-2311 [DOI: 10.1109/ICASSP.2019.8682602http://dx.doi.org/10.1109/ICASSP.2019.8682602]
Oorloff T, Koppisetti S, Bonettini N, Solanki D, Colman B, Yacoob Y, Shahriyari A and Bharaj G. 2024. AVFF: Audio-visual feature fusion for video deepfake detection//Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion. Los Alamitos: IEEE: 27102-27112 [DOI: 10.1109/ CVPR52733.2024.02559http://dx.doi.org/10.1109/CVPR52733.2024.02559]
Papa L, Faiella L, Corvitto L, Maiano L and Amerini I. 2023. On the use of Stable Diffusion for creating realistic faces: from generation to detection//Proceedings of the 11th International Workshop on Bio- metrics and Forensics (IWBF). New York: IEEE: 1-6 [DOI: 10.1109/IWBF57495.2023.10156981http://dx.doi.org/10.1109/IWBF57495.2023.10156981]
Qian Y, Yin G, Sheng L, Chen Z and Shao J. 2020. Thinking in frequency: Face forgery detection by mining frequency-aware clues[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2007.09355https://arxiv.org/abs/2007.09355
Qu Z M, Yin Q L, Sheng Z Q, Wu J Y, Zhang B L, Yu S R, and Lu W. 2024. Overview of Deepfake proactive defense techniques. Journal of Image and Graphics, 29(02): 0318-0342
瞿左珉, 殷琪林, 盛紫琦, 吴俊彦, 张博林, 余尚戎, 卢伟. 2024. 人脸深度伪造主动防御技术综述. 中国图象图形学报, 29(02): 0318-0342 [DOI:10.11834/jig.230128http://dx.doi.org/10.11834/jig.230128]
Ranjan P, Patil S and Kazi F. 2020. Improved generalizability of deep-fakes detection using transfer learning based CNN frame- work//Proceedings of the 3rd International Conference on Infor- mation and Computer Technologies (ICICT). Los Alamitos: IEEE Computer Soc: 86-90 [DOI: 10.1109/ICICT50521.2020.00021http://dx.doi.org/10.1109/ICICT50521.2020.00021]
Ren J Q, Qin J P, Ma Q L and Cao Y. 2024. FastFaceCLIP: A lightweight text-driven high-quality face image manipulation. IET Comput Vision, 18(7): 950-967 [DOI: 10.1049/cvi2.12295http://dx.doi.org/10.1049/cvi2.12295]
Ricker J, Damm S, Holz T and Fischer A. 2022. Towards the detection of diffusion model deepfakes[EB/OL]. [2024-11-07]. https://arxiv. org/abs/2210.14571https://arxiv.org/abs/2210.14571
Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J and Niessner M. 2019. Faceforensics++: Learning to detect manipulated facial images//Proceedings of the IEEE/CVF international conference on computer vision. Los Alamitos: IEEE: 1-11 [DOI: 10.1109/ICCV. 2019.00009http://dx.doi.org/10.1109/ICCV.2019.00009]
Rossler A, Cozzolino D, Verdoliva L, Riess C, Thies J and Niessner M. 2018. Faceforensics: A large-scale video dataset for forgery detection in human faces[EB/OL]. [2024-11-07]. https://arxiv.org/ abs/1803.09179https://arxiv.org/abs/1803.09179
Sabir E, Cheng J, Jaiswal A, AbdAlmageed W, Masi I and Natarajan P. 2019. Recurrent convolutional strategies for face manipulation detection in videos[EB/OL]. [2024-11-09]. https://arxiv.org/abs/ 1905.00582https://arxiv.org/abs/1905.00582
Saif S, Tehseen S and Ali S S. 2024. Fake news or real? Detecting deepfake videos using geometric facial structure and graph neural network. Technological Forecasting and Social Change, 205 [DOI: 10.1016/j.techfore.2024.123471http://dx.doi.org/10.1016/j.techfore.2024.123471]
Shahzad S A, Hashmi A, Khan S, Peng Y T, Tsao Y and Wang H M. 2022. Lip sync matters: A novel multimodal forgery detector//Pro- ceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). New York: IEEE: 1885-1892 [DOI: 10.23919/APSIPAASC55919.2022. 9980296http://dx.doi.org/10.23919/APSIPAASC55919.2022.9980296]
Song H K, Woo S H, Lee J, Yang S, Cho H, Lee Y, Choi D and Kim K W. 2022a. Talking face generation with multilingual tts//Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Comp Soc: 21425-21430 [DOI: 10.1109/CVPR52688.2022.02074http://dx.doi.org/10.1109/CVPR52688.2022.02074]
Song J W, Liu C X and Zhang X Y. 2024b. LDH: least dependent hiding for screen-shooting resilient watermarking. Journal of Image and Graphics, 29(02): 0408-0418
宋佳维, 刘春晓, 张心怡. 2024b. 最小依赖隐藏的屏摄鲁棒水印方法. 中国图象图形学报, 29(02): 0408-0418 [DOI:10.11834/jig.220811http://dx.doi.org/10.11834/jig.220811]
Song L, Fang Z, Li X, Dong X, Jin Z, Chen Y and Lyu S. 2022b. Adaptive face forgery detection in cross domain//Proceedings of the 17th European Conference on Computer Vision. Berlin: Springer-verlag Berlin: 467-484 [DOI: 10.1007/978-3-031-19830- 4_27http://dx.doi.org/10.1007/978-3-031-19830-4_27]
Song X, Guo X, Zhang J, Li Q, Bai L, Liu X, Zhai G, Liu X. 2024a. On learning multi-modal forgery representation for diffusion generated video detection[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2410. 23623https://arxiv.org/abs/2410.23623
Stefanov K, Paliwal B and Dhall A. 2022. Visual representations of physiological signals for fake video detection[EB/OL]. [2024-11- 07]. https://arxiv.org/abs/2207.08380https://arxiv.org/abs/2207.08380
Sun K, Liu H, Ye Q, Gao Y, Liu J, Shao L and Ji R. 2021b. Domain general face forgery detection by learning to weight//Proceedings of the 35th AAAI Conference on Artificial Intelligence. Palo Alto: Association for the Advancement of Artificial Intelligence: 2638-2646 [DOI: 10.1609/aaai.v35i3.16367http://dx.doi.org/10.1609/aaai.v35i3.16367]
Sun K, Yao T P, Chen S, Ding S H, Li J L and Ji R R. 2022. Dual contrastive learning for general face forgery detection//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: Assoc Advancement Artificial Intelligence: 2316-2324 [DOI: 10.1609/ aaai.v36i2.20130http://dx.doi.org/10.1609/aaai.v36i2.20130]
Sun Z K, Han Y J, Hua Z Y, Ruan N and Jia W J. 2021a. Improving the efficiency and robustness of deepfakes detection through precise geometric features//Proceedings of the 2021 IEEE/CVF Conferen- ce on Computer Vision and Pattern Recognition (CVPR). Los Alamitos: IEEE Computer Soc: 3608-3617 [DOI: 10.1109/ CVPR46437.2021.00361http://dx.doi.org/10.1109/CVPR46437.2021.00361]
Țânțaru D-C, Oneață E and Oneață D. 2024. Weakly-supervised deepfake localization in diffusion-generated images//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE: 6258-6268 [DOI: 10.1109/WACV57701. 2024.00614http://dx.doi.org/10.1109/WACV57701.2024.00614]
Thies J, Zollhofer M and Niessner M. 2019b. Deferred neural rendering: Image synthesis using neural textures. ACM Transactions on Grap- hics, 38(4): 1-12 [DOI: 10.1145/3306346.3323035http://dx.doi.org/10.1145/3306346.3323035]
Thies J, Zollhofer M, Stamminger M, Theobalt C and Niessner M. 2019a. Face2Face: Real-time face capture and reenactment of RGB videos. Communications of the ACM, 62(1): 96-104 [DOI: 10.1145/3292039http://dx.doi.org/10.1145/3292039]
Wang C and Deng W. 2021. Representative forgery mining for fake face detection//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Comp Soc: 14923-14932 [DOI: 10.1109/CVPR46437.2021.01468http://dx.doi.org/10.1109/CVPR46437.2021.01468]
Wang J K, Wu Z X, Ouyang W H, Han X T, Chen J J, Lim S N and Jiang Y G. 2022b. M2TR: Multi-modal multi-scale transformers for deepfake detection//Proceedings of the 2022 International Conference on Multimedia Retrieval. New York: Assoc Computing Machinery: 615-623 [DOI: 10.1145/3512527.3531415http://dx.doi.org/10.1145/3512527.3531415]
Wang R, Huang Z, Chen Z, Liu L, Chen J and Wang L. 2022. Anti-forgery: Towards a stealthy and robust deepfake disruption attack via adversarial perceptual-aware perturbations[EB/OL]. [2024-11-07]. https://arxiv.org/abs/2206.00477https://arxiv.org/abs/2206.00477
Wang Z J, Montoya E, Munechika D, Yang H, Hoover B and Chau D H. 2022a. Diffusiondb: A large-scale prompt gallery dataset for text-to-image generative models[EB/OL]. [2024-11-07]. https:// arxiv.org/abs/2210.14896https://arxiv.org/abs/2210.14896
Wang Z, Bao J, Zhou W, Wang W, Hu H, Chen H and Li H. 2023. Dire for diffusion-generated image detection//Proceedings of the IEEE/ CVF International Conference on Computer Vision. Los Alamitos: IEEE Comp Soc: 22445-22455 [DOI: 10.1109/ICCV51070.2023. 02051http://dx.doi.org/10.1109/ICCV51070.2023.02051]
Wodajo D, Atnafu S and Akhtar Z. 2023. Deepfake video detection using generative convolutional vision transformer[EB/OL]. [2024- 11-07]. https://arxiv.org/abs/2307.07036https://arxiv.org/abs/2307.07036
Xia W, Yang Y, Xue J-H and Wu B. 2021. TediGAN: Text-guided diverse face image generation and manipulation//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Los Alamitos: IEEE Comp Soc: 2256-2265 [DOI: 10.1109/CVPR46437.2021.00229http://dx.doi.org/10.1109/CVPR46437.2021.00229]
Xu C, Zhang J, Hua M, He Q, Yi Z and Liu Y. 2022b. Region-aware face swapping//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Comp Soc: 7632-7641 [DOI: 10.1109/CVPR52688.2022.00748http://dx.doi.org/10.1109/CVPR52688.2022.00748]
Xu Y T, Liang J, Sheng L J and Zhang X Y. 2024. Learning spatiotemporal inconsistency via thumbnail layout for face deepfake detection. International Journal of Computer Vision, 132(12): 1-18 [DOI: 10.1007/s11263-024-02054-2http://dx.doi.org/10.1007/s11263-024-02054-2]
Xu Y, Deng B, Wang J, Jing Y, Pan J and He S. 2022c. High-resolution face swapping via latent semantics disentanglement//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. Los Alamitos: IEEE Comp Soc: 7642-7651 [DOI: 10.1109/CVPR52688.2022.00749http://dx.doi.org/10.1109/CVPR52688.2022.00749]
Xu Y, Yin Y, Jiang L, Wu Q, Zheng C, Loy C C, Dai B and Wu W. 2022a. Transeditor: Transformer-based dual-space GAN for highly controllable facial editing//Proceedings of the IEEE/CVF Confe- rence on Computer Vision and Pattern Recognition. Los Alamitos: IEEE Comp Soc: 7683-7692 [DOI: 10.1109/CVPR52688.2022. 00753http://dx.doi.org/10.1109/CVPR52688.2022.00753]
Yang P N, Huang H B, Wang Z Y, Yu A J and He R. 2022. Confidence-calibrated face image forgery detection with contra- stive representation distillation//Proceedings of the Asian Confer- ence on Computer Vision. Cham: Springer International Publishing Ag: 3-19 [DOI: 10.1007/978-3-031-26316-3_1http://dx.doi.org/10.1007/978-3-031-26316-3_1]
Yang W Y, Zhou X Y, Chen Z K, Guo B F, Ba Z J, Xia Z H, Cao X C and Ren K. 2023. AVoiD-DF: Audio-visual joint learning for detecting deepfake. IEEE Transactions on Information Forensics and Security, 18: 2015-2029 [DOI: 10.1109/tifs.2023.3262148http://dx.doi.org/10.1109/tifs.2023.3262148]
Yin Q L, Lu W, Cao X C, Luo X Y, Zhou Y C and Huang J W. 2024. Fine-grained multimodal deepfake classification via heterogeneous graphs. International Journal of Computer Vision, 132(11): 5255- 5269 [DOI: 10.1007/s11263-024-02128-1http://dx.doi.org/10.1007/s11263-024-02128-1]
Yu Z, Cai R, Li Z, Yang W, Shi J and Kot A C. 2024. Benchmarking joint face spoofing and forgery detection with visual and physiological cues. IEEE Transactions on Dependable and Secure Computing, 21(5): 4327-4342 [DOI: 10.1109/TDSC.2024. 3352049http://dx.doi.org/10.1109/TDSC.2024.3352049]
Zhang C, Zhao Y, Huang Y, Zeng M, Ni S, Budagavi M and Guo X. 2021. Facial: Synthesizing dynamic talking face with implicit attribute learning//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE: 3867-3876 [DOI: 10.1109/ICCV48922.2021.00384http://dx.doi.org/10.1109/ICCV48922.2021.00384]
Zhang D C, Lin F Z, Hua Y Y, Wang P J, Zeng D and Ge S M. 2022. Deepfake video detection with spatiotemporal dropout transfor- mer//Proceedings of the 30th ACM International Conference on Multimedia (MM). New York: Assoc Computing Machinery: 5833-5841 [DOI: 10.1145/3503161.3547913http://dx.doi.org/10.1145/3503161.3547913]
Zhang X, Karaman S and Chang S F. 2019. Detecting and simulating artifacts in GAN fake images//Proceedings of the 2019 IEEE International Workshop on Information Forensics and Security (WIFS). New York: IEEE: 1-6 [DOI:10.1109/wifs47025.2019. 9035107http://dx.doi.org/10.1109/wifs47025.2019.9035107]
Zhao H Q, Wei T Y, Zhou W B, Zhang W M, Chen D D and Yu N H. 2021. Multi-attentional deepfake detection//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Los Alamitos: IEEE: 2185-2194 [DOI:10.1109/ CVPR46437.2021.00222http://dx.doi.org/10.1109/CVPR46437.2021.00222]
Zheng Y L, Bao J M, Chen D, Zeng M and Wen F. 2021. Exploring temporal coherence for more general video face forgery detec- tion//Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. New York: IEEE: 15024-15034 [DOI: 10.1109/ ICCV48922.2021.01477http://dx.doi.org/10.1109/ICCV48922.2021.01477]
Zhou P, Han X T, Morariu V I and Davis L S. 2017. Two-stream neural networks for tampered face detection//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). New York: IEEE: 1831-1839 [DOI: 10. 1109/CVPRW.2017.229http://dx.doi.org/10.1109/CVPRW.2017.229]
Zhu X, Tang Y and Geng P. 2021. Detection algorithm of tamper and deepfake image based on feature fusion. Netinfo Security, 21(8): 70-81
朱新同, 唐云祁, 耿鹏志. 2021. 基于特征融合的篡改与深度伪造图像检测算法. 信息网络安全, 21(8): 70-81 [DOI: 10.3969/j.issn.1671-1122.2021.08.009http://dx.doi.org/10.3969/j.issn.1671-1122.2021.08.009]
Zhuang W Y, Chu Q, Tan Z T, Liu Q K, Yuan H J, Miao C T, Luo Z X and Yu N H. 2022. UIA-ViT: Unsupervised inconsistency-aware method based on vision transformer for face forgery detec- tion//Proceedings of the 17th European Conference on Computer Vision. Cham: Springer Nature Switzerland: 391-407 [DOI:10. 1007/978-3-031-20065-6_23http://dx.doi.org/10.1007/978-3-031-20065-6_23]
Zhuo W Q, Li D Z, Wang W and Dong J. 2023. Data-free model compression for light-weight DeepFake detection. Journal of Image and Graphics, 28(03): 0820-0835
卓文琦, 李东泽, 王伟, 董晶. 2023. 面向轻量级深度伪造检测的无数据模型压缩. 中国图象图形学报, 28(03): 0820-0835 [DOI:10.11834/jig.220559http://dx.doi.org/10.11834/jig.220559]
相关文章
相关作者
相关机构