数字媒体取证技术综述
Overview of digital media forensics technology
- 2021年26卷第6期 页码:1216-1226
纸质出版日期: 2021-06-16 ,
录用日期: 2021-03-26
DOI: 10.11834/jig.210081
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2021-06-16 ,
录用日期: 2021-03-26
移动端阅览
李晓龙, 俞能海, 张新鹏, 张卫明, 李斌, 卢伟, 王伟, 刘晓龙. 数字媒体取证技术综述[J]. 中国图象图形学报, 2021,26(6):1216-1226.
Xiaolong Li, Nenghai Yu, Xinpeng Zhang, Weiming Zhang, Bin Li, Wei Lu, Wei Wang, Xiaolong Liu. Overview of digital media forensics technology[J]. Journal of Image and Graphics, 2021,26(6):1216-1226.
面对每天有数以百万计通过网络传播的多媒体数据,到底哪些内容是真实可信的,虚假内容的背后又经历了哪些篡改?数字取证技术将给出答案。该技术不预先嵌入水印,而是直接分析多媒体数据的内容,达到辨别真实性的目的。任何篡改和伪造都会在一定程度上破坏原始多媒体数据本身固有特征的完整性,由于其具有一致性和独特性,可作为自身的“固有指纹”,用于鉴别篡改文件。随着篡改媒体的数量与日俱增,社会稳定甚至国家安全受到了严重威胁。特别地,随着深度学习技术的快速发展,虚假媒体与真实媒体之间的感官差距越来越小,这对媒体取证研究提出了巨大挑战,并使得多媒体取证成为信息安全领域一个重要的研究方向。因此,目前迫切需要能够检测虚假多媒体内容和避免危险虚假信息传播的技术和工具。本文旨在对过去多媒体取证领域所提出的优秀检测取证算法进行总结。除了回顾传统的媒体取证方法,还将介绍基于深度学习的方法。本文针对当今主流的多媒体篡改对象:图像、视频和语音分别进行总结,并针对每种媒体形式,分别介绍传统篡改方法和基于AI(artificial intelligence)生成的篡改方法,并介绍了已公开的大规模数据集以及相关应用的情况,同时探讨了多媒体取证领域未来可能的发展方向。
Internet and social networks have become the main platforms for people to access and share various digital media. Among them
media based on images
videos
and audio carry more information and are the most eye-catching. With the rapid development of computer technology
image and video editing software and tools have appeared one after another
such as Photoshop
Adobe Premiere Pro
and VideoStudio. These editing software can be faster and easier to modify the media. The effect of image forgery is realistic
and the effect of video editing and synthesis is natural and smooth. In recent years
the image generation technology has also been greatly developed
and the visual effects of the generated images may be fake. The problem of multimedia forgery attracts people's attention. The purpose of forgery may be entertainment (such as beautifying images)
malicious modification of the content of images and videos (such as deliberately modifying photos of political figures or deliberately exaggerating the severity of news events)
and malicious copying. Image forgery incidents in recent years also remind people to focus on the security of media content. The authenticity of visual media content decreases and is increasingly being questioned. At present
millions of multimedia data are transmitted via the Internet every day. What type of content is true? What tampering was made behind the wrong content? The digital forensics technology proposed in recent years provides the answer. This technology does not embed a watermark in advance but directly analyzes the content of multimedia data to achieve the purpose of authenticity recognition. The basic principle is that the inherent characteristics of the original multimedia data are consistent and unique and can be used as its own "intrinsic fingerprint". Any tampering or forgery destroys its integrity to a certain extent. In recent years
media tampering has been increasing and has seriously threatened social stability and even national security. Especially with the rapid development of deep learning technology
the perceived gap between fake media and real media decreases. This finding poses a serious challenge to media forensic research and makes multimedia forensics an important issue in the field of information security research direction. Therefore
technologies and tools that can detect erroneous multimedia content are urgently required
and the spread of dangerous erroneous information is avoided. This article aims to summarize the excellent detection and forensics algorithms proposed in the previous multimedia forensics field. In addition to reviewing traditional media forensics methods
we introduce methods based on deep learning. This article summarizes the current mainstream multimedia tampering objects
namely
images
videos
and sounds. Each media form includes traditional tampering methods and artificial intelligence (AI)-based tampering methods. Among them
video tampering is mainly divided into intraframe tampering and interframe tampering. Intraframe tampering takes the video frame as a unit to delete objects on the screen or performing "copy and move" operations
and interframe tampering takes the video sequence as a unit to add or delete frames. Traditional methods for detecting fake videos can be divided into video encoding tracking detection
video content inconsistency detection
video frame repeated tampering
and copy and paste detection. AI-based error video detection technology focuses on detecting artifacts left over from the network generated in the imaging network
which is different from the imaging process of a real camera. The purpose of digital image forensics technology is to verify the integrity and authenticity of digital images. Image forensic methods can be divided into active methods and passive methods. Active image forensics includes embedding watermarks or signatures in digital images. The passive blind forensic (blind forensics) method is not limited by these factors. It distinguishes images by detecting traces of tampering in the image. Common image forgery and tampering include enhancement
modification
area duplication
splicing
and synthesis. The detection of partial replacement image is divided into the following: 1) Area copy and tamper detection
which copies and pastes part of the area in the image to other areas. During the copying process
the copied area may undergo various geometric transformations and postprocessing. 2) Image processing fingerprints detection. The visual difference caused by simple area copying
splicing
and tampering is still evident. The forger performs postprocessing
such as zooming
rotating
and blurring the image
to eliminate these traces. 3) In recompression fingerprint detection
tampered images inevitably undergo recompression; thus
digital image recompression detection can provide a powerful auxiliary basis for digital image forensics. For the traceability detection technology of forged images
most images are captured by the camera. The general physical structure of the camera and the physical differences between different cameras leave traces on the captured images. These traces (camera fingerprints) appear as a series of features on the image
and the acquisition device of this image can be identified by examining the fingerprint of the device embedded in the image. The detection technology for the overall image generated by AI also focuses on detecting the artifacts left by the network generated in the imaging network. In the previous decades
some digital audio forensic studies have focused on detecting various forms of audio tampering. These methods check the metadata of audio files. In addition
the publicly available large-scale data sets and related applications are introduced
and the possible future development directions of the multimedia forensic field is discussed.
多媒体取证多媒体溯源篡改检测篡改定位虚假人脸
multimedia forensicsmultimedia traceabilityforgery detectionforgery localizationfake face
Afchar D, Nozick V, Yamagishi J and Echizen I. 2018. MesoNet: a compact facial video forgery detection network//Proceedings of 2018 IEEE International Workshop on Information Forensics and Security. Hong Kong, China: IEEE: 1-7[DOI: 10.1109/WIFS.2018.8630761http://dx.doi.org/10.1109/WIFS.2018.8630761]
Agarwal S, Farid H, Gu Y and He M M and Nagano K and Li H. 2018. Protecting World Leaders Against Deep Fakes//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, USA: IEEE: 38-45
Agarwal S and Varshney L R. 2019. Limits of deepfake detection: a robust estimation viewpoint[EB/OL].[2020-10-29].https://arxiv.org/pdf/1905.03493v1.pdfhttps://arxiv.org/pdf/1905.03493v1.pdf
AlBadawy E A, Lyu S W and Farid H. 2019. Detecting Ai-synthesized speech using bispectral analysis//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Long Beach, USA: IEEE: 104-109
Bishop C M. 1994. Mixture Density Networks. Birmingham: Aston University
Cao G, Zhao Y, Ni R R and Li X L. 2014. Contrast enhancement-based forensics in digital images. IEEE Transactions on Information Forensics and Security, 9(3): 515-525[DOI: 10.1109/TIFS.2014.2300937]
Chierchia G, Cozzolino D, Poggi G, Sansone C and Verdoliva L. 2014a. Guided filtering for PRNU-based localization of small-size image forgeries//Proceedings of 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. Florence, Italy: IEEE: 6231-6235[DOI: 10.1109/ICASSP.2014.6854802http://dx.doi.org/10.1109/ICASSP.2014.6854802]
Chierchia G, Poggi G, Sansone C and Verdoliva L. 2014b. A Bayesian-MRF approach for PRNU-based image forgery detection. IEEE Transactions on Information Forensics and Security, 9(4): 554-567[DOI: 10.1109/TIFS.2014.2302078]
Ciftci U A, Demir I and Yin L J. 2020. FakeCatcher: detection of synthetic portrait videos using biological signals. IEEE Transactions on Pattern Analysis and Machine Intelligence, (1): #3009287[DOI: 10.1109/TPAMI.2020.3009287].
Cozzolino D, Poggi G and Verdoliva L. 2015. Efficient dense-field copy-move forgery detection. IEEE Transactions on Information Forensics and Security, 10(11): 2284-2297[DOI: 10.1109/TIFS.2015.2455334]
Dang-Nguyen D T, Boato G and De Natale F G B. 2012. Identify computer generated characters by analysing facial expressions variation//Proceedings of 2012 IEEE International Workshop on Information Forensics and Security. Costa Adeje, Spain: IEEE: 252-257[DOI: 10.1109/WIFS.2012.6412658http://dx.doi.org/10.1109/WIFS.2012.6412658]
Donahue J, Hendricks L A, Guadarrama S, Rohrbach M, Venugopalan S, Darrell T and Saenko K. 2015. Long-term recurrent convolutional networks for visual recognition and description//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE: 2625-2634[DOI: 10.1109/CVPR.2015.7298878http://dx.doi.org/10.1109/CVPR.2015.7298878]
Galvan F, Puglisi G, Bruna A R and Battiato S. 2014. First quantization matrix estimation from double compressed JPEG images. IEEE Transactions on Information Forensics and Security, 9(8): 1299-1310[DOI: 10.1109/TIFS.2014.2330312]
Gao T G, Yang L, Xuan Y and Tong J. 2016. Contrast modification forensic algorithm based on superpixel and histogram of run length. Journal of Electronics and Information Technology,38(11): 2787-2794
高铁杠, 杨亮, 宣妍, 佟静. 2016. 基于超像素和游程直方图的对比度修改检测算法. 电子与信息学报, 38(11): 2787-2794[DOI: 10.11999/JEIT160161]
Güera D and Delp E J. 2018. Deepfake video detection using recurrent neural networks//Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance. Auckland, New Iealand: IEEE: 1-6[DOI: 10.1109/AVSS.2018.8639163http://dx.doi.org/10.1109/AVSS.2018.8639163]
Heng W, Kläser A, Schmid C and Liu C L. 2013. Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision, 103(1): 60-79[DOI: 10.1007/s11263-012-0594-8]
Hsu C C, Zhuang Y X and Lee C Y. 2020. Deep fake image detection based on pairwise learning. Applied Sciences, 10(1): #370[DOI: 10.3390/app10010370]
Jiang L M, Li R, Wu W N, Qian C and Loy C C. 2020. DeeperForensics-1.0: a large-scale dataset for real-world face forgery detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 2889-2898[DOI: 10.1109/CVPR42600.2020.00296http://dx.doi.org/10.1109/CVPR42600.2020.00296]
Justus T, Michael Z, Marc S, Christian T and Matthias N. 2016. Face2face: real-time face capture and reenactment of rgb videos//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 2387-2395[10.1109/CVPR.2016.26210.1109/CVPR.2016.262]
Koenig B E and Lacey D S. 2012. Forensic authenticity analyses of the header data in re-encoded WMA files from small Olympus audio recorders. Journal of the Audio Engineering Society, 60(4): 255-265
Korshunov P and Marcel S. 2018. DeepFakes: a new threat to face recognition? Assessment and detection[EB/OL].[2021-02-08]https://arxiv.org/pdf/1812.08685.pdfhttps://arxiv.org/pdf/1812.08685.pdf
Laptev I. 2005. On space-time interest points. International Journal of Computer Vision, 64(2/3): 107-123[DOI: 10.1007/s11263-005-1838-7]
Larcher A, Lee K A, Ma B and Li H Z. 2012. RSR2015: Database for text-dependent speaker verification using multiple pass-phrases//Proceedings of the 13th Annual Conference of the International Speech Communication Association. Portland, USA: IEEE: 1580-1583.
Lawgaly A and Khelifi F. 2017. Sensor pattern noise estimation based on improved locally adaptive DCT filtering and weighted averaging for source camera identification and verification. IEEE Transactions on Information Forensics and Security, 12(2): 392-404[DOI: 10.1109/TIFS.2016.2620280]
Li H D, Chen H, Li B and Tan S Q. 2018a. Can forensic detectors identify GAN generated images?//Proceedings of 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. Honolulu, USA: IEEE: 722-727[DOI: 10.23919/APSIPA.2018.8659461http://dx.doi.org/10.23919/APSIPA.2018.8659461]
Li J, Li X L, Yang B and Sun X M. 2015. Segmentation-based image copy-move forgery detection scheme. IEEE Transactions on Information Forensics and Security, 10(3): 507-518[DOI: 10.1109/TIFS.2014.2381872]
Li Y Z, Chang M C and Lyu S W. 2018b. In ictu oculi: exposing AI created fake videos by detecting eye blinking//Proceedings of 2018 IEEE International Workshop on Information Forensics and Security. Hong Kong, China: IEEE: 1-7[DOI: 10.1109/WIFS.2018.8630787http://dx.doi.org/10.1109/WIFS.2018.8630787]
Li Y Z, Yang X, Sun P and Qi H G and Lyu S W. 2020. Celeb-df: a large-scale challenging dataset for deepfake forensics//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Virtual: IEEE: 3207-3216[10.1109/CVPR42600.2020.0032710.1109/CVPR42600.2020.00327]
Liao D D, Yang R, Liu H M, Li J and Huang J W. 2011. Double H.264/AVC compression detection using quantized nonzero AC coefficients//Proceedings of SPIE 7880, Media Watermarking, Security, and Forensics Ⅲ. San Francisco Airport, USA: SPIE: 78800Q
Lin J, Huang T Q, Lai Y C and Lu H N. 2016. Detection of continuously and repeated copy-move forgery to single frame in videos by quantized DCT coefficients. Journal of Computer Applications, 36(5): 1356-1361
林晶, 黄添强, 赖玥聪, 卢贺楠. 2016. 采用量化离散余弦变换系数检测视频单帧连续多次复制-粘贴篡改. 计算机应用, 36(5): 1356-1361[DOI: 10.11772/j.issn.1001-9081.2016.05.1356]
Marra F, Gragnaniello D, Cozzolino D and Verdoliva L. 2018. Detection of GAN-generated fake images over social networks//Proceedings of 2018 IEEE Conference on Multimedia Information Processing and Retrieval. Miami, USA: IEEE: 384-389[DOI: 10.1109/MIPR.2018.00084http://dx.doi.org/10.1109/MIPR.2018.00084]
Matern F, Riess C and Stamminger M. 2019. Exploiting visual artifacts to expose deepfakes and face manipulations//Proceedings of 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW). Waikoloa, USA: IEEE: 83-92.[DOI: 10.1109/WACVW.2019.00020http://dx.doi.org/10.1109/WACVW.2019.00020]
McCloskey S and Albright M. 2018. Detecting GAN-generated imagery using color cues[EB/OL].[2021-02-08].https://arxiv.org/pdf/1812.08247.pdfhttps://arxiv.org/pdf/1812.08247.pdf
Popescu A C and Farid H. 2005. Exposing digital forgeries by detecting traces of resampling. IEEE Transactions on Signal Processing, 53(2): 758-767[DOI: 10.1109/TSP.2004.839932]
Raghavendra R, Raja K B, Venkatesh S and Busch C. 2017. Transferable Deep-CNN features for detecting digital and print-scanned morphed face images//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu, USA: IEEE: 1822-1830[DOI: 10.1109/CVPRW.2017.228http://dx.doi.org/10.1109/CVPRW.2017.228]
Rahmouni N, Nozick V, Yamagishi J and Echizen I. 2017. Distinguishing computer graphics from natural images using convolution neural networks//Proceedings of 2017 IEEE Workshop on Information Forensics and Security. Rennes, France: IEEE: 1-6[DOI: 10.1109/WIFS.2017.8267647http://dx.doi.org/10.1109/WIFS.2017.8267647]
Simonyan K and Zisserman A. 2014. Two-stream convolutional networks for action recognition in videos//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM: 568-576
Stamm M C, Lin W S and Liu K J R. 2012. Temporal forensics and anti-forensics for motion compensated video. IEEE Transactions on Information Forensics and Security, 7(4): 1315-1329[DOI: 10.1109/TIFS.2012.2205568]
Su W X and Fang Z. 2019. Identifying image authenticity based on CFA inconsistency of interpolation characteristics. Journal of Applied Sciences, 37(1): 33-40
苏文煊, 方针. 2019. 基于CFA插值特性不一致的图像真伪鉴别. 应用科学学报, 37(1): 33-40[DOI: 10.3969/j.issn.0255-8297.2019.01.004]
Tagliasacchi M, Visentini-Scarzanella M, Dragotti P L and Tubaro S. 2013. Transform coder identification//Proceedings of 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing. Vancouver, Canada: IEEE: 5785-5789[DOI: 10.1109/ICASSP.2013.6638773http://dx.doi.org/10.1109/ICASSP.2013.6638773]
Todisco M, Wang X, Vestman V, Sahidullah M, Delgado H, Nautsch A, Yamagishi J, Evans N, Kinnunen T and Lee K A. 2019. Asvspoof 2019: future horizons in spoofed and fake audio detection.[EB/OL].[2021-02-08].https://arxiv.org/pdf/1904.05441.pdfhttps://arxiv.org/pdf/1904.05441.pdf
Thai T H, Cogranne R, Retraint F and Doan T N C. 2017. JPEG quantization step estimation and its applications to digital image forensics. IEEE Transactions on Information Forensics and Security, 12(1): 123-133[DOI: 10.1109/TIFS.2016.2604208]
Wang R, Juefei-Xu F, Ma L, Xie X F, Huang Y H, Wang J and Liu Y. 2019. FakeSpotter: a simple yet robust baseline for spotting AI-synthesized fake faces[EB/OL].[2020-10-29].https://arxiv.org/pdf/1909.06122.pdfhttps://arxiv.org/pdf/1909.06122.pdf
Wang W H and Farid H. 2007a. Exposing digital forgeries in video by detecting duplication//Proceedings of the 9th Workshop on Multimedia and Security. Dallas, USA: ACM: 35-42[DOI: 10.1145/1288869.1288876http://dx.doi.org/10.1145/1288869.1288876]
Wang W H and Farid H. 2007b. Exposing digital forgeries in interlaced and deinterlaced video. IEEE Transactions on Information Forensics and Security, 2(3): 438-449[DOI: 10.1109/TIFS.2007.902661]
Wu Y Q, Wu P, Chen B J, Ju X W and Gao Y. 2019. Image splicing localization method based on fully convolutional residual networks. Journal of Applied Sciences, 37(5): 651-662
吴韵清, 吴鹏, 陈北京, 鞠兴旺, 高野. 2019. 基于残差全卷积网络的图像拼接定位算法. 应用科学学报, 37(5): 651-662[DOI: 10.3969/j.issn.0255-8297.2019.05.007]
Yang X, Li Y Z and Lyu S W. 2019. Exposing deep fakes using inconsistent head poses//Proceedings of 2019 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2019). Brighton, UK: IEEE: 8261-8265[DOI: 10.1109/ICASSP.2019.8683164http://dx.doi.org/10.1109/ICASSP.2019.8683164]
Yang X H. 2018. Blind digital image forensics based on correlation detection algorithm. Microelectronics and Computer, 35(4): 114-118
杨晓花. 2018. 基于相关性检测的数字图像盲取证算法仿真. 微电子学与计算机, 35(4): 114-118[DOI: 10.19304/j.cnki.issn1000-7180.2018.04.023]
Yu N, Davis L and Fritz M. 2019. Attributing fake images to GANs: learning and analyzing GAN fingerprints//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea(South): IEEE: 7555-7565[DOI: 10.1109/ICCV.2019.00765http://dx.doi.org/10.1109/ICCV.2019.00765]
Zakariah M, Khan M K and Malik H. 2018. Digital multimedia audio forensics: past, present and future. Multimedia Tools and Applications, 77(1): 1009-1040[DOI: 10.1007/s11042-016-4277-2]
Zhao H and Malik H. 2013. Audio recording location identification using acoustic environment signature. IEEE Transactions on Information Forensics and Security, 8(11): 1746-1759[DOI: 10.1109/TIFS.2013.2278843]
相关作者
相关机构