面向感知哈希的图像数据集

周元鼎; 房耀东; 秦川

doi:10.11834/jig.230397

数字媒体深度伪造与对抗 | 浏览量 : 0 下载量: 270 CSCD: 0

PDF
导出
分享
收藏
专辑

面向感知哈希的图像数据集
Large-scale image dataset for perceptual hashing
2024年29卷第2期页码：343-354
收稿日期：2023-06-20，

修回日期：2023-08-08，

纸质出版日期：2024-02-16
DOI： 10.11834/jig.230397
稿件说明：

移动端阅览

周元鼎，房耀东，秦川. 2024. 面向感知哈希的图像数据集. 中国图象图形学报， 29(02):0343-0354 DOI： 10.11834/jig.230397.

Zhou Yuanding， Fang Yaodong， Qin Chuan. 2024. Large-scale image dataset for perceptual hashing. Journal of Image and Graphics， 29(02):0343-0354 DOI： 10.11834/jig.230397.

摘要

目的

感知图像哈希又称图像摘要或是图像指纹，是一种有效的图像认证技术，近年来受到了广泛的关注。该技术通过将图像的感知鲁棒特征转化为固定长度的哈希序列，来实现图像版权认证。然而，该领域始终缺乏一个比较通用的数据集，已有数据集所使用的图像内容保留操作和真实场景差异较大，使得训练得到的神经网络架构在应对复杂的图像编辑操作时效果显著下降。

方法

针对感知图像哈希任务，面向实际图像内容认证场景构建了一个新的数据集。首先，将现实中常见的图像内容保留操作进行总结和分类，设计了48种单一、复合的图像内容保留操作来生成感知相似图像；然后，根据感知图像哈希的定义，选择与待认证图像语义相似但是感知内容不同的图像作为感知不相似图像，增加了该数据集的辨别难度；最终建立了一个包含116 400幅图像的感知哈希图像数据集。

结果

由于本文提出的数据集使用的图像内容保留操作更加复杂，不相似图像也更加难以辨别，使得在该数据集上训练得到的深度神经网络具有较好的泛化能力，即这些神经网络即使不进行重新训练或是微调，也可以在其他数据集上取得较好的认证性能。同时，在该数据集上训练得到的神经网络在不同数据集上性能差别较小，体现了本文数据集具有较好的稳定性。

结论

设计了一个针对感知哈希的图像数据集，大量的对比实验表明了该数据集的有效性，该工作可对感知图像哈希领域的发展起到促进作用。下载链接：

https://pan.baidu.com/s/1uVnUVr5HqaSpoNifGElucw

?pwd=8xwr

Abstract

Objective

With the rapid development of social media， multimedia information on the internet is updated at an exponential rate. Obtaining and transmitting digital images have become convenient， considerably increasing the risk of malicious tampering and forgery of images. Accordingly， increasing attention is given to image authentication and content protection. Many image authentication schemes have emerged recently， such as watermarking， the use of digital signatures， and perceptual image hashing （PIH）. PIH， also known as image abstract or image fingerprint， is an effective technique for image authentication that has attracted widespread research attention in recent years. The goal of PIH is to authenticate an image by compressing perceptual robust features into a compact hash sequence with a fixed length. However， a general dataset in this field is lacking， and the dataset constructed using other methods have many problems. On the one hand， the types of image content-preserving manipulations used in these datasets are few and the intensity of attacks is relatively weak. On the other hand， the distinct images used in these datasets are extremely different from the images that must be authenticated， making it easy to distinguish them from each other. The convolutional neural networks （CNNs） trained by these datasets have poor generalizability and can hardly cope with the complex and diverse image editing operations in reality. This important factor has limited the development of the PIH field.

Method

On the basis of the preceding knowledge， we propose a specialized dataset based on various manipulations in this study. This dataset can deal with complex image authentication scenarios. The proposed dataset is divided into three subsets： original， perceptual identical， and perceptual distinct images. The latter two correspond to the robustness and discrimination of PIH， respectively. Original images are selected from ImageNet1K， and each of them corresponds to one category. For identical images， we summarize the content-preserving manipulations commonly used in the field of PIH and group them into four major categories： geometric， enhancement， filter， and editing manipulations. Each major category is subdivided into different types， for a total of 35 single-image content-preserving manipulations. To ensure the diversity and reflect the randomness of image editing in reality， we set a threshold for each type of image content-preserving manipulation and let them randomly select the attack intensity within this range. In addition， we randomly combine multiple single-image content-preserving manipulations to form combination manipulations. Some combined manipulations in the test set have not been learned in the training set due to the randomness. This result is also in line with practical application scenarios， because many unlearned， combined image editing manipulations exist in reality. For perceptual distinct images， except for a portion of images unrelated to the original images， the other portions are selected from the same category that corresponds to each original image， increasing the difficulty of the dataset and improving the generalizability of the trained CNNs. Compared with previously adopted datasets， our dataset conforms more to the actual application scenario of the PIH task. Our dataset contains 1 200 original images， and each original image is subjected to 48 image content-preserving manipulations to generate 48 perceptual identical images. To balance the number of perceptual identical and distinct images， we also select 48 perceptual distinct images for each original image. Then， 24 images are randomly selected among them， and the other 24 images are semantically similar to the original images. Therefore， each batch contains 1 original image， 48 perceptual identical images， and 48 perceptual distinct images， for a total of 97 images. Our dataset has 1 200 original images or 116 400 images in total. The large amount of data ensures the effective training of CNNs.

Result

To validate the performance of the dataset proposed in this study （i.e.， PIHD）， four CNNs were trained on five datasets， including PIHD， and tested on these datasets. The receiver operating characteristic curves of each model is compared to judge its performance. The content-preserving manipulations used in this dataset are more complex and distinct images are more difficult to distinguish， the CNNs trained on this dataset provide better image authentication performance. Even without retraining or fine-tuning， they can still obtain satisfactory image authentication performance on other datasets， fully demonstrating the generalizability of the PIHD dataset. In addition， we compare the area under curve of each model on different test sets. The results demonstrate that the performance of the networks trained on other comparison datasets varies considerably across test sets， while the performance trained on PIHD remains nearly constant across datasets， reflecting the stability of the PIHD dataset. Collectively， the networks trained on our dataset are stable and exhibit certain generalization ability， enabling them to cope with complex and diverse real-world editing operations.

Conclusion

In this study， we design a dataset for the PIH task that uses richer image content-preserving manipulations and exhibits a certain randomness to restore the real application scenario to the maximum extent. In addition， images with the same semantic meaning as the original images are added to the distinct images in the dataset， increasing the difficulty in compliance with the PIH task. This step enables the trained CNNs to cope with more realistic and complex practical application scenarios. We test the dataset with different models on various datasets， including our proposed dataset. A large number of experiments demonstrate the effectiveness， generalizability， and stability of this dataset. Hence， this dataset can promote the development of the PIH field.

关键词

Keywords

references

Brown T B ， Mann B ， Ryder N ， Subbiah M ， Kaplan J ， Dhariwal P ， Neelakantan A ， Shyam P ， Sastry G ， Askell A ， Agarwal S ， Herbert-Voss A ， Krueger G ， Henighan T ， Child R ， Ramesh A ， Ziegler D M ， Wu J ， Winter C ， Hesse C ， Chen M ， Sigler E ， Litwin M ， Gray S ， Chess B ， Clark J ， Berner C ， McCandlish S ， Radford A ， Sutskever I and Amodei D . 2020 . Language models are few-shot learners ［EB/OL］. ［ 2023-06-20 ］. https://arxiv.org/pdf/2005.14165.pdf https://arxiv.org/pdf/2005.14165.pdf

Choi Y S and Park J H . 2012 . Image hash generation method using hierarchical histogram . Multimedia Tools and Applications ， 61 （ 1 ）： 181 - 194 ［ DOI： 10.1007/s11042-010-0724-7 http://dx.doi.org/10.1007/s11042-010-0724-7 ］

Dang-Nguyen D T ， Pasquini C ， Conotter V and Boato G . 2015 . RAISE： a raw images dataset for digital image forensics // Proceedings of the 6th ACM Multimedia Systems Conference . Portland， USA ： ACM： 219 - 224 ［ DOI： 10.1145/2713168.2713194 http://dx.doi.org/10.1145/2713168.2713194 ］

Deng J ， Dong W ， Socher R ， Li L J ， Li K and Li F F . 2009 . ImageNet： a large-scale hierarchical image database // Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition . Miami， USA ： IEEE： 248 - 255 ［ DOI： 10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ］

Dong J ， Wang W and Tan T N . 2013 . CASIA image tampering detection evaluation database // Proceedings of 2013 IEEE China Summit and International Conference on Signal and Information Processing . Beijing， China ： IEEE： 422 - 426 ［ DOI： 10.1109/ChinaSIP.2013.6625374 http://dx.doi.org/10.1109/ChinaSIP.2013.6625374 ］

Douze M ， Tolias G ， Pizzi E ， Papakipos Z ， Chanussot L ， Radenovic F ， Jenicek T ， Maximov M ， Leal-Taixé L ， Elezi I ， Chum O and Ferrer C C . 2022 . The 2021 image similarity dataset and challenge ［EB/OL］. ［ 2023-06-21 ］. https://arxiv.org/pdf/2106.09672.pdf https://arxiv.org/pdf/2106.09672.pdf

Gao G P ， Qin C ， Fang Y D and Zhou Y D . 2023 . Perceptual authentication hashing for digital images with contrastive unsupervised learning . IEEE Multim . 30 （ 3 ）： 129 - 140 ［ DOI： 10.1109/MMUL.2023.3280669 http://dx.doi.org/10.1109/MMUL.2023.3280669 ］

He K M ， Zhang X Y ， Ren S Q and Sun J . 2016 . Deep residual learning for image recognition // Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR） . Las Vegas， USA ： IEEE： 770 - 778 ［ DOI： 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ］

Huang X Y ， Sun B ， Yang Z Y ， Zhu Y Y and Tian Q . 2021 . Locality-sensitive hashing approach based on semantic space for visual retrieval . Journal of Image and Graphics ， 26 （ 7 ）： 1568 - 1582

黄小燕，孙彬，杨展源，朱映映，田奇 . 2021 . 面向视觉搜索的空间局部敏感哈希方法 . 中国图象图形学报， 26 （ 7 ）： 1568 - 1582 ［ DOI： 10.11834/jig.200534 http://dx.doi.org/10.11834/jig.200534 ］

Huang Z Q ， Tang Z J ， Zhang X Q ， Ruan L L and Zhang X P . 2023 . Perceptual image hashing with locality preserving projection for copy detection . IEEE Transactions on Dependable and Secure Computing ， 20 （ 1 ）： 463 - 477 ［ DOI： 10.1109/TDSC.2021.3136163 http://dx.doi.org/10.1109/TDSC.2021.3136163 ］

Jing W P ， Xu Z K ， Li L H ， Wang J ， He Y and Chen G S . 2022 . Deep unsupervised weighted hashing for remote sensing image retrieval . Journal of Database Management ， 33 （ 2 ）： 1 - 19 ［ DOI： 10.4018/JDM.306188 http://dx.doi.org/10.4018/JDM.306188 ］

Krizhevsky A ， Sutskever I and Hinton G E . 2012 . ImageNet classification with deep convolutional neural networks // Proceedings of the 25th International Conference on Neural Information Processing Systems . Lake Tahoe， USA ： Curran Associates Inc.： 1097 - 1105

Li Y N ， Wang D D and Tang L L . 2020 . Robust and secure image fingerprinting learned by neural network . IEEE Transactions on Circuits and Systems for Video Technology ， 30 （ 2 ）： 362 - 375 ［ DOI： 10.1109/TCSVT.2019.2890966 http://dx.doi.org/10.1109/TCSVT.2019.2890966 ］

Lin T Y ， Maire M ， Belongie S ， Hays J ， Perona P ， Ramanan D ， Doll􀅡r P and Zitnick C L . 2014 . Microsoft COCO： common objects in context // Proceedings of the 13th European Conference on Computer Vision . Zurich， Switzerland ： Springer： 740 - 755 ［ DOI： 10.1007/978-3-319-10602-1_48 http://dx.doi.org/10.1007/978-3-319-10602-1_48 ］

Liu S G and Huang Z Q . 2019 . Efficient image hashing with geometric invariant vector distance for copy detection . ACM Transactions on Multimedia Computing， Communications， and Applications ， 15 （ 4 ）： # 106 ［ DOI： 10.1145/3355394 http://dx.doi.org/10.1145/3355394 ］

Liu Z ， Mao H Z ， Wu C Y ， Feichtenhofer C ， Darrell T and Xie S N . 2022 . A ConvNet for the 2020 s ［EB/OL］. ［ 2022-03-02 ］. https://arxiv.org/pdf/2201.03545.pdf https://arxiv.org/pdf/2201.03545.pdf

Ouyang J ， Gao J H ， Wen Z K ， Zhang M ， Liu P F and Du Y H . 2011 . Video perceptual hashing fuse computational model of human visual system . Journal of Image and Graphics ， 16 （ 10 ）： 1883 - 1889

欧阳杰，高金花，文振焜，张盟，刘朋飞，杜以华 . 2011 . 融合HVS计算模型的视频感知哈希算法研究 . 中国图象图形学报， 16 （ 10 ）： 1883 - 1889 ［ DOI： 10.11834/jig.20111005 http://dx.doi.org/10.11834/jig.20111005 ］

Ouyang J L ， Wen X Z ， Liu J X and Chen J J . 2016 . Robust hashing based on quaternion Zernike moments for image authentication . ACM Transactions on Multimedia Computing， Communications， and Applications ， 12 （ 4 S）： # 63 ［ DOI： 10.1145/2978572 http://dx.doi.org/10.1145/2978572 ］

Qin C ， Chen X Q ， Luo X Y ， Zhang X P and Sun X M . 2018 . Perceptual image hashing via dual-cross pattern encoding and salient structure detection . Information Sciences ， 423 ： 284 - 302 ［ DOI： 10.1016/j.ins.2017.09.060 http://dx.doi.org/10.1016/j.ins.2017.09.060 ］

Qin C ， Chen X Q ， Ye D P ， Wang J W and Sun X M . 2016 . A novel image hashing scheme with perceptual robustness using block truncation coding . Information Sciences ， 361 - 362 ： 84 - 99 ［ DOI： 10.1016/j.ins.2016.04.036 http://dx.doi.org/10.1016/j.ins.2016.04.036 ］

Qin C ， Liu E L ， Feng G R and Zhang X P . 2021 . Perceptual image hashing for content authentication based on convolutional neural network with multiple constraints . IEEE Transactions on Circuits and Systems for Video Technology ， 31 （ 11 ）： 4523 - 4537 ［ DOI： 10.1109/TCSVT.2020.3047142 http://dx.doi.org/10.1109/TCSVT.2020.3047142 ］

Schaefer G and Stich M . 2004 . UCID： an uncompressed color image database // Proceedings Volume 5307， Storage and Retrieval Methods and Applications for Multimedia 2004 . San Jose， United States ： 472 - 480 ［ DOI： 10.1117/12.525375 http://dx.doi.org/10.1117/12.525375 ］

Shen Q and Zhao Y . 2020 . Perceptual hashing for color image based on color opponent component and quadtree structure . Signal Processing ， 166 ： # 107244 ［ DOI： 10.1016/j.sigpro.2019.107244 http://dx.doi.org/10.1016/j.sigpro.2019.107244 ］

Sun R and Zeng W J . 2014 . Secure and robust image hashing via compressive sensing . Multimedia Tools and Applications ， 70 （ 3 ）： 1651 - 1665 ［ DOI： 10.1007/s11042-012-1188-8 http://dx.doi.org/10.1007/s11042-012-1188-8 ］

Sun X H and Zhou J T . 2022 . Deep perceptual hash based on hash center for image copyright protection . IEEE Access ， 10 ： 120551 - 120562 ［ DOI： 10.1109/ACCESS.2022.3221980 http://dx.doi.org/10.1109/ACCESS.2022.3221980 ］

Tang Z J ， Zhang X Q ， Dai Y M and Lan W W . 2013a . Perceptual image hashing using local entropies and DWT . The Imaging Science Journal ， 61 （ 2 ）： 241 - 251 ［ DOI： 10.1179/1743131X11Y.0000000039 http://dx.doi.org/10.1179/1743131X11Y.0000000039 ］

Tang Z J ， Zhang X Q ， Li X X and Zhang S C . 2016 . Robust image hashing with ring partition and invariant vector distance . IEEE Transactions on Information Forensics and Security ， 11 （ 1 ）： 200 - 214 ［ DOI： 10.1109/TIFS.2015.2485163 http://dx.doi.org/10.1109/TIFS.2015.2485163 ］

Tang Z J ， Zhang X Q and Zhang S C . 2014 . Robust perceptual image hashing based on ring partition and NMF . IEEE Transactions on Knowledge and Data Engineering ， 26 （ 3 ）： 711 - 724 ［ DOI： 10.1109/TKDE.2013.45 http://dx.doi.org/10.1109/TKDE.2013.45 ］

Xing H F ， Che H ， Wu Q L and Wang H H . 2023 . Image perceptual hashing for content authentication based on Watson’s visual model and LLE . Journal of Real-Time Image Processing ， 20 （ 1 ）： # 7 ［ DOI： 10.1007/s11554-023-01269-9 http://dx.doi.org/10.1007/s11554-023-01269-9 ］

Yang Z H ， Hao G S ， Zhou X Y and Ruan W . 2022 . A novel image perceptual hashing algorithm based on frequency decomposition and LoG // Proceedings of the 5th International Conference on Computer Science and Software Engineering . Guilin， China ： ACM： 300 - 305 ［ DOI： 10.1145/3569966.3570057 http://dx.doi.org/10.1145/3569966.3570057 ］

Zhao R Y ， Ye X ， Zhou W T ， Zhang Y S and Chai X L . 2023 . Cloud-stored image thumbnail-preserving encryption . Journal of Image and Graphics ， 28 （ 3 ）： 645 - 665

赵若宇，叶茜，周文韬，张玉书，柴秀丽 . 2023 . 云存储图像缩略图保持的加密研究进展 . 中国图象图形学报， 28 （ 3 ）： 645 - 665 ［ DOI： 10.11834/jig.220533 http://dx.doi.org/10.11834/jig.220533 ］

文章被引用时，请邮件提醒。

提交