面向高光谱全色锐化的混合注意力双分支U型网络
Dual-branch U-Net with hybrid attention for hyperspectral pansharpening
- 2024年 页码:1-13
网络出版日期: 2024-09-18
DOI: 10.11834/jig.240410
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-09-18 ,
移动端阅览
杨勇,王晓争,刘轩等.面向高光谱全色锐化的混合注意力双分支U型网络[J].中国图象图形学报,
Yang Yong,Wang Xizozheng,Liu Xuan,et al.Dual-branch U-Net with hybrid attention for hyperspectral pansharpening[J].Journal of Image and Graphics,
目的
2
高光谱(Hyperspectral, HS)全色锐化旨在融合高空间分辨率全色(Panchromatic, PAN)图像和低空间分辨率高光谱(Low Resolution Hyperspectral, LRHS)图像,生成高空间分辨率高光谱(High Resolution Hyperspectral, HRHS)图像。现有全色锐化算法往往忽略PAN和HS图像之间的模态差异从而造成特征提取不精确,导致融合结果中存在光谱畸变和空间失真。针对这一问题,本文提出一种基于混合注意力机制的双分支U-Net(Dual-branch U-Net Based on Hybrid Attention, DUNet-HA)来实现PAN与HS图像的多尺度空间-光谱特征的提取和融合。
方法
2
网络中,混合注意力模块(Hybrid Attention Module, HAM)被设计来对网络中的每个尺度特征进行编码。在HAM中,利用通道和空间自注意力模块来增强光谱和空间特征,同时构建一个双交叉注意力模块(Double Cross Attention Module, DCAM),通过学习PAN与HS图像跨模态特征的空间-光谱依赖关系来引导两种特征的重建。与经典的混合Transformer结构相比,设计的DCAM可以通过计算与查询位置无关的交叉注意力权重来实现两种图像特征的校正,在降低模型计算量的同时,可提升网络的性能。
结果
2
在三个广泛使用的HS图像数据集上与最新的11种方法进行了比较,在Pavia center数据集中,相比性能第2的方法hyperRefiner,其峰值信噪比(Peak signal-to-noise ratio, PSNR)提升了1.10dB,光谱角制图(Spectral angle mapper, SAM)降低了0.40;在Botswana数据集中,其PSNR提升了1.29dB,SAM降低了0.14;在Chikusei数据集中,其PSNR提升了0.39dB,SAM降低了0.12。
结论
2
结果表明所提出的DUNet-HA结构能更好地融合光谱-空间信息,显著提升高光谱全色锐化结果图像的质量。
Objective
2
Hyperspectral (HS) images are obtained by sampling hundreds of contiguous narrow spectral bands using spectral imaging systems, providing rich spectral information. However, due to the low energy of narrow spectral bands, HS images typically have lower spatial resolution. In contrast, single-band panchromatic (PAN) images from PAN imaging systems provide rich spatial information but have lower spectral resolution. In some remote sensing applications that require high-resolution hyperspectral (HRHS) images with high spectral and spatial resolution, neither PAN nor HS images alone can meet the requirements. Therefore, the HS pansharpening aims to fuse the spatial information from PAN images with the spectral information from HS images to obtain HRHS images. This technology has received considerable attention in the field of remote sensing and is of great significance in various remote sensing tasks such as military surveillance, environmental monitoring, object identification, and classification. The HS pansharpening methods are mainly divided into two categories: traditional methods and deep learning (DL)-based methods. The traditional methods can be classified into four classes: component substitution-based methods, multi-resolution analysis-based methods, Bayesian-based methods, and model-based methods. Although these traditional methods are easy to implement and physically interpretable, they often suffer from spatial and spectral distortion issues due to inappropriate prior assumptions and imprecise manual feature extraction. Due its powerful feature learning capability, DL-based methods have been widely applied to HS pansharpening tasks. Although these methods have better performance compared to traditional methods, spectral and spatial distortions still exist in the fused images due to the neglecting the need to handle spectral and spatial features differently and the complex mapping relationships between multi-channel images. In recent years, since the introduction of the Transformer architecture that can learn global correlation features of images, some researchers have attempted to improve the performance of HS pansharpening by establishing relationships between two modal features using this architecture. However, the application of Transformer structures has been limited due to its high computational cost and low parameter efficiency. To effectively fuse PAN and HS images, this paper proposes a dual-branch U-Net network based on hybrid attention (DUNet-HA) for HS pansharpening to achieve multi-scale feature fusion. At each scale, spatial attention branches, spectral attention branches, and dual-cross attention module branches are constructed. These branches are used to enhance the spatial and spectral features of PAN and HS images, respectively, and to achieve complementary cross-modal features. The dual-cross attention module is designed to avoid the complex query matrix computation process in Transformers.
Method
2
The proposed DUNet-HA includes two U-Net branches, one for PAN images and the other for upsampled HS images, to extract and complement texture and spectral features. At each scale, a hybrid attention module (HAM) is constructed to encode features from both types of images. The HAM comprises a spatial attention module (Spa-A), a spectral attention module (Spe-A), and a dual-cross attention module (DCAM). Spa-A and Spe-A enhance the texture and spectral features of PAN and HS images, respectively, while DCAM corrects and complements these features. The enhanced and corrected features are integrated to obtain the encoded features at each scale. The decoder primarily facilitates feature fusion and reconstruction. In addition, we use DCAM to capture global contextual information and directly integrate encoded features, decoded features, and corrected complementary features at the same scale to better handle high-level spatial and spectral features. The DCAM proposed in this paper is a novel cross-attention structure that uses query-independent matrix computation instead of the attention computation in Transformer architecture, reducing computational cost. DCAM maps the cross-feature space of PAN and HS images to guide feature interaction for correction and supplementation.
Result
2
To validate the effectiveness of the proposed DUNet-HA, extensive experiments were conducted on three widely used HS datasets: Pavia center, Botswana, and Chikusei. We compared DUNet-HA with several state-of-the-art (SOTA) methods, including five traditional methods (CNMF, CFPCA, SFIM, GSA, MTF_GLP_HPM), and six DL-based methods (SSFCNN, HyperPNN, DHP-DARN, DIP-Hyperkite, Hyper-DSNet, and HyperRefiner). To evaluate the performance of all methods, we used five objective indicators: spectral cross correlation (SCC), spectral angle mapper (SAM), root mean square error (RMSE), erreur relative globale adimensionnelle de synthèse (ERGAS), and peak signal-to-noise ratio (PSNR). Experimental results with a scale factor of 4 demonstrate that the proposed method outperforms other SOTA methods in both objective results and visual effects. Specifically, on the Pavia center dataset, PSNR, SAM, and ERGAS of the proposed method is improved by 1.10 dB, 0.40, and 0.28, respectively, compared to the second-best method, HyperRefiner. Additionally, the objective results on the other two datasets also surpassed those of HyperRefiner. The visual results indicate that our proposed method is superior in recovering fine-grained spatial textures and spectral details. Ablation studies further demonstrate that the DCAM structure significantly improves the fusion process.
Conclusion
2
This paper proposes a dual-branch interactive U-Net network named DUNet-HA for HS pansharpening. This network extracts and reconstructs spatial and spectral information from PAN and HS images through a parallel dual U-Net structure to achieve more accurate fusion results. At each scale of the network's encoder, a HAM is constructed to enhance the spatial features of PAN images and the spectral features of HS images using spatial attention and spectral attention, respectively. Additionally, the DCAM is utilized to complement these features, which can reduce the modality differences between PAN and HS image features and enables their mutual supplementation for feature interaction guidance. Compared to the classic hybridTransformer attention structure, DCAM improves network performance while reducing the number of parameters and computational cost. Extensive experimental results on three widely used HS datasets demonstrate that the proposed DUNet-HA outperforms several SOTA methods in both quantitative and qualitative evaluations.
高光谱全色锐化模态差异混合注意力模块双交叉注意力模块Transformer空间-光谱依赖关系
HS pansharpeningModality differencesHybrid attention moduleDouble cross attention moduleTransformerSpatial-spectral dependency relationship
Aburaed N, Alkhatib M Q, Marshall S, Zabalza J and Ahmad H Al. 2023. Review of spatial enhancement of hyperspectral remote sensing imaging techniques. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16: 2275-2300 [DOI: 10.1109/JSTARS.2023.3242048http://dx.doi.org/10.1109/JSTARS.2023.3242048]
Aiazzi B, Baronti S and Selva M. 2007. Improving component substitution pansharpening through multivariate regression of ms +pan data. IEEE Transactions on Geoscience and Remote Sensing, 45(10): 3230–3239 [DOI: 10.1109/TGRS.2007.901007http://dx.doi.org/10.1109/TGRS.2007.901007]
Aiazzi B, Alparone L, Baronti S, Garzelli A and Selva M. 2006. MTF-tailored multiscale fusion of high-resolution ms and pan imagery. Photogrammetric Engineering & Remote Sensing, 72(5): 591-596 [DOI: 10.14358/PERS.72.5.591http://dx.doi.org/10.14358/PERS.72.5.591]
Bing Z, Wu Y F, Zhao B Y, Jocelyn C, Hong D F, Yao J and Gao L R. 2022. Progress and challenges in intelligent remote sensing satellite systems. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15: 1814-1822 [DOI: 10.1109/JSTARS.2022.3148139http://dx.doi.org/10.1109/JSTARS.2022.3148139]
Bandara W G C, Patel V M. 2022. HyperTransformer: A textural and spectral feature fusion transformer for pansharpening//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE: 1767-1777 [DOI: 10.1109/CVPR52688.2022.00181http://dx.doi.org/10.1109/CVPR52688.2022.00181]
Bandara W G C, Valanarasu J M J and Patel V M. 2022. Hyperspectral pansharpening based on improved deep image prior and residual reconstruction. IEEE Transactions on Geoscience and Remote Sensing, 60: 1-16 [DOI: 10.1109/TGRS.2021.3139292http://dx.doi.org/10.1109/TGRS.2021.3139292]
Cao Y, Xu J, Lin S, Wei F and Hu H. 2019. GCNet: Non-local networks meet squeeze-excitation networks and beyond//Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision. Seoul, Korea: IEEE: 1971-1980 [DOI: 10.1109/ICCVW.2019.00246http://dx.doi.org/10.1109/ICCVW.2019.00246]
Ding C, Chen J Y, Zheng M M, Zhang L, Wei W and Zhang Y N. 2024. Survey of hyperspectral image change detection method. Journal of Image and Graphics, 29(06):1714-1729
丁晨,陈静怡,郑萌萌,张磊,魏巍,张艳宁. 2024. 高光谱图像变化检测技术研究进展. 中国图象图形学报,29(06):1714-1729 [DOI: 10.11834/jig.240031http://dx.doi.org/10.11834/jig.240031]
Han X H, Shi B X, Zheng Y Q. 2018. SSF-CNN: spatial and spectral fusion with CNN for hyperspectral image super-resolution//Proceedings of the 2018 25th IEEE International Conference on Image Processing. Athens, Greece: IEEE: 2506-2510. [DOI: 10.1109/ICIP.2018.8451142http://dx.doi.org/10.1109/ICIP.2018.8451142]
He L, Zhu J W, Li Jet al. He L, Zhu J W, Li J, Antonio P, Jocelyn C and Li B. 2019. HyperPNN: Hyperspectral pansharpening via spectrally predictive convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 12(8): 3092-3100 [DOI: 10.1109/JSTARS.2019.2917584http://dx.doi.org/10.1109/JSTARS.2019.2917584]
Liu J G. 2000. Smoothing filter-based intensity modulation: A spectral preserve image fusion technique for improving spatial details. International Journal of Remote Sensing, 21(18): 3461-3472 [DOI: 10.1080/014311600750037499http://dx.doi.org/10.1080/014311600750037499]
Lin B H, Guo Y L. 2023. Deep hyperspectral and multispectral image fusion via probabilistic matrix factorization. IEEE Transactions on Geoscience and Remote Sensing, 61: 1-14 [DOI: 10.1109/TGRS.2023.3244992http://dx.doi.org/10.1109/TGRS.2023.3244992]
Li J J, Cui, R X, Li B, Song R, Li Y S, Dai Y C and Du Q. 2020. Hyperspectral image super-resolution by band attention through adversarial learning. IEEE Transactions on Geoscience and Remote Sensing, 58(6): 4304-4318 [DOI: 10.1109/TGRS.2019.2962713http://dx.doi.org/10.1109/TGRS.2019.2962713]
Nie J T, Zhang L, Wei W, Yan Q S, Ding C, Chen G C and Zhang Y N. 2023. A survey of hyperspectral image super-resolution method. Journal of Image and Graphics, 28(06):1685-1697
聂江涛,张磊,魏巍,闫庆森,丁晨,陈国超,张艳宁. 2023. 高光谱图像超分辨率重建技术研究进展. 中国图象图形学报,28(06):1685-1697 [DOI: 10.11834/jig.230038http://dx.doi.org/10.11834/jig.230038]
Peng S R, Guo C H, Wu X and Deng L J. 2023. U2Net: A General Framework with Spatial-Spectral-Integrated Double U-Net for Image Fusion//Proceedings of the 31st ACM International Conference on Multimedia, New York, USA: SIGMM: 3219-3227 [DOI: 10.1145/3581783.3612084http://dx.doi.org/10.1145/3581783.3612084]
Plaza A, Benediktsson J A, Boardman J W, Brazile J, Bruzzone L, Camps-Valls G, Chanussot J, Fauvel M, Gamba P, Gualtieri A, Marconcini M, Tilton J C and Trianni G. 2022. Recent advances in techniques for hyperspectral image processing. Remote Sensing of Environment, 113:110-122 [DOI: 10.1016/j.rse.2007.07.028http://dx.doi.org/10.1016/j.rse.2007.07.028]
Shaban A, A K C, Boehme A K and Martin-Schild S. 2013. Circle of Willis variants: Fetal PCA. Stroke Research and Treatment, 105937 [DOI: 10.1155/2013/105937http://dx.doi.org/10.1155/2013/105937]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez A N, Kaiser L and Polosukhin I. 2017. Attention is all you need//Proceedings of the 31st International Conference on Neural Information Processing, Red Hook, NY, USA: 5998-6008 [DOI: 10.48550/arXiv.1706.03762http://dx.doi.org/10.48550/arXiv.1706.03762]
Ungar S G, Pearlman J S, Mendenhall J A and Reuter D. 2003. Overview of the earth observing one (EO-1)mission. IEEE Transactions on Geoscience and Remote Sensing, 41(6):1149-1159 [DOI: 10.1109/TGRS.2003.815999http://dx.doi.org/10.1109/TGRS.2003.815999]
Wei Q, José B D, Dobigeon N and Tourneret J. 2015. Hyperspectral and multispectral image fusion based on a sparse representation. IEEE Transactions on Geoscience and Remote Sensing, 53(7): 3658-3668 [DOI: 10.1109/TGRS.2014.2381272http://dx.doi.org/10.1109/TGRS.2014.2381272]
Wang X H, Wang X Y, Song R X, Zhao X Y and Zhao K Y. 2023. MCT-Net: Multi-hierarchical cross transformer for hyperspectral and multispectral image fusion. Knowledge-Based Systems, 264:110362 [DOI: 10.1016/j.knosys.2023.110362http://dx.doi.org/10.1016/j.knosys.2023.110362]
Woo S, Park J, Lee J Y and Kweon I S. 2018. CBAM: Convolutional block attention module//Proceedings of the 2018 European Conference on Computer Vision. Munich, Germany: Springer: 3-19. [DOI: 10.1007/978-3-030-01234-2_1http://dx.doi.org/10.1007/978-3-030-01234-2_1]
Wald L. 2000. Quality of high resolution synthesised images: Is there a simple criterion?//Proceedings of the 3th Conference Fusion of Earth Data: Merging Point Measurements, Raster Maps and Remotely Sensed Images. Sophia Antipolis: France: SEE/URISCA: 99-103 [DOI: 10.1109/IGARSS.2000.859923http://dx.doi.org/10.1109/IGARSS.2000.859923]
Yokoya N, Yairi T and Iwasaki A. 2012. Coupled nonnegative matrix factorization unmixing for hyperspectral and multispectral data fusion. IEEE Transactions on Geoscience and Remote Sensing, 50(2): 528-537 [DOI: 10.1109/TGRS.2011.2161320http://dx.doi.org/10.1109/TGRS.2011.2161320]
Yokoya N, Akira I. 2016. Airborne Hyper-spectral Data over Chikusei [EB/OL].[2016-05-27]. https://www.researchgate.net/profile/Naoto-Yokoya/publication/304013716_Airborne_hyperspectral_data_over_Chikusei/links/5762f36808ae570d6e15c026/Airborne-hyperspectral-data-over-Chikusei.pdfhttps://www.researchgate.net/profile/Naoto-Yokoya/publication/304013716_Airborne_hyperspectral_data_over_Chikusei/links/5762f36808ae570d6e15c026/Airborne-hyperspectral-data-over-Chikusei.pdf
Zhang L, Nie J T, Wei W, Zhang Y N, Liao S C and Shao L. 2020. Unsupervised adaptation learning for hyperspectral imagery super-resolution//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA: IEEE: 3073-3082 [DOI: 10.1109/CVPR42600.2020.00314http://dx.doi.org/10.1109/CVPR42600.2020.00314]
Zhou B, Zhang X F, Chen X, Ren M and Feng Z Y. 2023. HyperRefiner: a refined hyperspectral pansharpening network based on the autoencoder and self-attention. International Journal of Digital Earth, 16(1): 3268-3294 [DOI: 10.1080/17538947.2023.2246944http://dx.doi.org/10.1080/17538947.2023.2246944]
Zhuo Y W, Zhang T J, Hu J F, Dou H X, Huang T Z and Deng L J. 2022. A deep-shallow fusion network with multidetail extractor and spectral attention for hyperspectral pansharpening. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15: 7539-7555 [DOI: 10.1109/JSTARS.2022.3202866http://dx.doi.org/10.1109/JSTARS.2022.3202866]
Zheng Y X, Li J J, Li Y S, Guo J, Wu, X Y and Chanussot J. 2020. Hyperspectral pansharpening using deep prior and dual attention residual network. IEEE Transactions on Geoscience and Remote Sensing, 58(11): 8059-8076 [DOI: 10.1109/TGRS.2020.2986313http://dx.doi.org/10.1109/TGRS.2020.2986313]
Ye Z, Bai L and He M Y. 2021. Review of spatial-spectral feature extraction for hyperspectral image. Journal of Image and Graphics, 26(8):1737-1763
叶珍,白璘,何明一. 2021. 高光谱图像空谱特征提取综述. 中国图象图形学报, 26(8):1737-1763 [DOI: 10.11834/jig.210198http://dx.doi.org/10.11834/jig.210198]
You X E, Su Y C, Jiang M Y, Li P F, Liu D S and Bai J Y. 2024. Deep embedded Transformer network with spatial-spectral information for unmixing of hyperspectral remote sensing images. Journal of Image and Graphics, 29(08):2220-2235
游雪儿,苏远超,蒋梦莹,李朋飞,刘东升,白晋颖. 2024. 深度嵌套式Transformer网络的高光谱图像空谱解混方法. 中国图象图形学报,29(08):2220-2235 [DOI: 10.11834/jig.230393http://dx.doi.org/10.11834/jig.230393]
相关作者
相关机构