深度嵌套式Transformer网络的高光谱图像空谱解混方法
Deep embedded Transformer network with spatial-spectral information for unmixing of hyperspectral remote sensing images
- 2024年29卷第8期 页码:2220-2235
纸质出版日期: 2024-08-16
DOI: 10.11834/jig.230393
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-08-16 ,
移动端阅览
游雪儿, 苏远超, 蒋梦莹, 李朋飞, 刘东升, 白晋颖. 2024. 深度嵌套式Transformer网络的高光谱图像空谱解混方法. 中国图象图形学报, 29(08):2220-2235
You Xueer, Su Yuanchao, Jiang Mengying, Li Pengfei, Liu Dongsheng, Bai Jinying. 2024. Deep embedded Transformer network with spatial-spectral information for unmixing of hyperspectral remote sensing images. Journal of Image and Graphics, 29(08):2220-2235
目的
2
基于深度学习的解混方法在信息挖掘和泛化性能上优于传统方法,但主要关注光谱信息,对空间信息的利用仍停留在滤波、卷积的表层处理。这使得构建解混网络时需要堆叠多层网络,易丢失部分图像信息,影响解混准确性。Transformer网络因其强大的特征表达能力广泛应用于高光谱图像处理,但将其直接应用于解混学习容易丢失图像局部细节。本文基于Transformer网络提出了改进方法。
方法
2
本文以TNT(Transformer in Transformer)构架为基础提出了一种深度嵌套式解混网络(deep embedded Transformer network, DETN),通过内外嵌入式策略实现编码器中局部与整体空间信息共享,不仅保留了高光谱图像的空间细节,而且在编码器中只涉及少量卷积运算,大幅度提升了学习效率。在解码器中,通过一次卷积运算来恢复数据结构以便生成端元与丰度,并在最后使用Softmax 层来保障丰度的物理意义。
结果
2
最后,本文分别采用模拟数据集和真实高光谱数据集进行对比实验,在50 dB模拟数据集中平均光谱角距离和均方根误差取得最优值,分别为0.038 6 和0.004 5,在真实高光谱数据集Samson、Jasper Ridge中取得最优平均光谱角距离,分别为0.119 4,0.102 7。
结论
2
实验结果验证了DETN 方法的有效性和优势,并且能为实现深度解混提供新的技术支撑和理论参考。
Objective
2
In hyperspectral remote sensing, mixed pixels often exist due to the complex surface of natural objects and the limitation of spatial resolution of instruments. Mixed pixels typically refer to the situation where a pixel in the hyperspectral images usually contains multiple spectral features, which hinders the application of hyperspectral images in various fields such as target detection, image classification, and environmental monitoring. Therefore, the decomposition (unmixing) of mixed pixels is a main concern in the processing of hyperspectral remote sensing images. Spectral unmixing aims to overcome the limitations of image spatial resolution by extracting pure spectral signals (endmembers) representing each land cover class and their respective proportions (abundances) within each pixel. It is based on a spectral mixing model at the sub-pixel level. The rise of deep learning has brought many advanced modeling theories and architecture tools to the field of hyperspectral mixed pixel decomposition and has also spawned many deep learning-based unmixing methods. Although these methods have advantages over traditional methods in terms of information mining and generalization performance, deep networks often need to combine multiple layers of stacked network layers to achieve optimal learning outcomes. Therefore, deep networks may cause damage to the internal structure of the data during the training process, which leads to the loss of important information in hyperspectral data and affects the accuracy of unmixing. In addition, most existing deep learning-based unmixing methods focus only on spectral information, but the exploit of spatial information is still limited to surface processing stages such as filtering and convolution. In recent years, autoencoder has been one of the research hotspots in the field of deep learning, and many variant networks based on autoencoder networks have emerged. Transformer is a novel deep learning network with an autoencoder-like structure. It has garnered considerable attention in various fields such as natural language processing, computer vision, and time series analysis due to its powerful feature representation capability. The Transformer, as a neural network primarily based on the self-attention mechanism, can better explore the underlying relationships among different features and more comprehensively aggregate the spectral and spatial correlations of pixels. This way enhances the ability of abundance learning and improves the accuracy of unmixing. Although the Transformer network has recently been used to design unmixing methods, using unsupervised Transformer models directly to obtain features can lose many local details and cause difficulty in exploiting the long-range dependency properties of Transformers effectively.
Method
2
To address these limitations, the study proposes a deep embedded Transformer network (DETN) based on the Transformer-in-Transformer architecture. This network adopts an autoencoder framework that consists of two main parts: node embedding (NE) and blind signal separation. In the first part, the input hyperspectral image is first uniformly divided twice, and the divided image patches are mapped into sub-patch sequences and patch sequences through linear transformation operations. Then, the sub-patch sequences are processed through an internal Transformer structure to obtain pixel spectral information and local spatial correlations, which are then aggregated into the patch sequences for parameter and information sharing. Finally, the local detail information in the patch sequences is retained, and the patch sequences are processed through an external Transformer structure to obtain and output pixel spectral information and global spatial correlation information containing local information. In the second part, the input NE is first reconstructed into an abundance map and smoothed during this process using a single layer of 2D convolution layer to eliminate noise. A SoftMax layer is used to ensure the physical meaning of the abundances. Finally, a single-layer 2D convolution layer is used to reconstruct the image, which optimizes and estimates the endmembers in the convolution layer.
Result
2
To evaluate the effectiveness of the proposed method, experiments are conducted using simulated datasets and some real hyperspectral datasets, including the Samson dataset, the Jasper Ridge dataset, and a part of the real hyperspectral farmland data in Nanchang City, Jiangxi Province, obtained by the Gaofen-5 satellite provided by Beijing Shengshi Huayao Technology Co., Ltd. In addition, resources from the ZY1E satellite provided by Beijing Shengshi Huayao Technology Co., Ltd. are used to obtain partial hyperspectral data of the Marseille Port in France for comparative experiments with different methods. The experimental results are quantitatively analyzed using spectral angle distance (SAD) and root mean square error (RMSE). In addition, the method evaluates the proposed DETN compared with several state-of-the-art deep learning-based unmixing algorithms: fully strained least squares (FCLS), deep autoencoder networks for hyperspectral unmixing (DAEN), autoencoder network for hyperspectral unmixing with adaptive abundance smoothing (AAS), the untied denoising autoencoder with sparsity (uDAS), hyperspectral unmixing using deep imageprior (UnDIP), and hyperspectral unmixing using Transformer network (DeepTrans-HSU). Results demonstrate that the proposed method outperforms the compared methods in terms of spectral angle distance (SAD), root mean square error (RMSE), and other evaluation metrics.
Conclusion
2
The proposed method effectively captures and preserves the spectral information of pixels at local and global levels, as well as the spatial correlations among pixels. This method results in accurate extraction of endmembers that match the ground truth spectral features. Moreover, the method produces smooth abundance maps with high spatial consistency, even in regions with hidden details in the image. These findings validate that the DETN method provides new technical support and theoretical references for addressing the challenges posed by mixed pixels in hyperspectral image unmixing.
遥感图像处理高光谱遥感混合像元分解深度学习Transformer网络
remote sensing image processinghyperspectral remote sensinghyperspectral unmixingdeep learningTransformer network
Boardman J W, Kruse F A. and Green, R O. 1995. Mapping Target Signatures via Partial Unmixing of Aviris Data//Proceeding of the 5th Annual JPL Airborne Earth Science Workshop. [s.l.]: AVIRIS Workshop
Borsoi R A, Imbiriba T and Bermudez J C M. 2020. Deep generative endmember modeling: an application to unsupervised spectral unmixing. IEEE Transactions on Computational Imaging, 6: 374-384 [DOI: 10.1109/TCI.2019.2948726http://dx.doi.org/10.1109/TCI.2019.2948726]
Eches O, Dobigeon N, Mailhes C and Tourneret J Y. 2010. Bayesian estimation of linear mixtures using the normal compositional model. Application to hyperspectral imagery. IEEE Transactions on Image Processing, 19(6): 1403-1413 [DOI: 10.1109/TIP.2010.2042993http://dx.doi.org/10.1109/TIP.2010.2042993]
Fang Y, Wang Y X, Xu L L, Zhuo R M, Wong A and Clausi D A. 2022. BCUN: Bayesian fully convolutional neural network for hyperspectral spectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 60: #5523714 [DOI: 10.1109/TGRS.2022.3151004http://dx.doi.org/10.1109/TGRS.2022.3151004]
Gao L R, Han Z, Hong D F, Zhang B and Chanussot J. 2022a. CyCU-Net: cycle-consistency unmixing network by learning cascaded autoencoders. IEEE Transactions on Geoscience and Remote Sensing, 60: #5503914 [DOI: 10.1109/TGRS.2021.3064958http://dx.doi.org/10.1109/TGRS.2021.3064958]
Gao L R, Wang Z C, Zhuang L N, Yu H Y, Zhang B and Chanussot J. 2022b. Using low-rank representation of abundance maps and nonnegative tensor factorization for hyperspectral nonlinear unmixing. IEEE Transactions on Geoscience and Remote Sensing, 60: #5504017 [DOI: 10.1109/TGRS.2021.3065990http://dx.doi.org/10.1109/TGRS.2021.3065990]
Ghosh P, Roy S K, Koirala B, Rasti B and Scheunders P. 2022. Hyperspectral unmixing using Transformer network. IEEE Transactions on Geoscience and Remote Sensing, 60: #5535116 [DOI: 10.1109/TGRS.2022.3196057http://dx.doi.org/10.1109/TGRS.2022.3196057]
Guo R, Wang W and Qi H R. 2015. Hyperspectral image unmixing using autoencoder cascade//7th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS). Tokyo, Japan: IEEE: 1-4 [DOI: 10.1109/WHISPERS.2015.8075378http://dx.doi.org/10.1109/WHISPERS.2015.8075378]
Han K, Xiao A, Wu E H, Guo J Y, Xu C J and Wang Y H. 2021. Transformer in Transformer//Proceedings of the 35th Conference on Neural Information Processing Systems. [s.l.]: [s.n.]: 15908-15919
Heinz D C and Chein-I-Chang. 2001. Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectral imagery. IEEE Transactions on Geoscience and Remote Sensing, 39(3): 529-545 [DOI: 10.1109/36.911111http://dx.doi.org/10.1109/36.911111]
Hua Z Q, Li X R, Qiu Q H and Zhao L Y. 2021. Autoencoder network for hyperspectral unmixing with adaptive abundance smoothing. IEEE Geoscience and Remote Sensing Letters, 18(9): 1640-1644 [DOI: 10.1109/LGRS.2020.3005999http://dx.doi.org/10.1109/LGRS.2020.3005999]
Jin Q W, Ma Y, Mei X G and Ma J Y. 2022. TANet: an unsupervised two-stream autoencoder network for hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 60: #5506215 [DOI: 10.1109/TGRS.2021.3094884http://dx.doi.org/10.1109/TGRS.2021.3094884]
Nascimento J M P and Dias J M B. 2005. Vertex component analysis: a fast method to unmix hyperspectral data. IEEE Transactions on Geoscience and Remote Sensing, 43(4): 898-910 [DOI: 10.1109/TGRS.2005.844293http://dx.doi.org/10.1109/TGRS.2005.844293]
Miao L, and Qi H. 2007. Endmember extraction from highly mixed data using minimum volume constrained nonnegative matrix factorization. IEEE Transactions on Geoscience and Remote Sensing, 45(3): 765-777 [DOI: 10.1109/TGRS.2006.888466http://dx.doi.org/10.1109/TGRS.2006.888466]
Ozkan S, Kaya B and Akar G B. 2019. EndNet: sparse autoencoder network for endmember extraction and hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 57(1): 482-496 [DOI: 10.1109/TGRS.2018.2856929http://dx.doi.org/10.1109/TGRS.2018.2856929]
Palsson B, Ulfarsson M O and Sveinsson J R. 2021. Convolutional autoencoder for spectral-spatial hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 59(1): 535-549 [DOI: 10.1109/TGRS.2020.2992743http://dx.doi.org/10.1109/TGRS.2020.2992743]
Peng Q, Zhang B, Sun X, Gao L R and Yu W B. 2017. Hyperspectral unmixing based on spatial and spectral preprocessing prior and constrained non-negative matrix factorization. Journal of Image and Graphics, 22(4): 542-550
彭倩, 张兵, 孙旭, 高连如, 于文博. 2017. 结合空间光谱预处理和约束非负矩阵分解的高光谱图像混合像元分解. 中国图象图形学报, 22(4): 542-550 [DOI: 10.11834/jig.20170414http://dx.doi.org/10.11834/jig.20170414]
Qi L, Chen Z W, Gao F, Dong J Y, Gao X B and Du Q. 2023. Multiview spatial-spectral two-stream network for hyperspectral image unmixing. IEEE Transactions on Geoscience and Remote Sensing, 61: #5502016 [DOI: 10.1109/TGRS.2023.3237556http://dx.doi.org/10.1109/TGRS.2023.3237556]
Qian Y T, Jia S, Zhou J and Robles-Kelly A. 2011. Hyperspectral unmixing via L1/2 sparsity-constrained nonnegative matrix factorization. IEEE Transactions on Geoscience and Remote Sensing, 49(11): 4282-4297 [DOI: 10.1109/TGRS.2011.2144605http://dx.doi.org/10.1109/TGRS.2011.2144605]
Qu H C, Ji R Q, Liu W J and Liang X J. 2015. Acceleration of hyperspectral image endmember extraction based on MapReduce pattern. Journal of Image and Graphics, 20(7): 973-980
曲海成, 籍瑞庆, 刘万军, 梁雪剑. 2015. MapReduce模式下高光谱图像端元提取算法加速. 中国图象图形学报, 20(7): 973-980 [DOI: 10.11834/jig.20150714http://dx.doi.org/10.11834/jig.20150714]
Qu Y and Qi H R. 2019. uDAS: an untied denoising autoencoder with sparsity for spectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 57(3): 1698-1712 [DOI: 10.1109/TGRS.2018.2868690http://dx.doi.org/10.1109/TGRS.2018.2868690]
Rasti B, Koirala B, Scheunders P and Ghamisi P. 2022. UnDIP: hyperspectral unmixing using deep image prior. IEEE Transactions on Geoscience and Remote Sensing, 60: #5504615 [DOI: 10.1109/TGRS.2021.3067802http://dx.doi.org/10.1109/TGRS.2021.3067802]
Song X R, Wu L D and Meng X L. 2020. Unsupervised hyperspectral unmixing based on robust non-negative matrix factorization. Journal of Image and Graphics, 25(4): 801-812
宋晓瑞, 吴玲达, 孟祥利. 2020. 利用稳健非负矩阵分解实现无监督高光谱解混. 中国图象图形学报, 25(4): 801-812 [DOI: 10.11834/jig.190354http://dx.doi.org/10.11834/jig.190354]
Su Y C, Gao L R, Jiang M Y, Plaza A, Sun X and Zhang B. 2023. NSCKL: normalized spectral clustering with kernel-based learning for semisupervised hyperspectral image classification. IEEE Transactions on Cybernetics, 53(10): 6649-6662 [DOI: 10.1109/TCYB.2022.3219855http://dx.doi.org/10.1109/TCYB.2022.3219855]
Su Y C, Li J, Plaza A, Marinoni A, Gamba P and Chakravortty S. 2019. DAEN: deep autoencoder networks for hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 57(7): 4309-4321 [DOI: 10.1109/TGRS.2018.2890633http://dx.doi.org/10.1109/TGRS.2018.2890633]
Su Y C, Marinoni A, Li J, Plaza J and Gamba P. 2018. Stacked nonnegative sparse autoencoders for robust hyperspectral unmixing. IEEE Geoscience and Remote Sensing Letters, 15(9): 1427-1431 [DOI: 10.1109/LGRS.2018.2841400http://dx.doi.org/10.1109/LGRS.2018.2841400]
Su Y C, Xu X, Li J, Qi H R, Gamba P and Plaza A. 2021. Deep autoencoders with multitask learning for bilinear hyperspectral unmixing. IEEE Transactions on Geoscience and Remote Sensing, 59(10): 8615-8629 [DOI: 10.1109/TGRS.2020.3041157http://dx.doi.org/10.1109/TGRS.2020.3041157]
Sun L, Zhao G R, Zheng Y H and Wu Z B. 2022. Spectral-spatial feature tokenization Transformer for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 60: #5522214 [DOI: 10.1109/TGRS.2022.3144158http://dx.doi.org/10.1109/TGRS.2022.3144158]
Sun Y F, Liu B, Wang R R, Zhang P Q and Dai M F. 2023. Spectral-spatial MLP-like network with reciprocal points learning for open-set hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 61: #5513218 [DOI: 10.1109/TGRS.2023.3280183http://dx.doi.org/10.1109/TGRS.2023.3280183]
Tong Q X, Zhang B and Zhang L F. 2016. Current progress of hyperspectral remote sensing in China. Journal of Remote Sensing, 20(5): 689-707
童庆禧, 张兵, 张立福. 2016. 中国高光谱遥感的前沿进展. 遥感学报, 20(5): 689-707 [DOI: 10.11834/jrs.20166264http://dx.doi.org/10.11834/jrs.20166264]
Tu B, Liao X L, Li Q M, Peng Y S and Plaza A. 2022. Local semantic feature aggregation-based Transformer for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 60: #5536115 [DOI: 10.1109/TGRS.2022.3201145http://dx.doi.org/10.1109/TGRS.2022.3201145]
Tu B, Ren Q, Li Q M, He W Q and He W. 2023. Hyperspectral image classification using a superpixel-pixel-subpixel multilevel network. IEEE Transactions on Instrumentation and Measurement, 72: #5013616 [DOI: 10.1109/TIM.2023.3271713http://dx.doi.org/10.1109/TIM.2023.3271713]
Wang J Y and Li C L. 2021. Development and prospect of hyperspectral imager and its application. Chinese Journal of Space Science, 41(1): 22-33
王建宇, 李春来. 2021. 高光谱遥感成像技术的发展与展望. 空间科学学报, 41(1): 22-33 [DOI: 10.11728/cjss2021.01.022http://dx.doi.org/10.11728/cjss2021.01.022]
Wang S Q, Yang J X, Shao Y T and Xiao L. 2023. Non-negative sparse component decomposition based modeling and robust unmixing for hyperspectral images. Journal of Image and Graphics, 28(2): 613-627
汪顺清, 杨劲翔, 邵远天, 肖亮. 2023. 高光谱图像非负稀疏分量分解建模与鲁棒性解混方法. 中国图象图形学报, 28(2): 613-627 [DOI: 10.11834/jig.211054http://dx.doi.org/10.11834/jig.211054]
Winter M E. 1999. N-FINDR: an algorithm for fast autonomous spectral end-member determination in hyperspectral data//Proceedings of SPIE 3753, Imaging Spectrometry V. Denver, USA: SPIE: 266-275 [DOI: 10.1117/12.366289http://dx.doi.org/10.1117/12.366289]
Zhang B. 2016. Advancement of hyperspectral image processing and information extraction. Journal of Remote Sensing, 20(5): 1062-1090
张兵. 2016. 高光谱图像处理与信息提取前沿. 遥感学报, 20(5): 1062-1090 [DOI: 10.11834/jrs.20166179http://dx.doi.org/10.11834/jrs.20166179]
相关作者
相关机构