引导集扩散模型在户型平面图重建中的算法应用

王静; 熊皓然; 黄惠

doi:10.11834/jig.240508

浏览量 : 0 下载量: 0 CSCD: 0

PDF
导出
分享
收藏
专辑

引导集扩散模型在户型平面图重建中的算法应用
Application of guided set diffusion model algorithm in floor plan reconstruction
2024年页码：1-11
网络出版日期： 2024-11-29 ，
DOI： 10.11834/jig.240508
稿件说明：

移动端阅览

王静,熊皓然,黄惠.引导集扩散模型在户型平面图重建中的算法应用[J].中国图象图形学报,

Wang Jing,Xiong Haoran,Huang Hui.Application of guided set diffusion model algorithm in floor plan reconstruction[J].Journal of Image and Graphics,
王静,熊皓然,黄惠.引导集扩散模型在户型平面图重建中的算法应用[J].中国图象图形学报, DOI： 10.11834/jig.240508.

Wang Jing,Xiong Haoran,Huang Hui.Application of guided set diffusion model algorithm in floor plan reconstruction[J].Journal of Image and Graphics, DOI： 10.11834/jig.240508.

摘要

目的

户型平面图的矢量化是一项关键技术，用于从户型平面点阵图中提取精确的结构信息，广泛应用于建筑装修、家居设计以及场景理解等领域。现有的方法通常采用两阶段流程：第一阶段利用深度神经网络提取户型区域的掩膜，第二阶段通过后处理步骤从掩膜轮廓中提取墙体的矢量信息。然而，这种方法存在误差累积问题，后处理算法难以保证鲁棒性。

方法

为了解决上述问题，本文提出一种基于引导集扩散模型的户型平面图矢量重建算法，该算法通过将目标检测或实例分割方法中获得的粗糙轮廓输入扩散模型，逐步迭代轮廓点进行重建。此外，本文还引入了一种轮廓倾斜度损失函数，以帮助网络生成更规整的房间布局，从而进一步提升矢量化结果的准确性。

结果

在公开的Cubicase5K数据集上，本文对提出的算法进行了广泛测试。实验结果表明，在不同的输入条件下，该算法均能有效优化房间轮廓的精度，显著提高了墙线矢量化的提取精度。

结论

本文提出的基于引导集扩散模型的矢量重建算法，通过解决传统方法中的误差累积问题，实现了室内户型平面图中墙体矢量化的精度提升。这一改进为建筑与家居设计等领域的应用提供了更为可靠的技术支持。

Abstract

Objective

Indoor floor plan vectorization is a sophisticated technique aimed at extracting precise structural information from raster images and converting them into vector representations. This process is essential in fields such as architectural renovation， interior design， and scene understanding， where accurate and efficient vectorization of floor plans can greatly enhance the quality and usability of spatial data. Traditionally， the vectorization process has been carried out using a two-stage pipeline. In the first stage， deep neural networks are used to segment the raster image， producing masks that define the room regions within the floor plan. These masks serve as the foundation for the subsequent vectorization process. In the second stage， post-processing algorithms are applied to these masks to extract vector information， with a focus on elements such as walls， doors， and other structural components. However， this process is not without its challenges. One of the main issues is error accumulation； inaccuracies in the initial mask generation can lead to compounded errors during vectorization. Moreover， post-processing algorithms often lack robustness， particularly when dealing with complex or degraded input images， leading to suboptimal vectorization results.

Method

To address these challenges， we propose a novel approach based on diffusion models for vector reconstruction of indoor floor plans. Diffusion models， originally developed for generative tasks， have shown great promise in producing high-quality outputs by iteratively refining input data. Our method leverages this capability to enhance the precision of floor plan vectorization. Specifically， the algorithm starts with rough masks generated by object detection or instance segmentation models. While these masks provide a basic outline of the room regions， they may lack the accuracy needed for precise vectorization. The diffusion model is then employed to iteratively refine the contour points of these rough masks， gradually reconstructing the room contours with greater accuracy. This process involves multiple iterations， during which the model adjusts the contour points based on patterns learned from the training data， leading to a more accurate representation of the room boundaries. A key innovation in our approach is the introduction of a contour inclination loss function. This loss function is specifically designed to guide the diffusion model in generating more reasonable and structurally sound room layouts. By penalizing unrealistic or impractical contour inclinations， the model is encouraged to produce outputs that closely resemble real-world room configurations. This not only improves the visual accuracy of the room contours but also enhances the overall quality of the vectorized floor plan. The benefits of our diffusion model-based approach are manifold. Firstly， by refining room contours before the vectorization stage， we can significantly reduce the error accumulation that plagues traditional methods. The more accurate the room contours， the more precise the subsequent wall vector extraction will be. Secondly， the use of a diffusion model allows for more robust handling of complex or noisy input images. Unlike traditional post-processing algorithms， which may struggle with irregularities in the input data， the diffusion model's iterative refinement process is better equipped to address such challenges， leading to more reliable vectorization results.

Result

We have thoroughly validated our method on the public CubiCase5K dataset， a widely used benchmark in the field of floor plan vectorization. The results of our experiments demonstrate that our approach significantly outperforms existing methods in both accuracy and robustness. Notably， we observed a marked improvement in the precision of wall vector extraction， which is crucial for applications in architectural renovation and interior design. The diffusion model's ability to produce more accurate room contours directly translates into better vectorized representations of floor plans， making our approach an invaluable tool for professionals in these fields.

Conclusion

In conclusion， the vectorization of indoor floor plans is a critical task with far-reaching applications， and the limitations of traditional methods have highlighted the need for more advanced techniques. Our diffusion model-based algorithm represents a significant step forward， offering a more accurate and reliable solution for the vector reconstruction of indoor floor plans. By refining room contours through iterative adjustments and incorporating a contour inclination loss function， our method not only addresses the issue of error accumulation but also enhances the overall quality of the vectorized output. Furthermore， our method holds broad application prospects in the field of interior design. Modern interior designers increasingly rely on digital tools for design and planning， where accurate indoor space representation is essential. With our method， designers can quickly generate high-quality room outlines and apply them to various design scenarios， such as furniture placement， lighting design， and space optimization. This efficient and precise vectorization technology not only saves designers significant time but also improves the feasibility and practicality of design solutions. In the field of building renovation， our approach offers significant advantages as well. During the renovation of old buildings， it is often necessary to redraw and optimize the original floor plan. This process is often constrained by the quality of the original drawings， especially in older buildings， where the original plans may have deteriorated or been lost. Through our diffusion model vectorization method， architects can quickly reconstruct digital versions of these old floor plans and use them as a foundation for renovation designs. This not only improves the efficiency of the renovation process but also helps preserve the historical character of the building. Lastly， our method has important application potential in the field of architectural scene understanding. With the rise of smart homes and automated building management systems， accurate interior floor plan data is critical to enabling these intelligent features. Our vectorization method can generate highly accurate representations of indoor spaces， providing reliable foundational data for intelligent systems. Overall， our approach promises to significantly improve the efficiency and effectiveness of floor plan vectorization， paving the way for more accurate architectural designs and enhanced spatial understanding.

关键词

深度学习室内户型图生成式重建扩散模型户型图矢量技术

Keywords

deep learningindoor floor plangenerated reconstructiondiffusion modelfloorplan vector technology

references

Chen L C， Zhu Y K， Papandreou G， Schroff F， and Adam H. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation//Proceedings of the European conference on computer vision. ECCV： 801-818. ［DOI：10.1007/978-3-030-01234-2_49http://dx.doi.org/10.1007/978-3-030-01234-2_49］

Chen J C， Qian Y M， and Furukawa Y. 2022. HEAT： Holistic Edge Attention Transformer for Structured Reconstruction//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE： 3866-3875. ［DOI：10.1109/CVPR52688.2022.00384http://dx.doi.org/10.1109/CVPR52688.2022.00384］

Chen J C， Deng R Z， and Furukawa Y. 2023. PolyDiffuse： Polygonal Shape Reconstruction via Guided Set Diffusion Models. ［EB/OL］. ［2023-06-02］ https://arxiv.org/abs/2306.01461.pdfhttps://arxiv.org/abs/2306.01461.pdf

He K M， Gkioxari G， Dollar P， and Girshick R. 2017. Mask R-CNN//Proceedings of the IEEE international conference on computer vision. ICCV： 2961-2969. ［DOI：10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322］

Ho J， Jain A ， and Abbeel P. 2020. Denoising diffusion probabilistic models. ［EB/OL］. ［2020-06-19］ https://arxiv.org/abs/2006.11239.pdfhttps://arxiv.org/abs/2006.11239.pdf

Kalervo A， Ylioinas J， Häikiö M， Karhu A， and Kannala J. 2019. Cubicasa5k： A dataset and an improved multi-task model for floorplan image analysis//Image Analysis： 21st Scandinavian Conference. Springer：28-40. ［DOI：10.1007/978-3-030-20205-7_3http://dx.doi.org/10.1007/978-3-030-20205-7_3］

Kirillov A， Mintun E， Ravi N， Mao H， Rolland C， Gustafson L， Xiao T， Whitehead S， C. Berg A， Lo W Y， Dollar P， and Girshick R. 2023. Segment anything//Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE： 4015-4026. ［DOI：10.48550/arXiv.2304.02643http://dx.doi.org/10.48550/arXiv.2304.02643］

Lazarow J， Xu W J， and Tu Z W. 2022. Instance segmentation with mask-supervised polygonal boundary transformers//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE： 4382-4391. ［DOI：10.1109/CVPR52688.2022.00434http://dx.doi.org/10.1109/CVPR52688.2022.00434］

Liu C， Wu J J， Kohli P， and Furukawa Y. 2017. Raster-To-Vector： Revisiting Floorplan Transformation//Proceedings of the IEEE International Conference on Computer Vision， IEEE：2195-2203 ［DOI：10.1109/ICCV.2017.241http://dx.doi.org/10.1109/ICCV.2017.241］

Liu C， Wu J Y， and Furukawa Y. 2018. FloorNet： A Unified Framework for Floorplan Reconstruction from 3D Scans. ［EB/OL］. ［2018-03-31］ https://arxiv.org/abs/1804.00090.pdfhttps://arxiv.org/abs/1804.00090.pdf

Lv X L， Zhao S C， Yu X Y， and Zhao B Q. 2021. Residential floor plan recognition and reconstruction//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE： 16717-16726. ［DOI：10.1109/CVPR46437.2021.01644http://dx.doi.org/10.1109/CVPR46437.2021.01644］

Pizarro P N， Hitschfeld N， Sipiran I， and Saavedra J M. 2022.Automatic floor plan analysis and recognition//Automation in Construction. 140： 104348. ［DOI：/10.1016/j.autcon.2022.104348http://dx.doi.org//10.1016/j.autcon.2022.104348］

Vaswani A， Shazeer N， Parmar N， Uszkoreit J， Jones L， N. Gomez A， Kaiser L， and Polosukhin I. 2017. Attention is all you need. ［EB/OL］. ［2017-06-12］ https://arxiv.org/abs/1706.03762.pdfhttps://arxiv.org/abs/1706.03762.pdf

Yue Y W， Kontogianni T， Schindler K， and Engelmann F. 2023. Connecting the Dots： Floorplan Reconstruction Using Two-Level Queries//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE： 845-854. ［DOI：10.48550/arXiv.2211.15658http://dx.doi.org/10.48550/arXiv.2211.15658］

Zhao Y A， Lv W Y， Xu S L， Wei J M， Wang G Z， Dang Q Q， Liu Y， Chen J. 2023. DETRs Beat YOLOs on Real-time Object Detection. ［EB/OL］. ［2018-03-31］ https://arxiv.org/abs/2304.08069.pdfhttps://arxiv.org/abs/2304.08069.pdf

Zhu R Y， Shen J C， Deng X T， Wallden M， and Ino F. 2020. Training Strategies for CNN-based Models to Parse Complex FloorPlans//Proceedings of the 2020 9th International Conference on Software and Computer Applications. 11-16. ［DOI：10.1145/3384544.3384566http://dx.doi.org/10.1145/3384544.3384566］

Yu Liu， Wu Xiaoqun. 2024. Survey of texture optimization algorithms for 3D reconstructed scenes. Journal of Image and Graphics， 29（08）：2303-2318 DOI： 10.11834/jig.230478http://dx.doi.org/10.11834/jig.230478.

Du Tao， Hu Ruizhen， Liu Libin， Yi Li， Zhao Hao. 2024. Research progress in human-like indoor scene interaction. Journal of Image and Graphics， 29（06）：1575-1606 DOI： 10.11834/jig.240004http://dx.doi.org/10.11834/jig.240004.

Yue Liang， Tan Hao， Huang Junkai， Zhang Shaokui. 2024. Review of digital 3D indoor scene synthesis. Journal of Image and Graphics， 29（09）：2471-2493 DOI： 10.11834/jig.230712http://dx.doi.org/10.11834/jig.230712.

Ye Yixuan， Du Xia， Chen Si， Zhu Shunzhi， Yan Yan. 2024. Sparse adversarial patch attack based on QR code mask. Journal of Image and Graphics， 29（07）：1889-1901 DOI： 10.11834/jig.230453http://dx.doi.org/10.11834/jig.230453.

Wan Junhui， Liu Xinpu， Chen Lili， Ao Sheng， Zhang Peng， Guo Yulan. 2024. Geometric attribute-guided 3D semantic instance reconstruction. Journal of Image and Graphics， 29（01）：0218-0230 DOI： 10.11834/jig.230106http://dx.doi.org/10.11834/jig.230106.

文章被引用时，请邮件提醒。

提交

端到端对称感知对比学习脑室分割算法

面向目标类别分类的无数据知识蒸馏方法

IHCCD: 非规范手写汉字识别数据集

二维人体姿态编解码方法综述：从解决歧义性问题的角度出发

面向驾驶场景精准图像翻译的条件扩散模型