具有细粒度感受野的多尺度融合口腔模型分割
Dental model segmentation network with fine-grained receptive fields and multiscale fusion
- 2024年29卷第12期 页码:3786-3799
纸质出版日期: 2024-12-16
DOI: 10.11834/jig.230769
移动端阅览
浏览全部资源
扫码关注微信
纸质出版日期: 2024-12-16 ,
移动端阅览
周新文, 朱洋, 葛峻沂, 潘钱家, 魏然, 顾敏. 2024. 具有细粒度感受野的多尺度融合口腔模型分割. 中国图象图形学报, 29(12):3786-3799
Zhou Xinwen, Zhu Yang, Ge Junyi, Pang Qianjia, Wei Ran, Gu Min. 2024. Dental model segmentation network with fine-grained receptive fields and multiscale fusion. Journal of Image and Graphics, 29(12):3786-3799
目的
2
从口内扫描点云模型上精确分割牙齿是计算机辅助牙科治疗中重要的任务,但存在手动执行耗时且烦琐的问题。近年来,计算机视觉领域涌现出一些端到端的方法实现三维形状分割。然而,大多数方法没有注意到口腔分割需要网络具有更加细粒度的感受野,因此分割精度仍然受到限制。为了解决该问题,设计了一个端到端的具有细粒度感受野的全自动牙齿分割网络——TRNet,用于在未加工的口内扫描点云模型上自动分割牙齿。
方法
2
首先, TRNet使用了具有细粒度感受野的编码器,其基于多尺度融合从不同的尺度提取到更全面的口腔模型特征,并通过更适合口腔模型分割的细粒度分组查询半径以及具有相对坐标归一化的特征提取层来提升分割性能。其次,TRNet采用了基于层级连接的特征嵌入方式,网络学习到口腔模型中由各个局部区域到覆盖更大范围空间的关键特征,特征提取更全面,提升了网络的分割精度。同时,TRNet使用了基于软性注意力机制的特征融合方式,使网络更好地从融合特征中关注到口腔模型的关键信息。
结果
2
使用由口内扫描仪获取的患者口内扫描点云模型数据集评估了TRNet。经过5折交叉验证的实验结果中,TRNet的总体准确
率(overall accuracy,OA)达到了97.015
<math id="M1"><mo>±</mo></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534641&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534639&type=
1.52400005
2.28600001
0.096%,平均交并比(mean intersection over union,mIoU)达到了92.691
<math id="M2"><mo>±</mo></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534641&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534639&type=
1.52400005
2.28600001
0.454%,显著优于现有方法。
结论
2
实验结果表明,提出的具有细粒度感受野的多尺度融合口腔分割模型在口内扫描点云模型上取得了较好表现,提高了网络对于口腔模型的分割能力,使点云分割结果更准确。
Objective
2
Dental computer-aided therapy relies on the use of dental models to aid dentists in their practice. One of the most fundamental tasks in dental computer-aided therapy is the automated segmentation of teeth using point cloud data obtained from intra-oral scanners (IOS). The precise segmentation of each individual tooth in this procedure provides vital information for a variety of subsequent tasks. These segmented dental models facilitate customized treatment planning and modeling, thus providing extensive assistance in carrying out further treatments. However, the automated segmentation of individual teeth from dental models faces three significant challenges. First, the indistinct boundary between teeth and gums poses difficulties in segmentation based solely on geometric features. Second, certain factors, such as occlusion during scanning, can lead to suboptimal results, particularly in posterior dental regions, thereby further complicating the segmentation process. Lastly, teeth often exhibit complex anomalies in patients, including crowding, missing teeth, and misalignment issues, which further complicate the task of accurate segmentation. To address these challenges, two conventional methods are proposed for segmenting teeth in images obtained from IOS scanners. The first method employs a projection-based approach, wherein a 3D dental scan image is initially projected into a 2D space, segmentation is then performed in a 2D space, and the result is remapped back into the 3D space. The second method adopts a geometry-based approach and typically utilizes geometric attributes, such as surface curvature, geodesic information, harmonic fields, and other geometric properties, to distinguish tooth structures. However, these methods are not fully automated and rely on domain-specific knowledge and experience. Moreover, the predefined low-level attributes used by these methods lack robustness when dealing with the complex appearance of patietns’ teeth. Considering the impactful application of convolutional neural networks (CNN) in computer vision and medical image processing, several deep learning methods rooted in CNN have been introduced. Some of these methods directly extract translation-invariant depth geometric features from 3D point cloud data but suffer from a lack of necessary receptive field for fine-grained tasks, such as dental model segmentation. Moreover, the network structure exhibits redundancy and neglects the crucial details of dental models. To address these issues, a fully automatic tooth segmentation network called TRNet is proposed in this paper, which can automatically segment teeth on unprocessed intra-oral scanned point cloud models.
Method
2
In the proposed end-to-end 3D point cloud-based multi-scale fusion dental model segmentation method, an encoder with a fine-grained receptive field is employed to address those challenges posed by the small size of each tooth within the dental model and the lack of distinct features between the teeth and gums. Each tooth within the dental model is relatively small in comparison to the entire dental model, and the boundaries between the teeth and gums lack distinct features. Consequently, a fine-grained receptive field is essential for extracting features from this model. The network adopts a small radius for querying the neighborhood, thus narrowing the receptive field and enabling the network to focus on detailed features. Additionally, downsampling can lead to the uneven density of the point cloud, thereby causing the network trained on sparse point clouds to struggle in recognizing fine-grained local structures. Multiscale feature fusion coding is implemented to address these issues. Given that the encoder uses a small query radius to create a fine-grained receptive field, the relative coordinates become relatively small. Consequently, the network needs to learn large weights to operate on these relative coordinates, thereby introducing further challenges in network optimization. TRNet normalizes the relative coordinates in the feature extraction layer to facilitate network optimization and enhance segmentation performance. The network also employs a highly efficient decoder. Previous segmentation methods often utilize the U-Net structure, which incorporates jump connections for multi-level feature aggregation between the input features of the cascaded decoder and the outputs of the corresponding layer encoder. However, this top-down propagation is considered inefficient for feature aggregation. The decoding approach used by TRNet directly combines the features outputted from all cascade encoders, thereby allowing the network to learn the importance of each cascade. The discrepancies in scales or dimensions of the features represented by fused information in the network may also introduce unwanted bias during the fusion process. To address these issues and ensure that the network focuses on crucial information within the fused features, a soft attention mechanism is incorporated into the fusion process. Specifically, a soft attention operation is performed on the newly combined features after their connection, thereby enabling the network to adaptively balance the discrepancies of different scales or levels in the propagated features.
Result
2
A dataset comprising dental models taken from numerous patients with irregular tooth shapes, such as crowding, misalignment, and underdeveloped teeth, was compiled. To establish the labeled values, an experienced dentist meticulously segmented and annotated these models. The dataset was then randomly divided into two subsets, with 146 models allocated for training and 20 models reserved for testing. Data augmentation techniques, such as random panning and scaling, were employed to enhance the diversity of the training set. In each iteration, intra-oral scan images were shifted by a randomly selected value within the range of [-0.1, 0.1] and scaled by a randomly chosen magnification within the range of [0.8, 1.25], thereby generating new training data. Experimental results from a 5-fold cross-validation reveal that TRNet achieved an overall accuracy (OA) of 97.015
<math id="M3"><mo>±</mo></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534678&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534659&type=
1.52400005
2.28600001
0.096% and a mean intersection over union (mIoU) of 92.691
<math id="M4"><mo>±</mo></math>
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534678&type=
https://html.publish.founderss.cn/rc-pub/api/common/picture?pictureId=70534659&type=
1.52400005
2.28600001
0.454%, significantly outperforming the existing methods.
Conclusion
2
An end-to-end deep learning network called TRNet is introduced in this paper for the automatic segmentation of teeth in 3D dental images acquired from intra-oral scanners. An encoder with fine-grained receptive fields was also implemented to enhance the local feature extraction capabilities essential for dental model segmentation. Additionally, a decoder based on hierarchical connections was employed to allow the network to decode efficiently by learning the significance of each level. This refinement significantly improves the precision of dental model segmentation. A soft attention mechanism was also integrated into the feature fusion process to enable the network to focus on key information within dental model features. Experimental results indicate that TRNet shows excellent performance on intra-oral scanned point cloud models and enhances the ability of the network to segment dental models, thereby improving the accuracy of point cloud segmentation results.
自动口腔模型分割点云细粒度感受野多尺度特征融合坐标归一化软性注意力机制
automatic dental model segmentationpoint cloudfine-grained receptive fieldsmultiscale feature fusioncoordinate normalizationsoft attention mechanism
Cui Z M, Li C J, Chen N L, Wei G D, Chen R N, Zhou Y F, Shen D G and Wang W P. 2021. TSegNet: an efficient and accurate tooth segmentation network on 3D dental model. Medical Image Analysis, 69: #101949 [DOI: 10.1016/j.media.2020.101949http://dx.doi.org/10.1016/j.media.2020.101949]
Duan F and Chen L. 2023. 3D dental mesh segmentation using semantics-based feature learning with graph-transformer//Proceedings of the 26th International Conference on Medical Image Computing and Computer Assisted Intervention-MICCAI 2023. Vancouver, Canada: Springer: 456-465 [DOI: 10.1007/978-3-031-43990-2_43http://dx.doi.org/10.1007/978-3-031-43990-2_43]
Hao J, Liao W, Zhang Y L, Peng J, Zhao Z, Chen Z, Zhou B W, Feng Y, Fang B, Liu Z Z and Zhao Z H. 2022. Toward clinically applicable 3-dimensional tooth segmentation via deep learning. Journal of Dental Research, 101(3): 304-311 [DOI: 10.1177/00220345211040459http://dx.doi.org/10.1177/00220345211040459]
He K M, Gkioxari G, Dollár P and Girshick R. 2017. Mask R-CNN//Proceedings of 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE: 2961-2969 [DOI: 10.1109/ICCV.2017.322http://dx.doi.org/10.1109/ICCV.2017.322]
Jiang L, Zhao H S, Liu S, Shen X Y, Fu C W and Jia J Y. 2019. Hierarchical point-edge interaction network for point cloud semantic segmentation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea(South): IEEE: 10433-10441 [DOI: 10.1109/ICCV.2019.01053http://dx.doi.org/10.1109/ICCV.2019.01053]
Kondo T, Ong S H and Foong K W C. 2004. Tooth segmentation of dental study models using range images. IEEE Transactions on Medical Imaging, 23(3): 350-362 [DOI: 10.1109/TMI.2004.824235http://dx.doi.org/10.1109/TMI.2004.824235]
Kumar Y, Janardan R, Larson B and Moon J. 2011. Improved segmentation of teeth in dental models. Computer-Aided Design and Applications, 8(2): 211-224 [DOI: 10.3722/cadaps.2011.211-224http://dx.doi.org/10.3722/cadaps.2011.211-224]
Li Y Y, Bu R, Sun M C, Wu W, Di X H and Chen B Q. 2018. PointCNN: convolution on X-transformed points//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc.: #31
Lian C F, Wang L, Wu T H, Liu M X, Durán F, Ko C C and Shen D G. 2019. MeshsNet: deep multi-scale mesh feature learning for end-to-end tooth labeling on 3D dental surfaces//Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention-MICCAI 2019. Shenzhen, China: Springer: 837-845 [DOI: 10.1007/978-3-030-32226-7_93http://dx.doi.org/10.1007/978-3-030-32226-7_93]
Liang Z D, Yang M, Li H and Wang C X. 2020. 3D instance embedding learning with a structure-aware loss function for point cloud segmentation. IEEE Robotics and Automation Letters, 5(3): 4915-4922 [DOI: 10.1109/LRA.2020.3004802http://dx.doi.org/10.1109/LRA.2020.3004802]
Ma Y Q and Li Z K. 2010. Computer aided orthodontics treatment by virtual segmentation and adjustment//Proceedings of 2010 International Conference on Image Analysis and Signal Processing. Xiamen, China: IEEE: 336-339 [DOI: 10.1109/IASP.2010.5476100http://dx.doi.org/10.1109/IASP.2010.5476100]
Qi C R, Su H, Mo K and Guibas L J. 2017a. Pointnet: deep learning on point sets for 3D classification and segmentation//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 652-660 [DOI: 10.1109/CVPR.2017.16http://dx.doi.org/10.1109/CVPR.2017.16]
Qi C R, Yi L, Su H and Guibas L J. 2017b. Pointnet++: deep hierarchical feature learning on point sets in a metric space//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc.: 5105-5114
Qiu L D, Ye C J, Chen P, Liu Y B, Han X G and Cui S G. 2022. DArch: dental arch prior-assisted 3D tooth instance segmentation with weak annotations//Proceedings of 2012 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE: 20752-20761 [DOI: 10.1109/CVPR52688.2022.02009http://dx.doi.org/10.1109/CVPR52688.2022.02009]
Qiu Y F and Niu J L. 2023. Point cloud segmentation algorithm fusing few-shot meta-learning and prototype alignment. Journal of Image and Graphics, 28(12): 3884-3896
邱云飞, 牛佳璐. 2023. 融合小样本元学习和原型对齐的点云分割算法. 中国图象图形学报, 28(12): 3884-3896 [DOI: 10.11834/jig.220942http://dx.doi.org/10.11834/jig.220942]
Sinthanayothin C and Tharanont W. 2008. Orthodontics treatment simulation by teeth segmentation and setup//Proceedings of the 5th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology. Krabi, Thailand: IEEE: 81-84 [DOI: 10.1109/ECTICON.2008.4600377http://dx.doi.org/10.1109/ECTICON.2008.4600377]
Tian S K, Dai N, Zhang B, Yuan F L, Yu Q and Cheng X S. 2019. Automatic classification and segmentation of teeth on 3D dental model using hierarchical deep learning networks. IEEE Access, 7: 84817-84828 [DOI: 10.1109/ACCESS.2019.2924262http://dx.doi.org/10.1109/ACCESS.2019.2924262]
Wang L, Huang Y C, Hou Y L, Zhang S M and Shan J. 2019a. Graph attention convolution for point cloud semantic segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 10296-10305 [DOI: 10.1109/CVPR.2019.01054http://dx.doi.org/10.1109/CVPR.2019.01054]
Wang Y, Sun Y B, Liu Z W, Sarma S E, Bronstein M M and Solomon J M. 2019b. Dynamic graph cnn for learning on point clouds. ACM Transactions on Graphics, 38(5): 1-12 [DOI: 10.1145/3326362http://dx.doi.org/10.1145/3326362]
Wongwaen N and Sinthanayothin C. 2010. Computerized algorithm for 3D teeth segmentation//Proceedings of 2010 International Conference on Electronics and Information Engineering. Kyoto, Japan: IEEE: 277-280 [DOI: 10.1109/ICEIE.2010.5559877http://dx.doi.org/10.1109/ICEIE.2010.5559877]
Wu T H, Lian C F, Lee S, Pastewait M, Piers C, Liu J, Wang F, Wang L, Chiu C Y, Wang W C, Jackson C, Chao W L, Shen D G and Ko C C. 2022. Two-stage mesh deep learning for automated tooth segmentation and landmark localization on 3D intraoral scans. IEEE Transactions on Medical Imaging, 41(11): 3158-3166 [DOI: 10.1109/TMI.2022.3180343http://dx.doi.org/10.1109/TMI.2022.3180343]
Xu M Y, Zhou Z P and Qiao Y. 2020. Geometry sharing network for 3D point cloud classification and segmentation//Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York, USA: AAAI: 12500-12507 [DOI: 10.1609/aaai.v34i07.6938http://dx.doi.org/10.1609/aaai.v34i07.6938]
Xu X J, Liu C and Zheng Y Y. 2019. 3D tooth segmentation and labeling using deep convolutional neural networks. IEEE Transactions on Visualization and Computer Graphics, 25(7): 2336-2348 [DOI: 10.1109/TVCG.2018.2839685http://dx.doi.org/10.1109/TVCG.2018.2839685]
Yamany S M and El-Bialy A M. 1999. Efficient free-form surface representation with application in orthodontics//Proceedings of the Three-Dimensional Image Capture and Applications II. San Jose, USA: SPIE: 115-124[DOI: 10.1117/12.341053http://dx.doi.org/10.1117/12.341053]
Yuan T R, Liao W H, Dai N, Cheng X S and Yu Q. 2010. Single-tooth modeling for 3D dental model. International Journal of Biomedical Imaging, 2010(1): #535329 [DOI: 10.1155/2010/535329http://dx.doi.org/10.1155/2010/535329]
Zanjani F G, Moin D A, Claessen F, Cherici T, Parinussa S, Pourtaherian A, Zinger S and de With P H N. 2019a. Mask-MCNet: instance segmentation in 3D point cloud of intra-oral scans//Proceedings of the 22nd International Conference on Medical Image Computing and Computer Assisted Intervention-MICCAI 2019. Shenzhen, China: Springer: 128-136 [DOI: 10.1007/978-3-030-32254-0_15http://dx.doi.org/10.1007/978-3-030-32254-0_15]
Zanjani F G, Moin D A, Verheij B, Claessen F, Cherici T, Tan T and de With P H N. 2019b. Deep learning approach to semantic segmentation in 3D point cloud intra-oral scans of teeth//Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning (MIDL). London, UK: PMLR: 557-571
Zhao M X, Ma L Z, Tan W Z and Nie D D. 2005. Interactive tooth segmentation of dental models//Proceedings of 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference. Shanghai, China: IEEE: 654-657 [DOI: 10.1109/IEMBS.2005.1616498http://dx.doi.org/10.1109/IEMBS.2005.1616498]
Zhao Y, Zhang L M, Liu Y, Meng D Y, Cui Z M, Gao C Q, Gao X B, Lian C F and Shen D G. 2022. Two-stream graph convolutional network for intra-oral scanner image segmentation. IEEE Transactions on Medical Imaging, 41(4): 826-835 [DOI: 10.1109/TMI.2021.3124217http://dx.doi.org/10.1109/TMI.2021.3124217]
Zhao Y, Zhang L M, Yang C S, Tan Y Y, Liu Y, Li P C, Huang T H and Gao C Q. 2021. 3D dental model segmentation with graph attentional convolution network. Pattern Recognition Letters, 152: 79-85 [DOI: 10.1016/j.patrec.2021.09.005http://dx.doi.org/10.1016/j.patrec.2021.09.005]
Zhou X W, Gan Y Z, Zhao Q F, Xiong J and Xia Z Y. 2019. Simulation of orthodontic force of archwire applied to full dentition using virtual bracket displacement method. International Journal for Numerical Methods in Biomedical Engineering, 35(5): #e3189 [DOI: 10.1002/cnm.3189http://dx.doi.org/10.1002/cnm.3189]
Zou B J, Liu S J, Liao S H, Ding X and Liang Y. 2015. Interactive tooth partition of dental mesh base on tooth-target harmonic field. Computers in Biology and Medicine, 56: 132-144 [DOI: 10.1016/j.compbiomed.2014.10.013http://dx.doi.org/10.1016/j.compbiomed.2014.10.013]
相关作者
相关机构