簡易檢索 / 詳目顯示

研究生: 蔡悅承
Tsai, Yueh-Cheng
論文名稱: 少樣本全場景三維點雲切割
A Full-Scene Approach to Few-Shot 3D Point Cloud Segmentation
指導教授: 陳煥宗
Chen, Hwann-Tzong
口試委員: 李哲榮
劉庭祿
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 111
語文別: 中文
論文頁數: 27
中文關鍵詞: 三維點雲語意切割少樣本語意切割
外文關鍵詞: 3D point cloud semantic segmentation, few-shot semantic segmentation
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 三維點雲語意分割任務需要高昂的標註成本,對於依賴大量訓練資料的一般深度學習方法來說,缺少足夠的標注資料在訓練上會是一大阻礙,也因而難以達到預期的訓練成效。為減輕標注所需的人力、克服前述阻礙,我們希望將少樣本方法應用在三維點雲分割。在本論文中,我們改進了過去的少樣本三維語意分割成果,使其能應用在更為真實的全場景設定而非原先的切割場景設定。這是一個具有挑戰性的設定,因為在全場景中背景類別涵蓋的點雲數量更多且與目標類別的比例更不平衡。為了實現這一目標,我們提出「鄰近注意力模組」來生成更有鑑別度的特徵,並設計「支援、查詢交互注意力模組」來通過交換支援原型與查詢特徵之間的資訊來精煉特徵。我們採用3D-SIS 和ScanNet 這兩個常用的三維場景資料集,驗證本論文所提出的少樣本全場景點雲分割方法。


    Regarding the 3D point cloud semantic segmentation task that requires expensive annotation costs, few-shot learning methods aim to mitigate the annotation’s human labor via generalizing the segmentation capability to unseen classes based on only a few training samples. In this thesis, we improve the previous few-shot 3D semantic segmentation work by adapting it into a more realistic full-scene inference setting instead of the original cropped scene inference setting. Such a setting is challenging since the background label is more dominant and imbalanced in a full-scene point cloud. To handle full-scene 3D segmentation, we propose the neighbor-attention module to generate discriminative features and the support-query cross-attention module to refine the features by exchanging the information between support prototypes and query features. We apply the proposed method to the 3D-SIS and ScanNet benchmark datasets to evaluate full-scene 3D semantic segmentation performance.

    List of Tables 2 List of Figures 3 摘 要 4 Abstract 5 1 Introduction 6 2 Related Work 8 3 Approach 10 3.1 Preliminaries 10 3.2 Assumptions 11 3.3 Full-Scene Inference 11 4 Experiments 15 4.1 Datasets 15 4.2 Implementation Details 16 4.3 Main Results 16 5 Conclusion and Future Work 22 Bibliography 23

    [1] W.-Y. Chen, Y.-C. Liu, Z. Kira, Y.-C. Wang, and J.-B. Huang. A closer look at fewshot classification. In International Conference on Learning Representations, 2019.
    [2] C. Choy, J. Gwak, and S. Savarese. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3075–3084, 2019.
    [3] G. S. Dhillon, P. Chaudhari, A. Ravichandran, and S. Soatto. A baseline for few-shot image classification. In International Conference on Learning Representations, 2020.
    [4] N. Dong and E. P. Xing. Few-shot semantic segmentation with prototype learning. In British Machine Vision Conference 2018, BMVC 2018, Newcastle, UK, September 3-6, 2018, page 79. BMVA Press, 2018.
    [5] C. Finn, P. Abbeel, and S. Levine. Model-agnostic meta-learning for fast adaptation of deep networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia,
    6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pages 1126–1135. PMLR, 2017.
    [6] B. Graham, M. Engelcke, and L. van der Maaten. 3d semantic segmentation with submanifold sparse convolutional networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22,
    2018, pages 9224–9232. Computer Vision Foundation / IEEE Computer Society, 2018.
    [7] Q. Hu, B. Yang, L. Xie, S. Rosa, Y. Guo, Z. Wang, N. Trigoni, and A. Markham. Randla-net: Efficient semantic segmentation of large-scale point clouds. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 11105–11114. Computer Vision Foundation / IEEE, 2020.
    [8] Q. Huang, W. Wang, and U. Neumann. Recurrent slice networks for 3d segmentation of point clouds. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 2626–2635. Computer Vision Foundation / IEEE Computer Society, 2018.
    [9] L. Landrieu and M. Simonovsky. Large-scale point cloud semantic segmentation with superpoint graphs. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pages 4558– 4567. Computer Vision Foundation / IEEE Computer Society, 2018.
    [10] Y. Li, R. Bu, M. Sun, W. Wu, X. Di, and B. Chen. Pointcnn: Convolution on x-transformed points. In S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pages 828–838, 2018.
    [11] W. Liu, C. Zhang, G. Lin, and F. Liu. Crnet: Cross-reference networks for fewshot segmentation. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pages 4164–4172. Computer Vision Foundation / IEEE, 2020.
    [12] Y. Liu, Q. Hu, Y. Lei, K. Xu, J. Li, and Y. Guo. Box2seg: Learning semantics of 3d point clouds with box-level supervision. CoRR, abs/2201.02963, 2022.
    [13] J. Min, D. Kang, and M. Cho. Hypercorrelation squeeze for few-shot segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
    [14] T. Munkhdalai and H. Yu. Meta networks. In D. Precup and Y. W. Teh, editors, Proceedings of the 34th International Conference on Machine Learning, ICML 2017,
    Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine
    Learning Research, pages 2554–2563. PMLR, 2017.
    [15] K. Nguyen and S. Todorovic. Feature weighting and boosting for few-shot segmentation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 622–631. IEEE, 2019.
    [16] C. R. Qi, H. Su, K. Mo, and L. J. Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, pages 77–85. IEEE Computer Society, 2017.
    [17] C. R. Qi, L. Yi, H. Su, and L. J. Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 5099–5108, 2017.
    [18] S. Ravi and H. Larochelle. Optimization as a model for few-shot learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net, 2017.
    [19] A. Santoro, S. Bartunov, M. M. Botvinick, D. Wierstra, and T. P. Lillicrap. Metalearning with memory-augmented neural networks. In M. Balcan and K. Q. Weinberger, editors, Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, volume 48 of JMLR Workshop and Conference Proceedings, pages 1842–1850. JMLR.org, 2016.
    [20] V. G. Satorras and J. B. Estrach. Few-shot learning with graph neural networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC,
    Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018.
    [21] A. Shaban, S. Bansal, Z. Liu, I. Essa, and B. Boots. One-shot learning for semantic segmentation. In British Machine Vision Conference 2017, BMVC 2017, London, UK,
    September 4-7, 2017. BMVA Press, 2017.
    [22] J. Snell, K. Swersky, and R. S. Zemel. Prototypical networks for few-shot learning. In I. Guyon, U. von Luxburg, S. Bengio, H. M. Wallach, R. Fergus, S. V. N. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing
    Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pages 4077–4087, 2017.
    [23] F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. Torr, and T. M. Hospedales. Learning to compare: Relation network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
    [24] O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra. Matching networks for one shot learning. In D. D. Lee, M. Sugiyama, U. von Luxburg, I. Guyon, and R. Garnett, editors, Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, December 5-10, 2016, Barcelona, Spain, pages 3630–3638, 2016.
    [25] K. Wang, J. H. Liew, Y. Zou, D. Zhou, and J. Feng. Panet: Few-shot image semantic segmentation with prototype alignment. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pages 9196–9205. IEEE, 2019.
    [26] Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, and J. M. Solomon. Dynamic graph CNN for learning on point clouds. ACM Trans. Graph., 38(5):146:1–146:12, 2019.
    [27] X. Ye, J. Li, H. Huang, L. Du, and X. Zhang. 3d recurrent neural networks with context fusion for point cloud semantic segmentation. In V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, editors, Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part VII, volume 11211 of Lecture Notes in Computer Science, pages 415–430. Springer, 2018.
    [28] C. Zhang, G. Lin, F. Liu, J. Guo, Q. Wu, and R. Yao. Pyramid graph networks
    with connection attentions for region-based one-shot semantic segmentation. In 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea
    (South), October 27 - November 2, 2019, pages 9586–9594. IEEE, 2019.
    [29] C. Zhang, G. Lin, F. Liu, R. Yao, and C. Shen. Canet: Class-agnostic segmentation networks with iterative refinement and attentive few-shot learning. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pages 5217–5226. Computer Vision Foundation / IEEE, 2019.
    [30] N. Zhao, T.-S. Chua, and G. H. Lee. Few-shot 3d point cloud semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
    Recognition, 2021.

    QR CODE