簡易檢索 / 詳目顯示

研究生: 何元通
He, Yuan-Tong
論文名稱: 影像語意切割透過對中層特徵之迭代整合網路
Iterative Integration Network over Intermediate Features for Semantic Segmentation
指導教授: 張隆紋
Chang, Long-Wen
口試委員: 陳朝欽
Chen, Chaur-Chin
邱瀞德
Chiu, Ching-Te
學位類別: 碩士
Master
系所名稱:
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 52
中文關鍵詞: 影像語意切割迭代類神經網路RefineNet
外文關鍵詞: semantic segmentation, Iterative, neural network, RefineNet
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,U-Net架構的神經網路在影像語意切割上展現出傑出的成果。這個架構由編譯網路與解譯網路所組成。而其中,自編譯網路取得的特徵映射將被傳遞到解譯網路,藉以提供更加充足的空間資訊。
    而我們採用RefineNet作為我們的基礎模型,其亦遵循著U-Net的架構。我們變更其編譯網路,並且對RefineNet模塊中的子模塊進行修改,這些修改能夠讓我們省下更多的記憶體空間。此外,我們亦在解譯網路上採用了遞歸迭代的架構,進一步解省更多的空間。
    最後,我們在CamVid資料集上運行我們的實驗與RefineNet的結果。我們的網路相對於RefineNet,花費相對較低的記憶空間成本,且在平均交並比分數上能取得更佳的結果。


    Recently, networks in U-Net architecture have shown outstanding performance in semantic segmentation. The architecture is composed of the encoder and the decoder network. The feature maps from the encoder network are passed to the decoder network for sufficient spatial information.
    We employ RefineNet as our based model, which also follows the U-Net architecture. We alter the encoder network and make some modification on the sub-blocks in RefineNet blocks. These changes make our model save more memory cost. Moreover, a recurrent architecture is applied in the decoder network for less memory cost.
    Eventually, we carry out our experiments and set new RefineNet result on CamVid dataset. Our network outperforms RefineNet on mean intersection-over-union score and takes relatively less memory cost.

    Chapter 1. Introduction..................9 Chapter 2. Related Work.................11 2.1 Feature Fusing......................11 2.2 Encoder Network.....................14 2.3 Recurrent Structure.................15 Chapter 3. Method.......................17 3.1 Network Overview....................17 3.2 Encoder Network.....................18 3.3 RefineNet block to IINet block......21 3.3.1 Residual Encode-Decode Unit.......22 3.3.2 Feature Fusing....................23 3.3.3 Chained Concatenated Pooling......25 3.3.4 Supplemented Dense Refinement.....26 3.4 Iterative Integration...............28 3.4.1 Iterative Architecture............28 3.4.2 Iterative State Weighting.........30 3.5 Other Network Details...............31 Chapter 4. Experiment Results...........32 Metrics.................................32 Baselines...............................33 Experiment details......................33 Experiment on CamVid....................34 Chapter 5. Conclusions and Discussion...50 References..............................51

    [1] J. Long, E. Shelhamer, and T. Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
    [2] S. Hong, H. Noh, and B. Han. "Decoupled deep neural network for semi-supervised semantic segmentation." Advances in neural information processing systems. 2015.
    [3] H. Noh, S. Hong, and B. Han. "Learning deconvolution network for semantic segmentation." Proceedings of the IEEE International Conference on Computer Vision. 2015.
    [4] K. He, X. Zhang, S. Ren, and J. Sun. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
    [5] J.Y. Zhu, T. Park, P. Isola, and AA. Efros. "Unpaired image-to-image translation using cycle-consistent adversarial networks." arXiv preprint arXiv:1703.10593 (2017).
    [6] K. He, G. Gkioxari, and P. Dollár. "Mask r-cnn." Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017.
    [7] P. Luc, N. Neverova, C. Couprie, J. Verbeek, and Y. LeCun. "Predicting deeper into the future of semantic segmentation." arXiv preprint arXiv:1703.07684 (2017).
    [8] A. Radford, L. Metz, and S. Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
    [9] O. Ronneberger, P. Fischer, and T. Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
    [10] PO. Pinheiro, TY Lin, R. Collobert, and P. Dollár. "Learning to refine object segments." European Conference on Computer Vision. Springer, Cham, 2016.
    [11] J. Fu, J. Liu, Y. Wang, and H. Lu. "Stacked Deconvolutional Network for Semantic Segmentation." arXiv preprint arXiv:1708.04943 (2017).
    [12] C. Peng, X. Zhang, and G. Yu. "Large Kernel Matters--Improve Semantic Segmentation by Global Convolutional Network." arXiv preprint arXiv:1703.02719 (2017).
    [13] LC. Chen, G. Papandreou, and I. Kokkinos. "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs." IEEE transactions on pattern analysis and machine intelligence 40.4 (2018): 834-848.
    [14] V. Badrinarayanan, A. Kendall, and R. Cipolla. "Segnet: A deep convolutional encoder-decoder architecture for image segmentation." IEEE transactions on pattern analysis and machine intelligence 39.12 (2017): 2481-2495.
    [15] H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia. "Pyramid scene parsing network." IEEE Conf. on Computer Vision and Pattern Recognition (CVPR). 2017.
    [16] G. Lin, A. Milan, C. Shen, and I. Reid. "Refinenet: Multi-path refinement networks for high-resolution semantic segmentation." IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2017.
    [17] S. Jégou, M. Drozdzal, and D. Vazquez. "The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation." Computer Vision and Pattern Recognition Workshops (CVPRW), 2017 IEEE Conference on. IEEE, 2017.
    [18] J. Krapac, and IKS. Šegvic. "Ladder-Style DenseNets for Semantic Segmentation of Large Natural Images." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
    [19] K. Simonyan, and A. Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
    [20] G. Bertasius, J. Shi, and L. Torresani. "Semantic segmentation with boundary neural fields." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
    [21] Z. Zhang, X. Zhang, C. Peng, D. Cheng, and J. Sun. "ExFuse: Enhancing Feature Fusion for Semantic Segmentation." arXiv preprint arXiv:1804.03821 (2018).
    [22] G. Huang, Z. Liu, and L. Van Der Maaten. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. Vol. 1. No. 2. 2017.
    [23] S. Nah, TH. Kim, and KM. Lee. "Deep multi-scale convolutional neural network for dynamic scene deblurring." Computer Vision and Pattern Recognition (CVPR). 2017.
    [24] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas. "DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks." arXiv preprint arXiv:1711.07064 (2017).
    [25] X. Tao, H. Gao, X. Shen, J. Wang, and J. Jia. "Scale-recurrent Network for Deep Image Deblurring." arXiv preprint arXiv:1802.01770 (2018).
    [26] S. Leroux, P. Molchanov, P. Simoens, B. Dhoedt, T. Breuel, and J. Kautz. "IamNN: Iterative and Adaptive Mobile Neural Network for Efficient Image Classification." arXiv preprint arXiv:1804.10123 (2018).
    [27] K. Greff, RK. Srivastava, and J. Schmidhuber. "Highway and residual networks learn unrolled iterative estimation." arXiv preprint arXiv:1612.07771 (2016).
    [28] H. Zhang, K. Dana, J. Shi, Z. Zhang, X. Wang, A. Tyagi, and A. Agrawal. "Context encoding for semantic segmentation." The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2018.
    [29] H. Li, P. Xiong, J. An, and L. Wang. "Pyramid Attention Network for Semantic Segmentation." arXiv preprint arXiv:1805.10180 (2018).
    [30] J. Hu, L. Shen, and G. Sun. "Squeeze-and-excitation networks." arXiv preprint arXiv:1709.01507 (2017).
    [31] C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. Sang. "Learning a Discriminative Feature Network for Semantic Segmentation." arXiv preprint arXiv:1804.09337 (2018).
    [32] M. Ren, and RS. Zemel. "End-to-end instance segmentation with recurrent attention." arXiv preprint arXiv:1605.09410 (2017).
    [33] L. McIntosh, N. Maheswaranathan, D. Sussillo, and J. Shlens. "Recurrent Segmentation for Variable Computational Budgets." arXiv preprint arXiv:1711.10151 (2017).
    [34] GJ. Brostow, J. Shotton, J. Fauqueur, and R. Cipolla. "Segmentation and recognition using structure from motion point clouds." European conference on computer vision. Springer, Berlin, Heidelberg, 2008.
    [35] J. Deng, W. Dong, R. Socher, LJ. Li, K Li, and FF. Li. "Imagenet: A large-scale hierarchical image database." Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.

    QR CODE