基於對抗式訓練的信任引導之半監督式語意分割

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳聖諺 Chen, Sheng-Yen
論文名稱：	基於對抗式訓練的信任引導之半監督式語意分割 Segmentation Confidence-Guided Semi- Supervised Semantic Segmentation Based on Adversarial Training
指導教授：	林嘉文 Lin, Chia-Wen
口試委員:	黃敬群 Huang, Ching-Chun 康立威 Kang, Li-Wei 鄭旭詠 Cheng, Hsu-Yung
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	33
中文關鍵詞：	半監督式訓練、語意分割、生成對抗式網路
外文關鍵詞：	semi-supervision, semantic segmentation, generative adversarial learning
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本篇論文提出深度學習的架構,透過半監督式的機制與生成對抗式網
路的學習,訓練語意分割模型,進而對影像分割做出更精確以期接近全監
督式訓練的預測。
半監督式生成對抗語意分割任務在主架構上皆起源於典型的對抗式網
路概念。基礎網路必須構建兩個主要區塊,一個是語意分割器亦或是分類
器來實現圖像分割的目的。第二,用來強化模型分割能力的判別器。
以生成對抗式網路發展出的架構概念主要分為兩類。第一類,利用現
有的分割網路架構作為一個可靠且穩定的生成模型,再以判別模型強化生
成網路的分割能力。第二類,以生成網路產生擬真的合成影像擴增訓練樣
本以及多樣性,再佐以分類器進而判斷類別,從而達到分割器的效果。前
者優勢在於網路分割架構的穩定性,縱使標記樣本大量縮減,模型仍存在
一定的再現性與語意分類表現。後者優勢在於訓練樣本的擴增,並施加分
類器促進圖像真實性和語意分割能力。
我提出的方法利用既有的語意分割模型與生成多樣分割圖樣擴增,將
兩種實作優勢結合並搭配半監督式策略的選擇性訓練,即選擇性採取有效
且可靠的分割區塊進行損失函數訓練穩定且有效的利用未標記樣本。除了
只採用未標記圖像所生成之可靠分割進行訓練,更進一步利用相較可靠樣
本而言更為大量的不可靠之分割結果,使得網路認識並降低不可靠的分類
之機率以進一步減少語意判斷錯誤的可能,提升整體平均語意分割之準確
率。

We proposed a deep learning architecture based on semi supervision and
generative adversarial learning to train the semantic segmentation network expecting
to segment more precisely and as precise as fully-supervised training.
Semi-supervised semantic segmentation network based on generative adversarial
learning is derived from typical GAN’s concept. The architecture has to have two main
blocks: one is a segmentation network or a classifier used to do the segment job.
Another block is a discriminator used to enhance the segmentation capability of the
model.
There are mainly two kinds of approach used to solve this task based on using
adversarial learning. The first one takes an existing segmentation network as a reliable
and stable generation model and fortifies the segmentation ability by a discriminator.
The second one trains a generator to augment training examples in the dataset and use
a classifier to serve as a discriminator. The advantage of the former comes from the
stability and reliability of the segmentation network. Even if the labeled data are limited,
the network still preserves a certain extent of model reproducibility. The advantage of
the later is the amount of augmented data generated and implement a classifier to boost
the reality of the image and ability of segmentation the objects.
We proposed a method using existed segmentation model and a generator for
synthesizing variations of segmentation maps. Also, we combine the architecture with
a selective strategy to train the segmentation network. So as to say, we selectively
choose the trusted regions to train the cross-entropy loss and filter out the untrusted
classes in untrusted regions to make the network lowers the possibilities of classifying
them as the ground-truth. Our approach leverage each and every pixel information in
an image with careful supervising helping the segmentation network avoid making
mistakes and improves the mean-IOU.

摘
要............................................................................................................. 2
Abstract ................................................................................................................. 3
Content .................................................................................................................. 4
Chapter 1 ............................................................................................................... 5
Introduction ........................................................................................... 5
1.1
Research Background .................................................................. 5
Motivation and Objective ..................................................................... 6
1.2
Thesis Organization ..................................................................... 8
Chapter 2 ............................................................................................................... 9
Related Work ......................................................................................... 9
2.1 Segmentation Models .................................................................. 9
2.2 Adversarial Neworks ................................................................. 10
2.3 Adversarial Learning Based Segmenation Models..................... 11
Chapter 3 ............................................................................................................. 14
Proposed Method ................................................................................ 14
3.1 Overview of Proposed Method .................................................. 14
3.2 Semi Supervision ........................................................................ 16
3.3 Loss Functions ............................................................................ 18
Chapter 4 ............................................................................................................. 20
Experiments and Discussions ............................................................. 20
4.1 Data Set ...................................................................................... 20
4.2 Architecture Deformation .......................................................... 22
4.3 Performance Evaluation ............................................................. 23
Chapter 5 ............................................................................................................. 30
Conclusion .......................................................................................... 30
Reference ............................................................................................................ 31
                                

[1] Jonathan Long, Evan Shelhamer, and Trevor Darrell. Fully convolutional networks
for semantic segmentation. In CVPR, 2015.
[2] Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated
convolutions. In ICLR, 2016.
[3] O. Ronneberger, P. Fischer, and T. Brox. U-net: Convolutional networks for
biomedical image segmentation. In Medical Image Computing and Computer-Assisted
Intervention, 2015.
[4] S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang,
and P. Torr. Conditional random fields as recurrent neural networks. In ICCV, 2015.
[5] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and
Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets,
atrous convolution, and fully connected crfs. In TPAMI, 2017.
[6] Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In:
CVPR. (2017)
[7] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In
NIPS, 2014.
[8] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with
deep convolutional generative adversarial networks. In ICLR, 2016
[9] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, ́ A. Acosta, A. Aitken,
A. Tejani, J. Totz, Z. Wang, et al. Photo-realistic single image super-resolution using a
generative adversarial network. arXiv preprint arXiv:1609.04802, 2016
31[10] Hung, W.C., Tsai, Y.H., Liou, Y.T., Lin, Y.Y., Yang, M.H.: Adversarial learning for
semi-supervised semantic segmentation. arXiv preprint arXiv:1802.07934 (2018)
[11] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for
improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017
[12] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. arXiv:1701.07875, 2017.
[13] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved
training of Wasserstein GANs. arXiv e-prints, arXiv:1704.00028, 2017. Advances in
Neural Information Processing Systems 31 (NIPS 2017)
[14] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros. Unpaired imageto-image translation
using cycle-consistent adversarial networks. In IEEE International Conference on
Computer Vision (ICCV), 2017
[15] C. Chan, S. Ginosar, T. Zhou, and A. A. Efros, “Everybody dance now,” in ECCV
Workshop, 2018
[16] N. Souly, C. Spampinato, and M. Shah. Semi supervised semantic segmentation
using generative adversarial network. In IEEE International Conference on Computer
Vision, 2017
[17] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale
image recognition. In ICLR, 2015
[18] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V.
Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In CVPR, 2015.
[19] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning
for image recognition. In CVPR, 2016
[20] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Semantic
image segmentation with deep convolutional nets and fully connected CRFs. In ICLR
2015.
32[21] Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution
for semantic image segmentation. arXiv:1706.05587 (2017)
[22] Avinash Hindupur: the-gan-zoo https://github.com/hindupuravinash/the-gan-zoo
[23] T. Xu, P. Zhang, Q. Huang, H. Zhang, Z. Gan, X. Huang, and X. He. Attngan: Fine-
grained text to image generation with attentional generative adversarial networks. In
CVPR, 2018.
[24] Xun Huang, Yixuan Li, Omid Poursaeed, John E. Hopcroft, and Serge J. Belongie.
Stacked generative adversarial networks. CoRR, abs/1612.04357, 2016.
[25] Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and YiHsuan Yang. MuseGAN:
Multi-track sequential generative adversarial networks for symbolic music generation
and accompaniment. In Proc. AAAI, 2018.
[26] J. Wu, C. Zhang, T. Xue, B. Freeman, and J. Tenenbaum. Learning a probabilistic
latent space of object shapes via 3d generative-adversarial modeling. In NIPS, pages
82–90, 2016
[27] Dan Li, Dacheng Chen, Jonathan Goh, and See-kiong Ng. Anomaly detection with
generative adversarial networks for multivariate time series. arXiv preprint
arXiv:1809.04758, 2018.
[28] Pauline Luc, Camille Couprie, Soumith Chintala, and Jakob Verbeek. 2016.
Semantic segmentation using adversarial networks. arXiv preprint arXiv:1611.08408 .
[29] Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and
Andrew Zisserman. The pascal visual object classes (voc) challenge. In IJCV, 2010.

簡易檢索 / 詳目顯示

相關論文