研究生: |
曾景暐 Tseng, Ching-Wei |
---|---|
論文名稱: |
泛用輕量型條件生成對抗網路於影像補全 General Deep Image Completion with Lightweight Conditional Generative Adversarial Networks |
指導教授: |
賴尚宏
Lai, Shang-Hong |
口試委員: |
劉庭祿
陳煥宗 Chen, Hwann-Tzong 許秋婷 Hsu, Chiu-Ting |
學位類別: |
碩士 Master |
系所名稱: |
|
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 英文 |
論文頁數: | 72 |
中文關鍵詞: | 影像補全 、條件式生成對抗網路 、深度學習 |
外文關鍵詞: | Inpainting, GANs, DCNNs |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來因深度生成對抗網路的興起,在影像復原的問題上都能得到相較於傳統方法更好與更真實的結果。然而,一般深度學習的方法需要巨量的訓練參數,以及無法應用於補全多種形式的影像破損與遺失。除此之外,一般深度自編碼器與對抗網路結合時容易面臨不穩定的訓練過程,或是只學習到將所有的輸入影像都對應至某特定結果。在這篇論文中,我們提出透過建構輕量型的條件生成對抗網路,以及結合更穩定的對抗訓練方式,來應用在解決各式各樣的影像破損情況,將其修補為更真實,完美的影像。另外,我們也提出新的訓練策略來促使深度模型學習擷取影像具有代表性的特徵,以便修復各種不同的破損。在實驗中,我們驗證了論文所提出的深度模型相較於其他深度學習方法所需的訓練參數是最少的。而且在量化數據以及視覺化上,我們提出的方法在各種類型的資料集中都優於傳統以及深度學習方法。在應用面,我們也展示了能在不同解析度以及使用者自定義的破損影像中,依然能夠修補完整。
Recent image completion researches using deep neural networks approaches have shown remarkable progress by using generative adversarial networks (GANs). However, these approaches still suffer from the problems of large model sizes and lack of generality for various types of corruptions. In addition, the conditional GANs often suffer from the mode collapse and unstable training problems. In this thesis, we overcome these shortcomings in the previous models by proposing a lightweight model of conditional GANs and integrating a stable adversarial training strategy. Moreover, we present a new training strategy to train the model to learn how to complete different types of corruptions or missing regions in images. Experimental results demonstrate qualitatively and quantitatively that the proposed model provides significant improvement over state-of-the-art image completion methods on public datasets. In addition, we show that our model requires much less model parameters to achieve superior results for different types of unseen corruption masks.
References
[1] Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan Goldman. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Transactions on Graphics-TOG, 28(3):24, 2009.
[2] CRIMINISI, Antonio; PÉREZ, Patrick; TOYAMA, Kentaro. Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on image processing, 2004, 13.9: 1200-1212.
[3] Iddo Drori, Daniel Cohen-Or, and Hezy Yeshurun. Fragment-based image completion. In ACM Transactions on graphics (TOG), volume 22, pages 303–312. ACM, 2003.
[4] Simon Korman and Shai Avidan. Coherency sensitive hashing. In 2011 International Conference on Computer Vision, pages 1607–1614. IEEE, 2011.
[5] Marta Wilczkowiak, Gabriel J Brostow, Ben Tordoff, and Roberto Cipolla. Hole filling through photomontage. In BMVC, volume 5, pages 492–501, 2005.
[6] Yao Hu, Debing Zhang, Jieping Ye, Xuelong Li, and Xiaofei He. Fast and accurate matrix completion via truncated nuclear norm regularization. IEEE transactions on pattern analysis and machine intelligence, 35(9):2117–2130, 2013.
[7] Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. Image completion using planar structure guidance. ACM Transactions on Graphics (TOG), 33(4):129, 2014.
[8] Nikos Komodakis and Georgios Tziritas. Image completion using efficient belief propagation via priority scheduling and dynamic pruning. IEEE Transactions on Image Processing, 16(11):2649–2661, 2007.
[9] Stefan Roth and Michael J Black. Fields of experts. International Journal of Computer Vision, 82(2):205–229, 2009.
[10] J. Hays and A. A. Efros. Scene completion using millions of photographs. In ACM Transactions on Graphics (TOG), volume 26, page 4. ACM, 2007.
[11] A. Dosovitskiy, J. Tobias Springenberg, and T. Brox. Learning to generate chairs with convolutional neural networks.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1538–1546, 2015.
[12] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.
[13] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
[14] Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. Photo-realistic single image super-resolution using a generative adversarial network. arXiv preprint arXiv:1609.04802, 2016.
[15] Chuan Li and Michael Wand. Precomputed real-time texture synthesis with markovian generative adversarial networks. In European Conference on Computer Vision, pages 702–716. Springer, 2016.
[16] Yijun Li, Sifei Liu, Jimei Yang, and Ming-Hsuan Yang. Generative face completion. arXiv preprint arXiv:1704.05838, 2017.
[17] Raymond Yeh, Chen Chen, Teck Yian Lim, Mark Hasegawa-Johnson, and Minh N Do. Semantic image inpainting with perceptual and contextual losses. arXiv preprint arXiv:1607.07539, 2016.
[18] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. Image-to-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016.
[19] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scaleimage recognition. CoRR, abs/1409.1556, 2014.
[20] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. Context encoders: Feature learning by inpainting. arXiv preprint arXiv:1604.07379, 2016.
[21] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li. High-resolution image inpainting using multi-scale neural patch synthesis. arXiv preprint arXiv:1611.09969, 2016.
[22] Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and accurate deep network learning by exponential linear units (elus). arXiv preprint arXiv:1511.07289, 2015.
[23] Xiao-Jiao Mao, Chunhua Shen, and Yu-Bin Yang. Image restoration using convolutional auto-encoders with symmetric skip connections. arXiv preprint arXiv:1606.08921, 2016.
[24] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 234–241. Springer, 2015.
[25] K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016.
[26] S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[27] Xudong Mao, Qing Li, Haoran Xie, Raymond YK Lau, and Zhen Wang. Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076, 2016.
[28] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan. arXiv preprint arXiv:1701.07875, 2017.
[29] David Berthelot, Tom Schumm, and Luke Metz. Began: Boundary equilibrium generative adversarial networks. arXiv preprint arXiv:1703.10717, 2017.
[30] C. Wah, S. Branson, P. Welinder, P. Perona, and S. Belongie. The Caltech-UCSDBirds-200-2011 Dataset. Technical report, 2011.
[31] M-E. Nilsback and A. Zisserman. Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, Dec 2008.
[32] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
[33] K. Zhao, W.-S. Chu, F. De la Torre, J. F. Cohn, and H. Zhang. Joint patch and multi-label learning for facial action unit detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2207–2216, 2015.
[34] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European Conference on Computer Vision, pages 740–755. Springer, 2014.
[35] John D’Errico. inpaint_nans, MATLAB Central File Exchange, Retrieved March 1, 2017.
[36] Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
[37] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[38] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. CoRR, abs/1409.1556, 2014.
[39] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proc. 8th Int’l Conf. Computer Vision, volume 2, pages 416–423, July 2001.
[40] D. Yoo, N. Kim, S. Park, A. S. Paek, and I. S. Kweon. Pixel level domain transfer. In European Conference on Computer Vision, pages 517–532. Springer, 2016.
[41] R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization. In European Conference on Computer Vision, pages 649–666. Springer, 2016.
[42] BMVC 2017. Retrieved from https://bmvc2017.london/, April 29, 2017.
[43] ARCHITECTURURAL DIGEST, London Travel Guide. Retrieved from http://www.architecturaldigest.com/london-travel-guide, April 29,2017.