多尺度型生成對抗網路於視頻解模糊｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	韓磊 Han, Lei
論文名稱：	多尺度型生成對抗網路於視頻解模糊 Multi-scale GAN for Video Deblurring
指導教授：	賴尚宏 Lai, Shang-Hong
口試委員:	邱瀞德 Chiu, Ching-Te 許秋婷 Hsu, Chiu-Ting
學位類別：	碩士 Master
系所名稱：
論文出版年：	2018
畢業學年度：	107
語文別：	英文
論文頁數：	37
中文關鍵詞：	生成对抗网路、解模糊
外文關鍵詞：	GAN, deblur
相關次數：	點閱：96 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著深度學習的快速發展，通過一個個深度模型，越來越多的視覺問題得以解決。對於視頻解模糊這個任務，已經有一些深度模型被提出。與此同時，生成對抗網路也由於它強大的能力而被廣泛應用到許多場景。在這篇論文中，我們將傳統方法中經典的多尺度結構與生成對抗網路相結合以來解決視頻解模糊問題。從試驗結果中可以發現，無論是從數值度量上還是視覺直觀上，我們的模型都比一些其他的深度模型算法更加穩定。最後，我們也從不同的角度分析了我們的模型的優越性。

With the rapid development of deep learning, more and more vision problems can be solved by using deep neural network models. Several deep models have been released to handle video deblurring task. At the same time, generative adversarial networks (GANs) are also widely used in many kinds of problems for its strength. In this thesis, we connect the classical multi-scale structure in traditional vision methods with GAN for video deblurring. The quantitative and qualitative results of our experiments demonstrate the proposed model can restore frames more robustly than the state-of-the-art video deblurring deep methods. We also justify the superiority of our model from different perspectives.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Problem Description . . . . . . . . . .  . . . . . . . . . . 2
3 Main Contributions . . . . . . . . . . . . . . . . . . . . . 3
Related works . . . . . . . . . . . . . . . . . . . .. . . . . 5
1 Image/Video Deblurring . . . . . . . . . . . . . . . . . . . 5
2 Generative Adversarial Networks .. . . . . . . . . . . . . . 6
Proposed Model . . . . . . . . . . . . . . . . . . . . . . . . 9
1 Network Architecture . . . . . . . . . . . . . . . . . . . . 9
1.1 Generator . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Discriminator . . . . . . . . . . . . . . . . . . . . . . 12
2 Loss function . . . . . . . . . . . . . . . . . . . . . . . 14
3 Training details . . . . . . . . . . .  . . . . . . . . . . 16
Experiments . . . . . . . . . . . . .  . . . . . . . . . . .  17
1 Datasets . . . . . . . . . . . . .  . . . . . . . . . . . . 17
2 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 Comparison . . . . . . . . . . . . .. . . . . . . . . . . . 20
4 Ablation Study . . . . . . . . . .  . . . . . . . . . . . . 24
4.1 Loss Function and Hyperparameters . . . . . . . . . . . . 24
4.2 Network structure . . . . . . . . . . . . . . . . . . . . 26
Discussion . . . . . . . . . . . . .  . . . . . . . . . . . . 34
References . . . . . . . . . . . . .  . . . . . . . . . . . . . 35
                                

[1] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, “Deblurgan:
Blind motion deblurring using conditional adversarial networks,” CoRR,
vol. abs/1711.07064, 2017.
[2] Y. Chen, Y.-K. Lai, and Y.-J. Liu, “Cartoongan: Generative adversarial networks
for photo cartoonization,” 2018.
[3] P. Wieschollek, M. Hirsch, B. Schölkopf, and H. Lensch, “Learning blind motion
deblurring,” in Proceedings IEEE International Conference on Computer Vision
(ICCV), (Piscataway, NJ, USA), pp. 231–240, IEEE, Oct. 2017.
[4] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” CoRR,
vol. abs/1709.01507, 2017.
[5] A. He, C. Luo, X. Tian, and W. Zeng, “A twofold siamese network for real-time
object tracking,” CoRR, vol. abs/1802.08817, 2018.
[6] S. Nah, T. H. Kim, and K. M. Lee, “Deep multi-scale convolutional neural network
for dynamic scene deblurring,” in The IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), July 2017.
[7] A. Chakrabarti, “A neural approach to blind motion deblurring,” in ECCV, 2016.
[8] S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang, “Deep video
deblurring for hand-held cameras,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, pp. 1279–1288, 2017.
[9] T. H. Kim, K. M. Lee, B. Scholkopf, and M. Hirsch, “Online Video Deblurring via
Dynamic Temporal Blending Network,” Proceedings of the IEEE International
Conference on Computer Vision, vol. 2017-October, pp. 4058–4067, 2017.
[10] P. Wieschollek, B. Schölkopf, H. P. A. Lensch, and M. Hirsch, “End-to-end learning
for image burst deblurring,” in Computer Vision - ACCV 2016 - 13th Asian
Conference on Computer Vision, vol. 10114 of Image Processing, Computer Vision,
Pattern Recognition, and Graphics, pp. 35–51, Springer, 2017.
[11] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation,
vol. 9, pp. 1735–1780, 1997.
[12] A. Levin, Y. Weiss, F. Durand, and W. T. Freeman, “Understanding and evaluating
blind deconvolution algorithms,” 2009 IEEE Conference on Computer Vision and
Pattern Recognition, pp. 1964–1971, 2009.
[13] D. Krishnan and R. Fergus, “Fast image deconvolution using hyper-laplacian priors,”
in NIPS, 2009.
[14] T. F. Chan and C.-K. Wong, “Total variation blind deconvolution,” IEEE transactions
on image processing : a publication of the IEEE Signal Processing Society,
vol. 7 3, pp. 370–5, 1998.
[15] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. C. Courville, and Y. Bengio, “Generative adversarial networks,” CoRR,
vol. abs/1406.2661, 2014.
[16] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. A. Cunningham, A. Acosta, A. P.
Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, “Photo-realistic single image
super-resolution using a generative adversarial network,” 2017 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), pp. 105–114, 2017.
[17] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for
improved quality, stability, and variation,” CoRR, vol. abs/1710.10196, 2017.
[18] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution
image synthesis and semantic manipulation with conditional gans,”
CoRR, vol. abs/1711.11585, 2017.
[19] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with
conditional adversarial networks,” 2017 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR), pp. 5967–5976, 2017.
[20] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation
using cycle-consistent adversarial networks,” 2017 IEEE International Conference
on Computer Vision (ICCV), pp. 2242–2251, 2017.
[21] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,”
in ICML, 2017.
[22] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved
training of wasserstein gans,” in NIPS, 2017.
[23] X. Xu, D. Sun, J. Pan, Y. Zhang, H. Pfister, and M.-H. Yang, “Learning to superresolve
blurry face and text images,” 2017 IEEE International Conference on
Computer Vision (ICCV), pp. 251–260, 2017.
[24] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for
biomedical image segmentation,” in MICCAI, 2015.
[25] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
2016 IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), pp. 770–778, 2016.
[26] A. L. Maas, “Rectifier nonlinearities improve neural network acoustic models,”
2013.
[27] D. Ulyanov, A. Vedaldi, and V. S. Lempitsky, “Instance normalization: The missing
ingredient for fast stylization,” CoRR, vol. abs/1607.08022, 2016.
[28] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training
by reducing internal covariate shift,” in ICML, 2015.
[29] X. Wei, B. Gong, Z. Liu, W. Lu, and L. Wang, “Improving the improved
training of wasserstein gans: A consistency term and its dual effect,” CoRR,
vol. abs/1803.01541, 2018.
[30] M. Arjovsky and L. Bottou, “Towards principled methods for training generative
adversarial networks,” CoRR, vol. abs/1701.04862, 2017.
[31] Y. Wu et al., “Tensorpack.” https://github.com/tensorpack/, 2016.
[32] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado,
A. Davis, J. Dean, M. Devin, S. Ghemawat, I. J. Goodfellow, A. Harp, G. Irving,
M. Isard, Y. Jia, R. Józefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané,
R. Monga, S. Moore, D. G. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
I. Sutskever, K. Talwar, P. A. Tucker, V. Vanhoucke, V. Vasudevan, F. B. Viégas,
O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow:
Large-scale machine learning on heterogeneous distributed systems,” CoRR,
vol. abs/1603.04467, 2015.
[33] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” CoRR,
vol. abs/1412.6980, 2014.
[34] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale
image recognition,” CoRR, vol. abs/1409.1556, 2014.
[35] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A Large-
Scale Hierarchical Image Database,” in CVPR09, 2009.
[36] X. Tao, H. Gao, Y. Wang, X. Shen, J. Wang, and J. Jia, “Scale-recurrent network
for deep image deblurring,” CoRR, vol. abs/1802.01770, 2018.

簡易檢索 / 詳目顯示

相關論文