實作深度學習模型進行影像壓縮｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳若菁 Chen, Rou-Jing
論文名稱：	實作深度學習模型進行影像壓縮 An Implementation of Deep Learning Model for Image Compression
指導教授：	陳朝欽 Chen, Chaur-Chin
口試委員:	張隆紋 Chang, Long-Wen 黃仲陵 Huang, Chung-Lin
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	26
中文關鍵詞：	深度學習、影像壓縮
外文關鍵詞：	Deep Learning, Image Compression
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

深度學習技術為影像視覺與影像處理領域帶來極大的進步，近期以學習方式為基礎的影像壓縮技術也受到了相當的關注。在我們的這篇論文中，我們首先介紹兩個常應用於影像壓縮的神經網路，一個是自編碼(auto-encoder)，另一個則是循環神經網路(Recurrent Neural Networks)，當中我們特別著重自編碼的作法，並根據[Ment2018]來實作我們的自編碼式的影像壓縮網路。在我們使用多層級結構相似性(Multi-Scale-Structural Similarity)訓練網路時，我們在解碼恢復圖像顏色上遇到問題，為了解決這個問題，我們提出了以混合多層級結構相似性與均方根誤差(Mean Square Error)的損失函數來訓練模型，結果顯示我們的做法有效改善了問題並提升了重建影像的品質。

Deep learning techniques make great progress in the domain of computer vision and digital image processing. Recently, learning-based image compression attracts attention. In this thesis, we first introduce two common neural networks: auto-encoders and recurrent neural networks (RNNs) which can be applied to do image compression. We especially focus on auto-encoders and follow [Ment2018] to construct our compression system. In the process of training the auto-encoders with multi-scale-structural similarity (MS-SSIM), we encouter some problems of restoring the colors of images. To overcome this problem, we propose a loss function which mixes MS-SSIM and mean square error (MSE) to train the auto-encoder. The results show our method relieves the problem and improves the quality of the reconstructed images.

Chapter 1 Introduction    1
Chapter 2 Deep Neural Networks for Image Compression    3
2.1 Auto-encoders    3
2.2 Recurrent Neural Networks    5
Chapter 3 Two Challenges of Implementing Auto-encoders for Image Compression    7
3.1 Quantization    7
3.2 Entropy estimation    8
Chapter 4 Model and Methodology    11
4.1 Architecture of Model    11
4.2 Importance map    13
4.3 Distortion Measure    14
Chapter 5 Proposed Methods    16
5.1 The problem of using MS-SSIM for loss function    16
5.2 Mix loss function    17
Chapter 6 Experiments    19
6.1 Training    19
6.2 Other codec    19
6.3 Results    20
Chapter 7 Conclusion    24
References    25

                                

[Ment2018] F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte and L. V. Gool, "Conditional probability models for deep image compression", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
[Thei2017] L. Theis, W. Shi, A. Cunningham and F. Huszár, "Lossy image compression with compressive autoencoders", International Conference on Learning Representations, 2017
[Tode2017] G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor and M. Covell, “Full resolution image compression with recurrent neural networks”, CoRR, vol. abs/1608.05148,
2016
[Li2017] M. Li, W. Zuo, S. Gu, D. Zhao and D. Zhang, "Learning Convolutional Networks for Content-weighted Image Compression", arXiv preprint arXiv:1703.10553, 2017
[Wall1991] G. K. Wallace. “The JPEG still picture compression standard”, Communications of the ACM, 34(4): 30–44, 1991
[Skod2001] A. Skodras, C. Christopoulos, and T. Ebrahimi. “The JPEG 2000 still image compression standard”, Signal Processing Magazine, 18(5):36–58, 2001
[Ball2017] J. Ballé, V. Laparra and E. P. Simoncelli, "End-to-end optimized image compression", International Conference on Learning Representations, 2017
[Tode2016] G. Toderici, S. M. O’Malley, S. J. Hwang, D. Vincent, D. Minnen, S. Baluja, M. Covell and R. Sukthankar, "Variable rate image compression with recurrent neural networks", International Conference on Learning Representations, 2016
[Oord2016] A. V. D. Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves and K. Kavukcuoglu, "Conditional image generation with PixelCNN decoders", Advances in Neural Information Processing Systems, 2016
[Wang2004] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image Quality Assessment: From Error Visibility toStructural Similarity", IEEE Trans. Image Processing, vol. 13,Jan. 2004
[Wang2003] Z. Wang, E. P. Simoncelli and A. C. Bovik, "Multiscale structural similarity for image quality assessment", Asilomar Conference on Signals, Systems Computers, vol.2, 1398-1402, Nov. 2003
[Hoch1997] S. Hochreiter and J. Schmidhuber, "Long short-term memory", Neural Computation, vol. 9, no. 8, 1735–1780, Nov. 1997

簡易檢索 / 詳目顯示

相關論文