簡易檢索 / 詳目顯示

研究生: 陳若菁
Chen, Rou-Jing
論文名稱: 實作深度學習模型進行影像壓縮
An Implementation of Deep Learning Model for Image Compression
指導教授: 陳朝欽
Chen, Chaur-Chin
口試委員: 張隆紋
Chang, Long-Wen
黃仲陵
Huang, Chung-Lin
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 26
中文關鍵詞: 深度學習影像壓縮
外文關鍵詞: Deep Learning, Image Compression
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 深度學習技術為影像視覺與影像處理領域帶來極大的進步,近期以學習方式為基礎的影像壓縮技術也受到了相當的關注。在我們的這篇論文中,我們首先介紹兩個常應用於影像壓縮的神經網路,一個是自編碼(auto-encoder),另一個則是循環神經網路(Recurrent Neural Networks),當中我們特別著重自編碼的作法,並根據[Ment2018]來實作我們的自編碼式的影像壓縮網路。在我們使用多層級結構相似性(Multi-Scale-Structural Similarity)訓練網路時,我們在解碼恢復圖像顏色上遇到問題,為了解決這個問題,我們提出了以混合多層級結構相似性與均方根誤差(Mean Square Error)的損失函數來訓練模型,結果顯示我們的做法有效改善了問題並提升了重建影像的品質。


    Deep learning techniques make great progress in the domain of computer vision and digital image processing. Recently, learning-based image compression attracts attention. In this thesis, we first introduce two common neural networks: auto-encoders and recurrent neural networks (RNNs) which can be applied to do image compression. We especially focus on auto-encoders and follow [Ment2018] to construct our compression system. In the process of training the auto-encoders with multi-scale-structural similarity (MS-SSIM), we encouter some problems of restoring the colors of images. To overcome this problem, we propose a loss function which mixes MS-SSIM and mean square error (MSE) to train the auto-encoder. The results show our method relieves the problem and improves the quality of the reconstructed images.

    Chapter 1 Introduction 1 Chapter 2 Deep Neural Networks for Image Compression 3 2.1 Auto-encoders 3 2.2 Recurrent Neural Networks 5 Chapter 3 Two Challenges of Implementing Auto-encoders for Image Compression 7 3.1 Quantization 7 3.2 Entropy estimation 8 Chapter 4 Model and Methodology 11 4.1 Architecture of Model 11 4.2 Importance map 13 4.3 Distortion Measure 14 Chapter 5 Proposed Methods 16 5.1 The problem of using MS-SSIM for loss function 16 5.2 Mix loss function 17 Chapter 6 Experiments 19 6.1 Training 19 6.2 Other codec 19 6.3 Results 20 Chapter 7 Conclusion 24 References 25

    [Ment2018] F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte and L. V. Gool, "Conditional probability models for deep image compression", IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018
    [Thei2017] L. Theis, W. Shi, A. Cunningham and F. Huszár, "Lossy image compression with compressive autoencoders", International Conference on Learning Representations, 2017
    [Tode2017] G. Toderici, D. Vincent, N. Johnston, S. J. Hwang, D. Minnen, J. Shor and M. Covell, “Full resolution image compression with recurrent neural networks”, CoRR, vol. abs/1608.05148,
    2016
    [Li2017] M. Li, W. Zuo, S. Gu, D. Zhao and D. Zhang, "Learning Convolutional Networks for Content-weighted Image Compression", arXiv preprint arXiv:1703.10553, 2017
    [Wall1991] G. K. Wallace. “The JPEG still picture compression standard”, Communications of the ACM, 34(4): 30–44, 1991
    [Skod2001] A. Skodras, C. Christopoulos, and T. Ebrahimi. “The JPEG 2000 still image compression standard”, Signal Processing Magazine, 18(5):36–58, 2001
    [Ball2017] J. Ballé, V. Laparra and E. P. Simoncelli, "End-to-end optimized image compression", International Conference on Learning Representations, 2017
    [Tode2016] G. Toderici, S. M. O’Malley, S. J. Hwang, D. Vincent, D. Minnen, S. Baluja, M. Covell and R. Sukthankar, "Variable rate image compression with recurrent neural networks", International Conference on Learning Representations, 2016
    [Oord2016] A. V. D. Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves and K. Kavukcuoglu, "Conditional image generation with PixelCNN decoders", Advances in Neural Information Processing Systems, 2016
    [Wang2004] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image Quality Assessment: From Error Visibility toStructural Similarity", IEEE Trans. Image Processing, vol. 13,Jan. 2004
    [Wang2003] Z. Wang, E. P. Simoncelli and A. C. Bovik, "Multiscale structural similarity for image quality assessment", Asilomar Conference on Signals, Systems Computers, vol.2, 1398-1402, Nov. 2003
    [Hoch1997] S. Hochreiter and J. Schmidhuber, "Long short-term memory", Neural Computation, vol. 9, no. 8, 1735–1780, Nov. 1997

    QR CODE