簡易檢索 / 詳目顯示

研究生: 黃冠諭
Huang, Kuan-Yu
論文名稱: 基於深層孿生網路之成對式學習影像視覺品質評估
Pairwise Learning on Image Quality Assessment using Deep Siamese Neural Networks
指導教授: 林嘉文
Lin, Chia-Wen
口試委員: 彭文孝
Peng, Wen-Hsiao
賴尚宏
Lai, Shang-Hong
劉宗榮
Liu, Tsung-Jung
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 106
語文別: 英文
論文頁數: 43
中文關鍵詞: 主觀評測孿生網路成對式學習影像縮放
外文關鍵詞: Subjective Quality Assessment, Siamese Network, Pairwise Learning, Image Retargeting
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在影像處理研究領域中,評估影像品質之優劣一直都是相當具重要的環節,
    一個設計良好的影像品質評估方法,將提供影像演算法設計者們共同評比其算法
    性能的準則,以將提高評估結果作為目標,來分析方法間的差異與優劣,並能進
    而協助他們提出對應的改善策略。
    影像評估方法會依照評估問題的類型去特別設計,像是經典的影像超解析品
    質評估上,就會使用峰值信躁比(PSNR)來測量該算法產生出的高解析度影像
    與原使影像之間的差異,將此失真落差用分數來呈現,然而多數的評估做法都是
    基於簡單且具象的數理計算,因此在較為複雜且抽象的美感評估問題時,評估結
    果往往難以吻合實際的人類視覺系統,近年來許多人將深度學習運用於主觀品質
    評估領域,藉由大量人為標記的評估資料,使類神經網路模型學習出難以直接定
    義的抽象評估函數,而這類作法也確實達到較傳統品質評測方法優異的成果,然
    而對於某些無法直接定義標注資料的問題(如:影像縮放重定義(Image
    Retargeting)),會因為資料搜集上的障礙而導致無法直接地去學習。
    在本篇論文中,我們將提出一套成對式學習(Pairwise learning)的影像主觀
    品質評測,基於孿生神經網路(Siamese network)之學習架構,讓我們的方法可
    以使卷積神經網路模型(CNN)訓練在簡單的成對註記資料上,這不僅能幫助我
    們解決影像縮放重定義之主觀評估問題,還可以延伸至其他相似問題上,並且對
    於一般主流的主觀評估資料庫上,也能作為舒緩訓練資料蒐集之難度。我們的實
    驗顯示在影像縮放重定義主觀評估上,本篇作法可以達到較過往作法優異的性能
    表現,且擁有更良好的延伸拓展性;而在一般的主觀評估資料庫上,也接近了其
    他過往作法的性能表現。


    Image quality assessment is a very important component in image processing. A
    good image quality assessment method provides designers a criterion for comparing
    performance. Every designer directs toward the target of increasing the score produced
    by this method. In addition, this quality assessment criteria can facilitate designers to
    propose a suitable strategy for improving their methods.
    Image quality assessment methods are designed specially according to task goals.
    However, lots of methods are simple and human adjusted, they are hard to solve
    complicated quality assessing problems which are closer to the human visual system.
    Recently, many subjective quality assessment methods based on deep neural network
    have been proposed, which need large amounts of labeling data, and these methods has
    gotten significant improvement. Nevertheless, there have some cases difficult to label
    the subjective quality score directly (like: Image Retargeting), which cannot be learned
    by existing method due to the obstacle of collecting training data.
    In this thesis, we propose a pairwise training model for image subjective quality
    assessment. Our architecture is based on Siamese network, which can make our model
    training on paired-comparison data. This architecture lets us solve the problem on
    image retargeting and easily extends to similar tasks. On general subjective quality
    assessment database, our method can also alleviate the difficulty of data collections.
    Experiment shows that our proposed method produces better results compared to
    previous works on the retargeted image case and comparative results in the general
    quality assessment database, even under the limitation of labeling information.

    摘 要 .......i Abstract.......ii Content.......... iii Chapter 1 Introduction ......4 Chapter 2 Related Work...........8 2.1 RIQA........8 2.2 CNN-based GIQA..........10 Chapter 3 Proposed Method................13 3.1 Overview......................13 3.2 Initialization ..........................15 3.3 Network Architecture............20 3.4 Optimization ..........................22 Chapter 4 Experiments and Discussions.........24 4.1 Benchmark Databases.............24 4.2 Performance Evaluations........27 4.3 Predicted Quality Score Maps ....34 4.4 Limitations.......................34 Chapter 5 Conclusion...............37 References.........................38

    [1] H. Sheikh, M. Sabir and A. Bovik, "A statistical evaluation of recent full reference
    image quality assessment algorithms," IEEE Trans. Image Process., Nov. 2006,
    vol. 15, no. 11, pp. 3440–3451.
    [2] M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of
    image retargeting,” ACM Trans. Graphics, 2010, vol. 29, no. 6, pp. 160:1–160:9.
    [3] J. Bromley, I. Guyon, Y. LeCun, E. Sa ̈ckinger, and R. Shah, "Signature
    verification using a siamese time delay neural network," in Proc. Neural Inf.
    Process. Syst. 6, 1994, pp. 737–744.
    [4] S. Chopra, R. Hadsell and Y. LeCun, "Learning a similarity metric
    discriminatively, with application to face verification," in Proc. IEEE Conf.
    Comput. Vis. Pattern Recognit., San Diego, CA, USA, June 2005, vol. 1, pp. 539–
    546.
    [5] C. Hsu, C. Lin, Y. Fang and W. Lin, "Objective quality assessment for image
    retargeting based on perceptual geometric distortion and information loss," IEEE
    J. Sel. Topics Signal Process., June 2014, vol. 8, no. 3, pp. 377–389.
    [6] D. Chandler, "Most apparent distortion: Full-reference image quality assessment
    and the role of strategy," Journal of Electronic Imaging, Jan.–Mar. 2010, vol. 19,
    no. 1, pp. 011006.
    [7] D. Ghadiyaram and A. Bovik, "Massive online crowdsourced study of subjective
    and objective picture quality," IEEE Trans. Image Process., Nov. 2016, vol. 25,
    no. 1, pp. 372–387.
    [8] P. Krähenbühl, M. Lang, A. Hornung and M. Gross, "A system for retargeting of
    streaming video," ACM Transactions on Graphics, 2009, vol. 28, no. 5, pp. 126.
    [9] M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,”
    ACM Trans. Graphics, 2009, vol. 28, no. 3, pp. 23 :1–23:11.
    [10] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” ACM
    Trans. Graphics, 2007, vol. 26, no. 3, pp. 10:1–10:9.
    [11] Y. Pritch, E. Kav-Venaki, and S. Peleg, “Shift-map image editing,” in Proc. Int.
    Conf. Comput. Vis., Kyoto, Japan, Oct. 2009, pp. 151–158.
    [12] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven
    video retargeting,” in Proc. Int. Conf. Comput. Vis., Rio de Janeiro, Brazil, Oct.
    2007, pp. 1–6.
    [13] Y.-S. Wang, C.-L. Tai, O. Sorkin, and T.-Y. Lee, “Optimized scaleand-stretch for
    image resizing,” ACM Trans. Graphics, 2008, vol. 27, no. 5, pp. 118:1–118:8.
    [14] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data
    using bidirectional similarity,” in Proc. IEEE Conf. Comput. Vis. Pattern
    Recognit., Anchorage, AK, USA, Jun. 2008, pp. 1–8.
    [15] A. Mollahosseini and M. Mahoor, "Bidirectional warping of active appearance
    model," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Portland, OR, USA,
    Jun. 2013, pp. 875–880.
    [16] C. Liu, J. Yuen, and A. Torralba, “SIFT flow: Dense correspondence across scenes
    and its applications,” IEEE Trans. Pattern Anal. Mach. Intell., May 2011, vol. 33,
    no. 5, pp. 978–994.
    [17] O. Pele and M. Werman, “Fast and robust earth mover’s distances,” in Proc. Int.
    Conf. Comput. Vis., Kyoto, Japan, Oct. 2009, pp. 460–467.
    [18] Y. Zhang, Y. Fang, W. Lin, X. Zhang and L. Li, "Backward registration-based
    aspect ratio similarity for image retargeting quality assessment," IEEE Trans.
    Image Process., Jun. 2016, vol. 25, no. 9, pp. 4286–4297.
    [19] Y. Chen, Y. J. Liu "Learning to rank retargeted images," in Proc. IEEE Conf.
    Comput. Vis. Pattern Recognit., Honolulu, HI, USA, Jul. 2017, pp. 2994–4002.
    [20] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep
    convolutional neural networks,” in Proc. Neural Inf. Process. Syst. 25, 2012, pp.
    1097–1105.
    [21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object
    detection with region proposal networks,” in Proc. Neural Inf. Process. Syst. 28,
    2015, pp. 91–99.
    [22] E. Shelhamer, J. Long and T. Darrell, "Fully convolutional networks for semantic
    segmentation," IEEE Trans. Pattern Anal. Mach. Intell., May 2017, vol. 39, no. 4,
    pp. 640–651.

    [23] L. Kang, P. Ye, Y. Li and D. Doermann, "Convolutional neural networks for no-
    reference image quality assessment," in Proc. Int. Conf. Comput. Vis., Columbus,

    OH, USA, Jun. 2014, pp. 1733–1740.
    [24] S. Bosse, D. Maniry, T. Wiegand and W. Samek, "A deep neural network for
    image quality assessment," in Proc. 23th IEEE Int. Conf. Image Process., Phoenix,
    AZ, USA, Sep. 2016, pp. 3773–3777.
    [25] J. Kim and S. Lee, "Fully deep blind image quality predictor," IEEE J. Sel. Topics
    Signal Process., Dec. 2017, vol. 11, no. 1, pp. 206-220.
    [26] Y. Liang, J. Wang, X. Wan, Y. Gong, and N. Zheng, “Image quality assessment
    using similar scene as reference,” in Proc Eur. Conf. Comput. Vis., Amsterdam,
    Netherlands, Oct. 2016, pp. 3–18.
    [27] F. Gao, Y. Wang, P. Li, M. Tan, J. Yu, and Y. Zhu, “DeepSim: Deep similarity
    for image quality assessment,” Neurocomputing, Sep. 2017, vol. 125, pp. 104–114.
    [28] J. Kim and S. Lee. "Deep learning of human visual sensitivity in image quality
    assessment framework," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
    Honolulu, HI, USA, Jul. 2017, pp. 1676–1684.
    [29] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li and Li Fei-Fei, "ImageNet: A
    large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis.
    Pattern Recognit., Miami, FL, USA, Jun. 2009, pp. 248–255.
    [30] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, "Flownet 2.0:
    Evolution of optical flow estimation with deep networks," in Proc. IEEE Conf.
    Comput. Vis. Pattern Recognit., Honolulu, HI, USA, Jul. 2017, pp. 2462–2470.
    [31] R. Zhao, W. Ouyang, H. Li and X. Wang, "Saliency detection by multi-context
    deep learning," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Boston, MA,
    USA, Jun. 2015, pp. 1265–1274.
    [32] V. Nair and G.E. Hinton, “Rectified linear units improve restricted boltzmann
    machines,” in Proc. 27nd Int. Conf. Mach. Learn., Haifa, Israel, Jun. 2010, pp.
    807–814.
    [33] S. Ioffe, C. Szegedy, "Batch normalization: Accelerating deep network training by
    reducing internal covariate shift," in Proc. 32nd Int. Conf. Mach. Learn., Lille,
    France, Jun. 2015, pp. 448–456.
    [34] Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, "Image quality assessment: From
    error visibility to structural similarity," IEEE Trans. Image Process., Apr. 2004,
    vol. 13, no. 4, pp. 600–612.
    [35] Z. Wang, E. P. Simoncelli, A. C. Bovik, M. Matthews, "Multiscale structural
    similarity for image quality assessment," in Proc. IEEE Asilomar Conf. Signals
    Systems and Computers, Nov. 2003, Pacific Grove, CA, USA, vol. 2, pp. 1398–
    1402.
    [36] W. Xue, L. Zhang, X. Mou and A. Bovik, "Gradient magnitude similarity
    deviation: A highly efficient perceptual image quality index," IEEE Trans. Image
    Process., Feb. 2014, vol. 23, no. 2, pp. 684–695.
    [37] Y. Lv, G. Jiang, M. Yu, H. Xu, F. Shao and S. Liu, "Difference of Gaussian
    statistical features based blind image quality assessment: A deep learning
    approach," in Proc. 22th IEEE Int. Conf. Image Process., Quebec City, QC,
    Canada, Sep. 2015, pp. 2344–2348.
    [38] A. Mittal, A. Moorthy and A. Bovik, "No-reference image quality assessment in
    the spatial domain," IEEE Trans. Image Process., Aug. 2012, vol. 21, no. 12, pp.
    4695–4708.
    [39] P. Ye, J. Kumar, L. Kang, D. Doermann, "Unsupervised feature learning
    framework for no-reference image quality assessment," in Proc. IEEE Conf.
    Comput. Vis. Pattern Recognit., Providence, RI, USA, Jun. 2012, pp. 1098–1105.
    [40] W. Xue, X. Mou, L. Zhang, A. C. Bovik, X. Feng, "Blind image quality
    assessment using joint statistics of gradient magnitude and Laplacian features,"
    IEEE Trans. Image Process., Nov. 2014, vol. 23, no. 11, pp. 4850–4862.
    [41] K. Gu, G. Zhai, X. Yang and W. Zhang, "Using free energy principle for blind
    image quality assessment," IEEE Trans. Multimedia, Nov. 2015, vol. 17, no. 1, pp.
    50–63.
    [42] Q. Li, W. Lin, J. Xu and Y. Fang, "Blind image quality assessment using statistical
    structural and luminance features," IEEE Trans. Multimedia, Aug. 2016, vol. 18,
    no. 12, pp. 2457–2469.
    [43] J. Kim, H. Zeng, D. Ghadiyaram, S. Lee and L. Zhang, "Deep convolutional neural
    models for picture quality prediction," 2017. [Online]. Available:
    https://www.cs.utexas.edu/~deepti/publications/deep_iqa.pdf.
    [44] M. Kendall, "A new measure of rank correlation," Biometrika, 1938, vol. 30, no.
    12, pp. 81.
    [45] R. Bradley and M. Terry, "Rank analysis of incomplete block designs: I. the
    method of paired comparisons," Biometrika, 1952, vol. 39, no. 34, pp. 324.

    QR CODE