基於深層孿生網路之成對式學習影像視覺品質評估

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃冠諭 Huang, Kuan-Yu
論文名稱：	基於深層孿生網路之成對式學習影像視覺品質評估 Pairwise Learning on Image Quality Assessment using Deep Siamese Neural Networks
指導教授：	林嘉文 Lin, Chia-Wen
口試委員:	彭文孝 Peng, Wen-Hsiao 賴尚宏 Lai, Shang-Hong 劉宗榮 Liu, Tsung-Jung
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2017
畢業學年度：	106
語文別：	英文
論文頁數：	43
中文關鍵詞：	主觀評測、孿生網路、成對式學習、影像縮放
外文關鍵詞：	Subjective Quality Assessment, Siamese Network, Pairwise Learning, Image Retargeting
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在影像處理研究領域中,評估影像品質之優劣一直都是相當具重要的環節,
一個設計良好的影像品質評估方法,將提供影像演算法設計者們共同評比其算法
性能的準則,以將提高評估結果作為目標,來分析方法間的差異與優劣,並能進
而協助他們提出對應的改善策略。
影像評估方法會依照評估問題的類型去特別設計,像是經典的影像超解析品
質評估上,就會使用峰值信躁比(PSNR)來測量該算法產生出的高解析度影像
與原使影像之間的差異,將此失真落差用分數來呈現,然而多數的評估做法都是
基於簡單且具象的數理計算,因此在較為複雜且抽象的美感評估問題時,評估結
果往往難以吻合實際的人類視覺系統,近年來許多人將深度學習運用於主觀品質
評估領域,藉由大量人為標記的評估資料,使類神經網路模型學習出難以直接定
義的抽象評估函數,而這類作法也確實達到較傳統品質評測方法優異的成果,然
而對於某些無法直接定義標注資料的問題(如:影像縮放重定義(Image
Retargeting)),會因為資料搜集上的障礙而導致無法直接地去學習。
在本篇論文中,我們將提出一套成對式學習(Pairwise learning)的影像主觀
品質評測,基於孿生神經網路(Siamese network)之學習架構,讓我們的方法可
以使卷積神經網路模型(CNN)訓練在簡單的成對註記資料上,這不僅能幫助我
們解決影像縮放重定義之主觀評估問題,還可以延伸至其他相似問題上,並且對
於一般主流的主觀評估資料庫上,也能作為舒緩訓練資料蒐集之難度。我們的實
驗顯示在影像縮放重定義主觀評估上,本篇作法可以達到較過往作法優異的性能
表現,且擁有更良好的延伸拓展性;而在一般的主觀評估資料庫上,也接近了其
他過往作法的性能表現。

Image quality assessment is a very important component in image processing. A
good image quality assessment method provides designers a criterion for comparing
performance. Every designer directs toward the target of increasing the score produced
by this method. In addition, this quality assessment criteria can facilitate designers to
propose a suitable strategy for improving their methods.
Image quality assessment methods are designed specially according to task goals.
However, lots of methods are simple and human adjusted, they are hard to solve
complicated quality assessing problems which are closer to the human visual system.
Recently, many subjective quality assessment methods based on deep neural network
have been proposed, which need large amounts of labeling data, and these methods has
gotten significant improvement. Nevertheless, there have some cases difficult to label
the subjective quality score directly (like: Image Retargeting), which cannot be learned
by existing method due to the obstacle of collecting training data.
In this thesis, we propose a pairwise training model for image subjective quality
assessment. Our architecture is based on Siamese network, which can make our model
training on paired-comparison data. This architecture lets us solve the problem on
image retargeting and easily extends to similar tasks. On general subjective quality
assessment database, our method can also alleviate the difficulty of data collections.
Experiment shows that our proposed method produces better results compared to
previous works on the retargeted image case and comparative results in the general
quality assessment database, even under the limitation of labeling information.

摘 要 .......i
Abstract.......ii
Content.......... iii
Chapter 1 Introduction ......4
Chapter 2 Related Work...........8
2.1 RIQA........8
2.2 CNN-based GIQA..........10
Chapter 3 Proposed Method................13
3.1 Overview......................13
3.2 Initialization ..........................15
3.3 Network Architecture............20
3.4 Optimization ..........................22
Chapter 4 Experiments and Discussions.........24
4.1 Benchmark Databases.............24
4.2 Performance Evaluations........27
4.3 Predicted Quality Score Maps ....34
4.4 Limitations.......................34
Chapter 5 Conclusion...............37
References.........................38
                                

[1] H. Sheikh, M. Sabir and A. Bovik, "A statistical evaluation of recent full reference
image quality assessment algorithms," IEEE Trans. Image Process., Nov. 2006,
vol. 15, no. 11, pp. 3440–3451.
[2] M. Rubinstein, D. Gutierrez, O. Sorkine, and A. Shamir, “A comparative study of
image retargeting,” ACM Trans. Graphics, 2010, vol. 29, no. 6, pp. 160:1–160:9.
[3] J. Bromley, I. Guyon, Y. LeCun, E. Sa ̈ckinger, and R. Shah, "Signature
verification using a siamese time delay neural network," in Proc. Neural Inf.
Process. Syst. 6, 1994, pp. 737–744.
[4] S. Chopra, R. Hadsell and Y. LeCun, "Learning a similarity metric
discriminatively, with application to face verification," in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., San Diego, CA, USA, June 2005, vol. 1, pp. 539–
546.
[5] C. Hsu, C. Lin, Y. Fang and W. Lin, "Objective quality assessment for image
retargeting based on perceptual geometric distortion and information loss," IEEE
J. Sel. Topics Signal Process., June 2014, vol. 8, no. 3, pp. 377–389.
[6] D. Chandler, "Most apparent distortion: Full-reference image quality assessment
and the role of strategy," Journal of Electronic Imaging, Jan.–Mar. 2010, vol. 19,
no. 1, pp. 011006.
[7] D. Ghadiyaram and A. Bovik, "Massive online crowdsourced study of subjective
and objective picture quality," IEEE Trans. Image Process., Nov. 2016, vol. 25,
no. 1, pp. 372–387.
[8] P. Krähenbühl, M. Lang, A. Hornung and M. Gross, "A system for retargeting of
streaming video," ACM Transactions on Graphics, 2009, vol. 28, no. 5, pp. 126.
[9] M. Rubinstein, A. Shamir, and S. Avidan, “Multi-operator media retargeting,”
ACM Trans. Graphics, 2009, vol. 28, no. 3, pp. 23 :1–23:11.
[10] S. Avidan and A. Shamir, “Seam carving for content-aware image resizing,” ACM
Trans. Graphics, 2007, vol. 26, no. 3, pp. 10:1–10:9.
[11] Y. Pritch, E. Kav-Venaki, and S. Peleg, “Shift-map image editing,” in Proc. Int.
Conf. Comput. Vis., Kyoto, Japan, Oct. 2009, pp. 151–158.
[12] L. Wolf, M. Guttmann, and D. Cohen-Or, “Non-homogeneous content-driven
video retargeting,” in Proc. Int. Conf. Comput. Vis., Rio de Janeiro, Brazil, Oct.
2007, pp. 1–6.
[13] Y.-S. Wang, C.-L. Tai, O. Sorkin, and T.-Y. Lee, “Optimized scaleand-stretch for
image resizing,” ACM Trans. Graphics, 2008, vol. 27, no. 5, pp. 118:1–118:8.
[14] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data
using bidirectional similarity,” in Proc. IEEE Conf. Comput. Vis. Pattern
Recognit., Anchorage, AK, USA, Jun. 2008, pp. 1–8.
[15] A. Mollahosseini and M. Mahoor, "Bidirectional warping of active appearance
model," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Portland, OR, USA,
Jun. 2013, pp. 875–880.
[16] C. Liu, J. Yuen, and A. Torralba, “SIFT flow: Dense correspondence across scenes
and its applications,” IEEE Trans. Pattern Anal. Mach. Intell., May 2011, vol. 33,
no. 5, pp. 978–994.
[17] O. Pele and M. Werman, “Fast and robust earth mover’s distances,” in Proc. Int.
Conf. Comput. Vis., Kyoto, Japan, Oct. 2009, pp. 460–467.
[18] Y. Zhang, Y. Fang, W. Lin, X. Zhang and L. Li, "Backward registration-based
aspect ratio similarity for image retargeting quality assessment," IEEE Trans.
Image Process., Jun. 2016, vol. 25, no. 9, pp. 4286–4297.
[19] Y. Chen, Y. J. Liu "Learning to rank retargeted images," in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Honolulu, HI, USA, Jul. 2017, pp. 2994–4002.
[20] A. Krizhevsky, I. Sutskever, and G. Hinton, “ImageNet classification with deep
convolutional neural networks,” in Proc. Neural Inf. Process. Syst. 25, 2012, pp.
1097–1105.
[21] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object
detection with region proposal networks,” in Proc. Neural Inf. Process. Syst. 28,
2015, pp. 91–99.
[22] E. Shelhamer, J. Long and T. Darrell, "Fully convolutional networks for semantic
segmentation," IEEE Trans. Pattern Anal. Mach. Intell., May 2017, vol. 39, no. 4,
pp. 640–651.

[23] L. Kang, P. Ye, Y. Li and D. Doermann, "Convolutional neural networks for no-
reference image quality assessment," in Proc. Int. Conf. Comput. Vis., Columbus,

OH, USA, Jun. 2014, pp. 1733–1740.
[24] S. Bosse, D. Maniry, T. Wiegand and W. Samek, "A deep neural network for
image quality assessment," in Proc. 23th IEEE Int. Conf. Image Process., Phoenix,
AZ, USA, Sep. 2016, pp. 3773–3777.
[25] J. Kim and S. Lee, "Fully deep blind image quality predictor," IEEE J. Sel. Topics
Signal Process., Dec. 2017, vol. 11, no. 1, pp. 206-220.
[26] Y. Liang, J. Wang, X. Wan, Y. Gong, and N. Zheng, “Image quality assessment
using similar scene as reference,” in Proc Eur. Conf. Comput. Vis., Amsterdam,
Netherlands, Oct. 2016, pp. 3–18.
[27] F. Gao, Y. Wang, P. Li, M. Tan, J. Yu, and Y. Zhu, “DeepSim: Deep similarity
for image quality assessment,” Neurocomputing, Sep. 2017, vol. 125, pp. 104–114.
[28] J. Kim and S. Lee. "Deep learning of human visual sensitivity in image quality
assessment framework," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit.,
Honolulu, HI, USA, Jul. 2017, pp. 1676–1684.
[29] Jia Deng, Wei Dong, R. Socher, Li-Jia Li, Kai Li and Li Fei-Fei, "ImageNet: A
large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Miami, FL, USA, Jun. 2009, pp. 248–255.
[30] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, "Flownet 2.0:
Evolution of optical flow estimation with deep networks," in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Honolulu, HI, USA, Jul. 2017, pp. 2462–2470.
[31] R. Zhao, W. Ouyang, H. Li and X. Wang, "Saliency detection by multi-context
deep learning," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., Boston, MA,
USA, Jun. 2015, pp. 1265–1274.
[32] V. Nair and G.E. Hinton, “Rectified linear units improve restricted boltzmann
machines,” in Proc. 27nd Int. Conf. Mach. Learn., Haifa, Israel, Jun. 2010, pp.
807–814.
[33] S. Ioffe, C. Szegedy, "Batch normalization: Accelerating deep network training by
reducing internal covariate shift," in Proc. 32nd Int. Conf. Mach. Learn., Lille,
France, Jun. 2015, pp. 448–456.
[34] Z. Wang, A. Bovik, H. Sheikh, E. Simoncelli, "Image quality assessment: From
error visibility to structural similarity," IEEE Trans. Image Process., Apr. 2004,
vol. 13, no. 4, pp. 600–612.
[35] Z. Wang, E. P. Simoncelli, A. C. Bovik, M. Matthews, "Multiscale structural
similarity for image quality assessment," in Proc. IEEE Asilomar Conf. Signals
Systems and Computers, Nov. 2003, Pacific Grove, CA, USA, vol. 2, pp. 1398–
1402.
[36] W. Xue, L. Zhang, X. Mou and A. Bovik, "Gradient magnitude similarity
deviation: A highly efficient perceptual image quality index," IEEE Trans. Image
Process., Feb. 2014, vol. 23, no. 2, pp. 684–695.
[37] Y. Lv, G. Jiang, M. Yu, H. Xu, F. Shao and S. Liu, "Difference of Gaussian
statistical features based blind image quality assessment: A deep learning
approach," in Proc. 22th IEEE Int. Conf. Image Process., Quebec City, QC,
Canada, Sep. 2015, pp. 2344–2348.
[38] A. Mittal, A. Moorthy and A. Bovik, "No-reference image quality assessment in
the spatial domain," IEEE Trans. Image Process., Aug. 2012, vol. 21, no. 12, pp.
4695–4708.
[39] P. Ye, J. Kumar, L. Kang, D. Doermann, "Unsupervised feature learning
framework for no-reference image quality assessment," in Proc. IEEE Conf.
Comput. Vis. Pattern Recognit., Providence, RI, USA, Jun. 2012, pp. 1098–1105.
[40] W. Xue, X. Mou, L. Zhang, A. C. Bovik, X. Feng, "Blind image quality
assessment using joint statistics of gradient magnitude and Laplacian features,"
IEEE Trans. Image Process., Nov. 2014, vol. 23, no. 11, pp. 4850–4862.
[41] K. Gu, G. Zhai, X. Yang and W. Zhang, "Using free energy principle for blind
image quality assessment," IEEE Trans. Multimedia, Nov. 2015, vol. 17, no. 1, pp.
50–63.
[42] Q. Li, W. Lin, J. Xu and Y. Fang, "Blind image quality assessment using statistical
structural and luminance features," IEEE Trans. Multimedia, Aug. 2016, vol. 18,
no. 12, pp. 2457–2469.
[43] J. Kim, H. Zeng, D. Ghadiyaram, S. Lee and L. Zhang, "Deep convolutional neural
models for picture quality prediction," 2017. [Online]. Available:
https://www.cs.utexas.edu/~deepti/publications/deep_iqa.pdf.
[44] M. Kendall, "A new measure of rank correlation," Biometrika, 1938, vol. 30, no.
12, pp. 81.
[45] R. Bradley and M. Terry, "Rank analysis of incomplete block designs: I. the
method of paired comparisons," Biometrika, 1952, vol. 39, no. 34, pp. 324.

簡易檢索 / 詳目顯示

相關論文