簡易檢索 / 詳目顯示

研究生: 許博皓
Hsu, Po-Hao
論文名稱: 具場景文本還原功能的注意力引導低光度影像增強
Attention-Guided Low-light ImageEnhancement with Scene Text Restoration
指導教授: 賴尚宏
Lai, Shang-Hong
口試委員: 邱瀞德
Chiu, Ching-Te
朱宏國
Chu, Hung-Kuo
黃敬群
Huang, Ching-Chun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 36
中文關鍵詞: 低光度影像增強場景文本
外文關鍵詞: Low-light Image Enhancement, Scene text
相關次數: 點閱:83下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 最近,一些用來解決弱光圖像增強問題的深度學習方法已經被提出並取得了令人矚目的成功。儘管增強後的圖像質量通常得到改善,但是大多數現有模型都無法很好地恢復極低光照圖像的圖像細節和顏色,尤其是在場景文本上。在本文中,我們提出了一種具有自調節注意力機制和邊緣感知模塊的新型圖像增強模型。為了更好地恢復不清晰的場景文本,我們引入了一種文本檢測損失,它能夠指導模型有效地恢復文本區域。定量和品質的實驗結果表明,該模型在對SID和ICDAR15數據集的圖像恢復,文本檢測以及識別方面,均優於最新方法。


    Recently, several deep learning based methods have been proposed for solving the low-light image enhancement problem and achieved impressive success. Although the image quality of the enhanced images is generally improved, most of the existing models cannot well recover image details and colors for extremely low-light images, especially on scene texts. In this paper, we propose a novel image enhancement model with a self-regularised attention mechanism and edge-awareness module. In order to better recover the unclear scene texts, we introduce a text detection loss that guides the model to recover the text region effectively. Quantitative and qualitative experimental results show that the proposed model outperforms state-of-the-art methods in terms of image restoration, text detection, and recognition on SID and ICDAR15 datasets.

    1 Introduction 1 1.1 Motivation ............................... 1 1.2 Problem Statement .......................... 2 1.3 Contributions ............................. 2 1.4 Thesis Organization .......................... 3 2 Related Work 5 2.1 Low-Light Image Enhancement ................... 5 2.1.1 Traditional Approaches .................... 5 2.1.2 Deep Learning-Based Approaches ........................6 2.2 Edge Detection ............................ 7 2.2.1 Traditional Approaches .................... 7 2.2.2 Deep Learning-Based Approaches ........................7 2.3 Scene Text Detection ......................... 8 3 Method 9 3.1 Self-regularized attention ....................... 10 3.2 Edge-awareness ............................ 11 3.3 Text detection loss .......................... 11 3.4 Loss function ............................. 12 3.5 Network detail ............................ 13 4 Experiments 15 4.1 Dataset ................................ 15 4.2 Evaluation Metrics .......................... 16 4.3 Implementation Details ........................ 16 4.4 Experimental Results ......................... 16 4.4.1 Quantitative comparison ................... 16 4.4.2 Qualitative comparison .................... 20 4.5 Ablations ............................... 29 4.6 Failure Case Study .......................... 31 5 Conclusions 32 References 33

    [1]Libraw.https://www.libraw.org/docs.
    [2]Baek, J., Kim, G., Lee, J., Park, S., Han, D., Yun, S., Oh, S. J., and Lee, H.What is wrong with scene text recognition model comparisons? dataset andmodel analysis. InProceedings of the IEEE/CVF International Conference onComputer Vision (ICCV)(October 2019).
    [3]Baek,Y.,Lee,B.,Han,D.,Yun,S.,andLee,H. Characterregionawarenessfortext detection.2019 IEEE/CVF Conference on Computer Vision and PatternRecognition (CVPR)(2019), 9357–9366.
    [4]Bertasius, G., Shi, J., and Torresani, L. Deepedge: A multi-scale bifurcateddeepnetworkfortop-downcontourdetection.2015IEEEConferenceonCom-puter Vision and Pattern Recognition (CVPR)(2015), 4380–4389.
    [5]Canny, J. A computational approach to edge detection.IEEE Transactions onPattern Analysis and Machine Intelligence PAMI-8(1986), 679–698.
    [6]Chen, C., Chen, Q., Xu, J., and Koltun, V. Learning to see in the dark.2018IEEE/CVF Conference on Computer Vision and Pattern Recognition(2018),3291–3300.
    [7]Dollár,P.,andZitnick,C.L. Fastedgedetectionusingstructuredforests.IEEETransactions on Pattern Analysis and Machine Intelligence 37(2015), 1558–1570.
    [8]Epshtein, B., Ofek, E., and Wexler, Y. Detecting text in natural scenes withstrokewidthtransform.2010IEEEComputerSocietyConferenceonComputerVision and Pattern Recognition(2010), 2963–2970.
    [9]Fu, X., Zeng, D., Huang, Y., Zhang, X., and Ding, X. A weighted variationalmodel for simultaneous reflectance and illumination estimation.2016 IEEEConference on Computer Vision and Pattern Recognition (CVPR)(2016),2782–2790.
    [10]Gharbi,M.,Chen,J.,Barron,J.,Hasinoff,S.W.,andDurand,F. Deepbilaterallearning for real-time image enhancement.ACM Trans. Graph. 36(2017),118:1–118:12.
    [11]Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D.,Ozair, S., Courville, A. C., and Bengio, Y. Generative adversarial nets. InNIPS(2014).
    [12]Guo, X., Li, Y., and Ling, H. Lime: Low-light image enhancement via illu-mination map estimation.IEEE Transactions on Image Processing 26(2017),982–993.
    [13]Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P.,and Wang, Z. Enlightengan: Deep light enhancement without paired supervi-sion.ArXiv abs/1906.06972(2019).
    [14]Jobson, D., Rahman, Z., and Woodell, G. A. A multiscale retinex for bridg-ing the gap between color images and the human observation of scenes.IEEEtransactions on image processing : a publication of the IEEE Signal Process-ing Society 6 7(1997), 965–76.
    [15]Jobson, D. J., Rahman, Z.-u., and Woodell, G. A. Properties and performanceof a center/surround retinex.IEEE transactions on image processing 6, 3(1997), 451–462.
    [16]Karatzas, D., Bigorda, L. G. I., Nicolaou, A., Ghosh, S., Bagdanov, A. D.,Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V., Lu, S., Shafait, F.,Uchida, S., and Valveny, E. Icdar 2015 competition on robust reading.201513th International Conference on Document Analysis and Recognition (IC-DAR)(2015), 1156–1160.
    [17]Kingma, D. P., and Ba, J. Adam: A method for stochastic optimization.CoRRabs/1412.6980(2015).
    [18]Land, E. H. The retinex theory of color vision.Scientific american 237, 6(1977), 108–129.
    [19]Lee,C.,Lee,C.,andKim,C.-S. Contrastenhancementbasedonlayereddiffer-ence representation of 2d histograms.IEEE transactions on image processing22, 12 (2013), 5372–5384.
    [20]Lee, J., Park, S., Baek, J., Oh, S. J., Kim, S., and Lee, H. On recognizingtexts of arbitrary shapes with 2d self-attention.2020 IEEE/CVF Conferenceon Computer Vision and Pattern Recognition Workshops (CVPRW)(2020),2326–2335.
    [21]Liao, M., Shi, B., Bai, X., Wang, X., and Liu, W. Textboxes: A fast textdetector with a single deep neural network. InAAAI(2017).
    [22]Liu,Y.,Cheng,M.-M.,Hu,X.,Bian,J.,Zhang,L.,Bai,X.,andTang,J. Richerconvolutionalfeaturesforedgedetection.IEEETransactionsonPatternAnal-ysis and Machine Intelligence 41(2019), 1939–1946.
    [23]Liu, Y., Jin, L., Xie, Z., Luo, C., Zhang, S., and Xie, L. Tightness-awareevaluation protocol for scene text detection.2019 IEEE/CVF Conference onComputer Vision and Pattern Recognition (CVPR)(2019), 9604–9612.
    [24]Lore, K. G., Akintayo, A., and Sarkar, S. Llnet: A deep autoencoder approachto natural low-light image enhancement.Pattern Recognit. 61(2017), 650–662.
    [25]Matas, J., Chum, O., Urban, M., and Pajdla, T. Robust wide baseline stereofrom maximally stable extremal regions. InBMVC(2002).
    [26]Pizer, S., Amburn, E. P., Austin, J. D., Cromartie, R., Geselowitz, A., Greer,T., Romeny, B. T. H., and Zimmerman, J. B. Adaptive histogram equalizationand its variations.Graphical Models graphical Models and Image Processingcomputer Vision, Graphics, and Image Processing 39(1987), 355–368.
    [27]Ronneberger, O., Fischer, P., and Brox, T. U-net: Convolutional networks forbiomedical image segmentation. InMICCAI(2015).
    [28]Shen, W., Wang, X., Wang, Y., Bai, X., and Zhang, Z. Deepcontour: A deepconvolutional feature learned by positive-sharing loss for contour detection.2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)(2015), 3982–3991.
    [29]Shi,B.,Yang,M.,Wang,X.,Lyu,P.,Yao,C.,andBai,X. Aster: Anattentionalscenetextrecognizerwithflexiblerectification.IEEETransactionsonPatternAnalysis and Machine Intelligence 41, 9 (2019), 2035–2048.
    [30]Shi, Y., Wu, X., and Zhu, M. Low-light image enhancement algorithm basedon retinex and generative adversarial network.ArXiv abs/1906.06027(2019).
    [31]Tao,L.,Zhu,C.,Song,J.,Lu,T.,Jia,H.,andXie,X. Low-lightimageenhance-ment using cnn and bright channel prior.2017 IEEE International Conferenceon Image Processing (ICIP)(2017), 3215–3219.
    [32]Tao, L., Zhu, C., Xiang, G., Li, Y., Jia, H., and Xie, X. Llcnn: A convo-lutional neural network for low-light image enhancement.2017 IEEE VisualCommunications and Image Processing (VCIP)(2017), 1–4.
    [33]Wang, S., Zheng, J., Hu, H., and Li, B. Naturalness preserved enhancementalgorithm for non-uniform illumination images.IEEE Transactions on ImageProcessing 22(2013), 3538–3548.
    [34]Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., and Shen, C.Efficient and accurate arbitrary-shaped text detection with pixel aggregationnetwork. InProceedings of the IEEE International Conference on ComputerVision(2019), pp. 8440–8449.
    [35]Wang, Z., Simoncelli, E. P., and Bovik, A. Multi-scale structural similarity forimage quality assessment.
    [36]Wei, C., Wang, W., Yang, W., and Liu, J. Deep retinex decomposition forlow-light enhancement.ArXiv abs/1808.04560(2018).
    [37]Xie, S., and Tu, Z. Holistically-nested edge detection.International Journalof Computer Vision 125(2015), 3–18.
    [38]Xing, L., Tian, Z., Huang, W., and Scott, M. R. Convolutional character net-works. InProceedings of the IEEE International Conference on ComputerVision (ICCV)(2019).
    [39]Ying,Z.,Li, G.,andGao,W. Abio-inspiredmulti-exposurefusionframeworkfor low-light image enhancement.ArXiv abs/1711.00591(2017).
    [40]Zhou,X.,Yao,C.,Wen,H.,Wang,Y.,Zhou,S.,He,W.,andLiang,J. East: Anefficientandaccuratescenetextdetector.2017IEEEConferenceonComputerVision and Pattern Recognition (CVPR)(2017), 2642–2651.
    [41]Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. Unpaired image-to-imagetranslation using cycle-consistent adversarial networkss. InComputer Vision(ICCV), 2017 IEEE International Conference on(2017).
    [42]Zhu, M., Pan, P., Chen, W., and Yang, Y. Eemefn: Low-light image en-hancementviaedge-enhancedmulti-exposurefusionnetwork.InAAAI(2020),pp. 13106–13113.
    [43]Çelik, T., and Tjahjadi, T. Contextual and variational contrast enhancement.IEEE Transactions on Image Processing 20(2011), 3431–3441.

    QR CODE