簡易檢索 / 詳目顯示

研究生: 曹立元
Tsao, Li-Yuan
論文名稱: 透過局部隱式高斯化流與習得先驗增強任意倍率圖像超解析度模型
Enhancing Arbitrary-Scale Image Super-Resolution via Local Implicit Normalizing Flow and Learned Prior
指導教授: 陳煥宗
CHEN, HWANN-TZONG
李濬屹
Chun-Yi Lee
口試委員: 孫民
SUN, MIN
楊元福
PROF. YUAN-FU YANG
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2025
畢業學年度: 113
語文別: 英文
論文頁數: 55
中文關鍵詞: 超解析度高斯化流習得先驗任意倍率超解析度
外文關鍵詞: Image Super-Resolution, Normalizing Flow, Learned Prior, Arbitrary-scale Super-Resolution
相關次數: 點閱:78下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於高斯化流的超解析度方法能夠透過學習高解析度影像的機率分佈來生
    成高品質的圖像,並有效處理圖像超解析問題中本質上的不適定性。然而,此
    類方法通常僅限於執行固定倍率之超解析,限制其在真實應用中的彈性。近年
    來,任意倍率的超解析技術逐漸受到關注並取得顯著進展。然而,多數現有任
    意倍率超解析方法忽略了該任務的不適定性,僅依賴像素級的 L1 損失進行訓
    練,導致生成結果模糊。此外,基於高斯化流的超解析模型在推論階段仍面臨
    多項挑戰,例如格狀雜訊、反向映射爆炸,以及固定取樣溫度所造成的次佳表
    現。本論文旨在解決這些基於高斯化流之模型的內在問題,並提升其在任意放
    大倍率下生成高品質影像的能力。

    在本論文的第一部分,我們提出局部隱式高斯化流(Local Implicit Normalizing Flow, LINF),結合高斯化流與局部隱式神經表示,用以建模局部紋理區塊
    的機率分佈。在推論階段,LINF 可獨立生成各區塊並組合為完整影像。結合高
    斯化流強大的生成力,使得 LINF 能於任意解析度下生成高品質圖像,同時處
    理超解析的不適定性問題。

    在論文的第二部分,我們進一步探討上述模型在推論階段的問題,並提
    出 BFSR 架構,它是一種基於習得先驗的設計,可直接整合至現有的高斯化流
    超解析模型中。BFSR 透過從低解析輸入中預測條件式潛在表示來取代隨機取
    樣,進而提升推論穩定性與感知品質,且無需修改模型架構與預訓練權重。

    本論文所提出的兩個架構皆經由多項超解析度基準測試驗證,顯示其在提升
    基於高斯化流之模型的性能、靈活性與推論穩定性方面,皆具有顯著成效。


    Flow-based super-resolution (SR) methods generate high-quality images by
    learning the distribution of high-resolution (HR) images, effectively tackling the
    ill-posed problem of SR. However, these methods typically support only fixed-scale upsampling, limiting their flexibility in real-world scenarios. Recently, arbitrary-scale SR has gained increasing attention. Nevertheless, most methods neglect the ill-posedness of SR and rely on pixel-wise L1 loss, resulting in blurry outputs. In addition, flow-based SR models still encounter challenges such as grid artifacts, exploding inverses, and suboptimal results caused by a fixed sampling temperature. This thesis aims to address the inherent issues in flow-based SR models and enhance their capability to generate high-quality images at arbitrary scales.

    In the first part of this thesis, we propose Local Implicit Normalizing Flow
    (LINF), which combines normalizing flow with a local implicit neural representation to model the probability distribution of local texture patches. At inference, LINF generates image patches independently and assembles them into an image. Leveraging the generative capability of normalizing flows, LINF produces high-quality images at arbitrary resolutions while handling the ill-posed nature of SR.

    In the second part of this thesis, we further investigate the inherent issues of
    flow-based SR models and propose BFSR, a learned prior-based framework that can be seamlessly integrated with contemporary flow-based SR models. BFSR improves inference stability and perceptual quality by predicting conditional latent codes without modifying model architectures or pretrained weights.

    This thesis presents two frameworks that enhance the flexibility and robustness
    of flow-based SR models. The effectiveness of both frameworks is validated through extensive experiments, showing their ability to produce high-quality images with improved flexibility, perceptual quality, and inference stability.

    QR CODE