簡易檢索 / 詳目顯示

研究生: 趙貞豪
Chao, Chen-Hao
論文名稱: 基於降噪概似計分匹配之條件式資料生成
Denoising Likelihood Score Matching for Conditional Score-based Data Generation
指導教授: 李濬屹
Lee, Chun-Yi
口試委員: 陳煥宗
Chen, Hwann-Tzong
劉育綸
Liu, Yu-Lun
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 33
中文關鍵詞: 資料生成降噪計分匹配條件式資料生成
外文關鍵詞: Data Generation, Denoising Score Matching, Conditional Score-based Models
相關次數: 點閱:74下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 計分式生成模型(Score-based Generative Modeling)是目前生成模型當中生成品質最佳的技術之一,在這樣的模型中我們依照對數後的概率密度函數(Probability Density Function)之梯度做資料生成,本文探討計分式生成模型在引入條件式生成之框架下的困難之處與解決方法。
    許多現有的條件計分式生成模型利用貝氏定理將對數後的事後概率密度函數(Posterior Probability Density Function)之梯度分解為計分函數與概似計分函數。這些方法促進了條件計分式模型之訓練過程,因為可以使用計分式模型和分類器的輸出之和去估計對數後的事後概率密度函數之梯度。然而,此論文的分析表明這些方法中分類器的訓練目標可能會導致嚴重的分數不匹配問題,這表示估計的概似計分函數偏離真實的概似計分函數的情況。這種問題會導致在擴散採樣的過程中,樣本被錯誤的計分函數誤導,從而導致採樣質量下降。為了解決這個問題,我們提出了一種新的訓練方法,稱為降噪概似計分匹配(Denoising Likelihood Score Matching),用於使分類器的梯度匹配真實概似計分函數。此論文的實驗證據表明,所提出的方法在 CIFAR-10 和 CIFAR-100 資料集測試中明顯優於先前的方法。因此,此論文得出結論:通過採用降噪概似計分匹配,可以使分類器準確地估計概似計分函數,並減輕分數不匹配問題的影響。
    此論文研究已被《國際表徵學習會議》(ICLR 2022)接受發表。


    Many existing conditional score-based data generation methods utilize Bayes’ theorem to decompose the gradients of a log posterior density into a mixture of scores. These methods facilitate the training procedure of conditional score models, as a mixture of scores can be separately estimated using a score model and a classifier. However, our analysis indicates that the training objectives for the classifier in these methods may lead to a serious score mismatch issue, which corresponds to the situation that the estimated scores deviate from the true ones. Such an issue causes the samples to be misled by the deviated scores during the diffusion process, resulting in a degraded sampling quality. To resolve it, we theoretically formulate a novel training objective, called Denoising Likelihood Score Matching (DLSM) loss, for the classifier to match the gradients of the true log likelihood density. Our experimental evidences show that the proposed method outperforms the previous methods on both CIFAR-10 and CIFAR-100 datasets noticeably in terms of several key evaluation metrics. We thus conclude that, by adopting DLSM, the conditional scores can be accurately modeled, and the effect of the score mismatch issue is alleviated.
    This research paper has been accepted by the International Conference on Learning Representations (ICLR 2022).

    Abstract (Chinese) 1 Acknowledgements (Chinese) 2 Abstract 3 Acknowledgements 4 Contents 5 List of Figures 7 List of Tables 9 1. Introduction 10 2. Background 12 3. Analysis on the Score Mismatch Issue 17 4. Denoising Likelihood Score Matching 21 5. Experiments 25 6. Conclusion 30 Bibliography 31

    [1] A. Hyvarinen. Estimation of non-normalized statistical models by score matching, Journal of Machine Learning Research (JMLR), vol. 6, no. 24. pp. 695-709, 2005.
    [2] P. Vincent, A connection between score matching and denoising autoencoders, Neural computation, vol. 23, no. 7. pp. 1661-1674, 2011.
    (3] Y. Song, S. Garg, J. Shi, and S. Ermon, Sliced score matching: A scalable approach to density and score estimation, In Proc. of the Conf. on Uncertainty in Artificial Intelligence (UAI), p. 204. 2019.
    [4] Y. Song and S. Ermon, Generative modeling by estimating gradients of the data distribution, in Proc. of Conf. on Neural Information Processing Systems (NeurlPS). 2019.
    [5] Y. Song and S. Ermon, Improved techniques for training score-based generative models, In Proc. of Conf. on Neural Information Processing Systems (NeurlPS), 2020.
    [6] J. Ho, A. Jain, and P. Abbeel, Denoising diffusion probabilistic models, in Proc. of Conf. on Neural Information Processing Systems (NeurlPS), 2020.
    [7] J. Song, C. Meng, and S. Ermon, Denoising diffusion implicit models, in Proc. of the Int. Conf. on Machine Learning (ICML), 2021.
    [8] Y. Song, J. Sohl-Dickstein. D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-based generative modeling through stochastic differential equations, In Int. Conf. on Learning Representations (ICLR), 2021.
    [9] P. Dhariwal and A. Nichol, Diffusion models beat GANs on image synthesis, arXiv preprint arXiv:2105.05233, 2021.
    [10] A. Nguyen, J. Clune, Y. Bengio, A. Dosovitskiy, and J. Yosinski, Plug & play generative networks: Conditional iterative generation of images in latent space, In Proc. of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 4467-4477, 2017.
    [11] V. Jayaram and J. Thickstun, Source separation w让h deep generative priors, In Proc. of the Int. Conf. on Machine Learning (ICML), 2020.
    [12] J. Ho, C. S., W. Chan, D. Fleet. M. Norouzi, and T. Salimans. Cascaded diffusion models for high fidelity image generation, ArXiv, vol. abs/2106.15282, 2021.
    [13] G. O. Roberts and R. L. Tweedie, Exponential convergence of Langevin distributions and their discrete approximations, Bernoulli, vol. 2, no. 4, pp. 341-363, 1996.
    [14] G. O. Roberts and J. S. Rosenthal, Optimal scaling of discrete approximations to langevin diffusions, Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 60, no. 1. pp. 255-268. 1996.
    [15] E. Parzen, On estimation of a probability density function and mode, in The annals of mathematical statistics, vol. 33. pp. 1065-1076, JSTOR, 1962.
    [16] M. Welling and Y. W. Teh, “Bayesian learning via stochastic gradient langevin dynamics,” in Proc. of the Int. Conf. on Machine Learning (ICML), pp. 681– 688, Citeseer, 2011.
    [17] T. Kynkaanniemi, T. Karras, S. Laine, J. Lehtinen, and T. Aila, Improved precision and recall metric for assessing generative models, In Proc. of Conf. on Neural Information Processing Systems (NeurIPS), 2019.
    [18] K. He, X. Zhang, S. Ren, and J. Sun, Deep residual learning for image recognition, In Proc. of the Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.
    [19] S. Barratt and R. Sharma, A note on the inception score, In arXiv preprint arXiv:1801.01973, 2018.
    [20] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, GANs trained by a two time-scale update rule converge to a local nash equilibrium, In Proc. of Conf. on Neural Information Processing Systems (NeurIPS), vol. 30, 2017.
    [21] M. F. Naeem, S. J. Oh, Y. Uh, Y. Choi, and J. Yoo, Reliable fidelity and diversity metrics for generative models, In Proc. of the Int. Conf. on Machine Learning (ICML), 2020.
    [22] S. Ravuri and O. Vinyals, Classification accuracy score for conditional generative models, in Proc. of Conf. on Neural Information Processing Systems (NeurIPS), 2019.

    QR CODE