簡易檢索 / 詳目顯示

研究生: 顧翔予
Ku, Hsiang-Yu
論文名稱: 自適應共變異數的貝葉斯優化
CoMABO: Covariance Matrix Adaptation for Bayesian Optimization
指導教授: 李哲榮
Lee, Che-Rung
口試委員: 陳煥宗
Chen, Hwann-Tzong
王聖智
Wang, Sheng-Jyh
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 39
中文關鍵詞: 貝葉斯優化
外文關鍵詞: bayesian optimization
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 貝葉斯優化是一種尋找黑盒問題最佳解的有效方法,但對於高維問題通常
    表現不佳。雖然有很多方法,例如TuRBO (信賴區域的貝葉斯優化) 來解決高
    維問題中過度探索,以及高斯過程在高維空間中難以全局逼近真實函數的問
    題,但他們仍然可能收斂緩慢或停滯不前。在本文中我們提出了一種新方法
    CoMABO (自適應共變異數的貝葉斯優化) 增強TuRBO 的收斂性,自適應共變
    異數是一種基於高斯模型採樣點構建共變異數的技術, CoMABO 利用它更佳
    的探索候選點並優化複雜的代理模型,實驗結果表明CoMABO 提高了TuRBO
    在各種基準問題和實際應用問題中的收斂性。


    Bayesian optimization, an effective method for searching the optimal solution of black-box functions, usually performs poorly for high-dimensional problems. Although many methods, such as TuRBO (Trust regions Bayesian optimization), have been proposed to solve the over-emphasis of exploration in high-dimensional global acquisition, they may converge slowly or stagnate. In this paper, we proposed a new method, called CoMABO (Covariance Matrix Adaptation for Bayesian Optimization) to enhance the convergence of TuRBO algorithm. Covariance matrix adaptation is a technique that builds Gaussian models based on the covariance matrix constructed from sampled points. CoMABO utilizes it to explore better candidate points and to optimize the complex surrogate model. Experimental results show CoMABO improves the convergence of TuRBO on various benchmark problems and real world applications.

    Abstract (Chinese) 1 Abstract 2 Contents 3 List of Figures 5 List of Tables 6 List of Algorithms 7 1 Introduction 8 2 Related Work 10 3 CoMABO Algorithm 14 3.1 Initial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Acquisition Function (Inner Optimization Problem) . . . . . . . . . . . . . . . . . . . . . 16 3.3 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4 Experiments 19 4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.2 Test Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.1 14D Robot pushing . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.2 60D Rover trajectory planning . . . . . . . . . . . . . . . . . 20 4.2.3 200D Ackley function . . . . . . . . . . . . . . . . . . . . . . 21 4.2.4 100D Levy function . . . . . . . . . . . . . . . . . . . . . . . 21 4.3 Experimental Results And Analysis . . . . . . . . . . . . . . . . . . 21 4.4 Different Integration Methods . . . . . . . . . . . . . . . . . . . . . 24 5 Conclusion and future work 29 Bibliography 30 Appendix 35 A Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . 35 B CMA-ES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    [1] Hyunghun Cho, Yongjin Kim, Eunjung Lee, Daeyoung Choi, Yongjae Lee,
    and Wonjong Rhee. Basic enhancement strategies when using bayesian optimization
    for hyperparameter tuning of deep neural networks. IEEE Access,
    8:52588–52608, 2020.
    [2] Shenghong Ju, Takuma Shiga, Lei Feng, Zhufeng Hou, Koji Tsuda, and Junichiro
    Shiomi. Designing nanostructures for phonon transport via bayesian
    optimization. Physical Review X, 7(2):021024, 2017.
    [3] Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P Schoellig, Andreas
    Krause, Stefan Schaal, and Sebastian Trimpe. Virtual vs. real: Trading
    off simulations and physical experiments in reinforcement learning with
    bayesian optimization. In 2017 IEEE International Conference on Robotics
    and Automation (ICRA), pages 1557–1563. IEEE, 2017.
    [4] Peter I Frazier. A tutorial on bayesian optimization. arXiv preprint
    arXiv:1807.02811, 2018.
    [5] Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential modelbased
    optimization for general algorithm configuration. In International conference
    on learning and intelligent optimization, pages 507–523. Springer,
    2011.
    [6] Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. Bayesian optimization with robust bayesian neural networks. Advances in neural information processing systems, 29, 2016.
    [7] James Bergstra, R´emi Bardenet, Yoshua Bengio, and Bal´azs K´egl. Algorithms
    for hyper-parameter optimization. Advances in neural information processing
    systems, 24, 2011.
    [8] Stefan Falkner, Aaron Klein, and Frank Hutter. Bohb: Robust and efficient
    hyperparameter optimization at scale. In International Conference on Machine
    Learning, pages 1437–1446. PMLR, 2018.
    [9] Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet
    Talwalkar. Hyperband: A novel bandit-based approach to hyperparameter
    optimization. The Journal of Machine Learning Research, 18(1):6765–6816,
    2017.
    [10] Jacob Gardner, Chuan Guo, Kilian Weinberger, Roman Garnett, and Roger
    Grosse. Discovering and exploiting additive structure for bayesian optimization.
    In Artificial Intelligence and Statistics, pages 1311–1319. PMLR, 2017.
    [11] Kirthevasan Kandasamy, Jeff Schneider, and Barnab´as P´oczos. High dimensional
    bayesian optimisation and bandits via additive models. In International
    conference on machine learning, pages 295–304. PMLR, 2015.
    [12] Paul Rolland, Jonathan Scarlett, Ilija Bogunovic, and Volkan Cevher. Highdimensional
    bayesian optimization via additive models with overlapping
    groups. In International conference on artificial intelligence and statistics,
    pages 298–307. PMLR, 2018.
    [13] ChangYong Oh, Efstratios Gavves, and Max Welling. Bock: Bayesian optimization
    with cylindrical kernels. In International Conference on Machine
    Learning, pages 3868–3877. PMLR, 2018.
    [14] Amin Nayebi, Alexander Munteanu, and Matthias Poloczek. A framework for
    bayesian optimization in embedded subspaces. In International Conference
    on Machine Learning, pages 4752–4761. PMLR, 2019.
    [15] Zi Wang, Clement Gehring, Pushmeet Kohli, and Stefanie Jegelka. Batched
    large-scale bayesian optimization in high-dimensional spaces. In International
    Conference on Artificial Intelligence and Statistics, pages 745–754. PMLR,
    2018.
    [16] David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and
    Matthias Poloczek. Scalable global optimization via local bayesian optimization.
    Advances in Neural Information Processing Systems, 32, 2019.
    [17] Nikolaus Hansen. The cma evolution strategy: a comparing review. Towards
    a new evolutionary computation, pages 75–102, 2006.
    [18] Cl´ement Chevalier and David Ginsbourger. Fast computation of the multipoints
    expected improvement with applications in batch selection. In International
    Conference on Learning and Intelligent Optimization, pages 59–69.
    Springer, 2013.
    [19] Andreas Kirsch, Joost Van Amersfoort, and Yarin Gal. Batchbald: Efficient
    and diverse batch acquisition for deep bayesian active learning. Advances in
    neural information processing systems, 32, 2019.
    [20] Jos´e Miguel Hern´andez-Lobato, James Requeima, Edward O Pyzer-Knapp,
    and Al´an Aspuru-Guzik. Parallel and distributed thompson sampling for
    large-scale accelerated exploration of chemical space. In International conference
    on machine learning, pages 1470–1479. PMLR, 2017.
    [21] Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, and
    Barnab´as P´oczos. Parallelised bayesian optimisation via thompson sampling. In International Conference on Artificial Intelligence and Statistics, pages 133–142. PMLR, 2018.
    [22] Nikolaus Hansen. The cma evolution strategy: A tutorial. arXiv preprint
    arXiv:1604.00772, 2016.
    [23] Nikolaus Hansen, Sibylle D M¨uller, and Petros Koumoutsakos. Reducing
    the time complexity of the derandomized evolution strategy with covariance
    matrix adaptation (cma-es). Evolutionary computation, 11(1):1–18, 2003.
    [24] Il’ya Meerovich Sobol’. On the distribution of points in a cube and the approximate
    evaluation of integrals. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi
    Fiziki, 7(4):784–802, 1967.
    [25] Ya-xiang Yuan. A review of trust region algorithms for optimization. In
    Iciam, volume 99-1, pages 271–282, 2000.
    [26] William R Thompson. On the likelihood that one unknown probability exceeds
    another in view of the evidence of two samples. Biometrika, 25(3-4):
    285–294, 1933.
    [27] Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. Cma-es/pycma
    on github. zenodo. https://github.com/CMA-ES/pycma, Feb 2019.
    DOI:10.5281/zenodo.2559634.
    [28] Jacob Gardner, Geoff Pleiss, Kilian Q Weinberger, David Bindel, and Andrew
    G Wilson. Gpytorch: Blackbox matrix-matrix gaussian process inference
    with gpu acceleration. Advances in neural information processing systems, 31,
    2018.
    [29] Kun Dong, David Eriksson, Hannes Nickisch, David Bindel, and Andrew GWilson. Scalable log determinants for gaussian process kernel learning. Advances in Neural Information Processing Systems, 30, 2017.
    [30] Marco Taboga. Marginal and conditional distributions of a multivariate normal vector. https://www.statlect.com/probability-distributions/multivariate-normal-distribution-partitioning, 2021.

    QR CODE