自適應共變異數的貝葉斯優化｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	顧翔予 Ku, Hsiang-Yu
論文名稱：	自適應共變異數的貝葉斯優化 CoMABO: Covariance Matrix Adaptation for Bayesian Optimization
指導教授：	李哲榮 Lee, Che-Rung
口試委員:	陳煥宗 Chen, Hwann-Tzong 王聖智 Wang, Sheng-Jyh
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 資訊工程學系 Computer Science
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	39
中文關鍵詞：	貝葉斯優化
外文關鍵詞：	bayesian optimization
相關次數：	點閱：93 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

貝葉斯優化是一種尋找黑盒問題最佳解的有效方法，但對於高維問題通常
表現不佳。雖然有很多方法，例如TuRBO (信賴區域的貝葉斯優化) 來解決高
維問題中過度探索，以及高斯過程在高維空間中難以全局逼近真實函數的問
題，但他們仍然可能收斂緩慢或停滯不前。在本文中我們提出了一種新方法
CoMABO (自適應共變異數的貝葉斯優化) 增強TuRBO 的收斂性，自適應共變
異數是一種基於高斯模型採樣點構建共變異數的技術， CoMABO 利用它更佳
的探索候選點並優化複雜的代理模型，實驗結果表明CoMABO 提高了TuRBO
在各種基準問題和實際應用問題中的收斂性。

Bayesian optimization, an effective method for searching the optimal solution of black-box functions, usually performs poorly for high-dimensional problems. Although many methods, such as TuRBO (Trust regions Bayesian optimization), have been proposed to solve the over-emphasis of exploration in high-dimensional global acquisition, they may converge slowly or stagnate. In this paper, we proposed a new method, called CoMABO (Covariance Matrix Adaptation for Bayesian Optimization) to enhance the convergence of TuRBO algorithm. Covariance matrix adaptation is a technique that builds Gaussian models based on the covariance matrix constructed from sampled points. CoMABO utilizes it to explore better candidate points and to optimize the complex surrogate model. Experimental results show CoMABO improves the convergence of TuRBO on various benchmark problems and real world applications.

Abstract (Chinese) 1
Abstract 2
Contents 3
List of Figures 5
List of Tables 6
List of Algorithms 7
1 Introduction 8
2 Related Work 10
3 CoMABO Algorithm 14
3.1 Initial Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Acquisition Function
(Inner Optimization Problem) . . . . . . . . . . . . . . . . . . . . . 16
3.3 Data Augmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Experiments 19
4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.2 Test Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.1 14D Robot pushing . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 60D Rover trajectory planning . . . . . . . . . . . . . . . . . 20
4.2.3 200D Ackley function . . . . . . . . . . . . . . . . . . . . . . 21
4.2.4 100D Levy function . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 Experimental Results And Analysis . . . . . . . . . . . . . . . . . . 21
4.4 Different Integration Methods . . . . . . . . . . . . . . . . . . . . . 24
5 Conclusion and future work 29
Bibliography 30
Appendix 35
A Gaussian Process Regression . . . . . . . . . . . . . . . . . . . . . . 35
B CMA-ES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
                                

[1] Hyunghun Cho, Yongjin Kim, Eunjung Lee, Daeyoung Choi, Yongjae Lee,
and Wonjong Rhee. Basic enhancement strategies when using bayesian optimization
for hyperparameter tuning of deep neural networks. IEEE Access,
8:52588–52608, 2020.
[2] Shenghong Ju, Takuma Shiga, Lei Feng, Zhufeng Hou, Koji Tsuda, and Junichiro
Shiomi. Designing nanostructures for phonon transport via bayesian
optimization. Physical Review X, 7(2):021024, 2017.
[3] Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P Schoellig, Andreas
Krause, Stefan Schaal, and Sebastian Trimpe. Virtual vs. real: Trading
off simulations and physical experiments in reinforcement learning with
bayesian optimization. In 2017 IEEE International Conference on Robotics
and Automation (ICRA), pages 1557–1563. IEEE, 2017.
[4] Peter I Frazier. A tutorial on bayesian optimization. arXiv preprint
arXiv:1807.02811, 2018.
[5] Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential modelbased
optimization for general algorithm configuration. In International conference
on learning and intelligent optimization, pages 507–523. Springer,
2011.
[6] Jost Tobias Springenberg, Aaron Klein, Stefan Falkner, and Frank Hutter. Bayesian optimization with robust bayesian neural networks. Advances in neural information processing systems, 29, 2016.
[7] James Bergstra, R´emi Bardenet, Yoshua Bengio, and Bal´azs K´egl. Algorithms
for hyper-parameter optimization. Advances in neural information processing
systems, 24, 2011.
[8] Stefan Falkner, Aaron Klein, and Frank Hutter. Bohb: Robust and efficient
hyperparameter optimization at scale. In International Conference on Machine
Learning, pages 1437–1446. PMLR, 2018.
[9] Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet
Talwalkar. Hyperband: A novel bandit-based approach to hyperparameter
optimization. The Journal of Machine Learning Research, 18(1):6765–6816,
2017.
[10] Jacob Gardner, Chuan Guo, Kilian Weinberger, Roman Garnett, and Roger
Grosse. Discovering and exploiting additive structure for bayesian optimization.
In Artificial Intelligence and Statistics, pages 1311–1319. PMLR, 2017.
[11] Kirthevasan Kandasamy, Jeff Schneider, and Barnab´as P´oczos. High dimensional
bayesian optimisation and bandits via additive models. In International
conference on machine learning, pages 295–304. PMLR, 2015.
[12] Paul Rolland, Jonathan Scarlett, Ilija Bogunovic, and Volkan Cevher. Highdimensional
bayesian optimization via additive models with overlapping
groups. In International conference on artificial intelligence and statistics,
pages 298–307. PMLR, 2018.
[13] ChangYong Oh, Efstratios Gavves, and Max Welling. Bock: Bayesian optimization
with cylindrical kernels. In International Conference on Machine
Learning, pages 3868–3877. PMLR, 2018.
[14] Amin Nayebi, Alexander Munteanu, and Matthias Poloczek. A framework for
bayesian optimization in embedded subspaces. In International Conference
on Machine Learning, pages 4752–4761. PMLR, 2019.
[15] Zi Wang, Clement Gehring, Pushmeet Kohli, and Stefanie Jegelka. Batched
large-scale bayesian optimization in high-dimensional spaces. In International
Conference on Artificial Intelligence and Statistics, pages 745–754. PMLR,
2018.
[16] David Eriksson, Michael Pearce, Jacob Gardner, Ryan D Turner, and
Matthias Poloczek. Scalable global optimization via local bayesian optimization.
Advances in Neural Information Processing Systems, 32, 2019.
[17] Nikolaus Hansen. The cma evolution strategy: a comparing review. Towards
a new evolutionary computation, pages 75–102, 2006.
[18] Cl´ement Chevalier and David Ginsbourger. Fast computation of the multipoints
expected improvement with applications in batch selection. In International
Conference on Learning and Intelligent Optimization, pages 59–69.
Springer, 2013.
[19] Andreas Kirsch, Joost Van Amersfoort, and Yarin Gal. Batchbald: Efficient
and diverse batch acquisition for deep bayesian active learning. Advances in
neural information processing systems, 32, 2019.
[20] Jos´e Miguel Hern´andez-Lobato, James Requeima, Edward O Pyzer-Knapp,
and Al´an Aspuru-Guzik. Parallel and distributed thompson sampling for
large-scale accelerated exploration of chemical space. In International conference
on machine learning, pages 1470–1479. PMLR, 2017.
[21] Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, and
Barnab´as P´oczos. Parallelised bayesian optimisation via thompson sampling. In International Conference on Artificial Intelligence and Statistics, pages 133–142. PMLR, 2018.
[22] Nikolaus Hansen. The cma evolution strategy: A tutorial. arXiv preprint
arXiv:1604.00772, 2016.
[23] Nikolaus Hansen, Sibylle D M¨uller, and Petros Koumoutsakos. Reducing
the time complexity of the derandomized evolution strategy with covariance
matrix adaptation (cma-es). Evolutionary computation, 11(1):1–18, 2003.
[24] Il’ya Meerovich Sobol’. On the distribution of points in a cube and the approximate
evaluation of integrals. Zhurnal Vychislitel’noi Matematiki i Matematicheskoi
Fiziki, 7(4):784–802, 1967.
[25] Ya-xiang Yuan. A review of trust region algorithms for optimization. In
Iciam, volume 99-1, pages 271–282, 2000.
[26] William R Thompson. On the likelihood that one unknown probability exceeds
another in view of the evidence of two samples. Biometrika, 25(3-4):
285–294, 1933.
[27] Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. Cma-es/pycma
on github. zenodo. https://github.com/CMA-ES/pycma, Feb 2019.
DOI:10.5281/zenodo.2559634.
[28] Jacob Gardner, Geoff Pleiss, Kilian Q Weinberger, David Bindel, and Andrew
G Wilson. Gpytorch: Blackbox matrix-matrix gaussian process inference
with gpu acceleration. Advances in neural information processing systems, 31,
2018.
[29] Kun Dong, David Eriksson, Hannes Nickisch, David Bindel, and Andrew GWilson. Scalable log determinants for gaussian process kernel learning. Advances in Neural Information Processing Systems, 30, 2017.
[30] Marco Taboga. Marginal and conditional distributions of a multivariate normal vector. https://www.statlect.com/probability-distributions/multivariate-normal-distribution-partitioning, 2021.

簡易檢索 / 詳目顯示

相關論文