集成學習與深度學習在系外行星研究的應用｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	曾淯湞 Tseng, Yu-Chen
論文名稱：	集成學習與深度學習在系外行星研究的應用 The Applications of Ensemble Learning and Deep Learning on the Exoplanet Researches
指導教授：	葉麗琴 Yeh, Li-Chin
口試委員:	江瑛貴 Jiang, Ing-Guey 陳賢修 Chen, Shyan-Shiou
學位類別：	碩士 Master
系所名稱：	理學院 - 計算與建模科學研究所 Institute of Computational and Modeling Science
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	86
中文關鍵詞：	集成學習、深度學習、凌日法、系外行星、卷積神經網路、人工智慧、特徵工程、機器學習
外文關鍵詞：	Ensemble Learning, Deep Learning, TESS, Exoplanet, Convolutional Neural Network, Artificial Intelligence, TSfresh, Machine Learning
相關次數：	點閱：97 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究結合了機器學習中的集成學習和深度學習方法，並利用凌星法尋找TESS[6]數據中凌星週期在一到二天的候選系外行星。在模型訓練之前，我們使用了TSfresh特徵工程，並比較了使用Random Forest、Adaboost、XGBoost、LightGBM和CNN模型進行預測的差異。我們將原始光曲線處理成雜訊，並利用凌星模組Mandel&Agol[5]加入凌星訊號，建立了RF、Adab、XGB、LGBM和CNN模型。我們選擇適當的樣本大小，並使用交叉驗證法訓練模型。隨後，我們選擇效能較好的模型進行預測，尋找可能的候選系外行星。最終，我們在研究中發現了4個候選系外行星。

This study combines ensemble learning and deep learning methods in machine learning, integrated with the transit method to identify candidate exoplanets with transit periods ranging from one to two days in TESS[6] data. Prior to model training, we utilized TSfresh feature engineering and compared the predictive performance of Random Forest, Adaboost, XGBoost, LightGBM, and CNN models. We processed the original light curves as noise and incorporated transit signals using the Mandel & Agol [5] transit module, establishing RF, Adab, XGB, LGBM, and CNN models. Optimal sample sizes were selected, and models were trained using cross-validation. Subsequently, we employed the best-performing models for prediction to identify potential candidate exoplanets. Ultimately, our research identified four candidate exoplanets.

Abstract--------------------------------------------------------------3
摘要------------------------------------------------------------------4
致謝------------------------------------------------------------------5
Chapter 1: Introduction-----------------------------------------------6
Chapter 2: Dataset Introduction---------------------------------------9
2.1 Data Sources------------------------------------------------------9
2.2 Data Preparation-------------------------------------------------11
2.3 Data Preprocessing-----------------------------------------------15
2.3.1 Clustering And Standardization--------------------------15
2.3.2 Removing Outliers And Processing Scattered Data Points--19
2.3.3 Data Folding--------------------------------------------22
2.3.4  Interpolation For Imputation---------------------------23

Chapter 3: Feature Engineering---------------------------------------25
3.1 Purpose And Effects Of Feature Engineering-----------------------25
3.2 Using TSfresh For Feature Extraction ----------------------------25
3.3   Specific  Feature Engineering Steps And Methods----------------26

Chapter 4: Model Selection-------------------------------------------33
4.1 Development Of Artificial Intelligence---------------------------33
4.2 Introduction And Development Of Machine Learning-----------------34
4.3 Decision Tree----------------------------------------------------36
4.4 Ensemble Learning Methods----------------------------------------37
4.4.1 Random Forest-------------------------------------------42
4.4.2 Adaboost------------------------------------------------43
4.4.3 XGboost(Extreme Gradient Boosting)----------------------46
4.4.4 LightGBM------------------------------------------------48
4.5 Development Of Deep Learning-------------------------------------50
4.6     Convolutional Neural Network---------------------------------51

Chapter 5: Model Selection-------------------------------------------60
5.1 Training、Validation、Testing Set--------------------------------60
5.2 K-Fold Cross-Validation -----------------------------------------61
5.3 Model Evaluation(Random Forest, Adaboost, XGBoost, LightGBM)63
5.4     Model  Evaluation(Convolutional Neural Networks)-------------71

Chapter 6: Results and Discussions-----------------------------------75
6.1 RF、XGB、LGBM Model----------------------------------------------76
6.2     Adab、 CNN Model---------------------------------------------79

Chapter 7: Conclutions-----------------------------------------------80
References-----------------------------------------------------------81
Python Code-------------------------------------------------------82~86

                                

[1] Christ, M., et al. (2024). tsfresh documentation (Version 0.20.2.post0.dev4+g3da2360) [Software documentation]. Blue Yonder GmbH. Retrieved from https://tsfresh.readthedocs.io/en/latest/

[2] Huang, C.-S. (2018). 機器學習 Ensemble Learning之Bagging, Boosting與AdaBoost. Medium. Retrieved from https://chih-sheng-huang821.medium.com/機器學習-ensemble-learning之bagging-boosting與adaboost-af031229ebc3

[3] Leo, chiu. (2018). 使用 TensorFlow 了解 Dropout. Medium. Retrieved from https://medium.com/手寫筆記/使用-tensorflow-了解-dropout-bf64a6785431

[4] Malik, A., Moster, B. P., & Obermeier, C. (2021). Exoplanet Detection using Machine Learning. arXiv preprint arXiv:2011.14135. Retrieved from https://arxiv.org/abs/2011.14135

[5] Mandel, K., & Agol, E. (2002). Analytic Light Curves for Planetary Transit
Searches. The Astrophysical Journal, 580(2): L171.

[6] TESS - FFI/TP/LC Bulk Downloads :
https://archive.stsci.edu/tess/bulk_downloads/bulk_downloads_ffi-tp-lc-dv.html.

[7] Yeh, L.-C., & Jiang, I.-G. (2020). Searching for Possible Exoplanet Transits from BRITE Data through a Machine Learning Technique. Publications of the Astronomical Society of the Pacific, 133(1019), 014401.

簡易檢索 / 詳目顯示

相關論文