研究生: |
陳岳榆 Chen, Yueh-Yu |
---|---|
論文名稱: |
高維遷移學習之模型選擇與應用 Model Selection for High-Dimensional Transfer Learning And Their Applications |
指導教授: |
銀慶剛
Ing, Ching-Kang |
口試委員: |
俞淑惠
Yu, Shu-Hui 邱海唐 Chiou, Hai-Tang |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2024 |
畢業學年度: | 112 |
語文別: | 中文 |
論文頁數: | 39 |
中文關鍵詞: | 高維度 、模型選擇 、遷移學習 、純粹貪婪演算法 、柴比雪夫貪婪演算法 |
外文關鍵詞: | High-Dimensional, Model Selection, Transfer Learning, Pure Greedy Algorithm, ChebyShev Greedy Algorithm |
相關次數: | 點閱:52 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
高維數據與遷移學習分別在統計與機器學習是非常實用的議題。本論文將會回顧 Lin, Ing, Dai 與 Chen (2023) 對非線性模型議題提出的選模方法與理論性質,以及 Hamsa (2020) 對遷移學習議題提出的兩步驟求解概念,並套用他們的想法,以純粹貪婪演算法和柴比雪夫貪婪演算法兩種貪婪演算法為主要工具,針對遷移學習資料提出了一套兩步驟的模型選擇程序稱為 TSGA。此外,就算在不知道真實代理數據偏誤的稀疏性時,仍可以考慮使用 TSGA 並且其結果不會受到稀疏性影響。模擬結果與訂房網站實際資料分析的應用說明了我們的方法具有實用性。
We consider the problem of high-dimensional data and transfer learning. First of all, we review the model selection methods and theoretical properties of Lin, Ing, Dai and chen (2023) on non-linear models.Then we also review the two step estimation method of Hamsa (2020) on transfer learning. After that, we borrow their ideas and propose a model selection method under transfer learning situation called two step greedy algorithm (TSGA) which use pure greedy algorithm (PGA) and ChebyShev greedy algorithm (CGA) as the main tool to select relevant variables.In addition, we still can use TSGA even if we are not sure about the sparsity of proxy data bias.Simulation results and applications to booking website data are provided to shed light on the performance and usefulness of our approach.
1. Chien-Tong Lin, Ching-Kang Ing, Chi-Shian Dai, You-Lin Chen(2023). High-dimensional model selection via Chyebyshev's Greedy Algorithm. Working paper.
2. Hamsa Bastani(2020). Predicting with Proxies: Transfer Learning in High Dimension. Management Science, 67(5).
3. Ching-Kang Ing, Tze-Leung Lai(2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear model. Statistica Sinica, 21, 1473-1513.
4. Peter Buhlmann, Bin Yu(2003). Boosting with the L_2-loss: regression and classification. Journal of the American Statistical Association, 98(1), 324-339.