研究生: |
陳泰宇 Chen, Tai-Yu |
---|---|
論文名稱: |
k折交叉驗證檢驗均方誤差的偏誤和變異數之間的抵換關係: 模擬研究 The Bias-Variance Trade-off of the Testing Mean Squared Error by k-fold Cross-Validation: A Simulation Study |
指導教授: |
楊睿中
Yang, Jui-Chung |
口試委員: |
郭俊宏
Kuo, Chun-Hung 莊惠菁 Chuang, Hui-Ching |
學位類別: |
碩士 Master |
系所名稱: |
科技管理學院 - 經濟學系 Department of Economics |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 27 |
中文關鍵詞: | 交叉驗證 、檢驗的均方誤差 |
外文關鍵詞: | Cross-Validation, testing mean squared error |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本文研究k折(fold)交叉驗證(Cross-Validation)法下的檢驗的均方誤差(testing mean squared error)的偏誤(bias)和變異數(variance)的抵換關係理論。理論顯示隨著折(fold)的數量增加數檢驗的均方誤差的變異數隨之變大而偏誤隨之變小的關係。本文使用模擬線性回歸模型,利用交叉驗證法計算出檢驗的均方誤差。而電腦模擬的結果發現資料樣本數越大,偏誤和變異數抵換關係理論與實務經驗一致。亦即,隨著折的個數增加,檢驗的均方誤差的偏誤會越小而變異數會越大。
This paper studies the problem that the theorem shows testing mean squared error's bias and variance trade-off relationship. The theorem shows that as the number of folds increases the testing mean squared error's variance becomes larger and the bias becomes smaller. This paper uses simulate a linear regression model and uses Cross-Validation to calculate the testing mean squared error. The results of computer simulation found that the larger of the number of data samples, the theorem of the bias and variance trade-off relationship is consistent with practical experience. That is, as the number of folds increases, the testing mean squared error's bias becomes smaller and the variance becomes larger.
keywords: Cross-Validation, testing mean squared error
Breiman, L., & Spector, P. (1992). Submodel selection and evaluation
in regression, the x-random case. International Statistical
Review, 291-319.
Breiman, L. (2001). Random forests. Machine Learning, 45:5–32.
Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization
paths for generalized linear models via coordinate descent.
Journal of Statistical Software, 33(1), 1-22.
Friedman, J.H. (1999). Greedy function approximation: A gradient
boosting machine.Technical Report, Statistics Department, Stanford
University.
James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An
Introduction to Statistical Learning. Springer, 176-
183.
Kohavi, R. (1995). A study of cross validation and bootstrap for
accuracy estimation and model selection. International Joint
Conference on Artificial Intelligence, 1137-1143.
Nadaraya, E.A. (1964). On estimating regression. Theory of
Probability & Its Applications,9(1), 141-142.
Stone, M. (1974). Cross-Validatory choice and assessment of
statistical predictions. Journal of the Royal Statistical Society:
Series B(Methodological), 36(2), 111-147.
Scott, D.W., & Terrell, G.R. (1987). Biased and unbiased cross-
validation in density estimation. Journal of the American
Statistical Association, 82(400), 1131-1146.
Sarda, P. (1993). Smoothing parameter selection for smooth
distribution functions. Journal of Statistical Planning and
inference, 35, 65-75.
Tibshirani, R. (1996). Regression shrinkage and selection via the
lasso. Journal of the Royal Statistical Society: Series
B(Methodological), 58, 267-288.
Watson, G. S. (1964). Smooth regression analysis. Sankhya: The
Indian Journal of Statistics, Series A. 26(4), 359–372.