研究生: |
蔡忠廷 Tsai, Chung-Ting |
---|---|
論文名稱: |
多變量局部線性迴歸模型的變異數分析及檢定 Analysis of Variance and Hypothesis Testing for Multivariate Local Linear Regression Models |
指導教授: |
黃禮珊
Huang, Li-Shan |
口試委員: |
謝文萍
Hsieh, Wen-Ping 江金倉 Chiang, Chin-Tsang 謝叔蓉 Shieh, Shwu-Rong |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 73 |
中文關鍵詞: | 多變量 、變異數分析 、檢定 、無母數 、局部 、線性 |
外文關鍵詞: | nonparametric, multivariate, local, linear, ANOVA, test |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在線性模型中,使用誤差平方和(error sum of squares) 之差來衡量複雜模型與簡單模型配適能力差距的F 檢定是選擇適當模型常用的方法之一,Huang and Chen (2008) [7] 曾將該檢定的概念推廣至局部多項式迴歸模型(local polynomial regression model)(參考Fan and Gijbels, 1996 [3])中、建立局部與總體變異數分解、並提出新的F 檢定統計量以檢驗由局部多項式迴歸配適的模型函數相對於常數函數是否更為恰當。本論文將仿效Huang and Chen (2008) [7] 的手法,將此檢定架構推廣至多變量局部線性迴歸模型(multivariate local linear regression model)(參考Ruppert and Wand, 1994 [17])中,建立局部與總體變異數分解,進而定義兩個F 統計量以協助解決兩個感興趣的檢定問題:(i) 模型函數是否為常數函數、(ii) d 維模型函數是否可簡化為 (d-1) 維之函數,接著以模擬測試這兩個檢定統計量在二元解釋變數情況下的型一錯誤(type I error) 與檢定力(power),觀察其隨著樣本數、解釋變數相關性、帶寬(bandwidth)、以及拒絕訊號不同時的變化,也同時提出實際執行本論文檢定的計算方法,其中包含如何將乘積核函數(product kernel function) 進行正規化(normalization),最後則會將這兩個檢定方法應用在美國波士頓地區房價資料的分析。
In linear models, it is common to test the difference between two nested models by measuring the difference of their error sums of squares and performing an F-test. Huang and Chen (2008) [7] have extended the structure of this F-test to local polynomial regression (LPR) models (see Fan and Gijbels, 1996 [3]), constructed local and global ANOVA decompositions for LPR models, and defined an F-statistic to test whether a model function fitted by LPR is significant. This thesis extends this F-test to multivariate local linear regression (MLLR) models (see Ruppert and Wand, 1994 [17]) by mimicking a similar framework proposed by Huang and Chen (2008) [7]. We establish local and global ANOVA decompositions for MLLR models, and define two F-statistics corresponding to the following two hypotheses: (i) whether a model function fitted by MLLR is significant, and (ii) whether a model function fitted by MLLR with covariates X_2,..., X_d is more appropriate than a model function fitted by MLLR with covariates X_1,..., X_d. In the bivariate case (d = 2), the type I error and power for these two F-tests are investigated by simulations under different settings of sample sizes, correlations of covariates, values of bandwidth, and signals of rejection, while practical issues of implementing these two F-tests are also discussed, including normalization for the product kernel function. At last, these two F-tests are applied to the analysis of Boston house-price data.
[1] Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley.
[2] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80, 580–619.
[3] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall/CRC.
[4] Fan, J. and Jiang J. (2005). Nonparametric inferences for additive models. Journal of the American Statistical Association, 100, 890-907.
[5] Gu, J., Li, Q. and Yang, J.-C. (2015). Multivariate local polynomial kernel estimators: leading bias and asymptotic distribution. Econometric Reviews, 34, 979-1010.
[6] Harrison, D. and Rubinfeld, D. L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5, 81-102.
[7] Huang, L.-S. and Chen, J. (2008). Analysis of variance, coefficient of determination and F-test for local polynomial regression. The Annals of Statistics, 36, 2085-2109.
[8] Huang, L.-S. and Davidson, P. W. (2010). Analysis of variance and F-tests for partial linear models with applications to environmental health data. Journal of the American Statistical Association, 105, 991-1004.
[9] Huang, L.-S. and Su, H. (2009). Nonparametric F-tests for nested global and local polynomial models. Journal of Statistical Planning and Inference, 139, 1372-1380.
[10] Lepage, G. P. (1978). A new algorithm for adaptive multidimensional integration. Journal of Computational Physics, 27, 192-203.
[11] Neumeyer, N. and Keilegom, I. V. (2010). Estimating the error distribution in nonparametric multiple regression with applications to model testing. Journal of Multivariate Analysis, 101, 1067-1078.
[12] Nielsen, J. P. and Sperlich, S. (2005). Smooth backfitting in practice. Journal of the Royal Statistical Society: Series B, 67, 43-61.
[13] Opsomer, J. D. and Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression. The Annals of Statistics, 25, 186-211.
[14] Opsomer, J. D. and Ruppert, D. (1998). A fully automated bandwidth selection method for fitting additive models. Journal of the American Statistical Association, 93, 605-619.
[15] Rousseeuw, P. J. and Ruts, I. (1998). Constructing the bivariate Tukey median. Statistica Sinica, 8, 827-839.
[16] Rousseeuw, P. J., Ruts, I. and Tukey J. W. (1999). The bagplot: a bivariate boxplot. The American Statistician, 53, 382-387.
[17] Ruppert, D. and Wand, M. P. (1994). Multivariate Locally Weighted Least Squares Regression. The Annals of Statistics, 22, 1346-1370.
[18] Schumann, E. (2009). Generating correlated uniform variates. 取自 http://comisef.wikidot.com/tutorial:correlateduniformvariates/
[19] Yu, K., Park, B. U. and Mammen, E. (2008). Smooth backfitting in generalized additive models. The Annals of Statistics, 36, 228-260.