多變量局部線性迴歸模型的變異數分析及檢定

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡忠廷 Tsai, Chung-Ting
論文名稱：	多變量局部線性迴歸模型的變異數分析及檢定 Analysis of Variance and Hypothesis Testing for Multivariate Local Linear Regression Models
指導教授：	黃禮珊 Huang, Li-Shan
口試委員:	謝文萍 Hsieh, Wen-Ping 江金倉 Chiang, Chin-Tsang 謝叔蓉 Shieh, Shwu-Rong
學位類別：	碩士 Master
系所名稱：	理學院 - 統計學研究所 Institute of Statistics
論文出版年：	2017
畢業學年度：	105
語文別：	中文
論文頁數：	73
中文關鍵詞：	多變量、變異數分析、檢定、無母數、局部、線性
外文關鍵詞：	nonparametric, multivariate, local, linear, ANOVA, test
相關次數：	點閱：2 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在線性模型中，使用誤差平方和(error sum of squares) 之差來衡量複雜模型與簡單模型配適能力差距的F 檢定是選擇適當模型常用的方法之一，Huang and Chen (2008) [7] 曾將該檢定的概念推廣至局部多項式迴歸模型(local polynomial regression model)（參考Fan and Gijbels, 1996 [3]）中、建立局部與總體變異數分解、並提出新的F 檢定統計量以檢驗由局部多項式迴歸配適的模型函數相對於常數函數是否更為恰當。本論文將仿效Huang and Chen (2008) [7] 的手法，將此檢定架構推廣至多變量局部線性迴歸模型(multivariate local linear regression model)（參考Ruppert and Wand, 1994 [17]）中，建立局部與總體變異數分解，進而定義兩個F 統計量以協助解決兩個感興趣的檢定問題：(i) 模型函數是否為常數函數、(ii) d 維模型函數是否可簡化為 (d-1) 維之函數，接著以模擬測試這兩個檢定統計量在二元解釋變數情況下的型一錯誤(type I error) 與檢定力(power)，觀察其隨著樣本數、解釋變數相關性、帶寬(bandwidth)、以及拒絕訊號不同時的變化，也同時提出實際執行本論文檢定的計算方法，其中包含如何將乘積核函數(product kernel function) 進行正規化(normalization)，最後則會將這兩個檢定方法應用在美國波士頓地區房價資料的分析。

In linear models, it is common to test the difference between two nested models by measuring the difference of their error sums of squares and performing an F-test. Huang and Chen (2008) [7] have extended the structure of this F-test to local polynomial regression (LPR) models (see Fan and Gijbels, 1996 [3]), constructed local and global ANOVA decompositions for LPR models, and defined an F-statistic to test whether a model function fitted by LPR is significant. This thesis extends this F-test to multivariate local linear regression (MLLR) models (see Ruppert and Wand, 1994 [17]) by mimicking a similar framework proposed by Huang and Chen (2008) [7]. We establish local and global ANOVA decompositions for MLLR models, and define two F-statistics corresponding to the following two hypotheses: (i) whether a model function fitted by MLLR is significant, and (ii) whether a model function fitted by MLLR with covariates X_2,..., X_d is more appropriate than a model function fitted by MLLR with covariates X_1,..., X_d. In the bivariate case (d = 2), the type I error and power for these two F-tests are investigated by simulations under different settings of sample sizes, correlations of covariates, values of bandwidth, and signals of rejection, while practical issues of implementing these two F-tests are also discussed, including normalization for the product kernel function. At last, these two F-tests are applied to the analysis of Boston house-price data.

緒論                                                              1
背景介紹                                                          3
1 局部多項式迴歸. . . . . . . . . . . . . . . . . . . . . . . . . 3
2 局部多項式迴歸的變異數分析及檢定. . . . . . . . . . . . . . . . 4
3 多變量局部線性迴歸. . . . . . . . . . . . . . . . . . . . . . . 5
4 Tukey 中位數與bagplot . . . . . . . . . . . . . . . . . . . . . 6
5 其他文獻回顧. . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.1 可加性模型的GLR 檢定. . . . . . . . . . . . . . . . . . . . . 8
5.2 多變量無母數迴歸模型的誤差分布估計與模型架構檢定. . . . . . . 8
多變量局部線性迴歸的檢定方法                                      10
1 變異數分解. . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 漸近投影矩陣. . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 檢定模型函數是否為常數函數. . . . . . . . . . . . . . . . . . . 14
4 檢定模型函數是否為變數 X_2,..., X_d 之函數. . . . . . . . . . . 15
模擬                                                              18
1 核函數正規化方法. . . . . . . . . . . . . . . . . . . . . . . . 19
2 模擬一. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.1 檢定模型函數是否為常數函數的模擬結果. . . . . . . . . . . . . 22
2.2 檢定模型函數是否為單變量 X_1 函數的模擬結果. . . . . . . . . .24
3 模擬二. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.1 檢定模型函數是否為常數函數的模擬結果. . . . . . . . . . . . . 26
3.2 檢定模型函數是否為單變量 X_1 函數的模擬結果. . . . . . . . . .27
實際資料分析                                                      30
結論與後續研究                                                    39
附錄                                                              42
1 證明. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
1.1 性質3.1 證明. . . . . . . . . . . . . . . . . . . . . . . . . 42
1.2 定理3.2 證明. . . . . . . . . . . . . . . . . . . . . . . . . 43
1.3 定理3.3 證明. . . . . . . . . . . . . . . . . . . . . . . . . 43
1.4 定理3.4 證明. . . . . . . . . . . . . . . . . . . . . . . . . 51
1.5 定理3.5 證明. . . . . . . . . . . . . . . . . . . . . . . . . 54
2 模擬結果. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.1 模擬一. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.2 模擬二. . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
參考文獻                                                            72
                                

[1] Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: Wiley.
[2] Breiman, L. and Friedman, J. H. (1985). Estimating optimal transformations for multiple regression and correlation. Journal of the American Statistical Association, 80, 580–619.
[3] Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications. Chapman and Hall/CRC.
[4] Fan, J. and Jiang J. (2005). Nonparametric inferences for additive models. Journal of the American Statistical Association, 100, 890-907.
[5] Gu, J., Li, Q. and Yang, J.-C. (2015). Multivariate local polynomial kernel estimators: leading bias and asymptotic distribution. Econometric Reviews, 34, 979-1010.
[6] Harrison, D. and Rubinfeld, D. L. (1978). Hedonic prices and the demand for clean air. Journal of Environmental Economics and Management, 5, 81-102.
[7] Huang, L.-S. and Chen, J. (2008). Analysis of variance, coefficient of determination and F-test for local polynomial regression. The Annals of Statistics, 36, 2085-2109.
[8] Huang, L.-S. and Davidson, P. W. (2010). Analysis of variance and F-tests for partial linear models with applications to environmental health data. Journal of the American Statistical Association, 105, 991-1004.
[9] Huang, L.-S. and Su, H. (2009). Nonparametric F-tests for nested global and local polynomial models. Journal of Statistical Planning and Inference, 139, 1372-1380.
[10] Lepage, G. P. (1978). A new algorithm for adaptive multidimensional integration. Journal of Computational Physics, 27, 192-203.
[11] Neumeyer, N. and Keilegom, I. V. (2010). Estimating the error distribution in nonparametric multiple regression with applications to model testing. Journal of Multivariate Analysis, 101, 1067-1078.
[12] Nielsen, J. P. and Sperlich, S. (2005). Smooth backfitting in practice. Journal of the Royal Statistical Society: Series B, 67, 43-61.
[13] Opsomer, J. D. and Ruppert, D. (1997). Fitting a bivariate additive model by local polynomial regression. The Annals of Statistics, 25, 186-211.
[14] Opsomer, J. D. and Ruppert, D. (1998). A fully automated bandwidth selection method for fitting additive models. Journal of the American Statistical Association, 93, 605-619.
[15] Rousseeuw, P. J. and Ruts, I. (1998). Constructing the bivariate Tukey median. Statistica Sinica, 8, 827-839.
[16] Rousseeuw, P. J., Ruts, I. and Tukey J. W. (1999). The bagplot: a bivariate boxplot. The American Statistician, 53, 382-387.
[17] Ruppert, D. and Wand, M. P. (1994). Multivariate Locally Weighted Least Squares Regression. The Annals of Statistics, 22, 1346-1370.
[18] Schumann, E. (2009). Generating correlated uniform variates. 取自 http://comisef.wikidot.com/tutorial:correlateduniformvariates/
[19] Yu, K., Park, B. U. and Mammen, E. (2008). Smooth backfitting in generalized additive models. The Annals of Statistics, 36, 228-260.

全文公開日期 2022/06/18 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文