研究生: |
劉昱志 Liu,Yu Chih |
---|---|
論文名稱: |
辨識函數型應變數的重要隨機效應 Identifying important random effects for functional response data |
指導教授: |
鄭少為
Cheng,Shao-Wei |
口試委員: |
曾勝滄
Tzeng,Sheng Tsang 洪志真 Horng,Jyh Jen |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2016 |
畢業學年度: | 104 |
語文別: | 中文 |
論文頁數: | 31 |
中文關鍵詞: | 函數型基底 、函數型線性模型 、函數型主成分分析 、線性判別分析 、線性混合效應模型 、晶圓厚度剖面 |
外文關鍵詞: | functional basis, functional linear model, functional principal component analysis, linear discriminant analysis, linear mixed-effect model, wafer thickness profile |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的發展,各種資料的收集變得更加容易,因為能頻繁地觀測和紀錄變數值,所謂的函數型資料也因應而生。函數型資料意指對某個變數$Y(t)$,在大量不同的$t$值下($t$可能為時間或位置),皆可觀測到$Y$值。當這些$t$值的數量夠多,且$Y(t)$對$t$為平滑之曲線時,則傾向將$Y(t)$視為$t$的函數來分析,此時便可將$Y(t)$稱為函數型資料,而這類型的資料通常具有高維度和高相關性的特性。有關函數型資料的分析,除了在學術上被多所探討外,其亦已被廣泛應用許多領域(如工業、金融、生物學等)的數據上。本論文將針對應變數為函數型而解釋變數為純量型的函數型線性模型來進行討論,而模型中解釋變數對應變數的效應,將被視為隨機效應,且每個隨機效應都會對應到一個已知的解釋變數之函數以及一個未知的函數型基底,此三者之乘積即為該函數型線性模型中的一個解釋項。本論文主要的研究課題是如何利用資料辨識最重要的(幾個)隨機效應,及估計其所對應之未知函數型基底。本論文將函數型基底視為投影方向,利用投影的想法,先將函數型線性模型轉換成混合效應模型。對此混合效應模型,藉由其檢定隨機效應的檢定統計量與線性判別分析中找投影方向之準則間的一致性,提出估計模型中之函數型基底的方法。而因為我們假設這些基底必須彼此直交,故本論文亦提出一個序列式估計法,由最顯著的投影方向開始,依檢定之顯著性高低,一一辨識出相對應的隨機效應之函數型基底。
本論文亦會比較利用此序列式估計法所求得之函數型基底,與利用函數型主成分分析所得到之基底,和在函數型線性模型下利用最小平方法之概念所求得的基底,三者之間的差異。並討論如何可推廣模型,和其對函數型基底之估計的影響。最後,我們將本文所發展的方法應用到一組晶圓剖面厚度的函數型資料上,以辨識最重要的隨機效應,並估計其所對應之函數型基底。
With the recent advances in data-collection technology, it becomes much easier to collect a large amount of data. When the values of a variable can be frequently observed and recorded at different, say times or locations, the so-called ``functional data'' emerges. Suppose that a variable $Y(t)$ can be observed at many different $t_i$'s (e.g., $t$ can be time or location). When the number of $t_i$'s is large and $Y(t)$ is a smooth curve of $t$, we prefer to regard the values of $Y(t_i)$'s as the realization of a random function of $t$ in the analysis. The collection of the values of $Y(t_i)$'s is then referred to as a functional data. Functional data usually have a rather large dimensionality and high correlations. The functional data analysis has been extensively theoretically studied in recent decades, and widely applied in the analysis of data across many areas (such as industry, finance, biology). In the thesis, we focus on the functional linear model with a functional response and several scalar predictors. In the model, the effects of the predictors on the response are considered as random effects, and every random effect corresponds to a known function of the predictors and an unknown functional basis. The product of the random effect, the known function of the predictors, and the unknown functional basis is an explanatory term in the model. In the thesis, we study and address the problems of identifying the most important random effects, and estimating their corresponding functional bases. We treat a functional basis as a direction on which the functional response can be projected. By using the idea of projection, we transform the functional linear model into several linear mixed-effect models (LMEMs), each corresponding to a random effect. Inspired by the connection between the statistic for testing a random effect in an LMEM and the criterion in the linear discriminant analysis for finding the best discriminant direction, we propose a method for estimating the functional bases. Because the functional bases are assumed to be mutually orthogonal in the model, the method is modified so as to give a sequential procedure, in each step of which the most significant random effect among the remaining random effects is identified under the restriction that its corresponding functional basis must be orthogonal to the previously identified ones. We also compare the difference between the functional bases, the bases identified by functional principal component analysis, and the bases obtained by applying the concept of least squares on the functional linear model. A discussion is given for a generalization of the functional linear model and its impact on the estimation of the functional bases. In the end, the method developed in the thesis is applied on a functional data of wafer thickness profile to identify the most important random effects and estimate their functional bases.
[1] Johnson, R. A. and Wichern, D. W. (2007). Applied multivariate statistical
analysis, 6th edition, Prentice hall.
[2] Ramsay, J. O. and Silverman, B. W., (2006). Functional data analysis, 2nd
edition, Springer.
[3] Santner, T. J., Williams, B. J., and Notz, W. I. (2013). The design and analysis
of computer experiments. Springer.
[4] Schäfer, J. and Strimmer, K. (2005) “A shrinkage approach to large-scale co-
variance matrix estimation and implications for functional genomics.” Statistical
applications in genetics and molecular biology, Vol. 4, Iss. 1, Article 32.
[5] Wu, C. J. and Hamada, M. S. (2011). Experiments: planning, analysis, and
optimization, 2nd edition. John Wiley & Sons.
[6] Wu, Y.-C. (2015). A study on functional FDD and functional ANOVA. PhD
thesis, National Chiao Tung University.