研究生: |
溫邦淳 WEN, BANG CHUN |
---|---|
論文名稱: |
利用Fused LASSO對倖存資料進行分析 Analysis of survival data with Fused LASSO |
指導教授: |
鄭又仁
Cheng, Yu Jen |
口試委員: |
邱燕楓
Chiu,Yen Feng 趙蓮菊 Chao, Anne |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 38 |
中文關鍵詞: | 懲戒函數 、變數選取 、變數分群 、倖存分析 |
外文關鍵詞: | penalty function, Fused LASSO, variable grouping |
相關次數: | 點閱:4 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本篇研究中,我們的目的是在Cox 比例風險函數中同時進行估計、變數選取及變數分群。Tibshirani (1996) 在目標函數中加入L1-norm 懲戒函數進行估計讓估計參數具有稀疏性,以此有效的同時達到估計以及變數選取的效果。在傳統的變數分群方法中,變數通常會根據從前的知識來進行分群,而這種分群方法通常被認定太過主觀。本篇研究中,我們應用Tibshirani et al. (2005) 的手法於Cox 比例風險函數的偏概似函數上,Fused LASSO 懲戒函數著重在參數和參數差的L1-norm,其中參數的L1懲戒函數使得參數估計值受到壓縮而達到稀疏性的性質,而參數差的L1 懲戒函數將鄰近的參數差進行壓縮,鄰近的參數得到相同估計值藉此進行變數分群。這種以數據自我統計的方法是較為客觀的,並且我們可以同時估計、變數選取及變數分群。在模擬方面,我們考慮四種模型比較:LASSO、Generalized LASSO、Fused LASSO、和正常的Cox model,以這些模型來分別比較這些懲戒函數所帶來的效果,並且實際應用在一筆肺癌經過輔助化療後基因位點資料分析。
In this work, our aims are to do model selection, coefficient estimation and variable grouping imultaneously in Cox’s proportional hazards model.Tibshirani (1996) added L1 norm penalty function to objective function to obtain the sparsity of coefficient estimation, which is an efficient way to domodel election and coefficient estimation at one time. In traditional variable grouping methods, variables are grouped based on the prior knowledge, which is often be judged too subjective. In this work, we apply Tibshirani et al. (2005) to the partial likelihood of Cox model. The Fused LASSO penalty focuses on the combination of L1 norm and the difference of L1 norm: L1 penalty shrinkages coefficients to ensure the sparseness of coefficient
estimates, while the difference of L1 penalty shrinkages the difference between the neighboring coefficients, which makes variables be grouped in the sense of nvolving same coefficient estimates. This data adaptive approach is more objective and we can estimate, select and group variables simultaneously. In our simulation, we consider three different cases: LASSO, generalized LASSO and Fused LASSO to compare the effects of the L1 and the difference of L1 penalty and apply to analysis Gene Signature for Adjuvant Chemotherapy in Resected Non–Small-Cell Lung cancer data.
References
Breiman, L. (1996). Heuristics of instability and stabilization in model selection.
The Annals of Statistics 24, 2350–2383.
Chaturvedi, N., de Menezes, R. X., and Goeman, J. J. (2014). Fused lasso algorithm
for cox proportional hazards and binomial logit models with application
to copy number profiles. Biometrical Journal 56, 477–492.
Cox, D. R. (1975). Partial likelihood. Biometrika 62, 269–276.
Efron, B., Hastie, T., Johnstone, I., Tibshirani, R., et al. (2004). Least angle
regression. The Annals of Statistics 32, 407–499.
Fan, J. and Li, R. (2002). Variable selection for cox’s proportional hazards model
and frailty model. The Annals of Statistics 30, 74–99.
Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional
feature space. Journal of the Royal Statistical Society: Series B (Statistical
Methodology) 70, 849–911.
Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for
generalized linear models via coordinate descent. Journal of Statistical Software
33, 1–22.
Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics
6, 461–464.
Simon, N., Friedman, J., Hastie, T., and Tibshirani, R. (2011). Regularization
paths for cox proportional hazards model via coordinate descent. Journal of
Statistical Software 39, 1–13.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal
of the Royal Statistical Society. Series B (Methodological) 58, 267–288.
Tibshirani, R. (1997). The lasso method for variable selection in the cox model.
Statistics in Medicine 16, 385–395.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., and Knight, K. (2005). Sparsity
and smoothness via the fused lasso. Journal of the Royal Statistical Society:
Series B (Statistical Methodology) 67, 91–108.
Tibshirani, R. J. (2011). The solution path of the generalized lasso. Technical
report, Stanford University.
van Houwelingen, H. C., Bruinsma, T., Hart, A. A., van’t Veer, L. J., and Wessels,
L. F. (2006). Cross-validated cox regression on microarray gene expression data.
Statistics in Medicine 25, 3201–3216.
Yamaoka, K., Nakagawa, T., and Uno, T. (1978). Application of akaike’s information
criterion (aic) in the evaluation of linear pharmacokinetic equations.
Journal of Pharmacokinetics and Biopharmaceutics 6, 165–175.
Zhang, H. H. and Lu, W. (2007). Adaptive lasso for cox proportional hazards
model. Biometrika 94, 691–703.
Zhu, C.-Q., Ding, K., Strumpf, D., Weir, B. A., Meyerson, M., Pennell, N.,
Thomas, R. K., Naoki, K., Ladd-Acosta, C., Liu, N., et al. (2010). Prognostic
and predictive gene signature for adjuvant chemotherapy in resected non–smallcell
lung cancer. Journal of Clinical Oncology 28, 4417–4424.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the
American Statistical Association 101, 1418–1429.