研究生: |
楊聲濤 Yang, Sheng-Tao |
---|---|
論文名稱: |
在Cox比例風險模式之下對盛行倖存資料進行模式選擇 Variable selection in Cox proportional hazards model for prevalent survival data |
指導教授: |
鄭又仁
Cheng, Yu-Jen |
口試委員: |
邱燕楓
Chiu, Yen-Feng 鄭又仁 Cheng, Yu-Jen 趙蓮菊 Chao, Lien-Ju |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 52 |
中文關鍵詞: | Cox比例風險模式 、模式選擇 、右設限 、左截斷 |
外文關鍵詞: | Cox proportion hazards model, variable selection, Right censored, Left truncation |
相關次數: | 點閱:3 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們研究了在Cox比例風險模式(Cox proportional hazards model)分析盛行倖存資料(Prevalent survival data)時模式選擇議題,然而在此研究中我們遇到了一些挑戰。 第一,在盛行倖存資料中我們蒐集到許多項共變數(Covariates),研究的目標是希望從這些共變數挑僅選出些許重要的共變數即可,包含連續型與離散型的共變數;第二,盛行倖存資料事實上是有偏誤之抽樣方法(Biased sampling)。我們提供了一套方法,不僅可以同時進行模式選擇與參數估計,還可以矯正因抽樣方法造成的偏誤。更進一步地,我們的方法可以允許使用不同的懲戒函數(Penalty function),適用於連續型與離散型的共變數。模擬結果顯示我們提供的機制是穩定的,並且可以選出正確的模式。我最後,們也將此方法應用在一筆有關女性乳癌的真實資料上進行分析。
We study the variable selection problem in Cox proportional hazards model for prevalent survival data. In this study, we face some challenges. Firstly, from many potential predictors, we would like to select a small number of key risk factors, including continuous or discrete variables. Secondly, data were collected from a prevalent sampling which is exactly a biased sampling scheme. The proposed method not only can select and estimate variables simultaneously but also can correct the sampling bias. Further, the proposed method can allow for different penalty functions, including continuous or discrete variables. The results of simulation study show that the proposed procedure is stable and more accurate to select the true model. We also apply the proposed method to a real data.
Breheny, P. and Huang, J. (2015). Group descent algorithms for nonconvex
penalized linear and logistic regression models with grouped predictors.
Statistics and Computing 25, 173–187.
Breiman, L. (1996). Heuristics of instability and stabilization in model selection.
The Annals of Statistics 24, 2350–2383.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal
Statistical Society. Series B 34, 187–220.
Cox, D. R. (1975). Partial likelihood. Biometrika 62, 269–276.
Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle
regression. The Annals of Statistics 32, 407–499.
Fan, J. and Li, R. (2002). Variable selection for cox’s proportional hazards model
and frailty model. The Annals of Statistics 30, 74–99.
Friedman, J., Hastie, T., and Tibshirani, R. (2010). Regularization paths for
generalized linear models via coordinate descent. Journal of Statistical
Software 33, 1–22.
Kalbfleisch, J. D. and Prentice, R. L. (2002). The Statistical Analysis of Failure
Time Data. John Wiley & Sons, 2nd edition.
Kaplan, E. L. and Meier, P. (1958). Nonparametric estimation from incomplete
observations. Journal of the American Statistical Association 53, 457–481.
Lu, W. and Zhang, H. H. (2007). Variable selection for proportional odds model.
Statistics in Medicine 26, 3771–3781.
Rubin, D. B. (1997). Estimating causal effects from large data sets using
propensity scores. Annals of Internal Medicine 127, 757–763.
Simon, N., Friedman, J., Hastie, T., Tibshirani, R., et al. (2011). Regularization
paths for cox’s proportional hazards model via coordinate descent. Journal of
Statistical Software 39, 1–13.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal
of the Royal Statistical Society. Series B (Methodological) 58, 267–288.
Tibshirani, R. et al. (1997). The lasso method for variable selection in the cox
model. Statistics in Medicine 16, 385–395.
Tsai, W.-Y., Jewell, N. P., and Wang, M.-C. (1987). A note on the product-limit
estimator under right censoring and left truncation. Biometrika 74, 883–886.
van Houwelingen, H. C., Bruinsma, T., Hart, A. A. M., van’t Veer, L. J., and
Wessels, L. F. A. (2006). Cross-validated cox regression on microarray gene
expression data. Statistics in Medicine 25, 3201–3216.
Wang, H. and Leng, C. (2008). A note on adaptive group lasso. Computational
Statistics and Data Analysis 52, 5277–5286.
Wang, L., Chen, G., and Li, H. (2007). Group scad regression analysis for
microarray time course gene expression data. Bioinformatics 23, 1486–1494.
Wang, M.-C. (1989). A semiparametric model for randomly truncated data.
Journal of the American Statistical Association 84, 742–748.
Wang, M.-C., Brookmeyer, R., and Jewell, N. P. (1993). Statistical models for
prevalent cohort data. Biometrics 49, 1–11.
Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with
grouped variables. Journal of the Royal Statistical Society: Series B
(Statistical Methodology) 68, 49–67.
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave
penalty. The Annals of Statistics 38, 894–942.
Zhang, H. H. and Lu, W. (2007). Adaptive lasso for cox’s proportional hazards
model. Biometrika 94, 691–703.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the
American Statistical Association 101, 1418–1429.
Zou, H. (2008). A note on path-based variable selection in the penalized
proportional hazards model. Biometrika 95, 241–247.
Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized
likelihood models. Annals of Statistics 36, 1509.