簡易檢索 / 詳目顯示

研究生: 游翔
Yu, Hsiang
論文名稱: 訊息設限與測量誤差下之復發事件分析
Recurrent event data analysis with informative censoring and measurement error
指導教授: 鄭又仁
Cheng, Yu-Jen
王清雲
Wang, Ching-Yun
口試委員: 黃冠華
Huang, Guan-Hua
邱燕楓
Chiu, Yen-Feng
江金倉
Chiang, Chin-Tsang
黃禮珊
Huang, Li-Shan
學位類別: 博士
Doctor
系所名稱: 理學院 - 統計學研究所
Institute of Statistics
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 94
中文關鍵詞: 復發事件資料訊息設限測量誤差
外文關鍵詞: Recurrent event data, Informative censoring, Measurement error
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 復發事件 (recurrent event) 在長期追蹤以及臨床實驗中是相當常見的資料型態。對於復發型資料分析而言,實驗者通常對共變數因子 (covariates) 對復發事件的頻率函數(rate function)的影響感到興趣。文獻中有許多統計方法可用於估計共變數因子對頻率函數的影響 (effect),但大多需要設限時間獨立(independent censoring)以及共變數測量值準確等假設。然而,在真實資料分析中,復發事件可能被其他事件 (例如:死亡) 中止而違反設限時間獨立之假設。此種情況,我們稱之為訊息設限 (informative censoring)。此外,共變數因子的測量值可能受限於測量誤差 (measurement errors) 而需要被校正。本篇論文主要提出半母數估計方法,在訊息設限和共變數因子有測量誤差的情況下,對復發型資料的共變數因子進行迴歸分析。本論文總共分為兩部分: 第一部分探討單一復發事件 (univariate recurrent event) 的估計方法。我們利用共享脆弱模型 (shared frailty model)來解釋訊息設限和復發事件之間的關聯性以及發生於同一人之事件的相關性。詳細而言,假設當脆弱變數 (frailty variable) 給定之後,復發事件服從一個普瓦松過程,其強度函數為一共享脆弱模型,且不假定脆弱變數的分配。在共變數和測量誤差通同時服從常態分配的假設下,我們提出迴歸校正法(regression calibration approach)和動差校正法(moment corrected approach) 去修正測量誤差在迴歸參數估計中造成的偏誤。此兩種方法皆屬於有母數校正方法且需要重複測量資料 (replicated data) 去估計測量誤差的變異數 (variance)。在第二部分,我們將第一部份的方法延伸到多變量復發事件 (multivariate recurrent event data) 分析。在此類資料中,研究者會對兩種類型以上的復發事件同時感到興趣。另外,我們考慮的情況為:每個樣本都有一個不偏測量值(surrogate),但只有一部分的樣本有工具變數 (instrumental variable)。重複測量資料和驗證資料 (validation data)皆不可得。我們假設不同類型復發事件的頻率函數服從不同的共享脆弱模型,其中脆弱變數用來描述訊息設限和復發事件之間的關聯以及不同復發事件之間的相關性。為修正測量誤差,我們提出兩個無母數校正方法(non-parametric correction approaches)去估計迴歸參數。第一個無母數校正方法只用工具變數可得之部份樣本來進行估計。為增進估計效率,我們提出第二個校正方法將其餘的樣本也納入估計。不同於第一部分,第二部分之方法不需要普瓦松過程的假設以及共變數和測量誤差的分配假設 (distributional assumption)。在估計過程中,我們亦不假定脆弱函數之分配。在兩個部分中,我們分別對本文提出之估計統計量建立大樣本理論,且利用模擬實驗來檢查估計量的表現。最後,我們將本文提出之估計方法套用到硒與癌症預防之雙盲實驗資料 (the Nutritional Prevention of Cancer trial),估計硒的補充對預防鱗狀細胞癌 (squamous cell carcinoma) 和 基底細胞癌 (basal cell carcinoma)的復發之效用。


    Recurrent event data are frequently observed in many longitudinal and clinical studies. In the literature, various methods have been proposed to analyze covariate effects on the occurrence rate of a recurrent event, yet these methods usually require the assumption of independent censoring and accurately measured covariates. However, in many real data applications, informative censoring occurs when the recurrent event process is stopped by some terminal events that are related to the recurrent event (e.g. death). Additionally, the covariates could be measured with errors and need to be corrected. In this doctoral dissertation, we develop semi-parametric estimation to deal with informative censoring and measurement errors for recurrent event data. This dissertation contains two works. In the first work, we propose two approaches to estimate regression parameters for univariate recurrent event data in the presence of informative censoring and measurement errors. Explicitly, we impose a shared frailty model on the intensity function of a Poisson process to characterize the informative censoring and the dependence of the events within a subject without specifying the frailty distribution. To estimate the regression parameters, a regression calibration method and a moment corrected method are proposed for adjusting measurement errors. Both methods are referred to as the parametric correction because they assume that the underlying covariates and error terms are normally distributed. Moreover, the replicated data is needed to estimate the measurement error variance. In the second work, we extend the first work to accommodate informative censoring and measurement errors in multivariate recurrent event data, in which more than one type of events is of interest. Also, we consider a situation that a surrogate is available for all subjects but an instrumental variable is obtained only for a fraction of subjects. No replicated data or a validation set is available. To formulate the dependence of the informative censoring on the recurrent event processes, a shared frailty model is imposed on the rate function for each type of recurrent event, where the frailty distribution is unspecified. The shared frailty model also characterizes the association among different types of recurrent events. For regression parameter estimation, we first construct a simple correction approach, in which only subjects with an observed instrumental variable are involved in the estimation. To gain the efficiency of the simple correction estimator, we further develop a new correction approach to incorporate the information from the whole cohort. Distinct from the approaches in our first work, the approaches in the second work require neither the assumption of a Poisson process nor the distributional assumption of the underlying covariates and measurement errors. The asymptotic properties of the four proposed estimators are established. The performance of all proposed methods is investigated through simulation studies. We illustrate the proposed methods with the Nutritional Prevention of Cancer data, which aims to assess the effect of plasma selenium supplement on recurrences of squamous cell carcinoma and basal cell carcinoma.

    1 Introduction 1 2 Literature reviews 5 2.1 Modelling and statistical methods for recurrent event data . . . . 5 2.1.1 Conditional model . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Marginal model . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.3 Frailty model . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Methods for measurement errors in various models . . . . . . . . 10 2.2.1 Generalized linear model . . . . . . . . . . . . . . . . . . . 11 2.2.2 Cox model . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.2.3 Recurrent event model . . . . . . . . . . . . . . . . . . . . 14 3 Parametric corrections for univariate recurrent event model with informative censoring and measurement error 16 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Model illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.2.1 Recurrent event model . . . . . . . . . . . . . . . . . . . . 18 3.2.2 Measurement model . . . . . . . . . . . . . . . . . . . . . 19 3.3 Estimating methods . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3.1 Regression Calibration . . . . . . . . . . . . . . . . . . . . 22 3.3.2 Moment corrected approach . . . . . . . . . . . . . . . . . 24 3.4 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.5 Real data application . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4 Non-Parametric corrections for multivariate recurrent event model with informative censoring and measurement error 34 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.2 Model illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2.1 Multivariate recurrent event model . . . . . . . . . . . . . 36 4.2.2 Measurement error model . . . . . . . . . . . . . . . . . . 37 4.3 Estimating methods . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.3.1 Simple non-parametric correction method . . . . . . . . . 41 4.3.2 GMM non-parametric correction method . . . . . . . . . 43 4.4 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.5 Real data application . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5 Conclusions and future works 53 Appendix A Asymptotic properties of the RC and MC estimators 76 Appendix B Proof of RC = MC for regression parameters 80 Appendix C Asymptotic properties of the SNC and GNC estimators 81

    Amorim, L. D. and Cai, J. (2015). Modelling recurrent events: a tutorial for analysis in epidemiology. International journal of epidemiology 44, 324–333.
    Andersen, P. K. and Gill, R. D. (1982). Cox’s regression model for counting processes: a large sample study. The Annals of Statistics 10, 1100–1120.
    Armstrong, B. (1985). Measurement error in the generalised linear model. Communications in Statistics-Simulation and Computation 14, 529–544.
    Balakrishnan, N. and Peng, Y. (2006). Generalized gamma frailty model. Statistics in Medicine 25, 2797–2816.
    Buonaccorsi, J. (2010). Measurement error: models, methods, and applications. Chapman and Hall/CRC, New York.
    Buzas, J. S. (1997). Instrumental variable estimation in nonlinear measurement error models. Communications in Statistics-Theory and Methods 26, 2861–2877.
    Buzas, J. S. (1998). Unbiased scores in proportional hazards regression with
    covariate measurement error. Journal of Statistical Planning and Inference
    67, 247–257.
    Cai, J. and Schaubel, D. E. (2004). Marginal means/rates models for multiple
    type recurrent event data. Lifetime data analysis 10, 121–138.
    Carroll, R. J., Kuchenhoff, H., Lombard, F., and Stefanski, L. A. (1996).
    Asymptotics for the simex estimator in nonlinear measurement error models.
    Journal of the American Statistical Association 91, 242–250.
    Carroll, R. J., Ruppert, D., Crainiceanu, C. M., Tosteson, T. D., and Karagas,
    M. R. (2012). Nonlinear and nonparametric regression and instrumental
    variables. Journal of the American Statistical Association 99, 736–750.
    Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006).
    Measurement error in nonlinear models: a modern perspective. Chapman &
    Hall, London.
    Carroll, R. J., Spiegelman, C. H., Lan, K. K., Bailey, K. T., and Abbott, R. D.
    (1984). On errors-in-variables for binary regression models. Biometrika 71,
    19–25.
    Chen, C. M., Chuang, Y. W., and Shen, P. S. (2015). Two-stage estimation for
    multivariate recurrent event data with a dependent terminal event. Biometrical
    Journal 57, 215–233.
    Clark, L. C., Combs, G. F., Turnbull, B. W., Slate, E. H., Chalker, D. K.,
    Chow, J., Davis, L. S., Glover, R. A., Graham, G. F., Gross, E. G., et al.
    (1996). Effects of selenium supplementation for cancer prevention in patients
    with carcinoma of the skin: a randomized controlled trial. Journal of the
    American Medical Association 276, 1957–1963.
    Cook, J. R. and Stefanski, L. A. (1994). Simulation-extrapolation estimation
    in parametric measurement error models. Journal of the American Statistical
    association 89, 1314–1328.
    Cook, R. J. and Lawless, J. F. (2007). The statistical analysis of recurrent
    events. Springer, New York.
    Cook, R. J., Lawless, J. F., Lakhal-Chaieb, L., and Lee, K. A. (2009). Robust
    estimation of mean functions and treatment effects for recurrent events under
    event-dependent censoring and termination: application to skeletal complications
    in cancer metastatic to bone. Journal of the American Statistical
    Association 104, 60–75.
    Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal
    Statistical Society Series B 34, 187–220.
    Duchateau, L., Janssen, P., Kezic, I., and Fortpied, C. (2003). Evolution of
    recurrent asthma event rate over time in frailty models. Journal of the Royal
    Statistical Society Series C 52, 355–363.
    Fleming, T. R. and Harrington, D. P. (1991). Counting processes and survival
    analysis. John Wiley & Sons, New York.
    Foutz, R. V. (1977). On the unique consistent solution to the likelihood equations.
    Journal of the American Statistical Association 72, 147–148.
    Fuller, W. A. (1987). Measurement error models. John Wiley & Sons, New
    York.
    Ghosh, D. and Lin, D. Y. (2000). Nonparametric analysis of recurrent events
    and death. Biometrics 56, 554–562.
    Ghosh, D. and Lin, D. Y. (2002). Marginal regression models for recurrent and
    terminal events. Statistica Sinica 12, 663–688.
    Gorfine, M., Hsu, L., and Prentice, R. L. (2004). Nonparametric correction
    for covariate measurement error in a stratified Cox model. Biostatistics 5,
    75–87.
    Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling
    without replacement from a finite universe. Journal of the American statistical
    Association 47, 663–685.
    Hu, C. and Lin, D. Y. (2002). Cox regression with covariate measurement error.
    Scandinavian Journal of Statistics 29, 637–655.
    Hu, C. and Lin, D. Y. (2004). Semiparametric failure time regression with
    replicates of mismeasured covariates. Journal of the American Statistical
    Association 99, 105–118.
    Hu, P., Tsiatis, A. A., and Davidian, M. (1998). Estimating the parameters in
    the Cox model when covariate variables are measured with error. Biometrics
    54, 1407–1419.
    Huang, C. Y., Qin, J., and Wang, M. C. (2010). Semiparametric analysis for recurrent
    event data with time-dependent covariates and informative censoring.
    Biometrics 66, 39–49.
    Huang, C. Y. and Wang, M. C. (2004). Joint modeling and estimation for
    recurrent event processes and failure time data. Journal of the American
    Statistical Association 99, 1153–1165.
    Huang, Y. and Wang, C. Y. (2000). Cox regression with accurate covariates
    unascertainable: a nonparametric-correction approach. Journal of the American
    Statistical Association 95, 1209–1219.
    Huang, Y. and Wang, C. Y. (2001). Consistent functional methods for logistic
    regression with errors in covariates. Journal of the American Statistical
    Association 96, 1469–1482.
    Huang, Y. and Wang, C. Y. (2006). Errors-in-covariates effect on estimating
    functions: Additivity in limit and nonparametric correction. Statistica Sinica
    96, 861–881.
    Huber, P. J. (2009). Robust statistics. John Wiley & Sons, New Jersey.
    Hughes, M. D. (1993). Regression dilution in the proportional hazards model.
    Biometrics 49, 1056–1066.
    Jiang, W., Turnbull, B. W., and Clark, L. C. (1999). Semiparametric regression
    models for repeated events with random effects and measurement error.
    Journal of the American Statistical Association 94, 111–124.
    Kalbfleisch, J. D., Schaubel, D. E., Ye, Y., and Gong, Q. (2013). An estimating
    function approach to the analysis of recurrent and terminal events. Biometrics
    69, 366–374.
    Lancaster, T. and Intrator, O. (1998). Panel data with survival: hospitalization
    of hiv-positive patients. Journal of the American Statistical Association 93,
    46–53.
    Lawless, J. F., Hu, J., and Cao, J. (1995). Methods for the estimation of
    failure distributions and rates from automobile warranty data. Lifetime Data
    Analysis 1, 227–240.
    Lawless, J. F. and Nadeau, C. (1995). Some simple robust methods for the
    analysis of recurrent events. Technometrics 37, 158–168.
    Liao, X., Zucker, D. M., Li, Y., and Spiegelman, D. (2011). Survival analysis
    with error-prone time-varying covariates: A risk set calibration approach.
    Biometrics 67, 50–58.
    Lin, D. Y., Wei, L. J., Yang, I., and Ying, Z. (2000). Semiparametric regression
    for the mean and rate functions of recurrent events. Journal of the Royal
    Statistical Society Series B 62, 711–730.
    Liu, L. and Huang, X. (2008). The use of gaussian quadrature for estimation
    in frailty proportional hazards models. Statistics in medicine 27, 2665–2683.
    Liu, L., Wolfe, R. A., and Huang, X. (2004). Shared frailty models for recurrent
    events and a terminal event. Biometrics 60, 747–756.
    Mazroui, Y., Mathoulin-Pelissier, S., Soubeyran, P., and Rondeau, V. (2012).
    General joint frailty model for recurrent event data with a dependent terminal
    event: application to follicular lymphoma data. Statistics in medicine 31,
    1162–1176.
    Morgan, W. J., Butler, S. M., Johnson, C. A., Colin, A. A., FitzSimmons,
    S. C., Geller, D. E., Konstan, M. W., Light, M. J., Rabin, H. R., Regelmann,
    W. E., et al. (1999). Epidemiologic study of cystic fibrosis: design and implementation
    of a prospective, multicenter, observational study of patients with
    cystic fibrosis in the us and canada. Pediatric Pulmonology 28, 231–241.
    Nakamura, T. (1990). Corrected score function for errors-in-variables models:
    Methodology and application to generalized linear models. Biometrika 77,
    127–137.
    Nakamura, T. (1992). Proportional hazards model with covariates subject to
    measurement error. Biometrics 48, 829–838.
    Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis
    testing. Handbook of econometrics 4, 2111–2245.
    Ng, E. T. M. and Cook, R. J. (1999). Robust inference for bivariate point
    processes. The Canadian Journal of Statistics 27, 509–524.
    Nielsen, G. G., Gill, R. D., Andersen, P. K., and Sørensen, T. I. (1992). A
    counting process approach to maximum likelihood estimation in frailty models.
    Scandinavian Journal of Statistics 19, 25–43.
    Ning, J., Rahbar, M. H., Choi, S., Piao, J., Hong, C., del Junco, D. J., Rahbar,
    E., Fox, E. E., Holcomb, J. B., and Wang, M. C. (2015). Estimating the ratio
    of multivariate recurrent event rates with application to a blood transfusion
    study. Statistical methods in medical research. In Press.
    Pepe, M. S. and Cai, J. (1993). Some graphical displays and marginal regression
    analyses for recurrent failure times and time dependent covariates. Journal
    of the American Statistical Association 88, 811–820.
    Prentice, R. L. (1982). Covariate measurement errors and parameter estimation
    in a failure time regression model. Biometrika 69, 331–342.
    Prentice, R. L., Williams, B. J., and Peterson, A. V. (1981). On the regression
    analysis of multivariate failure time data. Biometrika 68, 373–379.
    Rosner, B., Willett, W. C., and Spiegelman, D. (1989). Correction of logistic
    regression relative risk estimates and confidence intervals for systematic
    within-person measurement error. Statistics in medicine 8, 1051–1069.
    Schafer, D. W. and Purdy, K. G. (1996). Likelihood analysis for errors-invariables
    regression with replicate measurements. Biometrika 83, 813–824.
    Schaubel, D., Johansen, H., Dutta, M., Desmeules, M., Becker, A., and Mao, Y.
    (1996). Neonatal characteristics as risk factors for preschool asthma. Journal
    of Asthma 33, 255–264.
    Song, X., Davidian, M., and Tsiatis, A. A. (2002a). An estimator for the
    proportional hazards model with multiple longitudinal covariates measured
    with error. Biostatistics 3, 511–528.
    Song, X., Davidian, M., and Tsiatis, A. A. (2002b). A semiparametric likelihood
    approach to joint modeling of longitudinal and time-to-event data. Biometrics
    58, 742–753.
    Song, X. and Huang, Y. (2005). On corrected score approach for proportional
    hazards model with covariate measurement error. Biometrics 61, 702–714.
    Song, X. and Wang, C. Y. (2014). Proportional hazards model with covariate
    measurement error and instrumental variables. Journal of the American
    Statistical Association 109, 1636–1646.
    Stefanski, L. A. (1985). The effects of measurement error on parameter estimation.
    Biometrika 72, 583–592.

    Stefanski, L. A. and Carroll, R. J. (1987). Conditional scores and optimal scores
    for generalized linear measurement- error models. Biometrika 74, 703–716.
    Thall, P. F. and Vail, S. C. (1990). Some covariance models for longitudinal
    count data with overdispersion. Biometrics 46, 657–671.
    Therneau, T. M. and Hamilton, S. A. (1997). rhDNase as an example of recurrent
    event analysis. Statistics in medicine 16, 2029–2047.
    Tsiatis, A. A. and Davidian, M. (2001). A semiparametric estimator for the
    proportional hazards model with longitudinal covariates measured with error.
    Biometrika 88, 447–458.
    Turnbull, B. W., Jiang, W., and Clark, L. C. (1997). Regression models for
    recurrent event data: parametric random effects models with measurement
    error. Statistics in Medicine 16, 853–864.
    Wang, C. Y., Cullings, H., Song, X., and Kopecky, K. J. (2017). Joint nonparametric
    correction estimator for excess relative risk regression in survival
    analysis with exposure measurement error. Journal of the Royal Statistical
    Society Series B. In Press.
    Wang, C. Y., Hsu, L., Feng, Z. D., and Prentice, R. L. (1997). Regression
    calibration in failure time regression. Biometrics 53, 131–145.
    Wang, C. Y. and Sullivan Pepe, M. (2000). Expected estimating equations to
    accommodate covariate measurement error. Journal of the Royal Statistical
    Society Series B 62, 509–524.
    Wang, C. Y. and Wang, S. (1997). Semiparametric methods in logistic regression
    with measurement error. Statistica Sinica 7, 1103–1120.
    Wang, M. C., Qin, J., and Chiang, C. T. (2001). Analyzing recurrent event data
    with informative censoring. Journal of the American Statistical Association
    96, 1057–1065.
    Xu, G., Chiou, S. H., Huang, C. Y., Wang, M. C., and Yan, J. (2016). Joint
    scale-change models for recurrent events and failure time. Journal of the
    American Statistical Association. In Press.
    Ye, Y., Kalbfleisch, J. D., and Schaubel, D. E. (2007). Semiparametric analysis
    of correlated recurrent and terminal events. Biometrics 63, 78–87.
    Yi, G. Y. and Lawless, J. F. (2012). Likelihood-based and marginal inference
    methods for recurrent event data with covariate measurement error. Canadian
    Journal of Statistics 10, 530–549.
    Zeng, D., Ibrahim, J. G., Chen, M. H., Hu, K., and Jia, C. (2014). Multivariate
    recurrent events in the presence of multivariate informative censoring with
    applications to bleeding and transfusion events in myelodysplastic syndrome.
    Journal of biopharmaceutical statistics 24, 429–442.
    Zhao, H. and Lin, J. (2012). The large sample properties of the solutions of
    general estimating equations. Journal of Systems Science and Complexity
    25, 315–328.
    Zhu, L., Sun, J., Srivastava, D. K., Tong, X., Leisenring, W., Zhang, H., and
    Robison, L. L. (2011). Semiparametric transformation models for joint analysis
    of multivariate recurrent and terminal events. Statistics in medicine 30,
    3010–3023.
    Zhu, L., Sun, J., Tong, X., and Srivastava, D. K. (2010). Regression analysis of
    multivariate recurrent event data with a dependent terminal event. Lifetime
    data analysis 16, 478–490.

    QR CODE