簡易檢索 / 詳目顯示

研究生: 楊承翰
Yang, Cheng-Han
論文名稱: 利用梯度提升決策樹分析倖存資料
Gradient Boosting Tree with Survival Data
指導教授: 鄭又仁
Cheng, Yu-Jen
口試委員: 黃冠華
Huang, Guan-Hua
邱燕楓
Chiu, Yen-Feng
學位類別: 碩士
Master
系所名稱: 理學院 - 統計學研究所
Institute of Statistics
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 54
中文關鍵詞: 梯度決策樹比例風險模型個人化醫療因果推論
外文關鍵詞: boosting tree, survival analysis, personalized medicine, causal inference
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文主要有兩個目標,第一個目標為找出對於每位病患適合之治療方式;第二個目標為在最佳醫療決策下估計病患之倖存函數。因此我們根據梯度提升演算法(gradient boosting machine)架構,提出RAINBOW演算法,達成此兩項目的。從模擬結果可知,藉由給予適當的懲罰項,即便在自變數的個數大於樣本數的情況下,仍可精確的估計倖存函數與醫療決策。


    We consider the problem of identifying patients who may take medication. To deal with the problem, we use gradient boosting to train the Cox model, involves estimating survival function for treatment and control for each patient. The difference in these survival function is then used to make the optimal decision which patient should be treated and estimate the survival function with optimal treatment regime which maps observed patient characteristics to a recommended treatment. From our simulation, we get the good performance even the number of covariates is larger than the sample size. As an illustration, we apply the proposed method to survival data of non-small cell lung cancer.

    第一章 緒論.......................1 第二章 文獻探討....................4 第三章 研究方法...................17 第四章 模擬.......................26 第五章 實例分析...................29 第六章 結論.......................34 參考文獻..........................36 Appendices.......................39

    Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.
    Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.
    Breiman, L., Friedman, J., Stone, C. J., and Olshen, R. A. (1984). Classification and regression trees. CRC press, New York.
    Cox, D. R. (1972). Regression models and life-tables. In Breakthroughs in statistics, pages 527–541. Springer.
    Cox, D. R. (1975). Partial likelihood. Biometrika, pages 269–276.
    Deng, H. and Runger, G. (2012). Feature selection via regularized trees. In Neural Networks (IJCNN), The 2012 International Joint Conference on, pages 1–8. IEEE.
    Dusseldorp, E. and Van Mechelen, I. (2014). Qualitative interaction trees: a tool to identify qualitative treatment–subgroup interactions. Statistics in Medicine, 33(2):219–237.
    Foster, J. C., Taylor, J. M., and Ruberg, S. J. (2011). Subgroup identification
    from randomized clinical trial data. Statistics in Medicine, 30(24):2867–2880.
    Friedman, J. H. (2001). Greedy function approximation: a gradient boosting
    machine. Annals of Statistics, pages 1189–1232.
    Friedman, J. H. and Fisher, N. I. (1999). Bump hunting in high-dimensional data. Statistics and Computing, 9(2):123–143. 36
    Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3(Mar):1157–1182.
    Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46(1):389–422.
    Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE transactions on pattern analysis and machine intelligence, 20(8):832–844.
    Li, H. and Luan, Y. (2005). Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data. Bioinformatics, 21(10):2403–2409.
    Li, P., Burges, C. J., Wu, Q., Platt, J., Koller, D., Singer, Y., and Roweis, S.
    (2007). Mcrank: Learning to rank using multiple classification and gradient
    boosting. In NIPS, volume 7, pages 845–852.
    Liang, H. and Zou, G. (2008). Improved aic selection strategy for survival analysis. Computational Statistics & Data Analysis, 52(5):2538–2548.
    Lipkovich, I., Dmitrienko, A., Denne, J., and Enas, G. (2011). Subgroup identification based on differential effect search—a recursive partitioning method for establishing response to treatment in patient subpopulations. Statistics in Medicine, 30(21):2601–2621.
    Su, X., Tsai, C.-L., Wang, H., Nickerson, D. M., and Li, B. (2009). Subgroup
    analysis via recursive partitioning. Journal of Machine Learning Research,
    10(Feb):141–158. 37
    Su, X., Zhou, T., Yan, X., Fan, J., and Yang, S. (2008). The international journal of biostatistics.
    Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288.
    Tutz, G. and Binder, H. (2006). Generalized additive modeling with implicit
    variable selection by likelihood-based boosting. Biometrics, 62(4):961–971.
    Weisberg, S. (2005). Applied linear regression, volume 528. John Wiley & Sons.
    Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., and Laber, E. (2012). Estimating optimal treatment regimes from a classification perspective. Stat, 1(1):103–114.

    QR CODE