簡易檢索 / 詳目顯示

研究生: 呂文翔
Lu, Wen-Hsiang
論文名稱: 透過高維度邏輯斯迴歸模型的貪婪變數選取分析三階段主分層之因果效應
Analyzing Three Stage Principal Stratification Causal Effects via Greedy Variable Selection for High-Dimensional Logistic Regression Models
指導教授: 銀慶剛
Ing, Ching-Kang
口試委員: 俞淑惠
Yu, Shu-Hui
邱海唐
Chiou, Hai-Tang
學位類別: 碩士
Master
系所名稱: 理學院 - 統計學研究所
Institute of Statistics
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 36
中文關鍵詞: 模型選擇主分層因果分析三重穩健估計量訊息準則
外文關鍵詞: Model selection, Principal stratification causal inference, Multiply robust estimator, information criterion
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在醫學與社會領域中,許多研究不只考慮處理 (Treatment) 對結果 (Outcome) 的因果效應,也開始著重於中介變數對結果的因果效應。主分層是一個處理中介變數因果效應的機制,其概念是藉由定義聯合潛在中介變數,分成子群計算群內的因果效應。

    然而,隨著科技技術的快速進展,使得我們面臨高維度的困境,因此進行變數選取是必要的手段。本論文聚焦於高維度主分層因果分析的問題。對於高維度的資料,我們首先使用三階段模型選擇的方法 CGA+HDIC+Trim 分別對三個模型進行模型選擇,選擇出相關變數,再利用準確率(Accuracy)、幾何平均數(G-mean)、曲線下面積(Area under curve) 三個指標進行投票,票數較高表示選模結果較好,最後使用此模型當最終模型,將其帶入三重穩健估計式中,得到估計主分層因果效應。為了驗證本論文方法的有效性,我們提供多個模擬以及一筆實際資料作印證。


    In medicine and sociology, much research not only concerns the average treatment effect on the outcome but is also interested in the underlying mechanism through an intermediate variable. Principal stratification is such a mechanism to deal with an intermediate variable by targeting subgroup causal effects within the principal strata defined by the joint potential values of an intermediate variable. In addition, due to the rapid development of science and technology, we are inevitably facing high-dimensional dilemmas. This thesis aims to provide a high-dimensional principal stratification causal inference to tackle this challenge. We first use a three-stage model selection method (CGA+HDIC+Trim) to choose relevant variables. Then, we use three indices, accuracy, G-mean, and the area under the curve, to evaluate whether the model selection result is satisfactory. Moreover, we use a multiply robust estimator to estimate the principal stratification causal effect. Finally, in support of the usefulness of the proposed method, we provide simulations and an application for the analysis of longitudinal survey data.

    1 緒論1 2 主分層分析基本假設3 2.1 基本假設. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3 主分層因果效應與模型選擇6 3.1 三重穩健估計量. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.2 三階段柴比雪夫貪婪演算法. . . . . . . . . . . . . . . . . . . . . . 10 3.3 高維度主分層因果效應. . . . . . . . . . . . . . . . . . . . . . . . . 12 4 模擬資料分析15 4.1 模擬資料一: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.1.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2 模擬資料二: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3 模擬資料三: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5 真實資料分析33 6 結論與建議35 參考文獻36

    Chen, Y.-L., Dai, C.-S., and Ing, C.-K. (2019). High-dimensional model selection
    via chebyshev greedy algorithms. Working paper.
    Ding, P. and Lu, J. (2017). Principal stratification analysis using principal scores.
    Journal of the Royal Statistical Society: Series B (Statistical Methodology),
    79(3):757–777.
    Frangakis, C. E. and Rubin, D. B. (2002). Principal stratification in causal inference.
    Biometrics, 58(1):21–29.
    Gilbert, P. B. and Hudgens, M. G. (2008). Evaluating candidate principal surrogate
    endpoints. Biometrics, 64(4):1146–1154.
    Ing, C.-K. and Lai, T. L. (2011). A stepwise regression method and consistent
    model selection for high-dimensional sparse linear models. Statistica Sinica,
    pages 1473–1513.
    Jiang, Z., Ding, P., and Geng, Z. (2016). Principal causal effect identification and
    surrogate end point evaluation by multiple trials. Journal of the Royal Statistical
    Society: Series B (Statistical Methodology), 78(4):829–848.
    Jiang, Z., Yang, S., and Ding, P. (2020). Multiply robust estimation of causal
    effects under principal ignorability. arXiv preprint arXiv:2012.01615.
    Ning, Y., Sida, P., and Imai, K. (2020). Robust estimation of causal effects via a
    high-dimensional covariate balancing propensity score. Biometrika, 107(3):533–
    554.
    Rubin, D. B. (2006). Causal inference through potential outcomes and principal
    stratification: application to studies with” censoring” due to death. Statistical
    Science, pages 299–309.

    QR CODE