簡易檢索 / 詳目顯示

研究生: 劉采渝
Liu, Tsai-Yu
論文名稱: 基於多中心電子病歷資料聯邦圖模型用於死亡預測
Federated Graph Learning for Mortality Prediction Using Multi-site EHR Data
指導教授: 郭柏志
Kuo, Po-Chih
陳博現
Chen, Bor-Sen
口試委員: 周志遠
Chou, Zhi-Yuan
曾意儒
Tseng, Yi-Ju
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 電機工程學系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 30
中文關鍵詞: 分散式學習系統機器學習公平性變換器電子病歷模型隱私性
外文關鍵詞: distributed learning, disparity, transformer, EHR data, model privacy
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,由於硬體進步,分散式學習相關的研究大量湧現。分散式學習的一個重要議題是隱私保護。據我們所知,我們是第一個利用本地模型中的蒸餾訊息來訓練我們的演算法,並藉此資訊來共同訓練我們的分散式系統。換句話說,在共同訓練階段,我們不交換模型權重或梯度,因此更高度保護隱私。在第一部分,我們詳細介紹我們的動機和工作;在第二部分,我們回顧相關研究並將其與我們的工作進行比較;在第三部分,我們以數學形式詳細描述我們的方法;在第四部分,我們將在真實世界數據上測試我們的結果。


    There are plenty of studies on distributed learning due to hardware advance in recent years. One of the important issues of distributed learning is privacy. To our best knowledge, we are the first to utilized distilled information from a local model by our algorithm, and by this information we jointly train our distributed system. Which is, during the jointly training phase, we don't exchange model weights nor gradients, as a result, provide even higher insurance on privacy. In part 1 we introduce our motivation and our work in detail; in part 2 we review the relate works and compare with our work' in part 3, we describe our method in detail with mathematical form; in part 4 we test our result on real world data.

    Contents Abstract (Chinese) I Acknowledgements (Chinese) II Abstract III Acknowledgements IV Contents VI List of Figures VIII List of Tables IX List of Algorithms X 1 Introduction 1 2 Related Work 4 2.1 Decentralized data federated machine learning . . . . . . . . . . . . 4 2.2 Differential privacy . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Homomorphic encryption . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 Graph Model Privacy . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.5 Graph on Medical . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.6 Graph Convolutional Transformer . . . . . . . . . . . . . . . . . . . 7 2.7 Fairness and Debiasing . . . . . . . . . . . . . . . . . . . . . . . . . 8 3 Methodology 10 3.1 Graph Convolutional Transformer . . . . . . . . . . . . . . . . . . . 10 3.2 Medical Graph Distillation . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Federated Graph Aggregation . . . . . . . . . . . . . . . . . . . . . 14 3.4 Retrain with Global Graph Information . . . . . . . . . . . . . . . . 14 4 Experiment 15 4.1 eICU Collaborative Research Dataset . . . . . . . . . . . . . . . . . 15 4.2 Model Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Prediction Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Model Fairness Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 5 Results 18 5.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Attention Behavior Visualization and Discussion . . . . . . . . . . . 19 5.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Bibliography 25 List of Figures 3.1 Illustration of medical graph . . . . . . . . . . . . . . . . . . . . . . 11 3.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3 Illustration of GCT in detail . . . . . . . . . . . . . . . . . . . . . . 12 3.4 Illustration of Prior conditional probability . . . . . . . . . . . . . . 12 5.1 midwest train 102 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2 midwest train 33 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 VIII List of Tables 5.1 Mortality Test Result: AUC-ROC . . . . . . . . . . . . . . . . . . . 19 5.2 Mortality Test Result: AUC-PR . . . . . . . . . . . . . . . . . . . . 19 5.3 Replaced by Mask: Mortality Test, age TPR . . . . . . . . . . . . 20 5.4 Train independently: Mortality Test, age TPR . . . . . . . . . . . . 20 5.5 Replaced by Mask: Mortality Test, gender TPR . . . . . . . . . . . 21 5.6 Train independently: Mortality Test, gender TPR . . . . . . . . . . 21 5.7 Replaced by Mask: Mortality Test, race TPR . . . . . . . . . . . . 22 5.8 Train independently: Mortality Test, race TPR . . . . . . . . . . . 22 List of Algorithms 1 medical graph distillation . . . . . . . . . . . . . . . . . . . . . . . . 13

    [1] arxiv.org e-print archive.
    [2] Diego Ardila, Atilla P Kiraly, Sujeeth Bharadwaj, Bokyung Choi, Joshua J
    Reicher, Lily Peng, Daniel Tse, Mozziyar Etemadi, Wenxing Ye, Greg Cor-
    rado, David P Naidich, and Shravya Shetty. Author correction: End-to-end
    lung cancer screening with three-dimensional deep learning on low-dose chest
    computed tomography. Nat. Med., 25(8):1319, August 2019.
    [3] Zhengping Che, David C. Kale, Wenzhe Li, Mohammad Taha Bahadori, and
    Yan Liu. Deep computational phenotyping. In Longbing Cao, Chengqi Zhang,
    Thorsten Joachims, Geoffrey I. Webb, Dragos D. Margineantu, and Graham
    Williams, editors, Proceedings of the 21th ACM SIGKDD International Con-
    ference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia,
    August 10-13, 2015, pages 507–516. ACM, 2015.
    [4] Edward Choi, Cao Xiao, Walter F. Stewart, and Jimeng Sun. Mime: Mul-
    tilevel medical embedding of electronic health records for predictive health-
    care. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grau-
    man, Nicol`o Cesa-Bianchi, and Roman Garnett, editors, Advances in Neural
    Information Processing Systems 31: Annual Conference on Neural Informa-
    tion Processing Systems 2018, NeurIPS 2018, December 3-8, 2018, Montr ́eal,
    Canada, pages 4552–4562, 2018.

    [5] Edward Choi, Zhen Xu, Yujia Li, Michael Dusenberry, Gerardo Flores, Emily
    Xue, and Andrew M. Dai. Learning the graphical structure of electronic
    health records with graph convolutional transformer. In The Thirty-Fourth
    AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second
    Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The
    Tenth AAAI Symposium on Educational Advances in Artificial Intelligence,
    EAAI 2020, New York, NY, USA, February 7-12, 2020, pages 606–613. AAAI
    Press, 2020.
    [6] Edward Choi, Zhen Xu, Yujia Li, Michael W. Dusenberry, Gerardo Flores,
    Yuan Xue, and Andrew M. Dai. Graph convolutional transformer: Learning
    the graphical structure of electronic health records. CoRR, abs/1906.04716,
    2019.
    [7] Enming Cui, Zhuoyong Li, Changyi Ma, Qing Li, Yi Lei, Yong Lan, Juan Yu,
    Zhipeng Zhou, Ronggang Li, Wansheng Long, and Fan Lin. Predicting the
    ISUP grade of clear cell renal cell carcinoma with multiparametric MR and
    multiphase CT radiomics. Eur. Radiol., 30(5):2912–2921, May 2020.
    [8] Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark
    Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, Quoc V.
    Le, and Andrew Y. Ng. Large scale distributed deep networks. In F. Pereira,
    C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural
    Information Processing Systems 25, pages 1223–1231. Curran Associates, Inc.,
    2012.
    [9] Nabil Elshafeey, Aikaterini Kotrotsou, Ahmed Hassan, Nancy Elshafei, Islam
    Hassan, Sara Ahmed, Srishti Abrol, Anand Agarwal, Kamel El Salek, Samuel
    Bergamaschi, Jay Acharya, Fanny E Moron, Meng Law, Gregory N Fuller,
    Jason T Huse, Pascal O Zinn, and Rivka R Colen. Multicenter study demon-

    strates radiomic features derived from magnetic resonance perfusion images
    identify pseudoprogression in glioblastoma. Nat. Commun., 10(1):3170, July
    2019.
    [10] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. The
    MIT Press, 2016.
    [11] Ehsan Hesamifard, Hassan Takabi, and Mehdi Ghasemi. CryptoDL: Deep
    neural networks over encrypted data. November 2017.
    [12] P. Kairouz, H. B. McMahan, et al. Foundations and trends in machine learn-
    ing. 14(1-2):1–210, 2021.
    [13] Georgios Kaissis, Sebastian Ziegelmayer, Fabian Loh ̈ofer, Hana Alg ̈ul,
    Matthias Eiber, Wilko Weichert, Roland Schmid, Helmut Friess, Ernst Rum-
    meny, Donna Ankerst, Jens Siveke, and Rickmer Braren. A machine learning
    model for the prediction of survival and tumor subtype in pancreatic ductal
    adenocarcinoma from preoperative diffusion-weighted imaging. Eur. Radiol.
    Exp., 3(1):41, October 2019.
    [14] Georgios Kaissis, Sebastian Ziegelmayer, Fabian Loh ̈ofer, Hana Alg ̈ul,
    Matthias Eiber, Wilko Weichert, Roland Schmid, Helmut Friess, Ernst Rum-
    meny, Donna Ankerst, Jens Siveke, and Rickmer Braren. A prospectively
    validated machine learning model for the prediction of survival and tumor
    subtype in pancreatic ductal adenocarcinoma. May 2019.
    [15] Jakub Koneˇcn ́y, H Brendan McMahan, Felix X Yu, Peter Richt ́arik,
    Ananda Theertha Suresh, and Dave Bacon. Federated learning: Strategies
    for improving communication efficiency. October 2016.

    [16] A. Li, L. Zhang, J. Wang, F. Han, and X.-Y. Li. Privacy-preserving efficient
    federated-learning model debugging. IEEE Transactions on Parallel and Dis-
    tributed Systems, 33(10):2291–2303, 2021.
    [17] Xing Li, Dexin Chen, Chunyan Li, and Liangmin Wang. Secure data ag-
    gregation with fully homomorphic encryption in large-scale wireless sensor
    networks. Sensors (Basel), 15(7):15952–15973, July 2015.
    [18] Haonan Lu, Mubarik Arshad, Andrew Thornton, Giacomo Avesani, Paula
    Cunnea, Ed Curry, Fahdi Kanavati, Jack Liang, Katherine Nixon, Sophie T
    Williams, Mona Ali Hassan, David D L Bowtell, Hani Gabra, Christina Fo-
    topoulou, Andrea Rockall, and Eric O Aboagye. A mathematical-descriptor
    of tumor-mesoscopic-structure from computed-tomography images annotates
    prognostic- and molecular-phenotypes of epithelial ovarian cancer. Nat. Com-
    mun., 10(1):764, February 2019.
    [19] Lingjuan Lyu, Han Yu, and Qiang Yang. Threats to federated learning: A
    survey. CoRR, abs/2003.02133, 2020.
    [20] Ryan McDonald, Keith Hall, and Gideon Mann. Distributed training strate-
    gies for the structured perceptron. In Human Language Technologies: The
    2010 Annual Conference of the North American Chapter of the Association
    for Computational Linguistics, HLT ’10, pages 456–464, Stroudsburg, PA,
    USA, 2010. Association for Computational Linguistics.
    [21] Scott Mayer McKinney, Marcin Sieniek, Varun Godbole, Jonathan Godwin,
    Natasha Antropova, Hutan Ashrafian, Trevor Back, Mary Chesus, Greg S
    Corrado, Ara Darzi, Mozziyar Etemadi, Florencia Garcia-Vicente, Fiona J
    Gilbert, Mark Halling-Brown, Demis Hassabis, Sunny Jansen, Alan Karthike-
    salingam, Christopher J Kelly, Dominic King, Joseph R Ledsam, David Mel-

    nick, Hormuz Mostofi, Lily Peng, Joshua Jay Reicher, Bernardino Romera-
    Paredes, Richard Sidebottom, Mustafa Suleyman, Daniel Tse, Kenneth C
    Young, Jeffrey De Fauw, and Shravya Shetty. Addendum: International eval-
    uation of an AI system for breast cancer screening. Nature, 586(7829):E19,
    October 2020.
    [22] Sungjin Park, Seongsu Bae, Jiho Kim, Tackeun Kim, and Edward Choi.
    Graph-text multi-modal pre-training for medical representation learning.
    CoRR, abs/2203.09994, 2022.
    [23] Katja Pinker, Joanne Chin, Amy N Melsaether, Elizabeth A Morris, and
    Linda Moy. Precision medicine and radiogenomics in breast cancer: New
    approaches toward diagnosis and treatment. Radiology, 287(3):732–747, June
    2018.
    [24] Daniel Povey, Xiaohui Zhang, and Sanjeev Khudanpur. Parallel training of
    deep neural networks with natural gradient and parameter averaging. CoRR,
    2014.
    [25] Pranav Rajpurkar, Jeremy Irvin, Kaylie Zhu, Brandon Yang, Hershel Mehta,
    Tony Duan, Daisy Ding, Aarti Bagul, Curtis Langlotz, Katie Shpanskaya,
    Matthew P Lungren, and Andrew Y Ng. CheXNet: Radiologist-level pneu-
    monia detection on chest x-rays with deep learning. November 2017.
    [26] Nicola Rieke, Jonny Hancox, Wenqi Li, Fausto Milletari, Holger Roth, Shadi
    Albarqouni, Spyridon Bakas, Mathieu N Galtier, Bennett Landman, Klaus
    Maier-Hein, Sebastien Ourselin, Micah Sheller, Ronald M Summers, Andrew
    Trask, Daguang Xu, Maximilian Baust, and M Jorge Cardoso. The future of
    digital health with federated learning. March 2020.

    [27] Rhema Vaithianathan, Diana Benavides Prado, Eric E. Dalton, Alexandra
    Chouldechova, and Emily Putnam-Hornstein. Using a machine learning tool
    to support high-stakes decisions in child protection. AI Mag., 42(1):53–60,
    2021.
    [28] Bino Varghese, Frank Chen, Darryl Hwang, Suzanne L Palmer, Andre Luis
    De Castro Abreu, Osamu Ukimura, Monish Aron, Manju Aron, Inderbir
    Gill, Vinay Duddalwar, and Gaurav Pandey. Objective risk stratification
    of prostate cancer using machine learning and radiomics applied to multi-
    parametric magnetic resonance images. In Proceedings of the 11th ACM In-
    ternational Conference on Bioinformatics, Computational Biology and Health
    Informatics, New York, NY, USA, September 2020. ACM.
    [29] Sixin Zhang, Anna E Choromanska, and Yann LeCun. Deep learning with
    elastic averaging sgd. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama,
    and R. Garnett, editors, Advances in Neural Information Processing Systems
    28, pages 685–693. Curran Associates, Inc., 20

    QR CODE