簡易檢索 / 詳目顯示

研究生: 許德丞
Hsu, Te-Cheng
論文名稱: 強健深度學習癌症預後預測架構
A Robust Deep Learning Cancer Prognosis Prediction Framework
指導教授: 林澤
Lin, Che
翁詠祿
Ueng, Yeong-Luh
口試委員: 陳倩瑜
Chen, Chien-Yu
王偉仲
Wang, Wei-Chung
郭柏志
Guo, Po-Chih
李祈均
Lee, Chi-Chun
學位類別: 博士
Doctor
系所名稱: 電機資訊學院 - 通訊工程研究所
Communications Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 149
中文關鍵詞: 半監督式學習多模態學習生物資訊學癌症預後深度學習
外文關鍵詞: semi-supervised learning, multimodal learning, bioinformatics, cancer prognosis, deep learning
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 癌症已常居世界主要死因之一,其複雜成因與生物互動機制使得快速且準確的癌症風險預測模型不易建立。近年研究顯示深度學習(deep learning)模型憑藉大量已標注(labeled)病人資料可被應用於進行癌症預後(prognosis)預測,以便醫生設計癌症病患後續療程。然而,將深度學習技術應用於生物資料上時,因為生物資料常有缺失值(missing value)、高維度且資料量少等特性,會遭遇許多實作上的問題。因前述原因造成的模型預測不穩定等問題常常迫使研究者使用相較簡單的機器學習(machine learning)模型,若未設計專用之特徵選擇(feature selection)演算法,將會使強而有力的深度學習模型無用武之地。

    為解決上述問題,我們提出基於貝氏變分自解碼器之半監督式深度學習強健癌症預後預測架構(robust Semi-supervised Cancer prognosis classifier with bAyesian Variational AutoeNcoder, SCAN)。SCAN 融合了多種上述模型的優點,並能以半監督式學習技術善加運用珍貴的未標記病人資料。基於系統生物學特徵選取器選出之癌症預後生物標記,我們可以提昇乳癌與非小細胞肺癌五年預後狀況預測模型準確度。為了避免過度擬合 (over-fitting),我們結合貝氏深度學習變分自解碼器 (variational autoecoder),讓 SCAN 能自由以大量為標記病人資料學習其特徵並幫助分類,除此之外亦結合變分隨機失活 (variational dropout) 正規化 (regularization) 技巧降低模型複雜度。再者,我們將 SCAN 的最終預測設計為由兩個子網路分類器 (subnetwork classifier) 的多數決 (majority vote) 產生,這些設計都使得 SCAN 能擴展到更多種資料,亦能使用許多未標記病人資料以增進模型穩定度。

    SCAN 藉由半監督式學習引入大量未標記病人資料,大幅提昇以往模型效能,實驗結果顯示其模型準確性勝過之前提出的雙模態網路分類器(Bimodal)與許多機器學習基準模型,包括我們研究團隊之前提出的雙模態神經網路 (Bimodal)、支持向量機 (support vector machine, SVM) 與隨機森林分類器 (random forest, RF)。SCAN於預測乳癌 / 非小細胞肺癌五年預後狀況中達成81.73% / 80.46%的AUROC,勝過Bimodal的表現(乳癌:77.71% / 非小細胞肺癌:78.67%),另外於獨立測試資料集(independent validation test set)上的表現亦顯示,SCAN可達到所有模型當中最強健的預測:SCAN的AUROC於乳癌/非小細胞肺癌(74.74% / 72.80%)遠遠勝過Bimodal於乳癌與非小細胞肺癌的AUROC結果(64.13% / 67.07%),該結果也超越 SVM 與 RF。本篇論文提出之SCAN的輕量與穩定特性,於資料規模相對小的生物資料應用上是一個極大的優勢。此外,SCAN是一個極為彈性的深度學習架構,未來將可應用於多種癌症與任意多種資料並使用大量未標記病人資料訓練癌症預後預測模型。基於以上模型設計優點,我們認為SCAN將可作為癌症早期篩檢與個人化醫療應用中有效的整合模型架構。


    Cancer is one of the leading mortality causes worldwide. With its complex biomedical interactions among heterogeneous data sources, fast and accurate cancer prognosis stratification models are hard to build yet essential for treatment design recommendations. Recently, deep learning models can provide strong prediction power supported by a large amount of labeled patient data. However, numerous challenges exist when applying them to biomedical data, such as missing data, high dimensionality, and limited data sizes. Prone to non-robust and biased predictions, these problems prevent researchers from using more advanced deep learning approaches for cancer prognosis prediction. Advanced feature selection approaches are thus necessary for avoiding over-fitting.

    To address these problems, we proposed a robust Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder (SCAN) as a structured deep learning framework for cancer prognosis prediction. SCAN takes full advantage of the large unlabeled patient data with semi-supervised learning techniques. Based on the small sets of prognostic biomarkers chosen via our systems biology feature selector and the corresponding clinical data, we can precisely infer the five-year disease-specific survival (DSS) and overall survival (OS) for breast cancer and non-small cell lung cancer (NSCLC) patients, respectively. To avoid potential over-fitting coming along with small data sizes, we leverage the advantage of generative models such as Bayesian deep learning (BDL) with large unlabeled patient data and use a full variational Bayes variational autoencoder (Full-VB VAE) to learn desired data representations that significantly help SCAN.

    Supported by large unlabeled patient data via semi-supervised and ensemble learning approach with a majority vote, SCAN achieved significantly better and robust prediction results than all benchmark models, including our previously proposed bimodal neural network (Bimodal), support vector machine (SVM), and random forest (RF) classifiers. For the area under the receiver operating characteristics curve (AUROC), SCAN (81.73% for breast cancer; 80.46% for NSCLC) outperformed Bimodal (77.71% for breast; 78.67% for NSCLC), and SCAN enjoys more advantages such as robustness and scalability than Bimodal in various analyses. Independent validation results further showed that SCAN still performs better AUROC scores (74.74% for breast cancer; 72.80% for NSCLC) than Bimodal (64.13% for breast cancer; 67.07% for NSCLC), SVM, and RF. Such robustness is essential for models with only small labeled patient data. SCAN is a general framework that can potentially be applied to other cancer types and with more data sources included. Moreover, we designed the model's final prediction to be the majority vote of the microarray and clinical subnetwork classifiers. These make SCAN a scalable and robust cancer prognosis classifier with full potential to be applied to other cancer types or even more data sources. With the remarkable capacity of SCAN, it is a promising tool for early cancer risk screening, paving the foundation for personalized medicine applications.

    Acknowledgments i 摘要 iii Abstract v 1 Introduction 1 1.1 Motivations.................................... 1 1.2 Highlights..................................... 7 1.3 Dissertationstructure ............................... 10 2 Materials and methods 11 2.1 Deeplearningbasics ............................... 11 2.2 Performance evaluation.............................. 12 2.2.1 Receiver operating characteristic curve (ROC) . . . . . . . . . . . . . 13 2.2.2 Youdenindex............................... 13 2.2.3 Precision-recall curve(PRC)....................... 14 2.2.4 F1-score.................................. 14 2.2.5 Survival analysis ............................. 14 2.3 Systems biology feature selector ......................... 17 2.4 Bimodal neural network classifier ........................ 20 2.5 Structured probabilistic modeling......................... 21 2.5.1 Probabilistic graphical models (PGMs) for model structure . . . . . . . 22 2.5.2 Inference on structured probabilistic models. . . . . . . . . . . . . . . 23 2.5.3 Classic Bayesian inference........................ 23 2.5.4 Variational(approximate)inference ................... 24 2.5.5 Variational autoencoders(VAEs)..................... 25 2.5.6 Generative adversarial networks(GANs) . . . . . . . . . . . . . . . . 26 2.6 Bayesian deep learning(BDL) .......................... 28 2.6.1 Key features of Bayesian deep learning ................. 28 2.6.2 Model evaluation............................. 29 2.6.3 Variational dropout............................ 30 2.7 Boostrapconfidenceintervals........................... 30 2.8 Ensemblelearning ................................ 32 2.9 ExplainableAI(XAI)............................... 32 2.9.1 Connection weights algorithm ...................... 32 2.9.2 Partial dependence plots ......................... 33 3 Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder (SCAN) 35 3.1 Motivations and design ideas for SCAN ..................... 35 3.2 SCAN architecture ................................ 37 3.2.1 Robust model design........................... 37 3.2.2 Four patient types............................. 37 3.2.3 Model configurations........................... 39 3.2.4 Subnetwork classifiers .......................... 40 3.2.5 Semi-supervised learning variational autoencoder (SSL-VAE) . . . . . 41 3.2.6 Full variational Bayes variational autoencoder (Full-VB VAE) . . . . . 44 3.3 Overall framework ................................ 46 4 Case study: breast cancer disease-specific survival (DSS) prediction with SCAN 49 4.1 Breastcancercohort–theMETABRICcohort. . . . . . . . . . . . . . . . . . 49 4.1.1 Disease-specific survival(DSS) ..................... 49 4.1.2 Dataset summary............................. 50 4.2 Primary results.................................. 50 4.2.1 Experimental setups........................... 50 4.2.2 Model prediction performance...................... 52 4.2.3 Receiver operating characteristics curves (ROCs) and precision-recall curves(PRCs) .............................. 53 4.2.4 Survival analysis ............................. 55 4.2.5 Predicted risk versus survival time and prognosis class . . . . . . . . . 55 4.3 External validations................................ 56 4.3.1 External validation test set (microarray) . . . . . . . . . . . . . . . . . 56 4.3.2 The cancer genome atlas breast invasive carcinoma (TCGA-BRCA) collection................................. 57 4.4 Feature importance ................................ 58 4.4.1 Connection weights algorithm ...................... 58 4.4.2 Partial dependence plot (PDP) ...................... 60 4.4.3 Kolmogorov-Smirnovtest (K-Stest)................... 60 4.5 Ablation studies.................................. 61 4.5.1 Removing unlabeled data......................... 61 4.5.2 Less labeled data ............................. 63 4.5.3 Duplicated unlabeled data ........................ 63 4.6 Case study summary ............................... 64 5 Case study: non-small cell lung cancer (NSCLC) overall survival (OS) prediction with SCAN 67 5.1 NSCLC cohort–NCBI GEO datasets ...................... 67 5.1.1 Overall survival(OS)........................... 67 5.1.2 Dataset summary............................. 68 5.2 Primary results .................................. 69 5.2.1 Experimental setups ........................... 69 5.2.2 Model prediction performance...................... 70 5.2.3 Receiver operating characteristics curves (ROCs) and precision-recall curves(PRCs) .............................. 70 5.2.4 Survival analysis ............................. 72 5.2.5 Predicted risk versus survival time and prognosis class . . . . . . . . . 72 5.3 External validations................................ 73 5.3.1 External validation test set(microarray) . . . . . . . . . . . . . . . . . 73 5.3.2 The cancer genome atlas lung adenocarcinoma (TCGA-LUAD) collection 74 5.4 Feature importance ................................ 75 5.4.1 Connection weights algorithm ...................... 75 5.4.2 Partial dependence plot (PDP) ...................... 76 5.4.3 Kolmogorov-Smirnovtest (K-Stest)................... 76 5.5 Ablation studies.................................. 77 5.5.1 Removing unlabeled data ......................... 77 5.5.2 Less labeled data ............................. 78 5.5.3 Duplicated unlabeled data ........................ 78 5.5.4 Limitation of SCAN ........................... 78 5.6 Case study summary ............................... 79 6 Discussions and future works 81 6.1 VAE latent representation superposition ..................... 81 6.2 Label information is more important at decoder inputs. . . . . . . . . . . . . . 82 6.3 SCAN is robust to model initialization ...................... 83 6.4 SCAN improves with increasing unlabeled data . . . . . . . . . . . . . . . . . 84 6.5 Generalization of SCAN to multiple heterogeneous data sources . . . . . . . . 85 6.6 Further analyses with the TCGA datasets .................... 86 6.7 Model design for combining predictions from two subnetwork classifiers . . . 89 6.7.1 The infeasibility for using two consecutive sigmoid activation layers . . 89 6.7.2 Objective function design for learning weights . . . . . . . . . . . . . 90 6.7.3 Multimodal learning framework is powerful. . . . . . . . . . . . . . . 91 6.8 Comparison between different sets of biomarkers . . . . . . . . . . . . . . . . 92 6.9 Federated learning ................................ 93 6.10 Other future research directions.......................... 95 7 Conclusion 99 Appendix A Single biomarker area under the receiver operating characteristics curve (AUROC) score for biomarker selection 101 A.1 HCC microarray datasets.............................101 A.2 Single biomarker AUROC score .........................102 A.3 Biomarker selection with single biomarker AUROC scores . . . . . . . . . . . 103 A.4 HCC prediction with selected biomarkers ....................103 A.4.1 Experimental setups ...........................103 A.4.2 HCC prediction results..........................104 Appendix B Ensemble systems biology feature selection and bimodal deep neural network for breast cancer five-year DSS prediction 105 B.1 Hybrid ensemble systems biology biomarker selection . . . . . . . . . . . . . 105 B.1.1 Overview and motivation for hybrid ensemble learning . . . . . . . . . 105 B.1.2 Function-perturbation ensemble approach . . . . . . . . . . . . . . . . 106 B.1.3 Data-perturbation ensemble approach ..................107 B.1.4 Hybrid ensemble approach........................107 B.1.5 Gene feature selection frequency curves . . . . . . . . . . . . . . . . . 108 B.2 Breast cancer five-year DSS prediction with bimodal NN . . . . . . . . . . . . 110 B.2.1 Experimental setups ...........................110 B.2.2 Prediction results on the METABRIC cohort . . . . . . . . . . . . . . 110 B.2.3 Independent validation test performance . . . . . . . . . . . . . . . . . 112 Appendix C Five-year overall survival prediction of non-small cell lung cancer with bimodal neural network 113 C.1 Experimental setups................................113 C.2 Learning curves of bimodal NN .........................114 C.3 Prediction results on the combined NSCLC cohort. . . . . . . . . . . . . . . . 115 C.4 Independent validation test set ..........................116 C.5 More detailed learning curve...........................116 Appendix D Bayesian neural networks for colon cancer OS prediction 119 D.1 Colon cancer overall survival (OS)........................119 D.2 Colon cancer patient cohort............................120 D.3 Experimental setups................................120 D.4 Colon cancer overall survival prediction results . . . . . . . . . . . . . 122 D.4.1 NN-based models performed well with sufficient regularization . 122 D.4.2 Bimodal NNs excel in combining heterogeneous data types . . . 122 D.4.3 Bayesian NNs achieved more robust performance . . . . . . . . 122 D.4.4 Bayesian NNs showed significant stratification in survival analysis . 123 D.4.5 Bayesian NNs were less sensitive to model hyper-parameter selection . . 123 D.5 Advantages of Bayesian deep learning......................124 D.6 Comparing regularization approaches ......................126 D.7 Performance with less labeled data and regularization approaches . . . . . . . . 127 Appendix E Wasserstein generative adversarial network-based Deep Adversarial Data Augmentation (wDADA) 129 E.1 Breast cancer cohort and prognostic biomarkers . . . . . . . . . . . . . . . . . 129 E.2 Wasserstein Generative Adversarial Network-based Deep adversarial data augmentation(wDADA)...............................130 E.2.1 DADA PhaseI–Generation training ..................131 E.2.2 DADA PhaseII–classification training . . . . . . . . . . . . . . . . . 131 E.2.3 wDADA – introducing WassersteinGAN (WGAN) . . . . . . . . . . . 132 E.3 Breast cancer DSS prediction with wDADA ...................132 E.3.1 Experimental setups ...........................132 E.3.2 Prognosis class stratification.......................134 E.3.3 Survival analysis .............................134 E.4 Comparing synthetic data generated from the augmenter with real data . . . . . 136 References 137

    [1] R. L. Siegel, K. D. Miller, and A. Jemal, “Cancer statistics,2019,” CA: A Cancer Journal for Clinicians, vol. 69, no. 1, pp. 7–34, 2019.
    [2] C. M. Perou, T. Sørlie, M. B. Eisen, M. van de Rijn, S. S. Jeffrey, C. A. Rees, J. R. Pollack, D. T. Ross, H. Johnsen, L. A. Akslen, Ø. Fluge, A. Pergamenschikov, C. Williams, S. X. Zhu, P. E. Lønning, A.-L. Børresen-Dale, P. O. Brown, and D. Botstein, “Molecular portraits of human breast tumours,” Nature, vol. 406, Aug. 2000.
    [3] F. R. Hirsch, G. V. Scagliotti, J. L. Mulshine, R. Kwon, W. J. Curran Jr, Y.-L. Wu, and L. Paz-Ares, “Lung cancer: current therapies and new targeted treatments,” The Lancet, vol. 389, no. 10066, pp. 299–311, 2017.
    [4] J. Ferlay, I. Soerjomataram, R. Dikshit, S. Eser, C. Mathers, M. Rebelo, D. M. Parkin, D. Forman, and F. Bray, “Cancer incidence and mortality worldwide: sources, methods and major patterns in globocan 2012,” International journal of cancer, vol. 136, no. 5, pp. E359–E386, 2015.
    [5] C. A. Pope Iii, R. T. Burnett, M. J. Thun, E. E. Calle, D. Krewski, K. Ito, and G. D. Thurston, “Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution,” Jama, vol. 287, no. 9, pp. 1132–1141, 2002.
    [6] J.-P. Pignon, H. Tribodet,G. V. Scagliotti, J.-Y. Douillard, F. A. Shepherd, R. J. Stephens, A. Dunant, V. Torri, R. Rosell, L. Seymour, et al., “Lung adjuvant cisplatin evaluation: a pooled analysis by the lace collaborative group,” in Database of Abstracts of Reviews of Effects (DARE): Quality-Assessed Reviews [Internet], Centre for Reviews and Dissemination (UK), 2008.
    [7] K. Saika and T. Sobue, “Cancer statistics in the world,” Gan to kagaku ryoho. Cancer & chemotherapy, vol. 40, no. 13, pp. 2475–2480, 2013.
    [8] Y.J.ChuaandJ.R.Zalcberg, “Progress and challenges in the adjuvant treatment of stage ii and iii colon cancers,” Expert review of anticancer therapy, vol. 8, no. 4, pp. 595–604, 2008.
    [9] P.-H. Wen, C.-L. Lu, C. Strong, Y.-J. Lin, Y.-L. Chen, C.-Y. Li, and C.-C. Tsai, “Demographic and urbanization disparities of liver transplantation in Taiwan,” International Journal of environmental research and public health, vol. 15, no. 2, p. 177, 2018.
    [10] L. K. Dunnwald, M. A. Rossing, and C. I. Li, “Hormone receptor status, tumor characteristics, and prognosis: a prospective cohort of breast cancer patients,” Breast cancer research: BCR, vol. 9, no. 1, p. R6, 2007.
    [11] B. D. Lehmann, J. A. Bauer, X. Chen, M. E. Sanders, A. B. Chakravarthy, Y. Shyr, and J. A. Pietenpol, “Identification of human triple-negative breast cancer subtypes and preclinical models for selection of targeted therapies,” The Journal of Clinical Investigation, vol. 121, pp. 2750–2767, July 2011.
    [12] L. A. Carey, E. C. Dees, L. Sawyer, L. Gatti, D. T. Moore, F. Collichio, D. W. Ollila, C. I. Sartor, M. L. Graham, and C. M. Perou, “The Triple Negative Paradox: Primary Tumor Chemosensitivity of Breast Cancer Subtypes,” Clinical Cancer Research, vol. 13, pp. 2329–2334, Apr. 2007.
    [13] R. Dent, M. Trudeau, K. I. Pritchard, W. M. Hanna, H. K. Kahn, C. A. Sawka, L. A. Lickley, E. Rawlinson, P. Sun, and S. A. Narod, “Triple-negative breast cancer: clinical features and patterns of recurrence,” Clinical Cancer Research: An Official Journal of the American Association for Cancer Research, vol. 13, pp. 4429–4434, Aug. 2007.
    [14] M. Duffy, N. Harbeck, M. Nap, R. Molina, A. Nicolini, E. Senkus, and F.Cardoso, “Clinical use of biomarkers in breast cancer: Updated guidelines from the European group on tumor markers (egtm),” European Journal of cancer, vol. 75, pp. 284–298, 2017.
    [15] P. A. Baeuerle and O. Gires, “EpCAM(CD326)finding its role in cancer,” British Journal of Cancer, vol. 96, pp. 417–423, Feb. 2007.
    [16] S. K. Lau, P. C. Boutros, M. Pintilie, F. H. Blackhall, C.-Q. Zhu, D. Strumpf, M. R. Johnston, G. Darling, S. Keshavjee, T. K. Waddell, N. Liu, D. Lau, L. Z. Penn, F. A. Shepherd, I. Jurisica, S. D. Der, and M.-S. Tsao, “Three-gene prognostic classifier for early-stage non small-cell lung cancer,” Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology, vol. 25, pp. 5562–5569, Dec. 2007.
    [17] C. Papadaki, M. Sfakianaki, E. Lagoudaki, G. Giagkas, G. Ioannidis, M. Trypaki, E. Tsakalaki, A. Voutsina, A. Koutsopoulos, D. Mavroudis, et al., “Pkm2 as a biomarker for chemosensitivity to front-line platinum-based chemotherapy in patients with metastatic non-small-cell lung cancer,” British journal of cancer, vol. 111, no. 9, pp. 1757–1764, 2014.
    [18] R. Chen, P. Khatri, P. K. Mazur, M. Polin, Y. Zheng, D. Vaka, C. D. Hoang, J. Shrager, Y. Xu, S. Vicent, A. J. Butte, and E. A. Sweet-Cordero, “A Meta-analysis of Lung Cancer Gene Expression Identifies PTK7 as a Survival Gene in Lung Adenocarcinoma,” Cancer Research, vol. 74, pp. 2892–2902, May 2014.
    [19] J. Münsterberg, D. Loreth, L. Brylka, S. Werner, J. Karbanova, M. Gandrass, S. Schnee- gans, K. Besler, F. Hamester, J. R. Robador, A. T. Bauer, S. W. Schneider, M. Wrage, K. Lamszus, J. Matschke, Y. Vashist, G. Uzunoglu, S. Steurer, A. K. Horst, L. Oliveira- Ferrer, M. Glatzel, T. Schinke, D. Corbeil, K. Pantel, C. Maire, and H. Wikman, “AL- CAM contributes to brain metastasis formation in non-small-cell lung cancer through interaction with the vascular endothelium,” Neuro-Oncology, vol. 22, pp. 955–966, July 2020.
    [20] D. Zeng, X. Wu, J. Zheng, Y. Zhuang, J. Chen, C. Hong, F. Zhang, M. Wu, and D. Lin, “Loss of cadm1/tslc1 expression is associated with poor clinical outcome in patients with esophageal squamous cell carcinoma,” Gastroenterology research and practice, vol. 2016, 2016.
    [21] C. C. Barron, P. J. Bilan, T. Tsakiridis, and E. Tsiani, “Facilitative glucose transporters: Implications for cancer detection, prognosis and treatment,” Metabolism: Clinical and Experimental, vol. 65, pp. 124–139, Feb. 2016.
    [22] C. Ding and H. Peng, “Minimum redundancy feature selection from microarray gene expression data,” Journal of bioinformatics and computational biology, vol. 3, no. 02, pp. 185–205, 2005.
    [23] W. Awada, T. M. Khoshgoftaar, D. Dittman, R. Wald, and A. Napolitano, “A review of the stability of feature selection techniques for bioinformatics data,” in 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI), pp. 356–363, IEEE, 2012.
    [24] S. Wu, I.-C. Tseng, W.-C. Huang, C.-W. Su, Y.-H. Lai, C. Lin, A. Y.-L. Lee, C.-Y. Kuo, L.-Y. Su, M.-C. Lee, et al., “Establishment of an immunocompetent metastasis rat model with hepatocyte cancer stem cells,” Cancers, vol. 12, no. 12, p. 3721, 2020.
    [25] H.-Y. Wang, “Prognostic markers for relapse prediction in colon cancer: a deep learning approach,” Master’s thesis, National Tsing Hua University Electrical Engineering Department, 2017.
    [26] F.-Y. Liu, T.-C. Hsu, P. Choong, et al., “Uncovering the regeneration strategies of zebrafish organs: a comprehensive systems biology study on heart, cerebellum, fin, and retina regeneration,” BMC systems biology, vol. 12, no. 2, p. 29, 2018.
    [27] Y.-H. Lai, W.-N. Chen, T.-C. Hsu, C. Lin, Y. Tsao, and S. Wu, “Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with deep learning,” Scientific Reports, vol. 10, p. 4679, Mar. 2020.
    [28] L.-H. Cheng, T.-C. Hsu, and C.Lin,“ Integrating ensemble systems biology feature selection and bimodal deep neural network for breast cancer prognosis prediction,” Scientific Reports, vol. 11, p. 14914, July 2021.
    [29] T.-C. Hsu and C. Lin, “Training with small medical data: Robust Bayesian neural networks for colon cancer overall survival prediction,” in 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pp. 2030– 2033, IEEE, 2021.
    [30] Z.-Y. He and W.-C. Yu, “Stable feature selection for biomarker discovery,” Computational biology and chemistry, vol. 34, no. 4, pp. 215–225, 2010.
    [31] I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT Press, 2016.
    [32] A.Krizhevsky, I.Sutskever, and G. E. Hinton,“ Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
    [33] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012.
    [34] M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang, et al., “End to end learning for self-driving cars,” arXiv preprint arXiv:1604.07316, 2016.
    [35] M. K. Leung, H. Y. Xiong, L. J. Lee, and B. J. Frey,“ Deep learning of the tissue-regulated splicing code,” Bioinformatics, vol. 30, no. 12, pp. i121–i129, 2014.
    [36] J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, et al., “Highly accurate protein structure prediction with alphafold,” Nature, vol. 596, no. 7873, pp. 583–589, 2021.
    [37] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrit- twieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, et al., “Mastering the game of go with deep neural networks and tree search,” nature, vol. 529, no. 7587, pp. 484–489, 2016.
    [38] T. S. Furey, N. Cristianini, N. Duffy, D. W. Bednarski, M. Schummer, and D. Haussler, “Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16, no. 10, pp. 906–914, 2000.
    [39] E. Martinez-Ledesma, R. G. Verhaak, and V. Treviño, “Identification of a multi-cancer gene expression biomarker for cancer clinical outcomes using a network-based algorithm,” Scientific reports, vol. 5, no. 1, pp. 1–14, 2015.
    [40] D. Sun, M. Wang, and A. Li,“ A multimodal deep neural network for human breast cancer prognosis prediction by integrating multi-dimensional data,” IEEE/ACM transactions on computational biology and bioinformatics, vol. 16, no. 3, pp. 841–850, 2018.
    [41] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” Advances in neural information processing systems, vol. 27, 2014.
    [42] M. W. Dusenberry, D. Tran, E. Choi, J. Kemp, J. Nixon, G. Jerfel, K. Heller, and A. M. Dai, “Analyzing the Role of Model Uncertainty for Electronic Health Records,” Proceedings of the ACM Conference on Health, Inference, and Learning, pp. 204–213, Apr. 2020. arXiv: 1906.03842.
    [43] T.-C. Hsu and C. Lin,“ Generative Adversarial Networks for Robust Breast Cancer Prognosis Prediction with Limited Data Size,” in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), pp. 5669–5672, July 2020. ISSN: 2694-0604.
    [44] E. AbuKhousa, N. Mohamed, and J. Al-Jaroodi, “e-health cloud: opportunities and challenges,” Future internet, vol. 4, no. 3, pp. 621–645, 2012.
    [45] P. Indyk and R. Motwani, “Approximate nearest neighbors: towards removing the curse of dimensionality,” in Proceedings of the thirtieth annual ACM symposium on Theory of computing, pp. 604–613, 1998.
    [46] X. Zhang, Z. Wang, D. Liu, and Q. Ling, “Dada: Deep adversarial data augmentation for extremely low data regime classification,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2807– 2811, IEEE, 2019.
    [47] Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu, “Recurrent neural networks for multivariate time series with missing values,” Scientific reports, vol. 8, no. 1, pp. 1–12, 2018.
    [48] V. Fortuin, D. Baranchuk, G. Rätsch, and S. Mandt, “Gp-vae: Deep probabilistic time series imputation,” in International conference on artificial intelligence and statistics, pp. 1651–1661, PMLR, 2020.
    [49] J. Futoma, S. Hariharan, K. Heller, M. Sendak, N. Brajer, M. Clement, A. Bedoya, and C. O'brien, “An improved multi-output gaussian process rnn with real-time validation for early sepsis detection,” in Machine Learning for Healthcare Conference, pp. 243–254, PMLR, 2017.
    [50] J. A. Saunders, N. Morrow-Howell, E. Spitznagel, P. Doré, E. K. Proctor, and R. Pescarino, “Imputing missing data: A comparison of methods for social work researchers,” Social work research, vol. 30, no. 1, pp. 19–31, 2006.
    [51] H. Kang, “The prevention and handling of the missing data,” Korean journal of anesthesiology, vol. 64, no. 5, p. 402, 2013.
    [52] B. K. Beaulieu-Jones, D. R. Lavage, J. W. Snyder, J. H. Moore, S. A. Pendergrass, and C. R. Bauer, “Characterizing and managing missing structured data in electronic health records: data analysis,” JMIR medical informatics, vol. 6, no. 1, p. e8960, 2018.
    [53] R. Wu, A. Zhang, I. Ilyas, and T. Rekatsinas, “Attention-based learning for missing data imputation in holoclean,” Proceedings of Machine Learning and Systems, vol. 2, pp. 307–325, 2020.
    [54] O. Chapelle, B. Scholkopf, and A. Zien, “Semi-supervised learning (chapelle, o. et al., eds.; 2006)[book reviews],” IEEE Transactions on Neural Networks, vol. 20, no. 3, pp. 542–542, 2009.
    [55] Y. Ouali, C. Hudelot, and M. Tami, “An overview of deep semi-supervised learning,” arXiv preprint arXiv:2006.05278, 2020.
    [56] H.Valpola,“ Fromneuralpcatodeepunsupervisedlearning,”inAdvancesinindependent component analysis and learning machines, pp. 143–171, Elsevier, 2015.
    [57] E. Bair, R. Tibshirani, and T. Golub,“ Semi-supervised methods to predict patient survival from gene expression data,” PLoS biology, vol. 2, no. 4, p. e108, 2004.
    [58] C.Rosenberg, M.Hebert, and H.Schneiderman,“ Semi-supervisedself-trainingofobject detection models,” KiltHub, 2005.
    [59] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
    [60] D. P. Kingma, S. Mohamed, D. Jimenez Rezende, and M. Welling, “Semi-supervised learning with deep generative models,” Advances in neural information processing systems, vol. 27, 2014.
    [61] X. Zhu, Z. Ghahramani, and J. D. Lafferty, “Semi-supervised learning using gaussian fields and harmonic functions,” in Proceedings of the 20th International conference on Machine learning (ICML-03), pp. 912–919, 2003.
    [62] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE transactions on neural networks, vol. 20, no. 1, pp. 61–80, 2008.
    [63] C. Zhang and Y. Ma, Ensemble machine learning: methods and applications. Springer, 2012.
    [64] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in international conference on machine learning, pp. 1050–1059, 2016.
    [65] G. Van Rossum and F. L. Drake, Python 3 Reference Manual. Scotts Valley, CA: CreateSpace, 2009.
    [66] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Is- ard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Software available from tensorflow.org.
    [67] J. Shi, J. Chen, J. Zhu, S. Sun, Y. Luo, Y. Gu, and Y. Zhou, “Zhusuan: A library for bayesian deep learning,” arXiv preprint arXiv:1709.05870, 2017.
    [68] E. Bingham, J. P. Chen, M. Jankowiak, F. Obermeyer, N. Pradhan, T. Karaletsos, R. Singh, P. A. Szerlip, P. Horsfall, and N. D. Goodman, “Pyro: Deep universal proba- bilistic programming,” J. Mach. Learn. Res., vol. 20, pp. 28:1–28:6, 2019.
    [69] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems 32 (H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, eds.), pp. 8024–8035, Curran Associates, Inc., 2019.
    [70] T. Hastie, “Ridge regularization: an essential concept in data science,” Technometrics, vol. 62, no. 4, pp. 426–433, 2020.
    [71] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society: Series B (Methodological), vol. 58, no. 1, pp. 267–288, 1996.
    [72] D. P. Kingma, T. Salimans, and M. Welling, “Variational dropout and the local reparameterization trick,” Advances in neural information processing systems, vol. 28, 2015.
    [73] T.-C. Hsu and C.Lin,“ Learning from small medical data — robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder,” Bioinformatics Advances, vol. 3, p. vbac100, Jan. 2023.
    [74] J. Konečnỳ, B. McMahan, and D. Ramage, “Federated optimization: Distributed optimization beyond the datacenter,” arXiv preprint arXiv:1511.03575, 2015.
    [75] N. Rieke, J. Hancox, W. Li, F. Milletari, H. R. Roth, S. Albarqouni, S. Bakas, M. N. Galtier, B. A. Landman, K. Maier-Hein, et al., “The future of digital health with federated learning,” NPJ digital medicine, vol. 3, no. 1, pp. 1–7, 2020.
    [76] T. Fawcett, “An introduction to roc analysis,” Pattern recognition letters, vol. 27, no. 8, pp. 861–874, 2006.
    [77] F. E. Harrell, “Regression modeling strategies,” BIOS, vol. 330, p. 2018, 2017.
    [78] T. G. Clark, M. J. Bradburn, S. B. Love, et al., “Survival analysis part i: basic concepts
    and first analyses,” British journal of cancer, vol. 89, no. 2, p. 232, 2003.
    [79] T. Saito and M. Rehmsmeier, “The precision-recall plot is more informative than the roc plot when evaluating binary classifiers on imbalanced datasets,” PloS one, vol. 10, no. 3, p. e0118432, 2015.
    [80] C. Huang, X. Huang, Y. Fang, J. Xu, Y. Qu, P. Zhai, L. Fan, H. Yin, Y. Xu, and J. Li, “Sample imbalance disease classification model based on association rule feature selection,” Pattern Recognition Letters, vol. 133, pp. 280–286, 2020.
    [81] T.-C. Hsu, S.-T. Liou, Y.-P. Wang, Y.-S. Huang, et al., “Enhanced recurrent neural network for combining static and dynamic features for credit card default prediction,” in ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1572–1576, IEEE, 2019.
    [82] R. Fluss, D. Faraggi, and B. Reiser, “Estimation of the youden index and its associated cutoff point,” Biometrical Journal: Journal of Mathematical Methods in Biosciences, vol. 47, no. 4, pp. 458–472, 2005.
    [83] Y. Yang and X. Liu, “A re-examination of text categorization methods,” in Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 42–49, 1999.
    [84] M. J. Bradburn, T. G. Clark, S. B. Love, et al.,“ Survival analysis part ii: multivariate data analysis – an introduction to concepts and methods,” British journal of cancer, vol. 89, no. 3, p. 431, 2003.
    [85] M. J. Bradburn, T. G. Clark, S. B. Love, and D. G. Altman, “Survival analysis part iii: multivariate data analysis–choosing a model and assessing its adequacy and fit,” British journal of cancer, vol. 89, no. 4, pp. 605–611, 2003.
    [86] T. G. Clark, M. J. Bradburn, S. B. Love, and D. Altman, “Survival analysis part iv: further concepts and methods in survival analysis,” British journal of cancer, vol. 89, no. 5, pp. 781–786, 2003.
    [87] J. Fox and S. Weisberg, “Cox proportional-hazards regression for survival data,” An R and S-PLUS companion to applied regression, vol. 2002, 2002.
    [88] R. Peto, M. C. Pike, P. Armitage, et al., “Design and analysis of randomized clinical trials requiring prolonged observation of each patient. ii. analysis and examples,” British journal of cancer, vol. 35, no. 1, p. 1, 1977.
    [89] L. Tian and R. Olshen, “Survival analysis: Logrank test,” 2016.
    [90] H. Uno, T. Cai, M. J. Pencina, R. B. D’Agostino, and L.-J. Wei, “On the c-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data,” Statistics in medicine, vol. 30, no. 10, pp. 1105–1117, 2011.
    [91] F. E. Harrell, K. L. Lee, and D. B. Mark, “Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors,” Statistics in Medicine, vol. 15, pp. 361–387, Feb. 1996.
    [92] D. Sahoo, D. L. Dill, R. Tibshirani, and S. K. Plevritis, “Extracting binary signals from microarray time-course data,” Nucleic Acids Research, vol. 35, pp. 3705–3712, June 2007.
    [93] C. Stark, B.-J. Breitkreutz, T. Reguly, L. Boucher, A. Breitkreutz, and M. Tyers, “Bi- oGRID: a general repository for interaction datasets,” Nucleic Acids Research, vol. 34, pp. D535–D539, Jan. 2006.
    [94] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng,“ Multimodal deep learning,” in ICML, 2011.
    [95] A. Eitel, J. T. Springenberg, L. Spinello, M. Riedmiller, and W. Burgard, “Multimodal deep learning for robust rgb-d object recognition,” in 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 681–687, IEEE, 2015.
    [96] L. V. Jospin, H. Laga, F. Boussaid, W. Buntine, and M.Bennamoun,“ Hands-on bayesian neural networks—a tutorial for deep learning users,” IEEE Computational Intelligence Magazine, vol. 17, no. 2, pp. 29–48, 2022.
    [97] G. E. Hinton, “Boltzmann machine,” Scholarpedia, vol. 2, no. 5, p. 1668, 2007.
    [98] D. Koller and N. Friedman, Probabilistic graphical models: principles and techniques.
    MIT press, 2009.
    [99] T. Minka et al., “Divergence measures and message passing,” tech. rep., Citeseer, 2005.
    [100] C. Jarzynski, “Nonequilibrium equality for free energy differences,” Physical Review Letters, vol. 78, no. 14, p. 2690, 1997.
    [101] R. M. Neal, “Annealed importance sampling,” Statistics and computing, vol. 11, no. 2, pp. 125–139, 2001.
    [102] C. H. Bennett, “Efficient estimation of free energy differences from monte carlo data,” Journal of Computational Physics, vol. 22, no. 2, pp. 245–268, 1976.
    [103] Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li, “Learning to rank: from pairwise approach to listwise approach,” in Proceedings of the 24th international conference on Machine learning, pp. 129–136, 2007.
    [104] G. E. Hinton, “Deep belief networks,” Scholarpedia, vol. 4, no. 5, p. 5947, 2009.
    [105] M.Arjovsky, S.Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” in International conference on machine learning, pp. 214–223, PMLR, 2017.
    [106] M.J.Johnson,D.K.Duvenaud,A.Wiltschko, R. P. Adams, and S. R. Datta,“ Composing graphical models with neural networks for structured representations and fast inference,” Advances in neural information processing systems, vol. 29, 2016.
    [107] R. Bardenet, A. Doucet, and C. C. Holmes, “On markov chain monte carlo methods for tall data,” Journal of Machine Learning Research, vol. 18, no. 47, 2017.
    [108] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
    [109] R. Longadge and S. Dongre, “Class imbalance problem in data mining review,” arXiv preprint arXiv:1305.1707, 2013.
    [110] Jeremy Orlof and Jonathan Bloom, “Bootstrap confidence intervals. Class 24, 18.05”.
    [111] G. Shmueli, P. C. Bruce, I. Yahav, N. R. Patel, and K. C. Lichtendahl Jr, Data mining for business analytics: concepts, techniques, and applications in R. John Wiley & Sons, 2017.
    [112] L. Breiman, “Random Forests,” Machine Learning, vol. 45, pp. 5–32, Oct. 2001.
    [113] L. Breiman, “Bagging predictors,” Machine learning, vol. 24, no. 2, pp. 123–140, 1996.
    [114] R. E. Schapire, “The strength of weak learnability,” Machine learning, vol. 5, no. 2, pp. 197–227, 1990.
    [115] Y. Hu, D. Niu, J. Yang, and S. Zhou, “Fdml: A collaborative machine learning framework for distributed features,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2232–2240, 2019.
    [116] M. Hügle, G. Kalweit, T. Hügle, and J. Boedecker, “A dynamic deep neural network for multimodal clinical data analysis,” in Explainable AI in healthcare and medicine, pp. 79–92, Springer, 2021.
    [117] J. D. Olden, M. K. Joy, and R. G. Death,“ An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data,” Ecological modelling, vol. 178, no. 3-4, pp. 389–397, 2004.
    [118] J. Olden and D. Jackson, “Illuminating the ”black box”: A randomization approach for understanding variable contributions in artificial neural networks,” Ecological Modeling, vol. 154, pp. 135–150, Aug. 2002.
    [119] A. Goldstein, A. Kapelner, J. Bleich, and E. Pitkin, “Peeking inside the black box: Vi- visualizing statistical learning with plots of individual conditional expectation,” journal of Computational and Graphical Statistics, vol. 24, no. 1, pp. 44–65, 2015.
    [120] R. Caruana, “Multitask learning,” Machine learning, vol. 28, no. 1, pp. 41–75, 1997.
    [121] C. Curtis, S. P. Shah, S.-F. Chin, G. Turashvili, O. M. Rueda, M. J. Dunning, D. Speed, A. G. Lynch, S. Samarajiwa, Y. Yuan, S. Gräf, G. Ha, G. Haffari, A. Bashashati, R. Russell, S. McKinney, METABRIC Group, A. Langerød, A. Green, E. Provenzano, G. Wishart, S. Pinder, P. Watson, F. Markowetz, L. Murphy, I. Ellis, A. Purushotham, A.-L. Børresen-Dale, J. D. Brenton, S. Tavaré, C. Caldas, and S. Aparicio, “The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups,” Nature, vol. 486, pp. 346–352, Apr. 2012.
    [122] T. Tieleman and G. Hinton, “Lecture 6.5 - rmsprop: Divide the gradient by a running average of its recent magnitude,” COURSERA: Neural networks for machine learning, vol. 4, no. 2, pp. 26–31, 2012.
    [123] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, “Fast and accurate deep network learn- ing by exponential linear units (elus),” arXiv preprint arXiv:1511.07289, 2015.
    [124] C. Dugas, Y. Bengio, F. Bélisle, C. Nadeau, and R. Garcia, “Incorporating second-order functional knowledge for better option pricing,” in Proceedings of the 13th International Conference on Neural Information Processing Systems, NIPS’00, (Cambridge, MA, USA), pp. 451–457, MIT Press, Jan. 2000.
    [125] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blon- del, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.
    [126] D. Santos,“ Breast cancer survival prediction using machine learning and gene expression profiles,” medRxiv, pp. 2022–01, 2022.
    [127] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P.Dollár,“ Focal loss for dense object detection,” in Proceedings of the IEEE international conference on computer vision, pp. 2980– 2988, 2017.
    [128] A. Zundelevich, M. Dadiani, S. Kahana-Edwin, A. Itay, T. Sella, M. Gadot, K. Cesarkas, S. Farage-Barhom, E. G. Saar, E. Eyal, et al., “Esr1 mutations are frequent in newly diagnosed metastatic and loco-regional recurrence of endocrine-treated breast cancer and carry worse prognosis,” Breast Cancer Research, vol. 22, no. 1, pp. 1–11, 2020.
    [129] S. Kurozumi, H. Matsumoto, Y. Hayashi, K. Tozuka, K. Inoue, J. Horiguchi, I. Takeyoshi, T. Oyama, and M. Kurosumi, “Power of pgr expression as a prognostic factor for er- positive/her2-negative breast cancer patients at intermediate risk classified by the ki67 labeling index,” BMC cancer, vol. 17, no. 1, pp. 1–9, 2017.
    [130] B. Zhang, Z. Zhang, L. Li, Y.-R. Qin, H. Liu, C. Jiang, T.-T. Zeng, M.-Q. Li, D. Xie, Y. Li, X.-Y. Guan, and Y.-H. Zhu, “TSPAN15 interacts with BTRC to promote oesophageal squamous cell carcinoma metastasis via activating NF-κB signaling,” Nature Communications, vol. 9, p. 1423, Apr. 2018.
    [131] H. Hou, Y. Lyu, J. Jiang, M. Wang, R. Zhang, C.-C. Liew, B. Wang, and C. Cheng, “Peripheral blood transcriptome identifies high-risk benign and malignant breast lesions,” PloS one, vol. 15, no. 6, p. e0233713, 2020.
    146
    [132] Y.-C. Chang, Y. Ding, L. Dong, L.-J. Zhu, R. V. Jensen, and L.-L. Hsiao, “Differential expression patterns of housekeeping genes increase diagnostic and prognostic value in lung cancer,” PeerJ, vol. 6, p. e4719, 2018.
    [133] Z. Liu, Q. Sun, and X. Wang, “PLK1, A Potential Target for Cancer Therapy,” Translational Oncology, vol. 10, pp. 22–32, Feb. 2017.
    [134] R. Rouzier, C. M. Perou, W. F. Symmans, N. Ibrahim, M. Cristofanilli, K. Anderson, K. R. Hess, J. Stec, M. Ayers, P. Wagner, et al., “Breast cancer molecular subtypes respond differently to preoperative chemotherapy,” Clinical cancer research, vol. 11, no. 16, pp. 5678–5685, 2005.
    [135] Y.-C. Lee, Y.-J. Chen, C.-C. Wu, S. Lo, M.-F. Hou, and S.-S. F. Yuan, “Resistin expression in breast cancer tissue as a marker of prognosis and hormone therapy stratification,” Gynecologic oncology, vol. 125, no. 3, pp. 742–750, 2012.
    [136] K. Wei, T. Li, F. Huang, J. Chen, and Z. He, “Cancer classification with data augmentation based on generative adversarial networks,” Frontiers of Computer Science, vol. 16, no. 2, pp. 1–11, 2022.
    [137] A. Jahanian, X. Puig, Y. Tian, and P. Isola, “Generative models as a data source for multiview representation learning,” arXiv preprint arXiv:2106.05258, 2021.
    [138] A. J. Gentles, S. V. Bratman, L. J. Lee, J. P. Harris, W. Feng, R. V. Nair, D. B. Shultz, V. S. Nair, C. D. Hoang, R. B. West, et al., “Integrating tumor and stromal gene expression signatures with clinical indices for survival stratification of early-stage non–small cell lung cancer,” JNCI: Journal of the National Cancer Institute, vol. 107, no. 10, 2015.
    [139] B.-R. Wu, “Multi-cancer prognosis prediction with multi-task learning integrating rna sequencing and clinical data,” Master’s thesis, National Taiwan University, Graduate Institute of Communication Engineering, 2022.
    [140] M. Grinberg, D. Djureinovic, H. R. Brunnström, J. S. Mattsson, K. Edlund, J. G. Hengstler, L. La Fleur, S. Ekman, H. Koyi, E. Branden, et al., “Reaching the limits of prognostication in non-small cell lung cancer: an optimized biomarker panel fails to outperform clinical parameters,” Modern Pathology, vol. 30, no. 7, pp. 964–977, 2017.
    [141] J.-P. Sculier, K. Chansky, J. J. Crowley, J. Van Meerbeeck, P. Goldstraw, et al., “The impact of additional prognostic factors on survival and their relationship with the anatomical extent of disease expressed by the 6th edition of the tnm classification of malignant tumors and the proposals for the 7th edition,” Journal of Thoracic Oncology, vol. 3, no. 5, pp. 457–466, 2008.
    [142] C. Doersch, “Tutorial on variational autoencoders,” arXiv preprint arXiv:1606.05908, 2016.
    [143] C. M. Bishop, “Pattern recognition,” Machine learning, vol. 128, no. 9, 2006.
    [144] B. Lakshminarayanan, A. Pritzel, and C. Blundell, “Simple and scalable predictive un-certainty estimation using deep ensembles,” Advances in neural information processing systems, vol. 30, 2017.
    [145] E. Oostwal, M. Straat, and M. Biehl, “Hidden unit specialization in layered neural networks: Relu vs. sigmoidal activation,” Physica A: Statistical Mechanics and its Applications, vol. 564, p. 125517, 2021.
    [146] J. So, B. Güler, and A. S. Avestimehr, “Byzantine-resilient secure federated learning,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 7, pp. 2168–2181, 2020.
    [147] A. Kline, H. Wang, Y. Li, S. Dennis, M. Hutch, Z. Xu, F. Wang, F. Cheng, and Y. Luo, “Multimodal machine learning in precision health: A scoping review,” npj Digital Medicine, vol. 5, no. 1, p. 171, 2022.
    [148] V. Smith, S. Forte, M. Chenxin, M. Takáč, M. I. Jordan, and M. Jaggi,“ CoCoA: A general framework for communication-efficient distributed optimization,” Journal of Machine Learning Research, vol. 18, p. 230, 2018. Publisher: MIT press.
    [149] S. Teerapittayanon, B. McDanel, and H.-T. Kung, “Distributed deep neural networks over the cloud, the edge and end devices,” in 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 328–339, IEEE, 2017.
    [150] W. Li, F. Milletarì, D. Xu, N. Rieke, J. Hancox, W. Zhu, M. Baust, Y. Cheng, S. Ourselin, M. J. Cardoso, et al., “Privacy-preserving federated brain tumour segmentation,” arXiv preprint, pp. 133–141, 2019.
    [151] J. Xu, B. S. Glicksberg, C. Su, P. Walker, J. Bian, and F. Wang, “Federated learning for healthcare informatics,” Journal of Healthcare Informatics Research, vol. 5, no. 1, pp. 1–19, 2021.
    [152] P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Bennis, A. N. Bhagoji, K. Bonawitz, Z. Charles, G. Cormode, R. Cummings, et al., “Advances and open problems in federated learning,” Foundations and Trends® in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021.
    [153] S. Chenthara, K. Ahmed, H. Wang, F. Whittaker, and Z. Chen, “Healthchain: A novel framework on privacy preservation of electronic health records using blockchain technology,” 2020.
    [154] L. Chen, N. Huang, C. Mu, H. S. Helm, K. Lytvynets, W. Yang, and C. E. Priebe, “Deep learning with label noise: A hierarchical approach,” arXiv preprint arXiv:2205.14299, 2022.
    [155] K. Tomczak, P. Czerwińska, and M. Wiznerowicz, “The cancer genome atlas (tcga): an immeasurable source of knowledge,” Contemporary oncology, vol. 19, no. 1A, p. A68, 2015.
    [156] J. L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y. Kluger, “Deepsurv: personalized treatment recommender system using a cox proportional hazards deep neural network,” BMC medical research methodology, vol. 18, no. 1, pp. 1–12, 2018.
    [157] K. Weiss, T. M. Khoshgoftaar, and D. Wang, “A survey of transfer learning,” Journal of Big data, vol. 3, no. 1, pp. 1–40, 2016.
    [158] C. Wu, F. Zhou, J. Ren, X. Li, Y. Jiang, and S. Ma, “A selective review of multi-level omics data integration using variable selection,” High-throughput, vol. 8, no. 1, p. 4, 2019.
    [159] Q. Zhao, X. Shi, Y. Xie, J. Huang, B. Shia, and S. Ma,“ Combining multidimensional genomic measurements for predicting cancer prognosis: observations from tcga,” Briefings in bioinformatics, vol. 16, no. 2, pp. 291–303, 2015.
    [160] P. Zhang, X. Liu, Z. Chu, J. Ye, K. Li, W. Zhuang, D. Yang, and Y. Jiang, “Detection of interleukin-33 in serum and carcinoma tissue from patients with hepatocellular carcinoma and its clinical implications,” Journal of International Medical Research, vol. 40, no. 5, pp. 1654–1661, 2012.
    [161] W. Zhu, H. Li, Y. Yu, J. Chen, X. Chen, F. Ren, Z. Ren, and G. Cui, “Enolase-1 serves as a biomarker of diagnosis and prognosis in hepatocellular carcinoma patients,” Cancer management and research, vol. 10, p. 5735, 2018.
    [162] G. Housman, S. Byler, S. Heerboth, K. Lapinska, M. Longacre, N. Snyder, and S.Sarkar, “Drug resistance in cancer: an overview,” Cancers, vol. 6, no. 3, pp. 1769–1792, 2014.
    [163] M. Sehhati, A. Mehridehnavi, H. Rabbani, and M. Pourhossein, “Stable gene signature selection for prediction of breast cancer recurrence using joint mutual information,” IEEE/ACM transactions on computational biology and bioinformatics, vol. 12, no. 6, pp. 1440–1448, 2015.
    [164] Z. Yan, Q. Wang, X. Sun, B. Ban, Z. Lu, Y. Dang, L. Xie, L. Zhang, Y. Li, W. Zhu, et al., “Osbrca: a web server for breast cancer prognostic biomarker investigation with massive data from tens of cohorts,” Frontiers in Oncology, vol. 9, p. 1349, 2019.
    [165] T.Dozat,“Incorporating nesterov momentum into adam,” in Workshop track ICLR 2016 - 2016 International Conference on Learning Representations (ICLR), 2016.
    [166] G. Jongeneel, T. Klausch, F. N. van Erning, G. R. Vink, M. Koopman, C. J. Punt, M. J. Greuter, and V. M. Coupé, “Estimating adjuvant treatment effects in stage ii colon cancer: Comparing the synthesis of randomized clinical trial data to real-world data,” International Journal of Cancer, vol. 146, no. 11, pp. 2968–2978, 2020.
    [167] A. G. Vaiopoulos, I. D. Kostakis, M. Koutsilieris, and A. G. Papavassiliou, “Colorectal cancer stem cells,” Stem cells, vol. 30, no. 3, pp. 363–371, 2012.
    [168] S. Depeweg, J.-M. Hernandez-Lobato, F. Doshi-Velez, and S. Udluft, “Decomposition of uncertainty in bayesian deep learning for efficient and risk-sensitive learning,” in International Conference on Machine Learning, pp. 1184–1193, PMLR, 2018.

    QR CODE