簡易檢索 / 詳目顯示

研究生: 江致勳
Chiang, Chih-Hsun
論文名稱: 應用先進自然語言模型與可解釋人工智慧於急診病患動向預測
Applying State-of-the-art NLP Models with Explainable Artificial Intelligence for Patient Disposition Prediction in Emergency Department
指導教授: 陳建良
Chen, James C.
王俊程
Wang, Jyun-Cheng
口試委員: 陳子立
Chen, Tzu-Li
陳盈彥
Chen, Yin-Yann
學位類別: 碩士
Master
系所名稱: 工學院 - 工業工程與工程管理學系
Department of Industrial Engineering and Engineering Management
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 69
中文關鍵詞: 急診室住院預測TransformerBERT可解釋人工智慧自然語言處理
外文關鍵詞: Admission Prediction in Emergency Department, Transformer, BERT, XAI, NLP
相關次數: 點閱:79下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著急診室就診人數和醫療費用的顯著增加,急診室擁擠問題日益嚴重。需求的激增導致病患需求與可用醫療資源之間長期失衡,可能會損害急診服務的效率並增加醫療糾紛。本研究旨在應用先進的自然語言處理模型和可解釋人工智慧技術,來預測急診病患的住院需求,以減輕急診醫療的供需不平衡,並提供資料科學方面的洞見。本研究以臺灣某家醫院從2014年起至2019年的急診資料作為實證研究數據。為了達到預測目的,研究採用了BERT、RoBERTa和DeBERTa V3等基於Transformer的模型來預測急診病人住院機率。其中,DeBERTa V3模型在各種評估指標上均表現優異,包括準確性、精確性、召回率等,凸顯其在實際應用中的高度效能。在模型可解釋性方面,本研究應用了可解釋人工智慧技術LIME。該技術可以識別患者投訴中顯著影響模型預測結果的有影響力的特徵,從而提高人工智慧系統的透明度和可信度。這有助於醫務人員做出更明智的決策並更有效地管理緊急資源。


    With the significant increase in emergency department (ED) visits and medical expenses, the issue of ED overcrowding has become increasingly severe. This surge in demand has led to a long-term imbalance between patient needs and available medical resources, potentially compromising the efficiency of emergency services and increasing medical disputes. This study aims to address this challenge by applying advanced natural language processing (NLP) models and explainable artificial intelligence (XAI) techniques to predict the hospitalization needs of ED patients, thereby alleviating the imbalance between emergency medical supply and demand and providing insights from a data science perspective. This study uses emergency and outpatient data from a hospital in Taiwan, spanning from 2014 to 2019, as empirical research data. To achieve the prediction goal, the study employs transformer-based models such as BERT, RoBERTa, and DeBERTa V3 to predict the probability of ED patient hospitalization. Among these, the DeBERTa V3 model performed excellently across various evaluation metrics, highlighting its high effectiveness in practical applications. In terms of model interpretability, this study applies the explainable AI technique LIME. The technique can identify influential features in patient complaints that significantly impact model prediction results, thereby enhancing the transparency and trustworthiness of the AI system. This helps medical staff make more informed decisions and manage emergency resources more effectively.

    摘要 I Abstract II 致謝 III Contents IV List of Tables VI List of Figures VII Chapter 1 Introduction 1 1.1 Background and Motivation 1 1.2 Research Objective 4 1.3 Organization of Thesis 5 Chapter 2 Literature Review 6 2.1 Predicting Hospital Admission for Emergency Department 6 2.2 Transformer-based Models Applications in Medical Field 7 2.3 XAI in NLP Applications for Medical Field 8 Chapter 3 Methodology 17 3.1 Data Preprocessing Methods 19 3.1.1 Data Labeling 19 3.1.2 Data Transformation 19 3.1.3 Data Segmentation 21 3.1.4 Tokenization 22 3.1.5 Word Embedding 22 3.2 Transformer-based Prediction Model 23 3.2.1 BERT 23 3.2.2 RoBERTa 24 3.2.3 DeBERTa V3 26 3.2.4 Summary of prediction models 26 3.3 Validation and Evaluation 27 3.3.1 Validation 27 3.3.2 Evaluation 29 3.4 XAI methods in NLP 30 3.4.1 XAI Research Framework 30 3.4.2 LIME 32 3.4.3 JIEBA Text Segmentation 34 Chapter 4 Empirical Study 36 4.1 Data Preprocessing 36 4.1.1 Data Description 36 4.1.2 Data Labeling 37 4.1.3 Tokenization and Word Embedding 38 4.2 Modeling 39 4.3 Prediction Model Results 40 4.3.1 Complaint Only 40 4.3.2 Extended Complaint 43 4.3.3 Summary of prediction model results 45 4.4 XAI Results 48 4.4.1 Complaint Only 49 4.4.2 Extended Complaint 50 4.4.3 Summary of XAI results 51 Chapter 5 Conclusion 62 Reference 63 Appendix 67 Appendix A Prediction model results 67

    Balagopalan, A., Eyre, B., Robin, J., Rudzicz, F., & Novikova, J. (2021). Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer’s Disease Based on Speech. Frontiers in Aging Neuroscience, 13, 635945. https://doi.org/10.3389/fnagi.2021.635945
    Chang, D., Hong, W. S., & Taylor, R. A. (2020). Generating contextual embeddings for emergency department chief complaints. JAMIA Open, 3(2), 160–166. https://doi.org/10.1093/jamiaopen/ooaa022
    Chen, T.-Y., Huang, T.-Y., & Chang, Y.-C. (2024). Using a clinical narrative-aware pre-trained language model for predicting emergency department patient disposition and unscheduled return visits. Journal of Biomedical Informatics, 155, 104657. https://doi.org/10.1016/j.jbi.2024.104657
    Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., & Hu, G. (2020). Revisiting Pre-Trained Models for Chinese Natural Language Processing. Findings of the Association for Computational Linguistics: EMNLP 2020, 657–668. https://doi.org/10.18653/v1/2020.findings-emnlp.58
    Cusidó, J., Comalrena, J., Alavi, H., & Llunas, L. (2022). Predicting Hospital Admissions to Reduce Crowding in the Emergency Departments. Applied Sciences, 12(21), Article 21. https://doi.org/10.3390/app122110764
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. https://doi.org/10.48550/arXiv.1810.04805
    Graham, B., Bond, R., Quinn, M., & Mulvenna, M. (2018). Using Data Mining to Predict Hospital Admissions From the Emergency Department. IEEE Access, 6, 10458–10469. https://doi.org/10.1109/ACCESS.2018.2808843
    Grefenstette, G. (1999). Tokenization. In Syntactic wordclass tagging (pp. 117-133). Dordrecht: Springer Netherlands. https://doi.org/10.1007/978-94-015-9273-4_9
    He, P., Gao, J., & Chen, W. (2023). DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing (arXiv:2111.09543). arXiv. https://doi.org/10.48550/arXiv.2111.09543
    Hong, W. S., Haimovich, A. D., & Taylor, R. A. (2018). Predicting hospital admission at emergency department triage using machine learning. PLOS ONE, 13(7), e0201016. https://doi.org/10.1371/journal.pone.0201016
    Kim, Y., Kim, J.-H., Kim, Y.-M., Song, S., & Joo, H. J. (2023). Predicting medical specialty from text based on a domain-specific pre-trained BERT. International Journal of Medical Informatics, 170, 104956. https://doi.org/10.1016/j.ijmedinf.2022.104956
    Kim, Y., Kim, J.-H., Lee, J. M., Jang, M. J., Yum, Y. J., Kim, S., Shin, U., Kim, Y.-M., Joo, H. J., & Song, S. (2022). A pre-trained BERT for Korean medical natural language processing. Scientific Reports, 12(1), 13847. https://doi.org/10.1038/s41598-022-17806-8
    Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach (arXiv:1907.11692). arXiv. https://doi.org/10.48550/arXiv.1907.11692
    Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/hash/8a20a8621978632d76c43dfd28b67767-Abstract.html
    Luo, X., Gandhi, P., Zhang, Z., Shao, W., Han, Z., Chandrasekaran, V., Turzhitsky, V., Bali, V., Roberts, A. R., Metzger, M., Baker, J., La Rosa, C., Weaver, J., Dexter, P., & Huang, K. (2021). Applying interpretable deep learning models to identify chronic cough patients using EHR data. Computer Methods and Programs in Biomedicine, 210, 106395. https://doi.org/10.1016/j.cmpb.2021.106395
    Parker, C. A., Liu, N., Wu, S. X., Shen, Y., Lam, S. S. W., & Ong, M. E. H. (2019). Predicting hospital admission at the emergency department triage: A novel prediction model. The American Journal of Emergency Medicine, 37(8), 1498–1504. https://doi.org/10.1016/j.ajem.2018.10.060
    Peck, J. S., Benneyan, J. C., Nightingale, D. J., & Gaehde, S. A. (2012). Predicting Emergency Department Inpatient Admissions to Improve Same-day Patient Flow. Academic Emergency Medicine, 19(9), E1045–E1054. https://doi.org/10.1111/j.1553-2712.2012.01435.x
    Pirzadeh, H., Shanian, S., Hamou-Lhadj, A., Alawneh, L., & Shafiee, A. (2013). Stratified sampling of execution traces: Execution phases serving as strata. Science of Computer Programming, 78(8), 1099–1118. https://doi.org/10.1016/j.scico.2012.11.002
    Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). 「Why Should I Trust You?」: Explaining the Predictions of Any Classifier (arXiv:1602.04938). arXiv. https://doi.org/10.48550/arXiv.1602.04938
    Saha, B., Lisboa, S., & Ghosh, S. (2020). Understanding patient complaint characteristics using contextual clinical BERT embeddings. 2020 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), 5847–5850. https://doi.org/10.1109/EMBC44109.2020.9175577
    Sundararajan, M., Taly, A., & Yan, Q. (2017). Axiomatic Attribution for Deep Networks (arXiv:1703.01365). arXiv. https://doi.org/10.48550/arXiv.1703.01365
    Tahayori, B., Chini-Foroush, N., & Akhlaghi, H. (2021). Advanced natural language processing technique to predict patient disposition based on emergency triage notes. Emergency Medicine Australasia, 33(3), 480–484. https://doi.org/10.1111/1742-6723.13656
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł. ukasz, & Polosukhin, I. (2017). Attention is All you Need. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
    Vig, J. (2019). BERTVIZ: A TOOL FOR VISUALIZING MULTI-HEAD SELF-ATTENTION IN THE BERT MODEL.
    Wu, Y., Zhang, L., Bhatti, U. A., & Huang, M. (2023). Interpretable Machine Learning for Personalized Medical Recommendations: A LIME-Based Approach. Diagnostics, 13(16), Article 16. https://doi.org/10.3390/diagnostics13162681
    Zhu, R., Tu, X., & Huang, J. X. (2021). 5—Utilizing BERT for biomedical and clinical text mining. In Data analytics in biomedical engineering and healthcare (pp. 73-103). Academic Press. https://doi.org/10.1016/B978-0-12-819314-3.00005-7
    Zhu, Y., Mahale, A., Peters, K., Mathew, L., Giuste, F., Anderson, B., & Wang, M. D. (2022). Using natural language processing on free-text clinical notes to identify patients with long-term COVID effects. Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, 1–9. https://doi.org/10.1145/3535508.3545555

    QR CODE