研究生: |
陳奐宇 Chen, Huan-Yu |
---|---|
論文名稱: |
電子健保資料在臨床應用中的深度學習數據分析 Clinical AI with Deep Learning Data Analytics for Electronic Claims Records |
指導教授: |
李祈均
Lee, Chi-Chun |
口試委員: |
馬席彬
Ma, Hsi-Pin 郭柏志 Kuo, Po-Chih 曾意儒 Tseng, Yi-Ju 鍾佳儒 Chung, Chia-Ru |
學位類別: |
博士 Doctor |
系所名稱: |
電機資訊學院 - 電機工程學系 Department of Electrical Engineering |
論文出版年: | 2024 |
畢業學年度: | 113 |
語文別: | 英文 |
論文頁數: | 71 |
中文關鍵詞: | 電子健保資料 、臨床醫療人工智慧 、深度學習 |
外文關鍵詞: | Electronic Claims Records, Clinical Artificial Intelligence, Deep Learning |
相關次數: | 點閱:1 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
電子健保資料(Electronic claims records, ECRs) 的運用在醫療保健領域呈現獨特的機會與挑戰。這些記錄龐大且標準化,然而往往提供的數據資訊含糊,需要複雜的分析工具來提取可行的見解。本文探討了深度學習技術在改善臨床AI應用方面的應用,並在三個相互關聯但各有不同的領域中做出了重要貢獻。每項研究都在前一項的基礎上進一步深入,以解決更廣泛的醫療場景。
首先,本研究的第一部分是開發一個新的預測型健康指數,這個指數利用深度學習從大規模人口基礎的健保資料中生成見解。此健康指數旨在提供一個全面的人口健康趨勢概況,反映了通過常規臨床數據捕捉到的各種健康關鍵指標的綜合影響,並且建立一個基本流程去展示如何利用數據來影響更廣泛的醫療策略。
在第二個研究中,我們通過實施基於Transformer的模型,提高了新發肺癌預測的準確性。該模型顯著提高了新發肺癌的篩檢能力。它採用嚴格的納入和排除標準來確保模型預測的強健性,從而應對臨床診斷中的一個關鍵挑戰——早期和準確的篩檢。基於第一項研究開發的資料分析方法,進一步深化將其應用於特定具體疾病的臨床篩檢。
本研究的第三部分調查了醫療資源使用中的消費不平等。通過檢查不同人群在醫療資源配置和消費上的差異,這項研究確定了預測模型中的顯著偏見。我們提出了減少這些偏見的策略,旨在促進醫療保健訪問和資源配置的公平性。它在前兩項研究的基礎上,利用從子群體分析獲得的見解來增強醫療處置的公平性。
總之,這些研究構成了一個大框架去利用ECRs提高醫療的準確性和公平性,展示了如何分析群體和個體數據去影響醫療策略和運作,最終改善臨床成果和資源分配。
The utilization of electronic claims records (ECRs) in healthcare provides unique opportunities and presents specific challenges due to their vast yet often loosely informative nature. This dissertation employs deep learning techniques to enhance clinical AI applications, making substantial contributions across three interconnected areas, with each study building on the insights from its predecessor to address broader healthcare challenges.
The first study develops a novel predictive health index utilizing deep learning to distill insights from large-scale, population-based ECRs. This index provides a comprehensive view of population health trends, reflecting the impact of various health indicators captured through routine clinical data. It serves as a foundational pipeline that demonstrates how aggregated data can influence broader healthcare strategies.
The second study advances to individual patient screenings by enhancing the accuracy of lung cancer predictions through a transformer-based model. This model not only improves the detection of new incidences of lung cancer but also employs rigorous inclusion and exclusion criteria to ensure robust predictions. This addresses a critical challenge in clinical diagnostics: the early and precise detection of diseases, which directly builds on the analytical capabilities developed in the first study by applying them to specific clinical conditions.
The third part of the research addresses consumption inequality in medical resource utilization, analyzing how resources are allocated across different economic subgroups to identify biases in predictive modeling. This study utilizes strategies to mitigate these biases, promoting equity in healthcare access and resource allocation. It expands on the previous studies by applying the insights gained from broad and individual-level analyses to enhance fairness in medical interventions.
Together, these studies form a cohesive framework that uses ECRs to enhance medical accuracy and fairness, demonstrating how aggregated and individual data insights can inform healthcare strategies and operations, ultimately improving clinical outcomes and resource distribution.
[1] “Artificial intelligence in healthcare market- global forecast (2024– 2032).” Global Mar
ket Insights, 2023. Report No. GMI1557.
[2] “Artificial intelligence (ai) in healthcare market- global forecast to 2029.” Marketsand
Markets, 2023. MarketsandMarkets.
[3] A. S. Pillai, “Utilizing deep learning in medical image analysis for enhanced diagnostic
accuracy and patient care: Challenges, opportunities, and ethical implications,” Journal
of Deep Learning in Genomic Data Analysis, vol. 1, no. 1, pp. 1–17, 2021.
[4] M. Raparthi, “Deep learning for personalized medicine-enhancing precision health with
ai,” Journal of Science & Technology, vol. 1, no. 1, pp. 82–90, 2020.
[5] T. Mizan and S. Taghipour, “Medical resource allocation planning by integrating ma
chine learning and optimization models,” Artificial Intelligence in Medicine, vol. 134,
p. 102430, 2022.
[6] C. S. Stafie, I.-G. Sufaru, C. M. Ghiciuc, I.-I. Stafie, E.-C. Sufaru, S. M. Solomon, and
M. Hancianu, “Exploring the intersection of artificial intelligence and clinical healthcare:
a multidisciplinary review,” Diagnostics, vol. 13, no. 12, p. 1995, 2023.
[7] S. Maleki Varnosfaderani and M. Forouzanfar, “The role of ai in hospitals and clinics:
transforming healthcare in the 21st century,” Bioengineering, vol. 11, no. 4, p. 337, 2024.
[8] M. Khalifa and M. Albadawy, “Artificial intelligence for clinical prediction: Exploring
key domains and essential functions,” Computer Methods and Programs in Biomedicine
Update, p. 100148, 2024.
[9] V. Nainamalai, H. A. Qair, E. Pelanis, H. B. Jenssen, Å. A. Fretland, B. Edwin, O. J. Elle,
and I. Balasingham, “Automated algorithm for medical data structuring, and segmentation
using artificial intelligence within secured environment for dataset creation,” European
Journal of Radiology Open, vol. 13, p. 100582, 2024.
[10] F. J. Díaz-Pernas, M. Martínez-Zarzuela, M. Antón-Rodríguez, and D. González-Ortega,
“A deep learning approach for brain tumor classification and segmentation using a multi
scale convolutional neural network,” in Healthcare, vol. 9, p. 153, MDPI, 2021.
[11] Z. Shao, H. Bian, Y. Chen, Y. Wang, J. Zhang, X. Ji, et al., “Transmil: Transformer based
correlated multiple instance learning for whole slide image classification,” Advances in
neural information processing systems, vol. 34, pp. 2136–2147, 2021.
67
[12] Y. Zhu, D. Bi, M. Saunders, and Y. Ji, “Prediction of chronic kidney disease progression
using recurrent neural network and electronic health records,” Scientific Reports, vol. 13,
no. 1, p. 22091, 2023.
[13] E.Kim,S.M.Rubinstein, K.T.Nead, A.P.Wojcieszynski, P. E. Gabriel, and J. L. Warner,
“Theevolvinguseofelectronic health records (ehr) for research,” in Seminars in radiation
oncology, vol. 29, pp. 354–361, Elsevier, 2019.
[14] J. Xu, X. Xi, J. Chen, V. S. Sheng, J. Ma, and Z. Cui, “A survey of deep learning for
electronic health records,” Applied Sciences, vol. 12, no. 22, p. 11709, 2022.
[15] T. Highfill, “Do hospitals with electronic health records have lower costs? a systematic
review and meta-analysis,” International Journal of Healthcare Management, vol. 13,
pp. 1–7, 05 2019.
[16] C. S. Kruse, C. Kristof, B. Jones, E. Mitchell, and A. Martinez, “Barriers to electronic
health record adoption: a systematic literature review,” Journal of medical systems,
vol. 40, pp. 1–7, 2016.
[17] K. Jeffrey, L. Woolford, R. Maini, S. Basetti, A. Batchelor, D. Weatherill, C. White,
V. Hammersley, T. Millington, C. Macdonald, et al., “Prevalence and risk factors for long
covid among adults in scotland using electronic health records: a national, retrospective,
observational cohort study,” EClinicalMedicine, vol. 71, 2024.
[18] S. Ebadollahi, J. Sun, D. Gotz, J. Hu, D. Sow, and C. Neti, “Predicting patient's trajec
tory of physiological data using temporal trends in similar patients: a system for near-term
prognostics,” in AMIA annual symposium proceedings, vol. 2010, p. 192, American Med
ical Informatics Association, 2010.
[19] W. Sun, Z. Cai, Y. Li, F. Liu, S. Fang, and G. Wang, “Data processing and text mining
technologies on electronic medical records: a review,” Journal of healthcare engineering,
vol. 2018, no. 1, p. 4302425, 2018.
[20] D. Maeng, J. Boscarino, W. Stewart, X. Yan, and N. Steigerwalt, “Ps1-11: A compar
ison of electronic medical records vs. claims data for rheumatoid arthritis patients in a
large healthcare system: An exploratory analysis,” Clinical Medicine & Research, vol. 12,
no. 1-2, pp. 108–108, 2014.
[21] C.-Y. Hung, H.-Y. Chen, L. J. Wee, C.-H. Lin, and C.-C. Lee, “Deriving a novel health
index using a large-scale population based electronic health record with deep networks,”
in 2020 42nd Annual International Conference of the IEEE Engineering in Medicine &
Biology Society (EMBC), pp. 5872–5875, IEEE, 2020.
[22] A.Haakenstad, J. A.Yearwood, N.Fullman, C.Bintz, K.Bienhoff, M.R.Weaver, V.Nan
dakumar, J. N. Joffe, K. E. LeGrand, M. Knight, et al., “Assessing performance of the
healthcare access and quality index, overall and by select age groups, for 204 countries
and territories, 1990–2019: a systematic analysis from the global burden of disease study
2019,” The Lancet global health, vol. 10, no. 12, pp. e1715–e1743, 2022.
[23] N. Frohlich and C. Mustard, “A regional comparison of socioeconomic and health indices
in a canadian province,” Social science & medicine, vol. 42, no. 9, pp. 1273–1281, 1996.
68
[24] C. Larson and A.Mercer, “Global health indicators: an overview,” Cmaj, vol. 171, no. 10,
pp. 1199–1200, 2004.
[25] C. t Sreeramareddy, V. Stathopoulou, N. Steel, C. Steiner, S. Steinke, E. Nolte, C. A.
t Antonio, M. A. Stokes, S. Stranges, M. Strong, et al., “Healthcare access and quality
index based on mortality from causes amenable to personal health care in 195 countries
andterritories, 1990-2015: a novel analysis from the global burdenofdiseasestudy2015,”
The Lancet, 2017.
[26] H. P. van de Water, R. J. Perenboom, and H. C. Boshuizen, “Policy relevance of the health
expectancy indicator; an inventory in european union countries,” Health Policy, vol. 36,
no. 2, pp. 117–129, 1996.
[27] A. M. Schwartz, K. J. Kugeler, C. A. Nelson, G. E. Marx, and A. F. Hinckley, “Use of
commercial claims data for evaluating trends in lyme disease diagnoses, united states,
2010–2018,” Emerging infectious diseases, vol. 27, no. 2, p. 499, 2021.
[28] D. H. Kim, E. Patorno, A. Pawar, H. Lee, S. Schneeweiss, and R. J. Glynn, “Measuring
frailty in administrative claims data: comparative performance of four claims-based frailty
measures in the us medicare data,” The Journals of Gerontology: Series A, vol. 75, no. 6,
pp. 1120–1125, 2020.
[29] C.-Y. Hsieh, C.-C. Su, S.-C. Shao, S.-F. Sung, S.-J. Lin, Y.-H. Kao Yang, and E. C.-C.
Lai, “Taiwan's national health insurance research database: past and future,” Clinical
epidemiology, pp. 349–358, 2019.
[30] L.-Y.Lin, C.Warren-Gash,L.Smeeth,andP.-C.Chen,“Dataresourceprofile: thenational
health insurance research database (nhird),” Epidemiology and health, vol. 40, 2018.
[31] World Health Organization and others, “Icd-10. international statistical classification of
diseases and related health problems: Tenth revision 1992, volume 1= cim-10. classifica
tion statistique internationale des maladies et des problèmes de santé connexes: Dixième
révision 1992, volume 1,” 1992.
[32] World Health Organization and others, “Who collaborating centre for drug statistics
methodology: Atc classification index with ddds and guidelines for atc classification and
ddd assignment,” Oslo, Norway: Norwegian Institute of Public Health, p. 15, 2006.
[33] C.-Y. Hung, C.-H. Lin, T.-H. Lan, G.-S. Peng, and C.-C. Lee, “Development of an intel
ligent decision support system for ischemic stroke risk assessment in a population-based
electronic health record database,” PloS one, vol. 14, no. 3, p. e0213007, 2019.
[34] H.-Y. Chen, H.-M. Wang, C.-H. Lin, R. Yang, and C.-C. Lee, “Lung cancer prediction us
ing electronic claims records: A transformer-based approach,” IEEE Journal of Biomedi
cal and Health Informatics, 2023.
[35] C.-Y. Hung, W.-C. Chen, P.-T. Lai, C.-H. Lin, and C.-C. Lee, “Comparing deep neu
ral network and other machine learning algorithms for stroke prediction in a large-scale
population-based electronic medical claims database,” in 2017 39th Annual International
Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3110
3113, IEEE, 2017.
69
[36] B.Kim,S.Sridharan, A.Atwal, andV.Ganapathi, “Deepclaim: payerresponseprediction
from claims data with deep learning. arxiv,” 2021.
[37] X. Min, B. Yu, and F. Wang, “Predictive modeling of the hospital readmission risk from
patients'claims data using machine learning: a case study on copd,” Scientific reports,
vol. 9, no. 1, p. 2362, 2019.
[38] S. A. Kovalchik, S. De Matteis, M. T. Landi, N. E. Caporaso, R. Varadhan, D. Consonni,
A. W. Bergen, H. A. Katki, and S. Wacholder, “A regression model for risk difference
estimation in population-based case–control studies clarifies gender differences in lung
cancer risk of smokers and never smokers,” BMC medical research methodology, vol. 13,
no. 1, pp. 1–8, 2013.
[39] M. C. Tammemägi, K. Ten Haaf, I. Toumazis, C. Y. Kong, S. S. Han, J. Jeon, J. Com
mins, T. Riley, and R. Meza, “Development and validation of a multivariable lung cancer
risk prediction model that includes low-dose computed tomography screening results: a
secondary analysis of data from the national lung screening trial,” JAMA network open,
vol. 2, no. 3, pp. e190204–e190204, 2019.
[40] A.Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. De
hghani, M.Minderer, G.Heigold, S.Gelly, etal., “Animageisworth16x16words: Trans
formers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[41] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and
I. Polosukhin, “Attention is all you need,” Advances in neural information processing
systems, vol. 30, 2017.
[42] X. Wang, K. Ma, J. Cui, X. Chen, L. Jin, and W. Li, “An individual risk prediction model
for lung cancer based on a study in a chinese population,” Tumori Journal, vol. 101, no. 1,
pp. 16–23, 2015.
[43] H. A. Katki, S. A. Kovalchik, C. D. Berg, L. C. Cheung, and A. K. Chaturvedi, “Develop
ment and validation of risk models to select ever-smokers for ct lung cancer screening,”
Jama, vol. 315, no. 21, pp. 2300–2311, 2016.
[44] F.Chollet, “Xception: Deeplearning withdepthwise separable convolutions,” in Proceed
ings of the IEEE conference on computer vision and pattern recognition, pp. 1251–1258,
2017.
[45] M.C.-H. Yeh, Y.-H. Wang, H.-C. Yang, K.-J. Bai, H.-H. Wang, and Y.-C. J. Li, “Artificial
intelligence–based prediction of lung cancer risk using nonimaging electronic medical
records: Deep learning approach,” Journal of medical Internet research, vol. 23, no. 8,
p. e26256, 2021.
[46] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “Lightgbm:
A highly efficient gradient boosting decision tree,” in Advances in Neural Information
Processing Systems (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vish
wanathan, and R. Garnett, eds.), vol. 30, Curran Associates, Inc., 2017.
[47] M. Sundararajan, A. Taly, and Q. Yan, “Axiomatic attribution for deep networks,” in In
ternational conference on machine learning, pp. 3319–3328, PMLR, 2017.
70
[48] N. Kokhlikyan, V. Miglani, M. Martin, E. Wang, B. Alsallakh, J. Reynolds, A. Melnikov,
N. Kliushkina, C. Araya, S. Yan, et al., “Captum: A unified and generic model inter
pretability library for pytorch,” arXiv preprint arXiv:2009.07896, 2020.
[49] N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan, “A survey on bias and
fairness in machine learning,” ACM computing surveys (CSUR), vol. 54, no. 6, pp. 1–35,
2021.
[50] S. Leavy, “Gender bias in artificial intelligence: The need for diversity and gender theory
in machine learning,” in Proceedings of the 1st international workshop on gender equality
in software engineering, pp. 14–16, 2018.
[51] E. Black, H. Elzayn, A. Chouldechova, J. Goldin, and D. Ho, “Algorithmic fairness and
vertical equity: Income fairness with irs tax audit models,” in Proceedings of the 2022
ACMConference on Fairness, Accountability, and Transparency, pp. 1479–1503, 2022.
[52] Y. Dong, J. Kang, H. Tong, and J. Li, “Individual fairness for graph neural networks:
A ranking based approach,” in Proceedings of the 27th ACM SIGKDD Conference on
Knowledge Discovery & Data Mining, pp. 300–310, 2021.
[53] I. Trapeznikova, “Measuring income inequality,” IZA World of Labor, 2019.
[54] J. A. Khan and R. A. Mahumud, “Is healthcare a`necessity'or`luxury'? an empirical
evidence from public and private sector analyses of south-east asian countries?,” Health
economics review, vol. 5, no. 1, p. 3, 2015.
[55] L. Pu, “Fairness of the distribution of public medical and health resources,” Frontiers in
public health, vol. 9, p. 768728, 2021.
[56] M. Alosh, K. Fritsch, M. Huque, K. Mahjoob, G. Pennello, M. Rothmann, E. Russek
Cohen, F. Smith, S. Wilson, and L. Yue, “Statistical considerations on subgroup analysis
in clinical trials,” Statistics in Biopharmaceutical Research, vol. 7, no. 4, pp. 286–303,
2015.
[57] Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in
International conference on machine learning, pp. 1180–1189, PMLR, 2015.
[58] B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted biases with adversarial
learning,” in Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society,
pp. 335–340, 2018.
[59] Z. Wang, K. Qinami, I. C. Karakozis, K. Genova, P. Nair, K. Hata, and O. Russakovsky,
“Towards fairness in visual recognition: Effective strategies for bias mitigation,” in
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,
pp. 8919–8928, 2020.