簡易檢索 / 詳目顯示

研究生: 林君衡
Lin, Chun-Heng
論文名稱: 基於檢驗數據預測出院報告之語句出現
Sentence Appearance Prediction in Discharge Summaries Based on Lab Test Data
指導教授: 蘇豐文
Soo, Von-Wun
口試委員: 邱瀞德
Chiu, Ching-Te
沈之涯
Shen, Chih-Ya
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2019
畢業學年度: 108
語文別: 英文
論文頁數: 38
中文關鍵詞: 句子分群出院報告深度學習缺值問題
外文關鍵詞: sentence clustering, discharge report, deep learning, missing value problem
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 深度學習在醫學領域中蓬勃發展。很多醫學領域的研究都將焦點放在診斷以及治療的部分,鮮少有研究是關於出院後照護的部分。我們的研究將焦點放在出院報告上。出院報告是出院後照護當中一個很重要的文件。我們的研究專注於預測出院報告中語句的出現與否。我們用了數種遞迴式神經網路去嘗試解決缺值的問題並請評估他們在預測語句出現上的表現。另外,我們將測量值是否異常的標示傳入神經網路以提升神經網路的表現。在預測用分群演算法產生的資料集的表現上,我們達到0.712的AUC分數。再者,用關鍵字產生的資料集我們甚至能達到0.753的AUC分數,代表著更多的資料量可以獲得更好的結果。儘管在我們的研究中只做了血液學相關的句子預測,我們的方法也可以應用在其他的部門。


    Deep learning prospered in medical domain. Many researches focus on diagnose and treatments instead of post-discharge care. Our research conducts research on the discharge summary of a patient that is an important element of post-discharge health care. We focus on predict the appearance of sentences in the discharge summary that are related to lab test data and predict their appearance. We use several RNNs to cope with missing values in the medical data and evaluate their performance on predicting the appearance of clustered sentences. Besides, we introduce a flag indicating abnormal values as an input to improve the performance of the RNN models. We reach an AUC score of 0.712 on the prediction of clustered sentences on average. Moreover, the performance on keyword selected sentences dataset that contains more data than clustered sentences dataset can obtain an AUC score of 0.753, which means if more data is fed during the training process, we may reach a better performance. Though we only conduct experiments on hematology department sentences, our methods can be generalized to prediction on the reports of other departments.

    摘要 Abstract Acknowledgement List of Tables List of Figures 1 Introduction 1 2 Related Work 4 3 Methodology 7 3.1 Clustering 7 3.1.1 Canopy Clustering 7 3.1.2 DBScan 8 3.2 Gated Recurrent Unit with Decay Factor 9 4 Experiments and Results 12 4.1 Dataset 12 4.1.1 Data preprocessing 12 4.2 Implementation details 15 4.3 Experiment result 15 4.4 Discussion 16 5 Conclusion and Future Work 21 5.1 Conclusion 21 5.2 Future Work 22 References 24 .1Appendix 28 .1.1Appendix:A 29

    [1] Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu.Recurrent Neural Networks for Multivariate Time Series with Missing Values.Scien-tific Reports, 8(1):1–12, 2018. ISSN 20452322. doi: 10.1038/s41598-018-24271-9.URLhttp://dx.doi.org/10.1038/s41598-018-24271-9.
    [2] Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, EricBattenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen,Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Di-amos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan,Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Jo-hannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, YangLiu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, SherjilOzair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman,Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet,Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, KaifuWang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao,Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, and Zhenyao Zhu. Deep24
    speech 2 : End-to-end speech recognition in english and mandarin. In Maria Flo-rina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd InternationalConference on Machine Learning, volume 48 ofProceedings of Machine LearningResearch, pages 173–182, New York, New York, USA, 20–22 Jun 2016. PMLR. URLhttp://proceedings.mlr.press/v48/amodei16.html.
    [3] Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen,Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, and Andrew Y.Ng. Deep Speech: Scaling up end-to-end speech recognition. pages 1–12, 2014.URLhttp://arxiv.org/abs/1412.5567.
    [4] William Chan, Navdeep Jaitly, Quoc Le, and Oriol Vinyals. Listen, attend andspell: A neural network for large vocabulary conversational speech recognition. In2016 IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), pages 4960–4964. IEEE, 2016.
    [5] Wayne Xiong, Jasha Droppo, Xuedong Huang, Frank Seide, Michael L. Seltzer, An-dreas Stolcke, Dong Yu, and Geoffrey Zweig. Toward Human Parity in Conversa-tional Speech Recognition.IEEE/ACM Transactions on Audio Speech and LanguageProcessing, 25(12):2410–2423, 2017. ISSN 23299290. doi: 10.1109/TASLP.2017.2756440.
    [6] Erik H. Hoyer, Charles A. Odonkor, Sumit N. Bhatia, Curtis Leung, Amy Deutschen-dorf, and Daniel J. Brotman. Association between days to complete inpatient dis-charge summaries with all-payer hospital readmissions in Maryland.Journal of Hos-pital Medicine, 11(6):393–400, 2016. ISSN 15535606. doi: 10.1002/jhm.2556.
    [7] Gregg S Meyer, Charles R Denham, James Battles, Pascale Carayon, Michael R Co-hen, Jennifer Daley, David R Hunt, and Andrew Lyzenga.National Quality Fo-rum (NQF). Safe Practices for Better Healthcare–2010 Update. 2010. ISBN9781933875460.[8] Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning withneural networks.Advances in Neural Information Processing Systems, 4(January):3104–3112, 2014. ISSN 10495258.
    [9] Ramesh Nallapati, Bowen Zhou, Cicero dos Santos, Caglar Gulcehre, and Bing Xi-ang. Abstractive Text Summarization using Sequence-to-sequence RNNs and Be-yond. pages 280–290, 2016. doi: 10.18653/v1/k16-1028.
    [10] Xingxing Zhang, Mirella Lapata, Furu Wei, and Ming Zhou. Neural latent extractivedocument summarization.arXiv preprint arXiv:1808.07187, 2018.
    [11] Chandra Khatri, Gyanit Singh, and Nish Parikh. Abstractive and Extractive Text Sum-marization using Document Context Vector and Recurrent Neural Networks. 2018.URLhttp://arxiv.org/abs/1807.08000.[12] Parag Jain, Anirban Laha, Karthik Sankaranarayanan, Preksha Nema, Mitesh M.Khapra, and Shreyas Shetty. A Mixed Hierarchical Attention based Encoder-DecoderApproach for Standard Table Summarization. pages 622–627, 2018. URLhttp://arxiv.org/abs/1804.07790.
    [13] Andrew Mccallum and Lyle H Ungar. Efficient Clustering of HighDimensional DataSets with Application to Reference Matching.
    [14] Martin Ester, Hans-Peter Kriegel, J ̈org Sander, Xiaowei Xu, et al. A density-basedalgorithm for discovering clusters in large spatial databases with noise. InKdd, vol-ume 96, pages 226–231, 1996.
    [15] Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. Layer Normalization.2016. URLhttp://arxiv.org/abs/1607.06450.
    [16] Diederik P. Kingma and Jimmy Ba. Adam: A Method for Stochastic Optimization.pages 1–15, 2014. URLhttp://arxiv.org/abs/1412.6980

    QR CODE