簡易檢索 / 詳目顯示

研究生: 陳俊儒
Chen, Chun-Ju
論文名稱: 利用環境因子及資料強化策略預測自殺行為
Suicidal Risk Prediction by Learning Environment with Data Augmentation Strategy
指導教授: 陳宜欣
Chen, Yi-Shin
口試委員: 陳朝欽
CHEN, CHAUR-CHIN
吳書儀
Wu, Shu-I
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 35
中文關鍵詞: 預測自殺環境因子資料強化大數據機器學習
外文關鍵詞: suicidal risk prediction, environment, data augmentation, big data, machine learning
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 自殺是全世界重要的健康問題。自殺與精神疾病之間的聯繫經常被理論化和研究人員驗證。例如,他們使用大型電子健康記錄和機器學習技術來檢測有意自殺的人們。但是,如果我們從更高的角度來看待這種關聯性,環境因素會是有很大的可能去影響精神疾病和自殺。所以,環境條件應該會是主要的自殺危險因素之一。在本文中,我們試圖分析氣候和空氣污染等環境因素如何影響人們自殺。但是,由於政策限制,我們可以獲得的自殺統計信息是不平衡和部分的。我們引入一種新穎的數據增強方法,該方法可以通過基於時間序列的方法來擴展和創建自殺和非自殺類數據,使得我們的機器學習模型能夠訓練和預測。基本上,我們基於時間序列的增強方法提供了一個平衡的數據集。我們的實驗結果表明,自殺與天氣和空氣污染密切相關。


    Suicide is a significant health issue around the world. Linkages between suicide, psychiatric diseases are often theorized and verified by researchers. For instance, they use large electronic health records with machine learning techniques to detect individuals who have the intention to commit suicide. But if we view the linkages from a much higher perspective, it is likely that environment impacts both psychiatric diseases and suicide. Environment conditions should be one of the primary suicidal risk factors. In this paper, we seek to analyze how environment factors such as climate and air pollution influence people to suicide. However, the information of suicide statistic we can obtained is imbalanced and partial because of policy restrictions. A novel data augmentation method is then introduced which can scale and create the suicide and non-suicide class data though a time-series based approach such that our machine learning models are able to train and predict. Essentially, our time-series based augmentation provides a well-balanced dataset. Our experimental results indicate that suicide has strong association with weather and air pollution.

    Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 People Health Records . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Environment (Weather and Air Pollution) . . . . . . . . . . . . . . . . . . 5 3 Data Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1 Environment Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.1 Meteorological Data . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.2 Air Pollution Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.3 Suicide Class Data . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.2 Timeline Distribution Conversion . . . . . . . . . . . . . . . . . . . . . . . 10 4.3 Non-Suicide Class Data Augmentation . . . . . . . . . . . . . . . . . . . . 11 4.3.1 Baseline Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.3.2 IDEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.4 Suicide Class Data Enhancement . . . . . . . . . . . . . . . . . . . . . . . 18 5 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.1 Baseline Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.2 IDEA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.1 Predictive Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6.2 Feature Importances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 7 CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 List of Tables 6.1 Registered Suicide Cases Distribution . . . . . . . . . . . . . . . . . . . . 23 6.2 Results: Baseline Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.3 Results: IDEA Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 6.4 Results: Feature Importances . . . . . . . . . . . . . . . . . . . . . . . . . 29 List of Figures 4.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 4.2 Period with 12% of suicide class data . . . . . . . . . . . . . . . . . . . . . 12 4.3 Period with 100% of suicide class data . . . . . . . . . . . . . . . . . . . . 12 4.4 Date Distribution of registered suicide cases at New Taipei City in 2007 . . 13 4.5 Baseline Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.6 Extract Suicide Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.7 Clustering Suicide Days . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.8 Form Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.9 Find Intensive Suicide Period and Discard the Un-Intensive Period . . . . . 15 4.10 Construct Longest Non-Suicide Consecutive Sequence . . . . . . . . . . . 16 4.11 Non-Suicide Class Data Augmentation . . . . . . . . . . . . . . . . . . . . 18 4.12 Sucide Class Data Enhancement . . . . . . . . . . . . . . . . . . . . . . . 19 5.1 IDEA Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1 Sucide Class Data Enhancement . . . . . . . . . . . . . . . . . . . . . . . 28

    References
    [1] Fiedorowicz J. G. Zhang-T. Potash J. B. Cavanaugh J. Solomon D. A. & Coryell W. H. Akhter, A. Seasonal variation of manic and depressive symptoms in bipolar disorder. Bipolar Disord, 15(4):377–384, 213.
    [2] Hilary Coon Douglas Gray Phillip Wilson William M. McMahon Perry F. Renshaw Am J Epidemiol. Amanda V. Bakian, Rebekah S. Huber. Assessing air pollution and suicide risk. 181(5):309–310, 2015.
    [3] & Volpe F. M. Amr, M. Seasonal influences on admissions for mood disorders and schizophrenia in a teaching psychiatric hospital in egypt. J Affect Disord, 137(1-3):56–60, 2012.
    [4] Leo Breiman. Random forests. Mach. Learn., 45(1):5–32, 2001.
    [5] F. C. Papadopoulos A. Papadopoulou G. Bouras R. Gournellis L. Lykouras. C. Christodoulou, A. Douzenis. Suicide and seasonality. Acta Psychiatr Scand,125(2):127–146, 2012.
    [6] F. Chollet. keras, github. https://github.com/fchollet/keras, 2015.
    [7] Pereira e Silva A. C. de Sousa-Rodrigues C. F. Barbosa F. T. de Siqueira Figueredo D. Arajo Santos J. L. de Andrade T. G. Coimbra, D. G. Do suicide attempts occur more frequently in the spring too? a systematic review and rhythmic analysis. J Affect Disord, 196:125–137, 2016.
    [8] Derek de Beurs. Network analysis: a novel approach to understand suicidal behaviour. International journal of environmental research and public health, 14(3):219, 2017.
    [9] Tom Dietterich. Overfitting and undercomputing in machine learning. ACM,27(3):326–327, 1995.
    [10] Daniel S. Hirschberg. Algorithms for the longest common subsequence problem. j. ACM, 24(4):664–675, 1977.
    [11] Sepp Hochreiter and Jrgen Schmidhuber. Long short-term memory. Neural Comput., 9(8):1735–1780, 1997.
    [12] Donald B. Johnson. A note on dijkstra’s shortest path algorithm. ACM, 20(3):385–388, 1973.
    [13] Jung S. H. Kang D. R. Kim H. C. Moon K. T. Hur N. W. . . . Suh I. Kim, C. Ambient particulate matter as a risk factor for suicide. Am J Psychiatry, 167(9):1100–1107, 2014.
    [14] Kim H. Honda Y. Guo Y. L. Chen B.-Y. Woo J.-M. Ebi K. L. Kim, Y. Suicide and ambient temperature in east asian countries: A time-stratified case-crossover analysis. Environ Health Perspect, 124(1):75–80, 2016.
    [15] Ng C. F. S. Chung Y. Kim H. Honda-Y. Guo Y. L. Hashizume M. Kim, Y. Air pollution and suicide in 10 cities in northeast asia: A time-stratified case-crossover analysis. Environ Health Perspect, 126(3):037002, 2018.
    [16] Stickley A. Konishi S. Watanabe C. Ng, C. F. S. Ambient air pollution and suicide in tokyo. J Affect Disord, 201:194–202, 2016.
    [17] Rory C. OConnor and Matthew K. Nock. Relation between temperature and suicide mortality in japan in the presence of other confounding factors using time-series analysis with a semiparametric approach. The Lancet Psychiatry, 1(1):73–85, 2014.
    [18] Lee DK Park K. Ryu S, Lee H. Use of a machine learning algorithm to predict individuals with suicide ideation in the general population. Psychiatry Investig, 15(11):1030–1036, 2018.
    [19] Tse R. & Pirkis J. Sinyor, M. Global trends in suicide epidemiology. Curr Opin Psychiatry, 30(1):1–6, 2017.
    [20] Willey J. B. Grafstein E. Rowe B. H. Colman I. Szyszkowicz, M. Air pollution and emergency department visits for suicide attempts in vancouver, canada. Environ Health Insights, 4(0):79–86, 2010.
    [21] Masaji Ono. Victoria Likhvar, Yasushi Honda. Relation between temperature and suicide mortality in japan in the presence of other confounding factors using timeseries analysis with a semiparametric approach. Environ Health Prev Med, 16(1):36–43, 2011.
    [22] Kapusta N. D. Praschak-Rieder N. Dorffner G. Willeit M. Vyssoki, B. Direct effect of sunshine on suicide. JAMA Psychiatry, 71(11):1231–1237, 2014.
    [23] Praschak-Rieder N Dorffner GWilleit M. 2014. Vyssoki B, Kapusta ND. Direct effect of sunshine on suicide. jama psychiatry. JAMA Psychiatry., 71(11):1231–1237, 2014.
    [24] Eric-Jan Wagenmakers et al. Bayesian inference for psychology. part i: Theoretical advantages and practical ramifications. Psychonomic bulletin & review, 25(1):35–57, 2018.
    [25] Ribeiro-J. D. & Franklin J. C. Walsh, C. G. Predicting risk of suicide attempts over time through machine learning. Clinical Psychological Science, 5(3):457469, 2017.
    [26] Azrael-D. Papadopoulos F. C. Lambert G. W. Miller M. White, R. A. Does suicide have a stronger association with seasonality than sunlight? BMJ Open., 5(6):e007403,2015.
    [27] Yasushi Honda Yue Leon Guo Bing-Yu Chen Jong-Min Woo Kristie L. Ebi. Yoonhee Kim, Ho Kim. Suicide and ambient temperature in east asian countries: A timestratified case-crossover analysis. Environ Health Perspect, 124(1):75–80, 2016.
    [28] Yeonseung Chung Ho Kim-Yasushi Honda Yue Leon Guo Youn-Hee Lim Bing-Yu Chen Lisa A. Yoonhee Kim, Chris Fook Sheng Ng. Air pollution and suicide in 10 cities in northeast asia: A time-stratified case-crossover analysis. Page, Masahiro Hashizume Environ Health Perspect., 126(3):127–146, 2018.

    QR CODE