應用連續侷限型波茲曼演算法及資料探勘模型分析電子鼻感測資料以鑑別慢性阻塞性肺疾病患者

簡易檢索 / 詳目顯示

回結果列表

研究生：	余觀至 Yu, Kuan-Chih.
論文名稱：	應用連續侷限型波茲曼演算法及資料探勘模型分析電子鼻感測資料以鑑別慢性阻塞性肺疾病患者 Recognition of Patients with Chronic Obstructive Pulmonary Disease by Applying Continuous Restricted Boltzmann Machine and Data-Mining Methods to Sensory Data of E-Nose
指導教授：	陳新 Chen, Hsin
口試委員:	鄭桂忠 Tang, Kea-Tiong 李祈均 Lee, Chi-Chun
學位類別：	碩士 Master
系所名稱：	電機資訊學院 - 電機工程學系 Department of Electrical Engineering
論文出版年：	2017
畢業學年度：	106
語文別：	中文
論文頁數：	169
中文關鍵詞：	機器學習、連續侷限性波茲曼模型、慢性阻塞性肺疾病、指紋辨識
外文關鍵詞：	Machine learning, Continuous Restricted Boltzmann Machine, Chronic Obstructive Pulmonary Disease, Pattern recognition
相關次數：	點閱：3 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本論文的研究目的是利用機器學習演算法辨識慢性阻塞性肺疾病(Chronic Obstructive Pulmonary Disease, COPD)。從過去的文獻中可知道，罹患COPD的患者會吐出特定的有機化合物。電子鼻系統能夠達成COPD的辨識，因為電子鼻系統擁有感應氣體的感測器以及儲存辨識方法的微處理器。若不同的感測器被應用，則電子鼻系統能被用來辨識不同的氣味。
為了達到辨識的目標，本論文遵循一套流程：(1)資料預處理(2)辨識演算法。資料預處理的部分，本論文提出一套流程：(1)基線操作(2)接收者操作特性曲線(3)正規化。辨識演算法的部分，本論文使用以下幾種方法：(1)支持向量機(2)線性判別分析(3)線性規劃(4)機率分類器。除此之外，由於連續侷限型波茲曼演算法(Continuous Restricted Boltzmann Machine, CRBM)具有學習與非線性運算的特性，所以本論文探討CRBM對於線性分類器的影響。在CRBM學習完某類資料分布的條件下，無論輸入的資料分布為何，經過多次取樣後，被學習的資料分布會被重建回來。因此，根據此特性，本論文發展出一套CRBM的機率模型作為機率分類器。
研究結果發現，所有的辨識方法皆無法有效地辨識未知的資料，理由在於，經過預處理的COPD資料重疊嚴重，導致演算法無法辨認。從資料預處理的方法可以發現，接收者操作特性曲線會淘汰掉重要的特徵，若沒有操作者接收特性曲線，則COPD的資料分布有所改善。

The purpose of this thesis is to the recognize Chronic Obstructive Pulmonary Disease (COPD) by applying machine-learning algorithms. In previous literature, it is confirmed that specific organic compounds are exhaled by most patients suffering from the COPD. The COPD could thus be diagnosed by using machine-learning algorithms to classify the sensory data of an electronic nose. An electronic nose (e-Nose) consists of an array of neuromorphic sensor with diversity. Each sensor exhibits its own characteristic response to different odorants. Therefore, this study aims to identify a machine-learning algorithm able to detect COPD by classifying the sensory data of an e-Nose.
To ease data-classification, the following methods are employed to preprocess the e-Nose data: (1) baseline manipulation, (2) receiver operating characteristic (ROC) curve, and (3) normalization. For data classification, the performance of the following three linear classifiers are compared: (1) the support vector machine, (2) the linear discriminant analysis, (3) the linear programming. In addition, the Continuous Restricted Boltzmann Machine (CRBM) is employed as a nonlinear, probabilistic classifier. How the CRBM could improve the classification task is further explored in this thesis. Based on the fact that the CRBM learns to regenerate training data, an algorithm for estimating the likelihood of unknown data under a CRBM model is developed. This estimating algorithm enables CRBM to function as a probabilistic classifier reliably.
However, our experimental results indicate that all algorithms are unable to recognize unknown data because different types of pre-processed COPD data exhibit significant overlap among each other. Further analysis indicates that sensor selection based on ROC curve filters out some important dimensions. Therefore, without the sensor selection, better classification result is achieved.

摘要    I
Abstract    II
誌謝    III
目錄    IV
圖目錄    VII
表目錄    XVI
第一章    緒論    1
1.1    研究動機    1
1.2    論文貢獻    2
1.3    論文內容簡述    2
第二章    文獻回顧    5
2.1     慢性阻塞性肺疾病(COPD)簡介    5
2.2     電子鼻系統簡介    7
2.3     線性判別分析    10
2.4     高斯混和模型    13
2.5     連續侷限型波茲曼演算法    16
2.6     總結    23
第三章    資料預處理與統計分析    26
3.1     資料預處理    26
3.1.1    基線操作(Baseline Manipulation)    26
3.1.2    接收者操作特性曲線(Receiver Operating Characteristic Curve, ROC Curve)    29
3.1.3    資料正規化    33
3.2     線性分類器的統計分析    36
3.2.1    驗證資料的線性可分割性    36
3.2.2    線性分類器之效能評估    42
3.3     總結    44
第四章    連續侷限型波茲曼演算法應用於COPD預處理資料    46
4.1     CRBM對於COPD預處理資料分布之影響    46
4.1.1    隱藏神經元個數少於可視神經元個數    48
4.1.2    隱藏神經元個數多於可視神經元個數    51
4.1.3    探討資料分布對於CRBM的影響    54
4.1.4    結論    62
4.2     連續侷限型波茲曼演算法外加線性規劃法之線性分類器    62
4.2.1    使用線性判別分析的謬誤之探討    63
4.2.2    驗證CRBM外加線性規劃法分類器之方法    63
4.2.3    系統特性描述與驗證結果    65
4.3     總結    67
第五章    機率模型應用於預處理COPD資料    70
5.1     高斯混和模型應用於LDA作為貝氏分類器    70
5.1.1    以高斯混和模型為基礎的貝氏分類器    70
5.1.2    系統架構與驗證結果    72
5.2     利用CRBM之重建特性應用於預處理COPD資料    80
5.2.1    CRBM重建點與初始點之關係    81
5.2.2    以CRBM為基礎的機率分類器及其模擬結果    86
5.3     總結    93
第六章    資料預處理方法與分群之討論    95
6.1     資料預處理方法的問題討論    95
6.1.1    支持向量機的參數設定之探討    96
6.1.2    (修正後)預處理方法的分析結果    99
6.2     三群分類之探討    105
6.2.1    三群分類方法的比較    105
6.2.2    線性分類器的統計分析    107
6.3     總結    110
第七章    結論與未來工作    112
7.1     結論    112
7.2     未來工作    113
參考文獻    116
附錄    119
A.    以感測器作為二元分類模型的ROC曲線分析    119
B.    預處理COPD資料與FEV1對照表    126
C.    預處理之COPD資料對SVM的留一交叉驗證    127
D.    預處理之COPD對LDA的留一交叉驗證    136
E.    預處理之COPD對線性規劃法的留一交叉驗證    145
F.    預處理COPD資料應用於CRBM降維模擬學習過程    154
G.    預處理COPD資料應用於CRBM高維模擬學習過程    156
H.    (所有資料皆為訓練資料)預處理COPD資料對「LDA+GMM  貝氏分類器」分類結果    161
I.    預處理COPD資料對「LDA+GMM貝氏分類器」做留一交叉驗證    165
J.    預處理COPD資料應用於CRBM機率模型驗證結果    168


                                

[1] 郵政醫院, “慢性阻塞性肺病診斷與發展,”[online]
http://www.postal.com.tw/網站衛教單張/胸腔內科/慢性阻塞性肺疾病.
html
[2] 李宥瑾, 「利用機器學習方法分析電子氣體感測資料以鑑別慢性肺阻塞與
氣喘患者」, 國立清華大學, 碩士論文, 2015
[3] G. Konvalina and H. Haick, “Sensors for breath testing: from nanomaterials to
comprehensive disease detection,”Acc. Chem. Res., 2013
[4] Mohammed J. Zaki and Wagner Meira JR., “Data mining and analysis:
fundamental concepts and algorithms,”Cambridge University Press, 2014
[5] Christopher M. Bishop, “Pattern Recognition and Machine Learning,” Springer,
2006
[6] H. Chen and A. F. Murray, “Continuous Restricted Boltzmann Machine with an
implementable training algorithm,” IEE Image Signal Process, Vol. 150, No. 3,
June 2003
[7] Smolensky P., “Information processing in dynamical systems: foundations of
harmony theory” in “Parallel distributed processing: explorations in the
microstructure of cognition,” MIT Press, Cambridge, MA, USA, Vol. 1,
pp. 195-281, 1986
[8] J. R. Movellan, “A learning theorem for networks at detailed stochastic
Equilibrium,” Neural Computation, Vol. 10, No. 5, pp. 1157-1178, July, 1998
[9] G. Hinton, “Training products of experts by minimizing contrastive divergence,”
Neural Computation, Vol. 14, 2000
[10] Hopfield, J. J. : “Neurons with graded response have collective computational
Properties like those of two-states neurons,” Proc. Natl. Acad. Sci. USA,
pp. 3088-3092, 1984
[11] 楊廷然, 「利用多標籤分類器實現電子鼻混合氣體識別方法之研究」, 國
立清華大學, 碩士論文, 2014
[12] T. Pearce, S. Schiffman, H. Nagle, and J. Gardner, “Handbook of machine
olfaction: electronic nose technology,” 2006
[13] 黃建銘, 「基於連續型波茲曼模型之電子鼻氣體訊號辨識方法研究」, 國
國立清華大學, 碩士論文, 2014
[14] T.M. Cover, “Geometrical and Statistical Properties of Systems of Linear
Inequalities with Applications in Pattern Recognition,” IEEE Trans. Electron.
Comput., vol. EC-14, no.3, pp.326-334, Jun. 1965
[15] H. Chen, “Continuous-valued Probabilistic Neural Computation in VLSI,”
Edinburgh University, Thesis (Ph.D.), 2004

簡易檢索 / 詳目顯示

相關論文