簡易檢索 / 詳目顯示

研究生: 黃騰陞
Huang, Teng-Sheng
論文名稱: 初探非監督式學習SOM
A Study on Unsupervised Learning Algorithm SOM
指導教授: 陳朝欽
Chen, Chaur-Chin
口試委員: 黃仲陵
Huang, Chung-Lin
張隆紋
Chang, Long-Wen
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊工程學系
Computer Science
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 26
中文關鍵詞: 非監督式學習監督式學習資料壓縮
外文關鍵詞: SOM, Unsupervised Learning, Unsupervised Learning Algorithm
相關次數: 點閱:2下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文將探討非監督式學習 Self-Organizing Map (SOM) 演算法以下簡稱SOM,SOM跟其他 Artificial Neural Network (ANN)一樣是模仿人體神經網路的數學模型,但在設計上卻又跟其他ANN模型有所區別,SOM演算法使得與輸入向量相似的神經元彼此群聚以達到學習的成效,其中相似度是根據歐氏距離來計算,與輸入向量最靠近的神經元被稱作獲勝神經元或最佳匹配單元(BMU),SOM中的BMU將根據輸入向量進行調整並靠近,直至SOM找不到任何可被更新的神經元,細節部份會在之後的章節進行討論,實驗結果將使用 8OX、Iris、Breast cancer Wisconsin (diagnostic) 以及HIV-1 protease cleavage data sets當作訓練集與其他傳統非監督式學習像是K-means、Agglomerative Hierar-chical Clustering及AdaBoost、Random Forest等監督式學習進行比較。


    This thesis will discuss the unsupervised learning self-organizing map algo-rithm hereinafter referred to as SOM. Like other artificial neural networks (ANNs), SOM is a mathematical model that imitates the human neural network, but it is dif-ferent from other ANN models in terms of design. SOM algorithm makes neurons similar to the input vector cluster with each other to achieve the effect of learning. The similarity is calculated based on the Euclidean distance. The neuron closest to the in-put vector is called winner neuron or the best matching unit (BMU). The BMU will be adjusted and approached according to the input vector until SOM cannot find any neurons that can be updated. The details will be discussed in the following chapters. Experimental results on 8OX, iris, breast cancer Wisconsin (diagnostic), HIV-1 prote-ase cleavage data sets by SOM will be illustrated with other traditional clustering al-gorithms such as K-means, Agglomerative Hierarchical Clustering and supervised learning AdaBoost and Random Forest for a comparison.

    Chapter 1 Introduction ...........................................1 Chapter 2 Background Review ......................................3 2.1 The Self-Organizing Map ............................................3 2.2 Learning algorithm .................................................3 2.2.1 Perceptron Learning Algorithm ....................................4 2.2.2 SOM's learning algorithm .........................................5 Chapter 3 Data Description and Experimental Results ..............11 3.1 Training data Sets .................................................11 3.1.1 HIV-1 Protease Cleavage Data Set .................................11 3.1.2 Breast Cancer Wisconsin Diagnostic Data Set ......................12 3.1.3 8OX Character Data Set ...........................................13 3.1.4 Iris Flower Data Set .............................................13 3.2 MiniSOM ............................................................13 3.3 One-Hot Encoding ...................................................14 3.4 Experimental re-sult ...............................................15 3.4.1 Results on HIV-1 and Breast Cancer Data Sets .....................16 3.4.2 Results on 8OX and Iris Data Sets ................................20 Chapter 4 Conclusion .............................................24 References .............................................................25

    [Hopf1982] J. J. Hopfield, “Neural Networks and Physical Systems with Emergent
    Collective Computational Abilities”, Proceedings of National Academy of Sciences, USA, Vol. 79, 2554-2558, April 1982.
    [Koho1990] T. Kohonen, “The self-organizing map” Proceedings of the IEEE, 78(9), 1464–1480, 1990.
    [Rogn1990] T. Rognvaldsson, T. You, and D. Garwicz “State of The Art Prediction of HIV-1 Protease Cleavage Sites”, Bioinformatics, 31(8), 1204–1210, 1990.
    [Rume1985] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning Internal
    Representations by Error Propagation” Technical Report, Mar-Sep 1985.
    [Shyu2008] A. A. Akinduko, E. M. Mirkes and A. N. Gorban, "SOM:
    Stochastic Initialization versus Principal Components." Information Sciences, 364-365, 213–221. doi:10.1016/j.ins.2015.10.013, 2008.
    [Web01] HIV-1 protease cleavage Data Set, Thorsteinn Rögnvaldsson, “https://archive.ics.uci.edu/ml/datasets/HIV-1+protease+cleavage”, last access on June 30, 2021.
    [Web02] Breast Cancer Wisconsin (Diagnostic) Data Set, William H. Wolberg, W. Nick Street and Olvi L. Mangasarian, “https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)”, last access on June 30, 2021.
    [Web03] Iris flower data set, Wikipedia, “https://en.wikipedia.org/wiki/Iris_flower_data_set”, last access on June 30, 2021.
    [Web04] MiniSom, Giuseppe Vettigli, “https://github.com/JustGlowing/minisom”, last access on June 30, 2021.
    [Web05] Kohonen's Self Organizing Feature Maps, Ai-junkie. “http://www.ai-junkie.com/ann/som/som1.html”, last access on June 30, 2021.
    [Web06] MiniSom 2.2.9, Giuseppe Vettigli. “https://pypi.org/project/MiniSom”, last access on June 30, 2021.
    [Web07] OneHotEncoder, Scikit-learn. “https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html”, last access on June 30, 2021.
    [Web08] Hierarchical clustering, Wikipedia. “https://en.wikipedia.org/wiki/Hierarchical_clustering”, last access on June 30, 2021.
    [Web09] K-means clustering, Wikipedia. “https://en.wikipedia.org/wiki/K-means_clustering”, last access on June 30, 2021.
    [Web10] AdaBoost, Wikipedia. “https://en.wikipedia.org/wiki/AdaBoost”, last access on June 30, 2021.
    [Web11] Random forest, Wikipedia. “https://en.wikipedia.org/wiki/Random_forest”, last access on June 30, 2021.
    [Web12] Decision tree learning, Wikipedia, “https://en.wikipedia.org/wiki/Decision_tree_learning”, last access on June 30, 2021.

    QR CODE