簡易檢索 / 詳目顯示

研究生: 王琮郁
Tsong-Yuh Wang
論文名稱: 微晶片在SVM與SOM上的分析
A Study on Analyzing Microarray Data using SVM and SOM
指導教授: 陳朝欽
Chaur-Chin Chen
口試委員:
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2007
畢業學年度: 95
語文別: 英文
論文頁數: 30
中文關鍵詞: 微晶片支援向量機自我組織圖基因表現量基因選取
外文關鍵詞: microarray, Support Sector Machines, Self-Organizing Map, gene expression, gene selection
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基因的發現以及基因序列的完成促進微晶片快速的發展。研究人員認為修復基因將有助於治療疾病,在微晶片被建構出來後,從微晶片掃瞄圖中可算出基因表現量,我們的實驗主要在分析這些數字型態的基因表現量,藉由實驗結果來讓我們對基因的特性有更進一步的了解。
    微晶片的成本相當昂貴,從有限的微晶片來求得重要的資訊是很重要的。我們先取得六組不同來源的微晶片基因表現量,接著由基因選取法選擇重要基因,然後我們將支援向量機與自我組織圖應用在這六組資料上作分類,藉由探討分析結果來探討基因選取法的效果以及基因的特性,而基因選取法所選取的少量基因,將有助於生物或醫學領域的研究。


    Researchers majoring in biology and medical science believe we could cure kinds of diseases if the genes which lead to the diseases are fixed. The development of microaray grew fast in recent years in order to make human genes readable. When we get a microarray image, the gene expression values can be computed by segmentation methods.

    After the gene expression is computed from the microarray image, the following important research is to analyze the gene expression. Because the range of the value of the gene expression is too huge for us to compute, we have to normalize the gene expression values first. Then we use smoothly clipped absolute deviation (SCAD) SVM and weighted punishment on overlap (WEPO) to screen the important genes. When these important genes (features) are found out, they are used in two classification methods, support vector machines (SVM) and self-organizing map (SOM). Finally, we can understand more properties of microarray data by the experimental results of gene selection and classification methods.

    Contents Chapter 1 Introduction 1 Chapter 2 Gene Expression Acquired from Microarray Images 3 Chapter 3 A Review of SVM and SOM 7 3.1 Support Vector Machines (SVM) 7 3.2 Self-Organizing Map (SOM) 11 3.2.1 Algorithm for Kohonen’s Self-Organizing Map 11 3.2.2 Efficient Initialization Schemes for SOM 12 3.2.2 Apply SOM for Classification 13 Chapter 4 Gene Selection Methods 15 4.1 Smoothly Clipped Absolute Deviation (SCAD) SVM 15 4.2 Weighted Punishment on Overlap (WEPO) 17 Chapter 5 Databases and Experimental Results 18 5.1 Description of Databases 18 5.2 Expermental Results 21 5.2.1 Results of classification using SVM 21 5.2.2 Results of classification using SOM 22 5.2.3 Comparison of classification using SVM and SOM 24 Chapter 6 Conclusion 27 References 28

    References

    [Ali00] A.A. Alizadeh et al., "Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling," Nature, vol. 403, 503-511, February 2000.

    [Alo99] U. Alon et al., "Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays," Proceedings of National Academy of Sciences of the United States of American, vol. 96, 6745-6750, 1999.

    [Arm02] S.A. Armstrong et al., "MLL Translocations Specify A Distinct Gene Expression Profile that Distinguishes A Unique Leukemia," Nature Genetics, vol. 30, 41-47, January 2002.

    [Chu03] H.Y. Chuang, H.K. Tsai, Y.F. Tsai and C.Y. Kao, “Ranking Genes for Discriminability on Microarray Data,” Journal of Information Science and Engineering, vol. 19, 953-966, 20 03.

    [Cle88] W.S. Cleveland and S.J. Devlin, "Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting," Journal of the American Statistical Association, vol. 83, 596-610, 1988.

    [Cor95] C. Cortes and V. Vapnik, “Support-vector network,” Machine Learning, no. 20, 273-297, 1995.

    [Fur00] T.S. Furey et al., ” Support vector machine classification and validation of cancer tissue samples using microarray expression data,” Bioinformatics, vol. 16 , no. 10, 906-914, 2000.

    [Gho05] A. Ghoting et al., ”Cache-conscious frequent pattern mining on a modern processor,” VLDB Endowment, 577 – 588, 2005.

    [Gol99] T.R. Golub et al., "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, 531-537, October 1999.

    [Gor02] G.J. Gordon et al., “Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gege Expression Ratios in Lung Cancer and Mesothelioma,” Cancer Research, vol. 62, 4963-4967, 2002.

    [Knu02] S. Knudsen, "A Biologist’s Guide to Analysis of DNA Microarray Data," Wiley-Liss, Inc., 2002.

    [Koh90] T. Kohonen, “The Self-Organizing Map,” Proceedings of the IEEE, vol. 78, 1464-1480, 1990.

    [Mal03] H.S. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, “Low-Complexity Transform and Quantization in H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 13, no. 7, July 2003.

    [Muk03] S. Mukherjee, “Classifying Microarray Data Using Support Vector Machines,” Understanding And Using Microarray Analysis Techniques: A Practical Guide, Kluwer Academic Publishers, Boston, MA, 2003.

    [Pom02] S.L. Pomeroy et al., "Prediction of Central Nervous System Embryonal Tumour Outcome Based on Gene Expression," Nature, vol. 415, 436-442, January 2002.

    [Pon98] M. Pontil and A. Verri, “Support Vector Machines for 3D Object Recognition,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 20, no. 6, 637-646, June 1998.

    [Ram06] R. Diaz-Uriarte and S. Alvarez de Andres, ”Gene selection and classification of microarray data using random forest,” BMC Bioinformatics, vol. 7, January 2006.

    [Ros00] D.T. Ross et al., "Systematic variation in gene expression patterns in human cancer cell lines," Journal of Nature Genetics, vol. 24, 227-235, March 2000.

    [Sin02] D. Singh et al., "Gene Expression Correlates of Clinical Prostate Cancer Behavior," Cancer Cell, vol. 1, 203-209, March 2002.

    [Su99] M.C. Su, T.K. Liu, and H.T. Chang, “An efficient initialization scheme for
    the self-organizing feature map algorithm,” IEEE International Joint Conference on Neural Networks, 1906-1910, 1999.

    [Tan06] E Ke Tang, PN Suganthan and Xin Yao, “Gene selection algorithms for microarray data based on least squares support vector machine,” BMC Bioinformatrics, vol. 7, 2006.

    [Tsa06] M.Y. Tsai, “Gene Expression Computation on Microarray Image Data,” M.S. Thesis, National Tsing Hua University, Taiwan, January 2006.

    [Wan05] H. Wang, J. Pei, and P.S. Yu, ”Pattern-based similarity search for microarray data,” ACM, 814 – 819, 2005.

    [Zha06] H.H. Zhang, J. Ahn, X. Lin, and C. Park, ”Gene selection using support vector machines with non-convex penalty,” Bioinformatics, vol. 22, 88-95, 2006.
    [Web01] http://sdmc.lit.org.sg/GEDatasets/Datasets.html.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE