簡易檢索 / 詳目顯示

研究生: 戴維
JOSE DAVID DE LA BASTIDA CASTILLO
論文名稱: 基因表現資料分析軟體
Software for Gene Expression Data Analysis
指導教授: 陳朝欽
Chen, Chaur-Chin
口試委員: 朱學亭
Chu, Hsueh-Ting
高成炎
Kao, Cheng-Yan
學位類別: 碩士
Master
系所名稱: 電機資訊學院 - 資訊系統與應用研究所
Institute of Information Systems and Applications
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 30
中文關鍵詞: 聚類分類基因芯片K-均值基因表達
外文關鍵詞: clustering, classification, microarray, K-means, gene expression
相關次數: 點閱:1下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 實驗研究DNA,RNA和蛋白質微陣列,通常包括數以千計的基因,可以生成大量的信息。處理完畢後,該信息將成為研究重點,以評估細胞或有機體的整體狀態的輸入數據。本工作的目的是提供用於快速和準確地處理任何微陣列數據集(與病人矩陣的相應的基因)的軟件工具,並在同一時間,提供經過這樣的基本步驟處理,以便獲得有用的結果。
    我們的方法慨述如下:首先,我們將使用Fisher線性比單獨找到最好的K表代表性的特徵,我們利用分類和分群的技術以達精實結果並且以一維與二維樹狀圖形顯示結果。


    ABSTRACT

    Software for Gene Expression Data Analysis
    By
    José David de la Bastida Castillo

    Experiments on DNA, RNA, and Protein microarrays, which normally include thousands of genes, can generate large volumes of information. Once processed, this information becomes the input data for researches focused to assess the overall state of a cell or an organism. The purpose of the present work is to provide software tools for processing any microarray dataset (given the corresponding gene versus patient matrix) fast and accurately; and, at the same time, to state the minimum steps required in order to obtain usable results after such processing.
    The methodology is as follows: first, we will use the Fisher linear ratio to find the individually best k representative features, and then we will classify and cluster these findings, this to the aim of verifying the results. Finally, we will show a graphic interpretation of the outcome via dendrograms.

    Chapter 1 Introduction 1 Chapter 2 Overview of Microarray Image Processing 3 2.1 Spot Detection 3 2.2 Determination of the Spot Area 4 2.3 Feature Computation 4 Chapter 3 Overview of Determining Outlier Genes 6 3.1 MA Plot 6 3.2 Normalization 7 3.3 Outliers Discovery 8 Chapter 4 The Experiment 10 4.1 The Input Datasets 10 4.2 Best k Genes 12 4.3 Classification 13 4.3.1 Fisher’s Linear Classifier 13 4.3.2 Naive Bayes Classifier 13 4.4 Clustering 13 4.4.1 K-means Clustering 14 4.4.2 Hierarchical Clustering 14 Chapter 5 Results 15 5.1 Classification Results 15 5.2 K-means Results 16 5.3 Graphic Representation Using Dendrograms 17 5.4 Graphic Representation Using 2D-Dendrograms 19 5.5 The Most Differentially k Expressed Genes 22 5.6 Gene Network Example 23 5.6.1 Example 1 24 5.6.2 Example 2 25 Chapter 6 Conclusion 26 References 27

    [Alon1999] U. Alon, N. Barkai, D.A. Notterman, K. Gish, S. Ybarra, D. Mack, and A.J. Levine, "Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tumor and Normal Colon Tissues Probed by Oligonucleotide Arrays”, Proceedings of National Academy of Sciences of the United States of American, Vol. 96, 6745-6750, 1999.

    [Babu2004] M.M. Babu, “An Introduction to Microarray Data Analysis”, Computational Genomics (Ed: R. Grant), Chapter 11, 225-249, 2004.

    [Bens2012] M. Benson and M. A. Langston, “Inferring Networks for Diseases”, Encyclopedia of Molecular Cell Biology and Molecular Medicine, 2012.

    [Brow1999] M.P.S. Brown, M.S. Brown, D. Lin, W.N. Grundy, D. Lin, N. Cristianini, Jr., D. Haussler, C. Sugnet, M. Ares, Jr. Ss, “Support Vector Machine Classification of Microarray Gene Expression Data”, University of California Santa Cruz, 1-31, 1999.

    [Cai2006] X. Cai and G.B. Giannakis, “Identifying Differentially Expressed Genes in Microarray Experiments With Model-Based Variance Estimation”, IEEE Transactions on signal processing, Vol. 54, No. 6, 2418-2426, 2006.

    [Chat1991] S. Chatterjee and B. Price, “Regression Analysis by Example”, John Wiley & Sons, New York, 1991.

    [Chen1997] Y. Chen, E.R. Dougherty, and M.L. Bittner, “Ratio-based decisions and the quantitative analysis of cDNA microarray images”, J. Biomed. Optics 2, 364-374, 1997.

    [Cheb2005] B. Chen, P.C. Tai, R. Harrison, and Y. Pan ,“Novel Hybrid Hierarchical-K-means Clustering Method (H-K-means) for Microarray Analysis”, CSBW '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference – Workshops, 105-108, 2005.

    [Chet2005] T.S. Chen, T.H. Tsai, Y.T. Chen, C.C. Lin, R.C. Chen, S.Y. Li, and H.Y. Chen, “A combined K-means and hierarchical clustering method for improving the clustering efficiency of microarray”. Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems, 405-408, 2005.

    [Clev1979] W.S. Cleveland and S.J. Devlin, “Robust Locally Weighted Regression and Smoothing Scatterplots”, Journal of the American Statistical Association, Vol. 74, 829-836, 1979.

    [Clev1988] W.S. Cleveland and S.J. Devlin, “Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting”, Journal of the American Statistical Association, Vol. 83, 596-610, 1988.

    [Duda2001] R.O. Duda, P.E. Hart, and D.G. Stork, “Pattern Classification”, A Wiley-Interscience Publication, John Wiley & Sons, Inc., New York, 2001.

    [Dudo2002] S. Dudoit, Y.H. Yang, M.J. Callow, and T.P. Speed, “Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments”, Statistica Sinica, Vol. 12, 111-139, 2002.

    [Gord2002] G.J. Gordon, R.V. Jensen, L. Hsiao, S.R. Gullans, J.E. Blumenstock, S. Ramaswamy, W.G. Richards, D.J. Sugarbaker, and R. Bueno, "Translation of Microarray Data into Clinically Relevant Cancer Diagnostic Tests Using Gene Expression Ratios in Lung Cancer And Mesothelioma". Cancer Research, Vol. 62, 4963-4967, 2002.

    [Hsie2005] Y.P. Hsie, “Gene Discovery from Microarray Images”, M.S. Thesis, Department of Computer Science, NTHU, Taiwan, 2005.

    [Jara2009] A. Jaradat, R. Salleh, and A. Abid, “Imitating K-means to Enhance Data Selection”, Journal of Applied Sciences, Vol. 9, 3569-3574, 2009.

    [LiLi2003] J. Li, H. Liu, and L. Wong, "Mean-entropy discretized features are effective for classifying high-dimensional biomedical data", The 3rd ACM SIGKDD Workshop on Data Mining in Bioinformatics, 17-24, 2003.

    [Nann2012] L. Nanni1, S. Brahnam, and A. Lumini, “Combining multiple approaches for gene microarray classification”, Bioinformatics, Vol. 28, No. 8, 1151-1157, 2012.

    [Otsu1979] N. Otsu, “A Threshold Selection Method from Gray-Level Histograms”, Journal of IEEE Transactions on Systems, Man, and Cybernetics, Vol. 9, No. 1, 62-66, 1979.

    [Quac2002] J. Quackenbush, “Microarray data normalization and transformation”, Nature Genetics Supplement, Vol. 32, 496-501, 2002.

    [Rock2009] D.M. Rocke, T. Ideker, O. Troyanskaya, J.Quackenbush, and J. Dopazo, “Papers on normalization, variable selection, classification or clustering of microarray data”, Bioinformatics, Vol. 25, No. 6, 701–702, 2009.

    [Shar2012] A. Sharma and K. K. Paliwa, “A Gene Selection Algorithm using Bayesian Classification Approach”, American Journal of Applied Sciences, Vol. 9, No. 1, 127-131, 2012.

    [Tsai2006] M.Y. Tsai, “Gene Expression Computation on Microarray Image Data”, M.S. Thesis, Department of Computer Science, NTHU, Taiwan, 2006.

    [Tsen2001] G.C. Tseng, M.K. Oh, L. Rohlin, J.C. Liao, and W.H. Wong, “Issues in cDNA microarray analysis: quality filtering, channel normalization, models of variations and assessment of gene effects”, Nucleic Acids Res. 29, 2549-2557, 2001.

    [Vala2011] P. Valarmathie, K. Dinakaran, and T. Ravichandran, “An Efficient Unified K-means Clustering Technique for Microarray Gene Expression Data”, Journal of Computer Science, Vol. 7, 954-957, 2011.

    [Veer2002] L.J. van 't Veer, H. Dai, M.J. van de Vijver, Y.D. He, A.M. Hart, M. Mao, H.L. Peterse, K. van der Kooy, M.J. Marton, A.T. Witteveen, G.J. Schreiber, R.M. Kerkhoven, C. Roberts, P.S. Linsley, R. Bernards, and S.H. Friend, "Gene Expression Profiling Predicts Clinical Outcome of Breast Cancer”, Letters to Nature, Nature 415, 530-536, 2002.

    [Yang2002] Y.H. Yang, S. Dudoit, P. Luu, D.M. Lin, V. Peng, J. Ngai, and T.P. Speed, “Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation”, Nucleic Acids Research, Vol. 30, No. 4, 15, 2002.





    [Web01] http://www.mathworks.com/help/bioinfo/ref/mairplot.html, Last access on march 1st, 2013.

    [Web02] http://levis.tongji.edu.cn/gzli/data/mirror-kentridge.html, Last access on march 1st, 2013.

    [Web03] http://string-db.org, Last access on June 5th, 2013.

    [Web04] http://www.genemania.org/, Last access on June 5th, 2013.

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)

    QR CODE