研究生: |
彭涵琪 |
---|---|
論文名稱: |
利用非負矩陣分解從大量生物晶片資料萃取白血球細胞基因表現譜 Meta-Expression Profile Retrieval for White Blood Cells with Non-Negative Matrix Factorization |
指導教授: | 謝文萍 |
口試委員: |
莊永仁
盧鴻興 |
學位類別: |
碩士 Master |
系所名稱: |
理學院 - 統計學研究所 Institute of Statistics |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 中文 |
論文頁數: | 38 |
中文關鍵詞: | NMF 、leukocyte 、microarray 、meta-profiles |
相關次數: | 點閱:2 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Abstract
The immune system is important for human body to protect against bacteria and viruses. It consists of various types of cells in blood. One set of the major cells in the immune system consist of several different white blood cells. They have been extensively studied individually with gene expression arrays. There are tens of thousands of genes assessed with only a small number of samples in each study. It should be very interesting to combine all the data together and compare the expression profiles in parallel to explore the similarity and difference across different white blood cells.
Non-Negative Matrix Factorization (NMF) is one of the most popular tools in multivariate analysis for decomposing high dimensional data. This study aims at retrieving the white blood cell type specific meta-profiles from a large dataset collected from different platforms and different experiments. We adopted NMF and explored the meta-profiles of four types of major leukocytes, the T cells, B cells, monocytes and neutrophils. Array data were collected from the Gene Expression Omnibus (GEO). The meta-profiles derived with NMF carry robust information across the two commercial platforms, Affymetrix and Illumina. It can be well explained by the relatively large difference of expression patterns among the cell types under consideration in comparison with the difference across platforms or experiments. The minimal restriction and assumption of NMF also contributes to the accurate mapping between the meta-profiles and the mean profiles.
References
Anttila, P., P. Paatero, et al. (1995). "Source identification of bulk wet deposition in Finland by positive matrix factorization." Atmospheric Environment 29(14): 1705–1718.
Beissbarth.T. and Speed.TP. (2004). "GOstat: Find statistically overrepresented Gene Ontologies within a group of genes." Bioinformatics 20(9): 1464-1465.
Ben-Israel, A. a. Greville, et al., Eds. (2003). Generalized Inverses: Theory and Applications, 2nd edition. New York, Springer.
Brunet, J.-P., P. Tamayo, et al. (2004). "Metagenes and molecular pattern discovery using matrix factorization." PNAS 101(12): 4164–4169.
Devarajan, K. (2008). "Nonnegative Matrix Factorization: An Analytical and Interpretive Tool in Computational Biology." PLoS Computational Biology 4(7): e1000029.
Frigyesi, A. and M. Hoglund (2008). "Non-Negative Matrix Factorization for the Analysis of Complex Gene Expression Data: identification of Clinically Relevant Tumor Subtypes." Cancer Informatics 2008(6): 275–292.
Gaujoux, R. and C. Seoighe (2010). "A flexible R package for nonnegative matrix factorization." BMC Bioinformatics 11(367): 1471-2105.
GEO. "GEO website." from http://www.ncbi.nlm.nih.gov/geo/.
Lawton, W. H. and E. A. Sylvestre (1971). "Self modeling curve resolution." Technometrics 13(3): 617+
Lee, D. D. and H. S. Seung (1999). "Learning the parts of objects by non-negative matrix factorization." Nature 401 (6755): 788–791.
Lee, D. D. and H. S. Seung (2001). "Algorithms for Non-negative Matrix Factorization." Advances in Neural Information Processing Systems 13: Proceedings of the 2000 Conference. MIT Press: pp. 556–562.
Liebermeister, W. (2002). "linear model of gene expression determined by independent component analysis." Bioinformatics 18(1): 51 - 60.
Paatero, P. and U. Tapper (1994). "Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values." Environmetrics 5: 111–126.
Raychaudhuri.S., Stuart.JM., et al. (2000). "Principal components analysis to summarize microarray experiments: application to sporulation time series." Pacific Symposium on Biocomputing 5: 452-463.
Tamayo, P., D. Scanfeld, et al. (2007). "Metagene projection for cross-platform, cross-species characterization of global transcriptional states." PNAS 104(14): 5959-5964.