使用線性混和模型分解血球細胞組成｜國立清華大學博碩士論文庫

簡易檢索 / 詳目顯示

回結果列表

研究生：	吳俊良 Wu, Chun-Liang
論文名稱：	使用線性混和模型分解血球細胞組成 Blood Cell Deconvolution with Linear Mixed Model
指導教授：	謝文萍 Hsieh, Wen-Ping
口試委員:	黃冠華徐南蓉
學位類別：	碩士 Master
系所名稱：	理學院 - 統計學研究所 Institute of Statistics
論文出版年：	2018
畢業學年度：	106
語文別：	英文
論文頁數：	54
中文關鍵詞：	反卷積、mRNA微陣列、線性
外文關鍵詞：	Deconvolution, microarray, Linear
相關次數：	點閱：90 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

血球細胞是由不同類型的細胞組成的混和物，每個細胞類型都有各自的基
因表現輪廓，而且不同的細胞類型有一定的關聯性。因此，利用基因表現量將
混和細胞分解成個別細胞類型並得到各自組成比例是一個困難的問題。
過往相關模型都是先對建個別細胞類型的基因表現量建立模型，再據此
拆解全血中混和細胞的組成比例。我們提出一個線性混和模型(Linear Mixed
Model for Deconvolution, LMMD) 分解細胞組成比例，將個別細胞類型和混和
細胞同時建模，共享相同的參數並共同估計混和細胞中各類型基因表現量，同
時可獲得未知的細胞組成比例，我們也特別為LMMD 建立了一套顯著基因的選
取標準。我們將LMMD 的分析流程與其他四個模型進行比較，在處理個別細胞
類型資料和混和細胞資料來自不同實驗的時候，LMMD 具備了更好的表現。

Gene expression of blood cells is a mixture of different cell types. Each cell
type has its own specific profile, and different cell types might be correlated at
the same time. Hence, decomposing the mixed expression profiles into cell typespecific
expression profiles and their respective cellular proportions is a difficult problem.
Previous studies usually build models on reference data that provide cellspecific profiles. We propose a Linear Mixed Model for Deconvolution (LMMD) to estimate the cell-specific expression level by modeling the reference profile and the mixture together in the same construction. We can also obtain the unknown cellular proportions at the same time. We establish the signature gene selection criteria for our LMMD model and compare it with four other models. LMMD has better performance when the reference data and mixture data are from different experiments.

Introduction 1
Methods 4
1 The algorithm of other deconvolution models compared in this study 4
1.1 Linear least squares regression (LLSR) . . . . . . . . . . . . 5
1.2 Non-negative least squares model (NNLS) . . . . . . . . . . 5
1.3 Digital Sorting Algorithm (DSA) . . . . . . . . . . . . . . . 6
1.4 CIBERSORT . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Linear Mixed Deconvolution Model . . . . . . . . . . . . . . . . . . 12
2.1 Main model . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Evaluation Criterion . . . . . . . . . . . . . . . . . . . . . . 16
3 Signature gene selection strategy . . . . . . . . . . . . . . . . . . . 17
Result 18
1 Evaluation through simulation . . . . . . . . . . . . . . . . . . . . . 18
2 Real data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2 Effect of log transformation . . . . . . . . . . . . . . . . . . 34
2.3 Comparison of models with three benchmark data sets . . . 36
2.4 Effect of gene selection . . . . . . . . . . . . . . . . . . . . . 40
2.5 When the cell-specific profiles are from different experiments 44
Conclusion and Discussion 50
References 52
                                

[1] Michael Syskind Pedersen, Ulrik Kjems, Karsten Boye Rasmussen, and
Lars Kai Hansen. Semi-blind source separation using head-related transfer
functions [speech signal separation]. In Acoustics, Speech, and Signal Processing,
2004. Proceedings.(ICASSP’04). IEEE International Conference on,
volume 5, pages V–713. IEEE, 2004.
[2] Shahin Mohammadi, Neta Zuckerman, Andrea Goldsmith, and Ananth
Grama. A critical survey of deconvolution methods for separating cell types
in complex tissues. Proceedings of the IEEE, 105(2):340–366, 2017.
[3] Alexander R Abbas, Kristen Wolslegel, Dhaya Seshasayee, Zora Modrusan,
and Hilary F Clark. Deconvolution of blood microarray data identifies cellular
activation patterns in systemic lupus erythematosus. PloS one, 4(7):e6098,
2009.
[4] Hyunsoo Kim and Haesun Park. Nonnegative matrix factorization based on
alternating nonnegativity constrained least squares and active set method.
SIAM journal on matrix analysis and applications, 30(2):713–730, 2008.
[5] Aaron M Newman, Chih Long Liu, Michael R Green, Andrew J Gentles,
Weiguo Feng, Yue Xu, Chuong D Hoang, Maximilian Diehn, and Ash A
Alizadeh. Robust enumeration of cell subsets from tissue expression profiles.
Nature methods, 12(5):453, 2015.
[6] Yi Zhong, Ying-Wooi Wan, Kaifang Pang, Lionel ML Chow, and Zhandong Liu. Digital sorting of complex tissues for cell type-specific gene expression profiles. BMC bioinformatics, 14(1):89, 2013.
[7] Karthik Devarajan. Nonnegative matrix factorization: an analytical and
interpretive tool in computational biology. PLoS computational biology,
4(7):e1000029, 2008.
[8] Renaud Gaujoux and Cathal Seoighe. Semi-supervised nonnegative matrix
factorization for gene expression deconvolution: a case study. Infection, Genetics
and Evolution, 12(5):913–921, 2012.
[9] Bernhard Schölkopf, Alex J Smola, Robert C Williamson, and Peter L
Bartlett. New support vector algorithms. Neural computation, 12(5):1207–
1245, 2000.
[10] Therese Sørlie, Robert Tibshirani, Joel Parker, Trevor Hastie, James Stephen
Marron, Andrew Nobel, Shibing Deng, Hilde Johnsen, Robert Pesich,
Stephanie Geisler, et al. Repeated observation of breast tumor subtypes in
independent gene expression data sets. Proceedings of the National Academy
of Sciences, 100(14):8418–8423, 2003.
[11] Marine Jeanmougin, Aurélien De Reynies, Laetitia Marisa, Caroline Paccard,
Gregory Nuel, and Mickael Guedj. Should we abandon the t-test in the analysis
of gene expression microarray data: a comparison of variance modeling
strategies. PloS one, 5(9):e12336, 2010.
[12] An-Shun Tai. A hierarchical bayesian deconvolution model for tumorinfiltrating
lymphocytes exploration. unpublished study.
[13] Yi Zhong and Zhandong Liu. Gene expression deconvolution in linear space.
Nature methods, 9(1):8, 2012.
54

簡易檢索 / 詳目顯示

相關論文