簡易檢索 / 詳目顯示

研究生: 林后賢
Lin, Hou-Hsien
論文名稱: 癌症基因體變異視覺化整合分析工具
A web tool for visual summary of mutations in cancer cohorts
指導教授: 呂平江
Lyu, Ping-Chiang
口試委員: 黃柏榕
Huang, Po-Jung
李季青
Lee, Chi-Ching
學位類別: 碩士
Master
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 60
中文關鍵詞: 癌症基因體全外顯子基因定序基因註解突變特徵
外文關鍵詞: CoMut plot, WES, Annotation, Mutational Signature
相關次數: 點閱:3下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • CoMut plot是廣泛應用於癌症基因體研究,利用條狀圖將群體間突變頻率最高的基因由高到低排列出來,同時也可以清楚找出基因體突變最多的個體。此外,也用熱點圖來呈現個體的每個特定基因上突變的程度與變異的種類。最後利用程式語言將這些圖縫合在一起,僅用一張圖就能呈現研究群體中個體間的基因體變異圖譜。整個分析過程需經過檔案格式轉換、變體位點註釋、顯著變異基因預測、變異種類統計分析、突變特徵分析。目前已有少數軟體工具雖然可以分析資料並繪製出綜合圖,但存在著幾個缺點:1.不支援主流的檔案格式(如:VCF格式) 2.缺乏預測顯著變異基因與突變特徵的功能 3.缺乏跨癌症群體比較功能。而且有些需要使用者使用程式語言才能繪製分析,這對於沒有程式語言背景的生物研究者是一個門檻。
    因此,我們開發了一個網頁工具: CoMutPlotter,使用者不需生物資訊背景就能夠自行操作,上傳研究群體的癌症基因體突變資料,進行全自動分析與產生CoMut plot圖表。CoMutPlotter支援多種基因體變異資料格式(TSV, MAF,VCF),變體位點經過基因功能註解、癌症驅動基因找尋和突變特徵辨認等分析流程,最後將所有結果整合繪製成CoMut plot。而我們也提供使用者將自己的資料與現有癌症基因體資料庫(TCGA/ICGC)的資料庫做比對,讓使用者可以比較不同國家的癌症資料差異,而所有分析結果的圖表都可以提供使用者下載。


    CoMut plot is a visual summary of mutational patterns in cancer cohorts, which is usually used in cancer research. This plot summarizes gene mutation rate and sample mutation burden along with their relevant clinical details. To date, there are two web-based tools cBioPortal and iCoMut, which allow users select only TCGA and ICGC data to create involute visualizations. For custom data analysis, only certain command-line packages with limit of specific file format are available now. It is difficult for non-bioinformatics researchers to generate the CoMut plot from their custom data by themself. In order to solve the needs for custom data to achieve CoMut plot, and moreover let user compare with TCGA/ICGC data.We create CoMutPlotter, an easy-of-use and automatic web-based tool for the production of publication quality graphs.
    CoMutPlotter is supported for various file format without annotation to CoMut plot and annotation report. It also provides the comparison of mutation patterns between custom data and TCGA/ICGC project, detection of top driver gene in cohort and contributions of COSMIC mutational signatures in individual samples.

    中文摘要---------------------------------1 Abstract--------------------------------2 誌謝辭-----------------------------------3 List of abbreviations--------------------5 Chapter 1. Introduction------------------6 Chapter 2. Materials & Methods-----------9 2.1 Whole exome sequencing analysis------9 2.2 Data source-------------------------10 2.3 Functional consequence annotation---10 2.4 Cancer driver gene identification---11 2.5 Mutational signature recognition----11 2.6 Website interface-------------------12 Chapter 3. Result & Discussion----------13 3.1 CoMutplotter Framework--------------13 3.2 Example of use----------------------13 3.3 Web interface-----------------------14 3.4 Comparison of the features across similar tools-----15 3.5 Pan-cancer and cancer-specific immuno-peptide database-----15 Chapter 4. Conclusion-------------------18 Chapter 5. Figure & Table---------------19 Figure 1. Framework of CoMutPlotter-----19 Figure 2. Overview of the CoMutPlotter web interface-----20 Figure 3.1 Interactive filters of CoMutPlotter-----------21 Figure 3.2 Interactive filters of CoMutPlotter-----------22 Figure 3.3 Interactive filters of CoMutPlotter-----------23 Figure 4. Cross-project comparison-----------------------24 Figure 5. Download & Report Generation-------------------25 Table 1. Comparison of the features of similar tools for CoMut-like plot generation------------------------------------------26 Reference------------------------------------------------27 Appendix-------------------------------------------------31 Paper----------------------------------------------------31 The code of CoMut plot-----------------------------------38

    1. The Cancer Genome Atlas Research, N., et al., Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature, 2008. 455: p. 1061.
    2. Zhang, J., et al., International Cancer Genome Consortium Data Portal--a one-stop shop for cancer genomics data. Database : the journal of biological databases and curation, 2011. 2011: p. bar026-bar026.
    3. Tomczak, K., P. Czerwińska, and M. Wiznerowicz, The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemporary oncology (Poznan, Poland), 2015. 19(1A): p. A68-A77.
    4. Wang, K., M. Li, and H. Hakonarson, ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res, 2010. 38(16): p. e164.
    5. McLaren, W., et al., The Ensembl Variant Effect Predictor. Genome Biol, 2016. 17(1): p. 122.
    6. De Baets, G., et al., SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants. Nucleic Acids Res, 2012. 40(Database issue): p. D935-9.
    7. Ramos, A.H., et al., Oncotator: cancer variant annotation tool. Hum Mutat, 2015. 36(4): p. E2423-9.
    8. Gao, J., et al., Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal, 2013. 6(269): p. pl1.
    9. Mayakonda, A., et al., Maftools: efficient and comprehensive analysis of somatic variants in cancer. Genome Res, 2018. 28(11): p. 1747-1756.
    10. Skidmore, Z.L., et al., GenVisR: Genomic Visualizations in R. Bioinformatics (Oxford, England), 2016. 32(19): p. 3012-3014.
    11. Lawrence, M.S., et al., Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature, 2013. 499(7457): p. 214-218.
    12. Rosenthal, R., et al., deconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biology, 2016. 17(1): p. 31.
    13. Alexandrov, L.B., et al., Deciphering signatures of mutational processes operative in human cancer. Cell Rep, 2013. 3(1): p. 246-59.
    14. Huang, P.J., et al., mSignatureDB: a database for deciphering mutational signatures in human cancers. Nucleic Acids Res, 2018. 46(D1): p. D964-d970.
    15. Nik-Zainal, S., et al., Landscape of somatic mutations in 560 breast cancer whole-genome sequences. Nature, 2016. 534: p. 47.
    16. Petljak, M. and L.B. Alexandrov, Understanding mutagenesis through delineation of mutational signatures in human cancer. Carcinogenesis, 2016. 37(6): p. 531-40.
    17. McKenna, A., et al., The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res, 2010. 20(9): p. 1297-303.
    18. Koboldt, D.C., et al., VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res, 2012. 22(3): p. 568-76.
    19. Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 2009. 25(16): p. 2078-9.
    20. Poplin, R., et al., A universal SNP and small-indel variant caller using deep neural networks. Nature Biotechnology, 2018. 36: p. 983.
    21. McCarthy, D.J., et al., Choice of transcripts and software has a large effect on variant annotation. Genome Med, 2014. 6(3): p. 26.
    22. Chen, T.W., et al., APOBEC3A is an oral cancer prognostic biomarker in Taiwanese carriers of an APOBEC deletion polymorphism. Nat Commun, 2017. 8(1): p. 465.
    23. Huang, P.-J., et al., VAReporter: variant reporter for cancer research of massive parallel sequencing. BMC genomics, 2018. 19(Suppl 2): p. 86-86.
    24. India Project Team of the International Cancer Genome, C., et al., Mutational landscape of gingivo-buccal oral squamous cell carcinoma reveals new recurrently-mutated genes and molecular subgroups. Nature Communications, 2013. 4: p. 2873.
    25. Yadav, M., et al., Predicting immunogenic tumour mutations by combining mass spectrometry and exome sequencing. Nature, 2014. 515(7528): p. 572-6.
    26. Murphy, J.P., et al., MHC-I Ligand Discovery Using Targeted Database Searches of Mass Spectrometry Data: Implications for T-Cell Immunotherapies. J Proteome Res, 2017. 16(4): p. 1806-1816.
    27. Schumacher, T.N. and R.D. Schreiber, Neoantigens in cancer immunotherapy. 2015. 348(6230): p. 69-74.
    28. Andreatta, M. and M. Nielsen, Gapped sequence alignment using artificial neural networks: application to the MHC class I system. Bioinformatics, 2016. 32(4): p. 511-7.
    29. Jurtz, V., et al., NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data. J Immunol, 2017. 199(9): p. 3360-3368.
    30. Karosiene, E., et al., NetMHCcons: a consensus method for the major histocompatibility complex class I predictions. Immunogenetics, 2012. 64(3): p. 177-86.
    31. O'Donnell, T.J., et al., MHCflurry: Open-Source Class I MHC Binding Affinity Prediction. Cell Syst, 2018. 7(1): p. 129-132.e4.
    32. Hundal, J., et al., pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Medicine, 2016. 8(1): p. 11.
    33. Wu, J., et al., TSNAdb: A Database for Tumor-specific Neoantigens from Immunogenomics Data Analysis. Genomics Proteomics Bioinformatics, 2018. 16(4): p. 276-282.

    QR CODE