簡易檢索 / 詳目顯示

研究生: 沈君輝
Shen, Jun-Hui
論文名稱: 從時間序列資料推測新冠病毒的自然選擇
Inferring natural selection of SARS-CoV-2 from temporal sequence data
指導教授: 張筱涵
Chang, Hsiao-Han
口試委員: 黃貞祥
NG, CHEN-SIANG
林勇欣
Lin, Yeong-Shin
學位類別: 碩士
Master
系所名稱: 生命科學暨醫學院 - 生物資訊與結構生物研究所
Institute of Bioinformatics and Structural Biology
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 41
中文關鍵詞: 新冠病毒同義替換非同義替換演化選汰
外文關鍵詞: SARS-CoV-2, synonymous substitution, nonsynonymous substitution, evolution, selection
相關次數: 點閱:70下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 新型冠狀病毒(SARS-CoV-2)引發的新冠肺炎(COVID-19)從2019年12月零號病人確診至今,在全球各地發展為持續的疫情。造成大量患者患病甚至死亡的同時,亦出現了多種變異株。前人的研究中定義了多種病毒株,以及其各自的代表性的突變。然而,在新型冠狀病毒基因組中各片段所遭受的選汰壓力以及各變異株間演化上的異同上,仍缺乏系統性的了解。為更系統性地了解此病毒的演化及變異程度,我分析了2020年1月至2021年8月間美國的55,418條新冠病毒基因組序列。本研究計算新冠病毒中12段蛋白質編碼序列的非同義替換率(non-synonymous substitution rate, dN)和同義替換率(synonymous substitution rate, dS),發現ORF1a、ORF1b、ORF3a、ORF7a, matrix, nucleocapsid和spike的非同義與同義替換率之比例dN/dS ratio有增大的趨勢,顯示出這些基因可能受到正向選擇壓力的影響。雖然dN/dS分析幫助我找出受正向選擇的基因,但是無法找出受正向選擇的位點。因此,為找出有較大機會受到正向選擇影響的位點,我找出在各變異株中隨時間顯著增加的突變,並在spike蛋白質結構中標示出這些突變的位置。透過上述的分析,我總共找出了100個較有機會發生重要突變的位點,共125種突變。其中包含了過去曾在其他研究中被提及的位點,以及尚未被注意到的位點。我的研究,對新冠病毒的演化,尤其是自然選汰,提供了全方面的探討與見解。


    The COVID-19 caused by SARS-CoV-2 has developed into a global pandemic which leads to about four hundred million cases and over five million deaths worldwide since the first case reported in December 2019. Several strains of SARS-CoV-2 have been identified by mutations in previous studies. However, a comprehensive understanding of selective pressure effect on each protein of SARS-CoV-2 is still lacking. In this study, I analyzed 55,418 SARS-CoV-2 genome sequences collected from the United States between January 2020 and August 2021. I calculated the dN and dS of 12 protein-coding sequences. There is an increasing tendency of dN/dS in ORF1a, ORF1b, ORF3a, ORF7a, matrix, nucleocapsid and spike, indicating that these proteins may affected by positive selection. To found sites with a greater chance of being affected by positive selection, I identified 125 mutations happened on 100 site that increased significantly over time in each variant and colored these sites in the spike protein. Our research provides comprehensive discussions and insights into the evolution and selection of SARS-CoV-2.

    摘要 ii Abstract iii 序言 iv 目錄 vi 圖目錄 vii 表目錄 viii 第一章 背景介紹 1 第二章 材料與方法 6 2.1序列資料的獲得與處理 6 2.2分派Nextstrain clade,clade內演化分析 7 2.3計算dN、dS 8 第三章 結果與討論 10 3.1變異株的改變 10 3.2自然選汰分析 10 3.3位點分析 13 第四章 總結 15 第五章 圖片與表格 16 第六章 參考文獻 37

    1. Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. (2020). The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2. Nature Microbiology, 5(4), 536–544. https://doi.org/10.1038/s41564-020-0695-z
    2. WHO Coronavirus (COVID-19) Dashboard. (n.d.). Retrieved January 13, 2022, from https://covid19.who.int
    3. Redondo, N., Zaldívar-López, S., Garrido, J. J., & Montoya, M. (2021). SARS-CoV-2 Accessory Proteins in Viral Pathogenesis: Knowns and Unknowns. Frontiers in Immunology, 12, 708264. https://doi.org/10.3389/fimmu.2021.708264
    4. Ashour, H. M., Elkhatib, W. F., Rahman, M. M., & Elshabrawy, H. A. (2020). Insights into the Recent 2019 Novel Coronavirus (SARS-CoV-2) in Light of Past Human Coronavirus Outbreaks. Pathogens (Basel, Switzerland), 9(3), E186. https://doi.org/10.3390/pathogens9030186
    5. Shang, J., Han, N., Chen, Z., Peng, Y., Li, L., Zhou, H., Ji, C., Meng, J., Jiang, T., & Wu, A. (2020). Compositional diversity and evolutionary pattern of coronavirus accessory proteins. Briefings in Bioinformatics, bbaa262. https://doi.org/10.1093/bib/bbaa262
    6. Huang, Y., Yang, C., Xu, X., Xu, W., & Liu, S. (2020). Structural and functional properties of SARS-CoV-2 spike protein: Potential antivirus drug development for COVID-19. Acta Pharmacologica Sinica, 41(9), 1141–1149. https://doi.org/10.1038/s41401-020-0485-4
    7. Harvey, W. T., Carabelli, A. M., Jackson, B., Gupta, R. K., Thomson, E. C., Harrison, E. M., Ludden, C., Reeve, R., Rambaut, A., Peacock, S. J., & Robertson, D. L. (2021). SARS-CoV-2 variants, spike mutations and immune escape. Nature Reviews Microbiology, 19(7), 409–424. https://doi.org/10.1038/s41579-021-00573-0
    8. Duffy, S. (2018). Why are RNA virus mutation rates so damn high? PLOS Biology, 16(8), e3000003. https://doi.org/10.1371/journal.pbio.3000003
    9. Rambaut, A., Holmes, E. C., O’Toole, Á., Hill, V., McCrone, J. T., Ruis, C., du Plessis, L., & Pybus, O. G. (2020). A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology. Nature Microbiology, 5(11), 1403–1407. https://doi.org/10.1038/s41564-020-0770-5
    10. Tracking SARS-CoV-2 variants. (n.d.). Retrieved January 17, 2022, from https://www.who.int/emergencies/what-we-do/tracking-SARS-CoV-2-variants
    11. Hadfield, J., Megill, C., Bell, S. M., Huddleston, J., Potter, B., Callender, C., Sagulenko, P., Bedford, T., & Neher, R. A. (2018). Nextstrain: Real-time tracking of pathogen evolution. Bioinformatics, 34(23), 4121–4123. https://doi.org/10.1093/bioinformatics/bty407
    12. Year-letter Genetic Clade Naming for SARS-CoV-2 on Nextstrain.org. (n.d.). Retrieved January 14, 2022, from https://nextstrain.org//blog/2020-06-02-SARSCoV2-clade-naming
    13. Boehm, E., Kronig, I., Neher, R. A., Eckerle, I., Vetter, P., & Kaiser, L. (2021). Novel SARS-CoV-2 variants: The pandemics within the pandemic. Clinical Microbiology and Infection, 27(8), 1109–1117. https://doi.org/10.1016/j.cmi.2021.05.022
    14. Wang, K., Jia, Z., Bao, L., Wang, L., Cao, L., Chi, H., Hu, Y., Li, Q., Jiang, Y., Zhu, Q., Deng, Y., Liu, P., Wang, N., Wang, L., Liu, M., Li, Y., Zhu, B., Fan, K., Fu, W., … Wang, X. (2021). A subset of Memory B-derived antibody repertoire from 3-dose vaccinees is ultrapotent against diverse and highly transmissible SARS-CoV-2 variants, including Omicron (p. 2021.12.24.474084). https://doi.org/10.1101/2021.12.24.474084
    15. Yang, H.-C., Chen, C., Wang, J.-H., Liao, H.-C., Yang, C.-T., Chen, C.-W., Lin, Y.-C., Kao, C.-H., Lu, M.-Y. J., & Liao, J. C. (2020). Analysis of genomic distributions of SARS-CoV-2 reveals a dominant strain type with strong allelic associations. Proceedings of the National Academy of Sciences, 117(48), 30679–30686. https://doi.org/10.1073/pnas.2007840117
    16. Yi, K., Kim, S. Y., Bleazard, T., Kim, T., Youk, J., & Ju, Y. S. (2021). Mutational spectrum of SARS-CoV-2 during the global pandemic. Experimental & Molecular Medicine, 53(8), 1229–1237. https://doi.org/10.1038/s12276-021-00658-z
    17. Tonkin-Hill, G., Martincorena, I., Amato, R., Lawson, A. R., Gerstung, M., Johnston, I., Jackson, D. K., Park, N., Lensing, S. V., Quail, M. A., Gonçalves, S., Ariani, C., Spencer Chapman, M., Hamilton, W. L., Meredith, L. W., Hall, G., Jahun, A. S., Chaudhry, Y., Hosmillo, M., … Wellcome Sanger Institute COVID-19 Surveillance Team. (2021). Patterns of within-host genetic diversity in SARS-CoV-2. ELife, 10, e66857. https://doi.org/10.7554/eLife.66857
    18. Nielsen, R. (2005). Molecular Signatures of Natural Selection. Annual Review of Genetics, 39(1), 197–218. https://doi.org/10.1146/annurev.genet.39.073003.112420
    19. Nielsen, R., & Yang, Z. (2003). Estimating the Distribution of Selection Coefficients from Phylogenetic Data with Applications to Mitochondrial and Viral DNA. Molecular Biology and Evolution, 20(8), 1231–1239. https://doi.org/10.1093/molbev/msg147
    20. Rocha, E. P. C., Smith, J. M., Hurst, L. D., Holden, M. T. G., Cooper, J. E., Smith, N. H., & Feil, E. J. (2006). Comparisons of dN/dS are time dependent for closely related bacterial genomes. Journal of Theoretical Biology, 239(2), 226–235. https://doi.org/10.1016/j.jtbi.2005.08.037
    21. Mugal, C. F., Wolf, J. B. W., & Kaj, I. (2014). Why Time Matters: Codon Evolution and the Temporal Dynamics of dN/dS. Molecular Biology and Evolution, 31(1), 212–231. https://doi.org/10.1093/molbev/mst192
    22. Bergquist, S., Otten, T., & Sarich, N. (2020). COVID-19 pandemic in the United States. Health Policy and Technology, 9(4), 623–638. https://doi.org/10.1016/j.hlpt.2020.08.007
    23. Yasmin, F., Najeeb, H., Moeed, A., Naeem, U., Asghar, M. S., Chughtai, N. U., Yousaf, Z., Seboka, B. T., Ullah, I., Lin, C.-Y., & Pakpour, A. H. (2021). COVID-19 Vaccine Hesitancy in the United States: A Systematic Review. Frontiers in Public Health, 9, 770985. https://doi.org/10.3389/fpubh.2021.770985
    24. Khare, S., Gurry, C., Freitas, L., B Schultz, M., Bach, G., Diallo, A., Akite, N., Ho, J., TC Lee, R., Yeo, W., Core Curation Team, G., & Maurer-Stroh, S. (2021). GISAID’s Role in Pandemic Response. China CDC Weekly, 3(49), 1049–1051. https://doi.org/10.46234/ccdcw2021.255
    25. Rozewicki, J., Li, S., Amada, K. M., Standley, D. M., & Katoh, K. (2019). MAFFT-DASH: Integrated protein sequence and structural alignment. Nucleic Acids Research, 47(W1), W5–W10. https://doi.org/10.1093/nar/gkz342
    26. Aksamentov, I., Roemer, C., Hodcroft, E. B., & Neher, R. A. (2021). Nextclade: Clade assignment, mutation calling and quality control for viral genomes. Journal of Open Source Software, 6(67), 3773. https://doi.org/10.21105/joss.03773
    27. Yang, Z. (2007). PAML 4: Phylogenetic Analysis by Maximum Likelihood. Molecular Biology and Evolution, 24(8), 1586–1591. https://doi.org/10.1093/molbev/msm088
    28. Zhan, X.-Y., Zhang, Y., Zhou, X., Huang, K., Qian, Y., Leng, Y., Yan, L., Huang, B., & He, Y. (2020). Molecular Evolution of SARS-CoV-2 Structural Genes: Evidence of Positive Selection in Spike Glycoprotein (p. 2020.06.25.170688). https://doi.org/10.1101/2020.06.25.170688
    29. Updated Nextstrain SARS-CoV-2 clade naming strategy. (n.d.). Retrieved January 18, 2022, from https://nextstrain.org//blog/2021-01-06-updated-SARS-CoV-2-clade-naming

    QR CODE