. doi: 10.1016/j.virs.2024.01.006
Citation: Changqiao You, Shuai Jiang, Yunyun Ding, Shunxing Ye, Xiaoxiao Zou, Hongming Zhang, Zeqi Li, Fenglin Chen, Yongliang Li, Xingyi Ge, Xinhong Guo. RNA barcode segments for SARS-CoV-2 identification from HCoVs and SARSr-CoV-2 lineages .VIROLOGICA SINICA, 2024, 39(1) : 156-168.  http://dx.doi.org/10.1016/j.virs.2024.01.006

鉴定HCoVs和SARSr-CoV-2中SARS-CoV-2的RNA条形码片段

  • 新型冠状病毒(SARS-CoV-2)作为引起新型冠状病毒感染(COVID-19)的病原体,不断进化产生新变异体,进而导致全球范围内的疫情复发。先前研究表明,条形码片段可以在具有密切系统发育关系的群体中高效、低成本地识别特定物种。在本研究中,基于物种的遗传演化关系,我们构建并测试了SARS-CoV-2的RNA条形码片段,便于从大规模的病毒样本中(例如HCoVs和SARSr-CoV-2谱系)高效且准确地识别SARS-CoV-2。我们从NCBI和GISAID数据库筛选并整理了1733个HCoVs和SARSr-CoV-2谱系的全基因组核苷酸序列用以构建测试集。通过物种的遗传水平测试,验证了条形码片段识别SARS-CoV-2的准确性和可靠性。随后,基于单核苷酸多态性位点和加权分数值大小,我们截取并筛选了75个位于ORF1ab、S、E、ORF7a和N编码区的主要和次要物种SARS-CoV-2特异性条形码片段。经测试,这些片段在鉴定性能上的召回率(接近100%),核苷酸水平上的特异性(接近30%)和精度(100%)表现优秀。最终,以上片段以一维和二维组合条形码的形式实现可视化,并存储至在线数据库(http://virusbarcodedatabase.top/)中。条形码技术鉴定SARS-CoV-2的成功应用不仅为涉及完整基因组序列多态性分析的研究提供有价值的见解。此外,这种高效且具有成本效益的鉴定方法也为未来病毒监测相关研究工作提供了一定参考价值。

RNA barcode segments for SARS-CoV-2 identification from HCoVs and SARSr-CoV-2 lineages

  • Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen responsible for coronavirus disease 2019 (COVID-19), continues to evolve, giving rise to more variants and global reinfections. Previous research has demonstrated that barcode segments can effectively and cost-efficiently identify specific species within closely related populations. In this study, we designed and tested RNA barcode segments based on genetic evolutionary relationships to facilitate the efficient and accurate identification of SARS-CoV-2 from extensive virus samples, including human coronaviruses (HCoVs) and SARSr-CoV-2 lineages. Nucleotide sequences sourced from NCBI and GISAID were meticulously selected and curated to construct training sets, encompassing 1733 complete genome sequences of HCoVs and SARSr-CoV-2 lineages. Through genetic-level species testing, we validated the accuracy and reliability of the barcode segments for identifying SARS-CoV-2. Subsequently, 75 main and subordinate species-specific barcode segments for SARS-CoV-2, located in ORF1ab, S, E, ORF7a, and N coding sequences, were intercepted and screened based on single-nucleotide polymorphism sites and weighted scores. Post-testing, these segments exhibited high recall rates (nearly 100%), specificity (almost 30% at the nucleotide level), and precision (100%) performance on identification. They were eventually visualized using one and two-dimensional combined barcodes and deposited in an online database (http://virusbarcodedatabase.top/). The successful integration of barcoding technology in SARS-CoV-2 identification provides valuable insights for future studies involving complete genome sequence polymorphism analysis. Moreover, this cost-effective and efficient identification approach also provides valuable reference for future research endeavors related to virus surveillance.

  • 加载中
    1. Agosto-Arroyo, E., Coshatt, G.M., Winokur, T.S., Harada, S., Park, S.L., 2017. Alchemy: a web 2.0 real-time quality assurance platform for human immunodeficiency virus, hepatitis C virus, and BK virus quantitation assays. J. Pathol. Inform. 10, 8-18.

    2. Amiral, J., Seghatchian, J., 2022. Autoimmune complications of COVID-19 and potential consequences for long-lasting disease syndromes. Transfus. Apher. Sci. 62, 103625.

    3. Badua, C.L.D.C., Baldo, K.A.T., Medina, P.M.B., 2021. Genomic and proteomic mutation landscapes of SARS-CoV-2. J. Med. Virol. 93, 1702-1721.

    4. Bao, Y., Bolotov, P., Dernovoy, D., Kiryutin, B., Zaslavsky, L., Tatusova, T., Ostell, J., Lipman, D., 2008. The influenza virus resource at the National Center for Biotechnology Information. J. Virol. 82, 596-601.

    5. Blois, S., Goetz, B.M., Bull, J.J., Sullivan, C.S., 2022. Interpreting and de-noising genetically engineered barcodes in a DNA virus. PLoS Comput. Biol. 18, e1010131.

    6. Carvalho, L.P.C., Costa, G.D.S., Pereira-Júnior, A.M., de-Paulo, P.F.M., Silva, G.S., Carioca, A.L.P.M., Rodrigues, B.L., Pessoa, F.A.C., Medeiros, J.F., 2022. DNA barcoding of genus Culicoides biting midges (Diptera: Ceratopogonidae) in the Brazilian Amazon. Acta Trop. 235-106619.

    7. Chaimayo, C., Kaewnaphan, B., Tanlieng, N., Athipanyasilp, N., Sirijatuphat, R., Chayakulkeeree, M., Angkasekwinai, N., Sutthent, R., Puangpunngam, N., Tharmviboonsri, T., Pongraweewan, O., Chuthapisith, S., Sirivatanauksorn, Y., Kantakamalakul, W., Horthongkham, N., 2020. Rapid SARS-CoV-2 antigen detection assay in comparison with real-time RT-PCR assay for laboratory diagnosis of COVID-19 in Thailand. Virol. J. 17, 177.

    8. Chu, H.M., Liu, J.X., Zhang, K., Zheng, C.H., Wang, J., Kong, X.Z., 2022. A binary biclustering algorithm based on the adjacency difference matrix for gene expression data analysis. BMC Bioinformatics 23, 381.

    9. CNCB-NGDC Members and Partners, 2023. Database resources of the national genomics data center, China National Center for Bioinformation in 2023. Nucleic Acids Res. 51, D18-D28.

    10. Cohen-Aharonov, L.A., Rebibo-Sabbah, A., Yaacov, A., Granit, R.Z., Strauss, M., Colodner, R., Cheshin, O., Rosenberg, S., Eavri, R., 2022. High throughput SARS-CoV-2 variant analysis using molecular barcodes coupled with next generation sequencing. PLoS One 17, e0253404.

    11. Cosar, B., Karagulleoglu, Z.Y., Unal, S., Ince, A.T., Uncuoglu, D.B., Tuncer, G., Kilinc, B.R., Ozkan, Y.E., Ozkoc, H.C., Demir, I.N., Eker, A., Karagoz, F., Simsek, S.Y., Yasar, B., Pala, M., Demir, A., Atak, I.N., Mendi, A.H., Bengi, V.U., Cengiz-Seval, G., Gunes-Altuntas, E., Kilic, P., Demir-Dora, D., 2022. SARS-CoV-2 mutations and their viral variants. Cytokine Growth Factor Rev. 63, 10-22.

    12. Cotuțiu, V.D., Ionică, A.M., Lefkaditis, M., Cazan, C.D., Hașaș, A.D., Mihalca, A.D., 2022. Thelazia lacrymalis in horses from Romania: epidemiology, morphology and phylogenetic analysis. Parasit. Vectors 15, 425.

    13. Daniels, R.S., McCauley J.W., 2023. The health of influenza surveillance and pandemic preparedness in the wake of the COVID-19 pandemic. J. Gen. Virol. 104, 001822.

    14. Fujito, S., Akyol, T.Y., Mukae, T., Wako, T., Yamashita, K.I., Tsukazaki, H., Hirakawa, H., Tanaka, K., Mine, Y., Sato, S., Shigyo, M., 2021. Construction of a high-density lineage map and graphical representation of the arrangement of transcriptome-based unigene markers on the chromosomes of onion, Allium cepa L. BMC Genomics 22, 481.

    15. Ghoyounchi, R., Ahmadpour, E., Spotin, A., Mahami-Oskouei, M., Rezamand, A., Aminisani, N., Ghojazadeh, M., Berahmat, R., Mikaeili-Galeh, T., 2017. Microsporidiosis in Iran: a systematic review and meta-analysis. Asian Pac. J. Trop. Med. 10, 341-350.

    16. GISAID, 2023. Variant of concern reports. https://gisaid.org/lineage-comparison/. (Accessed 11 June 2023).

    17. Gogoi, B., Wann, S.B., Saikia, S.P., 2020. DNA barcodes for delineating Clerodendrum species of North East India. Sci. Rep. 10, 13490.

    18. Gong, L., Zhang, D., Ding, X., Huang, J., Guan, W., Qiu, X., Huang, Z., 2021. DNA barcode reference library construction and genetic diversity and structure analysis of Amomum villosum Lour. (Zingiberaceae) populations in Guangdong Province. PeerJ 9, e12325.

    19. Grantham, R., Gautier, C., Gouy, M., Mercier, R., Pavé, A., 1980. Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 8, r49-r62.

    20. Guan, Q., Sadykov, M., Mfarrej, S., Hala, S., Naeem, R., Nugmanova, R., Al-Omari, A., Salih, S., Al-Mutair, A., Carr, M.J., Hall, W.W., Arold, S.T., Pain, A., 2020. A genetic barcode of SARS-CoV-2 for monitoring global distribution of different clades during the COVID-19 pandemic. Int. J. Infect. Dis. 100, 216-223.

    21. Guo, Y.Y., Huang, L.Q., Liu, Z.J., Wang, X.Q., 2016. Promise and challenge of DNA barcoding in Venus slipper (Paphiopedilum). PLoS One 11, e0146880.

    22. Guruprasad, L., 2021. Human coronavirus spike protein-host receptor recognition. Prog. Biophys. Mol. Biol. 161, 39-53.

    23. Hall, B.G., 2013. Building phylogenetic trees from molecular data with MEGA. Mol. Biol. Evol. 30, 1229-1235.

    24. Hebert, P.D., Cywinska, A., Ball, S.L., deWaard, J.R., 2003. Biological identifications through DNA barcodes. Proc. Biol. Sci. 270, 313-321.

    25. Hu, B., Guo, H., Zhou, P., Shi, Z.L., 2021. Characteristics of SARS-CoV-2 and COVID-19. Nat. Rev. Microbiol. 19, 141-154.

    26. Jiang, S., Chen, F., Qin, P., Xie, H., Peng, G., Li, Y., Guo, X., 2022. The specific DNA barcodes based on chloroplast genes for species identification of Theaceae plants. Physiol. Mol. Biol. Plants 28, 837-848.

    27. Li, H., Xiao, W., Tong, T., Li, Y., Zhang, M., Lin, X., Zou, X., Wu, Q., Guo, X., 2021. The specific DNA barcodes based on chloroplast genes for species identification of Orchidaceae plants. Sci. Rep. 11, 1424.

    28. Katoh, K., Rozewicki, J., Yamada, K.D., 2019. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 20, 1160-1166.

    29. Kirtipal, N., Bharadwaj, S., Kang, S.G., 2020. From SARS to SARS-CoV-2, insights on structure, pathogenicity and immunity aspects of pandemic human coronaviruses. Infect. Genet. Evol. 85, 104502.

    30. Kumar, S., Stecher, G., Li, M., Knyaz, C., Tamura, K., 2018. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547-1549.

    31. Lago S.G., Tomasik J., van Rees G.F., Ramsey J.M., Haenisch F., Cooper J.D., Broek J.A., Suarez-Pinilla P., Ruland T., Auyeug B., Mikova O., Kabacs N., Arolt V., Baron-Cohen S., Crespo-Facorro B., Bahn S., 2020. Exploring the neuropsychiatric spectrum using high-content functional analysis of single-cell signaling networks. Mol. Psychiatry 25, 2355-2372.

    32. Lam, T.T., Jia, N., Zhang, Y.W., Shum, M.H., Jiang, J.F., Zhu, H.C., Tong, Y.G., Shi, Y.X., Ni, X.B., Liao, Y.S., Li, W.J., Jiang, B.G., Wei, W., Yuan, T.T., Zheng, K., Cui, X.M., Li, J., Pei, G.Q., Qiang, X., Cheung, W.Y., Li, L.F., Sun, F.F., Qin, S., Huang, J.C., Leung, G.M., Holmes, E.C., Hu, Y.L., Guan, Y., Cao, W.C., 2020. Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins. Nature 583, 282-285.

    33. Langat, S.K., Eyase, F., Bulimo, W., Lutomiah, J., Oyola, S.O., Imbuga, M., Sang, R., 2021. Profiling of RNA viruses in biting midges (Ceratopogonidae) and related Diptera from Kenya using metagenomics and metabarcoding analysis. mSphere 6, e0055121.

    34. Letunic, I., Bork, P., 2021. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293-W296.

    35. Linhart, C., Shamir, R., 2005. The degenerate primer design problem: theory and applications. J. Comput. Biol. 12, 431-456.

    36. Mahima, K., Sunil-Kumar, K.N., Rakhesh, K.V., Rajeswaran, P.S., Sharma, A., Sathishkumar, R., 2022. Advancements and future prospective of DNA barcodes in the herbal drug industry. Front. Pharmacol. 13, 947512.

    37. Markov, P.V., Ghafari, M., Beer, M., Lythgoe, K., Simmonds, P., Stilianakis, N.I., Katzourakis, A., 2023. The evolution of SARS-CoV-2. Nat. Rev. Microbiol. 21, 361-379.

    38. Meacham, F., Boffelli, D., Dhahbi, J., Martin, D.I., Singer, M., Pachter, L., 2011. Identification and correction of systematic error in high-throughput sequence data. BMC Bioinformatics 12, 451.

    39. Meng, X., Zou, S., Li, D., He, J., Fang, L., Wang, H., Yan, X., Duan, D., Gao, L., 2022. Nanozyme-strip for rapid and ultrasensitive nucleic acid detection of SARS-CoV-2. Biosens. Bioelectron. 217, 114739.

    40. Minervina, A.A., Pogorelyy, M.V., Kirk, A.M., Crawford, J.C., Allen, E.K., Chou, C.H., Mettelman, R.C., Allison, K.J., Lin, C.Y., Brice, D.C., Zhu, X., Vegesana, K., Wu, G., Trivedi, S., Kottapalli, P., Darnell, D., McNeely, S., Olsen, S.R., Schultz-Cherry, S., McGargill, M.A., Wolf, J., Thomas, P.G., 2022. SARS-CoV-2 antigen exposure history shapes phenotypes and specificity of memory CD8+ T cells. Nat. Immunol. 23, 781-790.

    41. Nimavat, N., Singh, S., Fichadiya, N., Sharma, P., Patel, N., Kumar, M., Chauhan, G., Pandit, N., 2021. Online medical education in India - different challenges and probable solutions in the age of COVID-19. Adv. Med. Educ. Pract. 12, 237-243.

    42. Pickett, B.E., Sadat, E.L., Zhang, Y., Noronha, J.M., Squires, R.B., Hunt, V., Liu, M., Kumar, S., Zaremba, S., Gu, Z., Zhou, L., Larson, C.N., Dietrich, J., Klem, E.B., Scheuermann, R.H., 2012. ViPR: an open bioinformatics database and analysis resource for virology research. Nucleic Acids Res. 40, D593-D598.

    43. Peng, C., He, M., Cutrona, S.L., Kiefe, C.I., Liu, F., Wang, Z., 2020. Theme trends and knowledge structure on mobile health apps: bibliometric analysis. JMIR Mhealth Uhealth 8, e18212.

    44. Peng, Q., Xie, Y., Kuai, L., Wang, H., Qi, J., Gao, G.F., Shi, Y., 2023. Structure of monkeypox virus DNA polymerase holoenzyme. Science 379, 100-105.

    45. Raphael, C.E., Alkhouli, M., Maor, E., Panaich, S.S., Alli, O., Coylewright, M., Reeder, G.S., Sandhu, G., Holmes, D.R., Nishimura, R., Malouf, J., Cabalka, A., Eleid, M.F., Rihal, C.S., 2017. Building blocks of structural intervention: A novel modular paradigm for procedural training. Circ. Cardiovasc. Interv. 10, e005686.

    46. Rodríguez-Hernández, C., Sanz-Moreno, L., 2020. Inmunidad frente a SARS-CoV-2: caminando hacia la vacunación [Immunity against SARS-CoV-2: walking to the vaccination]. Rev. Esp. Quimioter. 33, 392-398.

    47. Rozas, J., Ferrer-Mata, A., Sánchez-DelBarrio, J.C., Guirao-Rico, S., Librado, P., Ramos-Onsins, S.E., Sánchez-Gracia, A., 2017. DnaSP 6: DNA sequence polymorphism analysis of large data sets. Mol. Biol. Evol. 34, 3299-3302.

    48. Rozewicki, J., Li, S., Amada, K.M., Standley, D.M., Katoh, K., 2019. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Res. 47, W5-W10.

    49. Saini, S.K., Hersby, D.S., Tamhane, T., Povlsen, H.R., Amaya-Hernandez, S.P., Nielsen, M., Gang, A.O., Hadrup, S.R., 2021. SARS-CoV-2 genome-wide T cell epitope mapping reveals immunodominance and substantial CD8+ T cell activation in COVID-19 patients. Sci. Immunol. 6, eabf7550.

    50. Schoch, C.L., Ciufo, S., Domrachev, M., Hotton, C.L., Kannan, S., Khovanskaya, R., Leipe, D., Mcveigh, R., O'Neill, K., Robbertse, B., Sharma, S., Soussov, V., Sullivan, J.P., Sun, L., Turner, S., Karsch-Mizrachi, I., 2020. NCBI Taxonomy: a comprehensive update on curation, resources and tools. Database (Oxford) 2020, baaa062.

    51. Selingerova, I., Valik, D., Gescheidtova, L., Sramek, V., Cermakova, Z., Zdrazilova-Dubska, L., 2021. Interpretive discrepancies caused by target values inter-batch variations in chemiluminescence immunoassay for SARS-CoV-2 IgM/IgG by MAGLUMI. J. Med. Virol. 93, 1805-1809.

    52. Shariat, S.F., Lotan, Y., Vickers, A., Karakiewicz, P.I., Schmitz-Dräger, B.J., Goebell, P.J., Malats, N., 2010. Statistical consideration for clinical biomarker research in bladder cancer. Urol. Oncol. 28, 389-400.

    53. Sheth, B.P., Thaker, V.S., 2017. DNA barcoding and traditional taxonomy: an integrated approach for biodiversity conservation. Genome 60, 618-628.

    54. Shu, Y., McCauley, J., 2017. GISAID: Global initiative on sharing all influenza data - from vision to reality. Euro. Surveill. 22, 30494.

    55. Singhal, T., 2022. The emergence of omicron: challenging times are Here again! Indian J. Pediatr. 89, 490-496.

    56. Sylla, M., Bosio, C., Urdaneta-Marquez, L., Ndiaye, M., 2009. Gene flow, subspecies composition, and dengue virus-2 susceptibility among Aedes aegypti collections in Senegal. PLoS Negl. Trop. Dis. 3, e408.

    57. Swanson, S.J., Conant, L.L., Humphries, C.J., LeDoux, M., Raghavan, M., Mueller, W.M., Allen, L., Gross, W.L., Anderson, C.T., Carlson, C.E., Busch, R.M., Lowe, M., Tivarus, M.E., Drane, D.L., Loring, D.W., Jacobs, M., Morgan, V.L., Szaflarski, J., Bonilha, L., Bookheimer, S., Grabowski, T., Phatak, V., Vannest, J., 2020. Changes in description naming for common and proper nouns after left anterior temporal lobectomy. Epilepsy Behav. 106, 106912.

    58. Tamura, K., Stecher, G., Kumar, S., 2021. MEGA11: molecular evolutionary genetics analysis version 11. Mol. Biol. Evol. 38, 3022-3027.

    59. Tan, M., Xia, J., Luo, H., Meng, G., Zhu, Z., 2023. Applying the digital data and the bioinformatics tools in SARS-CoV-2 research. Comput. Struct. Biotechnol. J. 21, 4697-4705.

    60. Tiecco, G., Storti, S., Arsuffi, S., Degli-Antoni, M., Focà, E., Castelli, F., Quiros-Roldan, E., 2022. Omicron BA.2 lineage, the “stealth” variant: is it truly a silent epidemic? a literature review. Int. J. Mol. Sci. 23, 7315.

    61. Timilsina, U., Umthong, S., Ivey, E.B., Waxman, B., Stavrou, S., 2022. SARS-CoV-2 ORF7a potently inhibits the antiviral effect of the host factor SERINC5. Nat. Commun. 13, 2935.

    62. Trávníček, P., Čertner, M., Ponert, J., Chumová, Z., Jersáková, J., Suda, J., 2019. Diversity in genome size and GC content shows adaptive potential in orchids and is closely linked to partial endoreplication, plant life-history traits and climatic conditions. New Phytol. 224, 1642-1656.

    63. Ullah, A., Mabood, N., Maqbool, M., Khan, L., Ullah, M., 2021. Cytidine deamination-induced perpetual immunity to SAR-CoV-2 infection is a potential new therapeutic target. Int. J. Med. Sci. 18, 3788-3793.

    64. Wang, L., Møhlenberg, M., Wang, P., Zhou, H., 2023. Immune evasion of neutralizing antibodies by SARS-CoV-2 Omicron. Cytokine Growth Factor Rev. 70, 13-25.

    65. Welch, J.J., Eyre-Walker, A., Waxman, D., 2008. Divergence and polymorphism under the nearly neutral theory of molecular evolution. J. Mol. Evol. 67, 418-426.

    66. Westhaus, A., Cabanes-Creus, M., Rybicki, A., Baltazar, G., Navarro, R.G., Zhu, E., Drouyer, M., Knight, M., Albu, R.F., Ng, B.H., Kalajdzic, P., Kwiatek, M., Hsu, K., Santilli, G., Gold, W., Kramer, B., Gonzalez-Cordero, A., Thrasher, A.J., Alexander, I.E., Lisowski, L., 2020. High-throughput in vitro, ex vivo, and in vivo screen of adeno-associated virus vectors based on physical and functional transduction. Hum. Gene Ther. 31, 575-589.

    67. WHO, 2023. Tracking SARS-CoV-2 variants. https://www.who.int/activities/tracking-SARS-CoV-2-variants. (Accessed 11 June 2023).

    68. Wu, F., Zhao, S., Yu, B., Chen, Y.M., Wang, W., Song, Z.G., Hu, Y., Tao, Z.W., Tian, J.H., Pei, Y.Y., Yuan, M.L., Zhang, Y.L., Dai, F.H., Liu, Y., Wang, Q.M., Zheng, J.J., Xu, L., Holmes, E.C., Zhang, Y.Z., 2020. A new coronavirus associated with human respiratory disease in China. Nature 579, 265-269.

    69. Zhang, J., Zheng, X., Wang, H., Jiang, H., Dong, H., Huang, L., 2020. Novel Sulfolobus fuselloviruses with extensive genomic variations. J. Virol. 94, e01624- e01619.

    70. Zhou, H., Ji, J., Chen, X., Bi, Y., Li, J., Wang, Q., Hu, T., Song, H., Zhao, R., Chen, Y., Cui, M., Zhang, Y., Hughes, A.C., Holmes, E.C., Shi, W., 2021. Identification of novel bat coronaviruses sheds light on the evolutionary origins of SARS-CoV-2 and related viruses. Cell 184, 4380-4391.

  • 加载中
  • 10.1016j.virs.2024.01.006-ESM.docx

Article Metrics

Article views(1316) PDF downloads(7) Cited by(0)

Related
Proportional views
    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    RNA barcode segments for SARS-CoV-2 identification from HCoVs and SARSr-CoV-2 lineages

      Corresponding author: Yongliang Li, lyl13618481357@hnu.edu.cn
      Corresponding author: Xingyi Ge, xyge@hnu.edu.cn
      Corresponding author: Xinhong Guo, gxh@hnu.edu.cn
    • a. College of Biology, Hunan University, Changsha, 410082, China;
    • b. College of Bioscience and Biotechnology, Hunan Agricultural University, Changsha, 410128, China

    Abstract: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the pathogen responsible for coronavirus disease 2019 (COVID-19), continues to evolve, giving rise to more variants and global reinfections. Previous research has demonstrated that barcode segments can effectively and cost-efficiently identify specific species within closely related populations. In this study, we designed and tested RNA barcode segments based on genetic evolutionary relationships to facilitate the efficient and accurate identification of SARS-CoV-2 from extensive virus samples, including human coronaviruses (HCoVs) and SARSr-CoV-2 lineages. Nucleotide sequences sourced from NCBI and GISAID were meticulously selected and curated to construct training sets, encompassing 1733 complete genome sequences of HCoVs and SARSr-CoV-2 lineages. Through genetic-level species testing, we validated the accuracy and reliability of the barcode segments for identifying SARS-CoV-2. Subsequently, 75 main and subordinate species-specific barcode segments for SARS-CoV-2, located in ORF1ab, S, E, ORF7a, and N coding sequences, were intercepted and screened based on single-nucleotide polymorphism sites and weighted scores. Post-testing, these segments exhibited high recall rates (nearly 100%), specificity (almost 30% at the nucleotide level), and precision (100%) performance on identification. They were eventually visualized using one and two-dimensional combined barcodes and deposited in an online database (http://virusbarcodedatabase.top/). The successful integration of barcoding technology in SARS-CoV-2 identification provides valuable insights for future studies involving complete genome sequence polymorphism analysis. Moreover, this cost-effective and efficient identification approach also provides valuable reference for future research endeavors related to virus surveillance.

    Reference (70) Relative (20)

    目录

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return