The emergence of COVID-19 since December 2019 has attracted great attention around the world and reminds the powerful pathogenic potential of viruses (Zhou et al. 2020). The subfamily Orthocoronavirinae of the family Coronaviridae contains four genera Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus, which are the largest group of positive-sense, non-segmented, single-stranded, enveloped RNA viruses (Woo et al. 2012; Shi et al. 2016). The genus Gammacoronavirus currently has five species (https://talk.ictvonline.org/taxonomy/), which are primarily spread through birds (Woo et al. 2012). Infectious bronchitis virus (IBV) is the best-studied Gammacoronavirus, which can lead to economic losses to the global poultry industry by causing avian infectious bronchitis (a highly contagious viral respiratory disease) (Jackwood and de Wit 2020). Turkey coronavirus (TCoV), also a member of Gammacoronavirus, causes acute infectious diarrhea in domestic turkeys (Lin et al. 2002). Recently, a novel species of genus Gammacoronavirus was identified in a mass die-off of Canada and snow geese (Papineau et al. 2019).
Birds, with more than 10, 000 living species, are one of the most popular beings with global distribution (Prum et al. 2015), and are the reservoir of many emerging and re-emerging viruses, such as avian influenza viruses and the West Nile virus (Reed et al. 2003; Olsen et al. 2006). With the capacity to fly long distances, birds play an important role in disseminating these emerging viruses to animals and/or humans (Woo et al. 2012). In the previous study, we have identified a new deltacoronavirus from birds of Qinghai-Tibetan Plateau in China (Zhu et al. 2020).
In February-March 2019, migratory birds were live-captured and kept in cages by professionals of migratory bird protection station under the permission from Jiangxi Province Department of Forestry. Feces of migratory birds were collected using non-invasive ways respectively, and then released. A total of 48 fecal samples were collected from two regions (Nanchang City and Jiujiang City) around Poyang Lake in Jiangxi Province of China. Total RNA was extracted from these samples using the QIAamp Viral RNA Mini Kit. Then, 15 out of 48 total RNA were randomly selected and pooled in equal amounts. Then, total RNA was used to construct the RNA sequencing library using Truseq stranded total RNA library prep gold kit and sequenced by the Illumina HiSeq2000 platform. After quality control, a total of 75, 448, 386 clean reads were obtained. The de novo assembled contigs were obtained using Trinity v2.10.0 (Grabherr et al. 2011) and annotated against the NCBI nr database with an E-value cutoff of 1 × 10-5.
We identified a 28, 466 nucleotide (nt) long contig annotated as coronavirus (CoV). The contig was verified, and a total of 1, 346 reads were mapped to the CoV genome. The full-length genome was named as Anser fabalis coronavirus (AFCoV) NCN2 based on one of its host, and was submitted to the NCBI database (GenBank accession number MW436465). BLASTn result shows that AFCoV shares 86.0% nt identity to a recently identified Gammacoronavirus [Canada goose coronavirus (CGCoV) CB17] and even lower nucleotide identity (< 78.8%) with other coronaviruses. The seven concatenated conserved domains (ADRP, 3CLpro, RdRp, Hel, ExoN, NendoU and O-MT) between AFCoV and CGCoV show greater than 90% amino acid (aa) identities (Table 1). These analyses identified AFCoV as a new member of the species of Goose coronavirus CB17.
AFCoV proteins Amino acid identity (%) TCoV SW1 CGCoV IBV ADRP 88.9 51.6 50.4 37.5 3CLpro 86.9 57.7 59.0 49.5 RdRp 97.4 77.1 76.9 65.7 Hel 100 88.5 88.3 75.0 ExoN 99.8 78.2 76.6 57.4 NendoU 99.7 59.0 58.7 41.8 O-MT 99.3 73.5 70.9 62.8 S 68.5 52.0 31.8 26.5 E 100 62.4 68.0 29.7 M 97.8 65.8 64.9 30.5 N 94.9 60.8 60.6 36.4 CGCoV: Canada goose coronavirus (NC_046965).
IBV: Avian infectious bronchitis virus (NC_001451).
TCoV: Turkey coronavirus (NC_010800).
SW1: Beluga Whale coronavirus SW1 (NC_010646).
Table 1. Comparison of amino acid identities between AFCoV and closely related CoV genome sequences.
The genome of AFCoV has a 38.5% G + C content and typical CoV genomic organization consisting of 5′ UTR (untranslated region), replicase ORF1ab, spike (S), envelope (E), membrane (M), nucleocapsid (N), several accessory proteins, and 3′ UTR (Fig. 1A). The 5′ UTR of AFCoV is 519 nt in length and shares the highest nucleotide identity (94.9%) with that of CGCoV. Like other coronaviruses, the ORF1ab gene occupies a large portion of the AFCoV genome and has a frameshift site at position 11, 940 based on the conserved heptamer 'UUUAAAC'. The polyprotein ORF1ab of AFCoV is cleaved into 16 non-structural proteins (NSPs) by two viral proteases, with almost the same putative protease cleavage sites as those of CGCoV (Supplementary Table S1). The AFCoV genome contains nine predicted accessory proteins, one less than those in CGCoV (Papineau et al. 2019) (Supplementary Table S2). The 3′ UTR (excluding the poly(A) tail) of AFCoV (255 nt in length) shows 98.8% nt identity with that of CGCoV.
Figure 1. A Genome organization of AFCoV. B The phylogenetic tree was constructed using genome sequences of different coronaviruses. The maximum-likelihood tree was constructed using PhyML 3.0, and only bootstrap values > 70% are shown. The viral sequence obtained in this study is marked in bold type.
The protein ORF1ab of AFCoV (6500 aa) shares 90.0% aa identity with that of CGCoV. The deduced S protein (1186 aa) of AFCoV shares a low identity (68.5%) with the S protein of CGCoV (1184 aa), and even lower identities with that of other coronaviruses (< 58.1%). The deduced E, M, and N proteins of AFCoV have 100%, 97.8%, and 94.9% aa identities with those of CGCoV respectively, and relatively low identities with the proteins of other Gammacoronaviruses ranging from 29.7 to 62.4% for E protein, 30.5-65.8% for M protein, and 36.4%-60.8% for N protein.
To assess the phylogenetic relationship of AFCoV with other coronaviruses, the full-length genome sequences of alphacoronaviruses, betacoronaviruses, gammacoronaviruses, and deltacoronaviruses were aligned with MAFFT and used to reconstruct the phylogenetic tree by the maximum-likelihood method with JC best-fit model. The phylogenetic tree (Fig. 1B) reveals that AFCoV clusters in the genus Gammacoronaviruses and groups closely with CGCoV. Phylogenetic trees based on proteins ORF1ab (LG model), and S (Dayhoff model) were also constructed, and show similar topologies as that of the genomes, indicating that AFCoV is closely related to CGCoV (Supplementary Fig. S1-S2).
To determine the prevalence of the new Gammacoronavirus in bird samples, we designed primers (F: 5′-CTTTTTGGTCTCTACCCTGTTC-3′, R: 5′- CCATAAAAATCCAGGACTTGTT-3′) to specifically amplify an 836 bp segment located in the replicase ORF1ab of AFCoV genome. All fecal samples were tested using the One-Step RT-PCR Kit Ver.2 (Takara, Japan). Five out of 48 (10.4%) samples were found to be positive for coronavirus and were also verified by Sanger sequencing. Result of alignment showed that four of the five sequences were absolutely the same, and the remaining one sequence has seven nucleotide differences with the consensus sequence. The five hosts of those coronavirus positive samples belong to four species, all from family Anatidae, including Cygnus columbianus (one host) and Anser albifrons (one host) from Nanchang City, Anas zonorhyncha (two hosts), and Anser fabalis (one host) from Jiujiang City.
Identifying viruses that have ability to spillover from wildlife into other animals and/or humans promotes our understanding of viral families that are very likely to be new pathogens in the future (Anthony et al. 2015). Here, we detected a new Gammacoronavirus (AFCoV) in four different bird species from two sites with a high (10.4%) detection rate. The ORF1ab, E, M, and N proteins of AFCoV had > 90% aa identities to those of CGCoV which caused a die-off of Canada and snow geese in a site near the arctic, indicating that AFCoV may also be pathogenic to animals. Birds of the family Anatidae may be the natural host of Goose coronavirus CB17 species (including AFCoV and CGCoV), because all the reservoirs of AFCoV and CGCoV belong to the family Anatidae (Papineau et al. 2019). The S protein of AFCoV has a significantly low aa identity (68.5%) with that of CGCoV, which may explain the broader diversity of AFCoV reservoirs (Cygnus columbianus, Anser albifrons, Anas zonorhyncha, and Anser fabalis) than CGCoV reservoirs (Branta canadensis and Anser caerulescens) (Papineau et al. 2019). The sampling dates and sites of five coronavirus positive birds Cygnus columbianus (February 27, 2019; Nangchang City), Anser albifrons (March 5, 2019; Nangchang city), Anas zonorhyncha (March 10, 2019; Jiujiang city), and Anser fabalis (On March 10, 2019; Jiujiang city) were different, indicating the spread of AFCoV among birds of family Anatidae because different migratory birds share the same habitat. Bootscan analysis using genomes of avian coronavirus, IBV, CGCoV, and AFCoV (as the query sequence) indicated the absence of any obvious recombination event (Supplementary Fig. S3). To summarize, this study describes a new member of Goose coronavirus CB17 species in migratory birds from China and promotes our understanding of the diversity of gammacoronaviruses. We did not have the permission to collect tissue samples, and the isolation of virus was unsuccessful. Therefore, the specific pathogenicity of AFCoV still needs further study.
Genomic Characterization of a New Coronavirus from Migratory Birds in Jiangxi Province of China
- Received Date: 25 January 2021
- Accepted Date: 29 March 2021
- Published Date: 08 July 2021