One full-length genome (31, 270 bp) related to beta-CoV was acquired from the pool of rats. To verify the assembled contig, a total of 64, 612 reads (0.62%) were mapped to the beta-CoV genome. The identified genome (named as Apodemus peninsulae CoV, accession numbers: MT430884) shared 94.5%–95.3% nucleotide (nt) identities with members of the species of Coronavirus HKU24. There was over 90% amino acid similarity regarding potential structural and non-structural proteins between Apodemus peninsulae CoV and closely related CoVs. To screen the prevalence of Apodemus peninsulae CoV in samples, we amplified the 440 bp fragment and the results showed that three out of fifty (6.0%) rat samples were positive (Table 1). For positive samples, two of them were Apodemus peninsulae and the other one was Microtus gregalis.
Name Accession number Sample types No. of positive samples/no. of test samples (% positive) Detected coronaviruses Apodemus peninsulae CoV MT430884 Rats 3/50 (6.0) Betacoronavirus MtCoV N MT215336 Birds 4/25 (16.0) Deltacoronavirus MtCoV HM MT215337 Marmots 1/29 (3.4) Deltacoronavirus
Table 1. Prevalence of beta- and delta-CoV in animal samples collected in Qinghai-Tibetan Plateau in July 2019.
Two mostly identical genomes were identified to be related to delta-CoV. They were from the marmot pool and the bird pool, which were named as Montifringilla taczanowskii CoV (MtCoV) HM (MT215337) and MtCoV N (MT215336), respectively. The numbers of reads mapped to genomes of MtCoV HM and MtCoV N were 18, 676 (0.01%) and 10, 854 (0.01%), respectively. The prevalence of MtCoV in marmots and birds were 3.4% (1/29) and 16.0% (4/25), respectively (Table 1). Among positive samples, the four birds were Montifringilla taczanowskii, and the marmot was Marmota himalayana.
The two delta-CoV genomes were of the same length of 25, 896 nt with 41.3% G + C content and only had four nucleotide differences between each other at locations 2, 254, 5, 761, 10, 826 and 19, 787. The first two base changes (2, 254 and 5, 761) were synonymous changes, the third was in a noncoding region and the last (19, 787) was a nonsynonymous change in the protein spike (proline in MtCoV HM and serine in MtCoV N). The genome structure of MtCoV HM and MtCoV N (Fig. 1) shared high similarity with those of PDCoV and SpCoV HKU17 (Woo et al. 2012), including 5′ UTR (untranslated region), replicase ORF1ab, spike (S), envelope (E), membrane (M), nonstructural protein 6 (NS6), nucleocapsid (N), NS7a, NS7b, NS7c, and 3′ UTR (Fig. 1). The putative transcription regulatory sequence (TRS) was identified based on the motif 5′-ACACCA-3′ (Table 2). Interestingly, the distance between the TRS and the first base of the initiation codon of ORF NS7a is 101 bp, which is the longest compared with those of six members of the genus Delta-CoV that contained a NS7a gene (Woo et al. 2012), which ranged from 4 to 80 bp.
MtCoV Location (nt) Length (aa) Frame(s) TRS location TRS sequence distance bases to AUG ORF1ab 497–19, 227 6, 242 +2, +1 37 ACACCA(453)AUG S 19, 209–22, 808 1, 199 +3 19, 062 ACACCA(140)AUG E 22, 802–23, 053 83 +2 22, 775 ACCCCA(20)AUG M 23, 046–23, 699 217 +3 23, 019 ACACCA(20)AUG NS6 23, 699–23, 980 93 +2 23, 646 ACACCA(46)AUG N 24, 005–25, 033 342 +2 23, 991 ACACCA(7)AUG NS7a 24, 099–24, 695 198 +3 23, 991 ACACCA(101)AUG NS7b 25, 044–25, 472 142 +3 25, 033 ACACCA(4)AUG NS7c 25, 394–25, 588 64 +2 25, 348 ACACGA(39)AUG
Table 2. Potential codings and predicted transcription regulatory sequences of the genome of MtCoV.
MtCoV was identified as a novel member of the species. Pairwise nucleotide sequence alignment of the novel MtCoV genome showed the highest homologies to SpCoV ISU690-4 (83.3%), followed by SpCoV HKU17 (83.0%), QuaCoV UAE-HKU30 (78.5%) and PorCoV HKU15 (82.8%). The amino acid identities of ADRP, 3CLpro, RdRp, Hel, ExoN, NendoU and O-MT between MtCoV and their closely related strains were summarized in Table 3. Results showed that the concatenated seven replicase domains revealed more than 90% amino acid identity to the members of species of Coronavirus HKU15 (Table 3), which suggested that MtCoV belongs to this species (Lau et al. 2018). However, the structural proteins E, M and N in MtCoV showed lower identities (83.1%–84.3%, 86.6%–87.1% and 88.6%–90.4%, respectively) to SpCoV HKU17, SpCoV ISU690-4, QuaCoV UAE-HKU30 and PorCoV HKU15. In particular, protein S in MtCoV shared the highest amino acid identities to SpCoV HKU17 (73.1%), following with Houbara coronavirus (HouCoV) UAE-HKU28 (72.2%), Pigeon coronavirus (PiCoV) UAE-HKU29 (72.1%) and Falcon coronavirus (FalCoV) UAE-HKU27 (72.1%), but very low identities to other members of the same species (44.8%–45.4%). Overall, these lower identities of structural proteins between MtCoV and other members of the species of Coronavirus HKU15 indicated that MtCoV represents a novel member in that species.
Domain SpCoV HKU17 PorCoV HKU15 SpCoV ISU690-4 QuaCoV UAE-HKU30 Amino acid identity (%) MtCoV ADRP 94.5 95.3 93.8 89.8 3CLpro 90.8 88.6 91.5 89.9 RdRp 95.0 94.9 94.7 93.2 Hel 97.8 97.3 97.7 97.5 ExoN 95.9 94.8 95.4 94.6 NendoU 90.5 88.4 89.6 89.9 O-MT 91.4 90.3 92.5 92.8 Concatenated 94.5 93.7 94.3 93.4 S 73.1 44.8 45.3 45.4 E 84.3 83.1 83.1 83.1 M 87.1 87.1 86.6 87.1 N 90.4 88.9 89.8 88.6
Table 3. Comparison of amino acid identities between MtCoV and closely related CoVs.
Phylogenetic analysis of whole-genome nucleotide sequences showed Apodemus peninsulae CoV was closely related to the cluster including members of the species of Coronavirus HKU24 (Fig. 2). Also, the result further confirmed that MtCoV belongs to the genus Delta-CoV and forms an independent lineage. It was closely related to PorCoV HKU15 and SpCoV HKU17 (Fig. 2). Further, phylogenetic analyses based on amino acid sequences of proteins ORF1ab, M and N were identical to trees based on nucleotide sequences, and both revealed that MtCoV HM and MtCoV N were clustered with members of the species of Coronavirus HKU15 but in the meantime different from them (Fig. 3). The protein S based phylogenetic tree showed that MtCoVs were grouped with SpCoV HKU17, and were more closely related to FalCoV UAE-HKU27, PiCoV UAE-HKU29, HouCoV UAE-HKU28 and magpie robin coronavirus (MRCoV) HKU18 (Fig. 3). It was due to the high identities of protein S and was consistent with previous reports (Woo et al. 2012; Chen et al. 2018).
Figure 2. Phylogenetic analysis of genome sequences of coronaviruses. Bootstrap values (≥70%) are showed along branches. Scale bar suggests nucleotide substitutions per site.
The genomes of ThCoV HKU12, MunCoV HKU13, SpCoV ISU690-4 and MtCoV HM (as query sequence) were aligned for recombination analysis using Bootscan. The result indicated the potential long recombination segment from aligned positions 19, 500 to 23, 300, which were mainly located in the S gene of MtCoV (Fig. 4A). The recombinant segment was likely to be derived from ThCoV HKU12. Since the receptor-binding domain (RBD) locates in the S protein, the recombined sequence might lead to biological changes of receptor binding and thus initiate cross-species transmission. Two short potential recombination segments were also found in aligned positions 11, 300 to 11, 700 and 15, 500 to 16, 100 of MtCoV (Fig. 4A). Genome sequence of MtCoV HM (as the query sequence) was compared with that of ThCoV-HKU12 using Simplot analysis (Fig. 4B). Result from global align in BLAST indicated that nucleotide identity values of S gene between MtCoV HM (19, 209 to 22, 808 nt) and ThCoV-HKU12 (19, 433 to 23, 011 nt) was 58.5%, with part of S gene of MtCoVs showed above 70% nucleotide identity to ThCoV-HKU12 (Fig. 4B). And, S gene of MtCoVs shared higher nucleotide identity with PiCoV UAE-HKU29 (73.3%), HouCoV UAE-HKU28 (73.3%), FalCoV UAEHKU27 (73.2%), SpCoV HKU17 (71.0%), Magpie robin coronavirus HKU18 (67.4%), Night heron coronavirus HKU19 (58.7%) and Wigeon coronavirus HKU20 (58.6%).
Figure 4. Potential recombination event detected using bootscan analysis. A Genome of MtCoV HM was used as the query sequence and compared with ThCoV HKU12, MunCoV HKU13 and SpCoV ISU690-4. Red lines indicated the recombination sites. B MtCoV HM was used as the query sequence and compared with the genome of ThCoV-HKU12.
To estimate the divergence time of MtCoV, the RdRp nucleotide sequences of genus delta-CoV were selected (Lau et al. 2018) and aligned to calculate tMRCA using GTR + G + I substitution model and uncorrelated relaxed clock type with the log-normal relaxed distribution. The result of molecular clock analysis indicated that the most recent common ancestor of MtCoVs along with other closest members of the species in Coronavirus HKU15 was estimated to be 289 (95% HPD = 60–1140) years ago (Fig. 5).