Of 951 bats tested 50 intestinal specimens(5.3%)were CoV positive,but surprisingly all respiratory specimens showed negative amplification. As shown in Table 1,among 181 bats from 6 species in 3 families in Guangdong province,16.2%(6/37)Rousettus leschenaultia and 27.5%(14/51)Cynopterus sphinx were CoV positive. Among 599 bats from 17 species in 5 families in Yunnan province,14.0%(14/100)Rousettus leschenaulti,2.4%(1/41)Megaerops kusnotei,9.0%(7/78)Rhinolophus sinicus and 5.3%(5/95)Myotis daubentonii were CoV positive. As the first study of this kind in the Tibet Autonomous Region,fifteen Hipposideros cineraceus and five Rhinolophus hipposideros collected in south Tibet were tested and only 6.7%(1/15)Hipposideros cineraceus showed positive amplification. In northeast China,2 of 97(2.1%)bats in Jilin province were positive: one from Murina leucogaster and another from Rhinolophus ferrumequinum. In contrast,all 16 Rhinolophus ferrumequinum and 38 Myotis ricketti in Liaoning province showed negative amplification. These results revealed a higher CoV incidence in three fruit bat species of the family Pteropodidae than in the four insectivorous bat families,indicating that fruit bats are more likely to harbor CoVs.
Bats Guangdong(Year 2005) Yunnan(Years 2012, 2013) Tibet(Year 2013) Liaoning(Year 2013) Jilin(Year 2013) Family Species Bat CoV& Bat CoV Bat CoV Bat CoV Bat CoV Pteropodidae Rousettus leschenaulti 6/37(16.2) 4 14/100(14.0) 4 Cynopterus sphinx 14/51(27.5) 4 Megaerops kusnotei 1/41(2.4) 4 Hipposideridae Hipposideros cineraceus 0/9 1/15(6.7) Hipposideros pomona 0/84 Hipposideros larvatus 0/68 0/2 Hipposideros armiger 0/11 0/18 Aselliscus stoliczkanus 0/33 Rhinolophidae Rhinolophus ferrumequinum 0/42 0/16 1/30(3.3) 2 Rhinolophus sinicus 7/78(9.0) , 2 Rhinolophus pusillus 0/5 0/6 Rhinolophus affinis 0/3 Rhinolophus hipposideros 0/37 0/5 Rhinolophus macrotis Vespertilionidae Myotis daubentonii 5/95(5.3) 3 Myotis laniger 0/8 Myotis chinensis 0/3 Myotis capaccinii 0/40 Myotis ricketti 0/38 0/27 Miniopterus schreibersi 0/8 Murina leucogaster 1/40(2.5) Megadermatidae Megaderma lyra 0/1 Note: positive/total bats; numbers in brackets indicate the coronavirus positive percentage. & CoV, α: αCoV; β: unclassified βCoV; β2: βCoV lineage 2; β3: βCoV lineage 3; β4: βCoV lineage 4.
Table 1. Bat sample collection and coronavirus detection.
To describe the genetic relationships among the 50 sequences obtained in this study and previously known CoVs,400-nt RdRP sequences were obtained from the primer truncation of 440-nt sequences and phylogenetically analyzed. Results showed that 8 sequences grouped into 3 clusters within the genus αCoV(Figure 1A). YDB5C is the first reported bat-borne CoV(Hipposideros cineraceus)in Tibet and clustered closely with MLHJC4,a CoV from Rhinolophus sinicus in Yunnan,both sharing 94% nt identity with previously reported strain HKU2/GD/430/2006 from Guangdong(Lau et al,2007). JTAC2 identified in Murina leucogaster in Jilin province diverged considerably from known CoVs,showing the hig-hest nt identity of only 83% to bat-borne coronavirus Neixiang-14 and Neixiang-52 detected also in Murina leucogaster, and followed by 78% nt identity with some p and emic porcine epidemic diarrhea virus(PEDV)strains that have emerged recently in China,USA and Japan(Vlasova et al,2014; Sun et al,2015; Suzuki et al,2015). Five other αCoVs(MLHJC1,MLHJC6,MLHJC8,MLHJC22,MLHJC34)identified from Rhinolophus sinicus in Yunnan formed a new group with MLHJC8 being slightly more divergent,showing highest nt identities(75%–89%)with the previously reported BtCoV/860/2005(Tang et al,2006). The remaining 42 bat CoV sequences were classified as βCoV and fell into 5 clusters(Figure 1B). Twenty identified in Guangdong fell into lineage β4,which showed the geographical relationship and was further divided into two distinct clusters,one with 6 sequences sharing 99% highest nt identity with HKU9-10-1(Lau et al,2010),while another including 14 sequences sharing the closest relationship with BtCoV/BRT55630/H.lek/CK/Tha/05/2012 detected in Hipposideros lekaguli in Thail and (Wacharapluesadee et al,2015). The 21 βCoVs identified in Yunnan province exhibited considerable genetic diversity and were distributed among lineages β2,β3 and β4. Fifteen fell into β4 and further divided into 2 lineages,fourteen sequences showing closest relationship to previously reported BtCoV/BRT55629/H.lek/CK/Tha/05/2012(Wacharapluesadee et al,2015),while another(ML92C)grouped with the above Guangdong sequences. Five sequences detected from Myotis daubentonii clustered within lineage β3,sharing > 91% nt identities with previously reported HKU4-4 from Tylonycteris pachypus (Woo et al,2007). This group showed about 80% nt identity with MERS-CoVs recently identified in China(Lu et al,2015) and Korea(Kim et al,2015). The remaining Yunnan bat CoV sequence,MLHJC35,detected in Rhinolophus sinicus, and the only Jilin province sequence,JTMC15,identified in Rhinolophus ferrumequinum were clustered into β2 and showed highest nt identities to SARS-related bat-borne CoVs(SARSr-BatCoVs). MLHJC35 was 97% identical with SARSr-BatCoV Cp/Yunnan2011 previously isolated in Yunnan province(Yang et al,2011),while JTMC15 shared 99% identity with SARSr-BatCoV Rf1 found in Rhinolophus ferrumequinum, Hubei province(Li et al,2005).
Figure 1. The 400-nt RdRP gene fragment based phylogenetic analysis of 50 bat CoV sequences obtained in this study (8 sequences of αCoV in (A) and 42 sequences of βCoV in (B)) in comparison with other representative strains retrieved from GenBank. Fifty sequences of this study are marked by triangles (13 sequences with complete RdRP gene sequencing are marked by solid triangles). The scale bar indicates the estimated number of substitutions per 10 nucleotides.
To obtain more precise analysis,representative specimens of the 8 phylogenetic clusters were subjected to full RdRP gene amplification. Complete RdRP sequences were obtained with 13 specimens belonging to 6 clusters comprised of 3 αCoVs and 10 βCoVs. Phylogenetic analysis based on the full RdRP gene sequences was highly consistent with Figure 1(Phylogenetic tree of the full RdRP gene sequences is not shown).
Full genomic sequencing was successful in 2 of the above 13 specimens: JTMC15 from Rhinolophus ferrumequinum, Jilin province, and JPDB144 from Myotis daubentonii, Yunnan province,with a nearly complete genome sequence obtained of JTAC2 from Murina leucogaster. The full genomes of JTMC15 and JPDB144(including complete terminal sequences of 5′ end and 3′- poly A) and near-complete genome of JTAC2 were 28,761 nt,30,321 nt and 25,719 nt in size respectively,with G+C contents of 38.1%,41.0% and 43.4%. It is proved that two proteinases,papain-like proteinase(PLPro )encoded by nsp3 gene and main proteinase(MPro )encoded by nsp5 gene in ORF1a of CoVs are able to cleave the complex of ORF1a and ORF1b(replicase)into 16 mature nonstructural proteins(nsps)(Neuman et al,2008). Our analysis of the nsps in ORF1ab revealed that all the three bat CoV genomic sequences contain 16 nsps(nsp1–nsp16)in ORF1ab,but the cleavage sites are different for nsp3 or nsp5 in different CoVs. The length of deduced amino acids of putative nsps,their first-last residue and position in replicase are shown in Supplementary Table S2.
Base on the nearly complete genomic sequence obtained,JTAC2 possesses the same genome structure as PEDVs with 7 genes in the order: 5′-ORF1a,1b,spike(S),3a,envelope(E),membrane(M) and nudeocapsid(N)-3′(Figure 2A). JTAC2 showed the nearest relationship(87.9% in ORF1a and 92.8% in ORF1b)with Lushi MI bat CoV isolates Neixiang-14 and Neixiang-52,but the latter two have very limited sequences available for further analysis. The recent PEDV-1C isolated from a piglet with diarrhea and vomiting(Sun et al,2015)was therefore used for sequence comparison and genomic organization analysis,since it has been fully sequenced and shares high identity with JTAC2(Figure 1A and Supplementary Table S3). The aa identity comparison shown in Supplementary Table S3 suggests that JTAC2 is a novel αCoV.
Figure 2. Predicted genome organizations of JTAC2, JTMC15 and JPDB144. (A) Nonstructural proteins are represented by open boxes, structural proteins by filled boxes. Apostrophes in JTAC2 identify unsequenced regions. (B) Sequence comparison showing the ORF shift of gene 7b of JTMC15 caused by the discontinuous deletions (represented by dots), resulting in elimination of gene 8 as compared to other SARS- and SARSr-CoVs. Nucleotide position are determined referencing strain BJ01. Stop codons and start codons are in bold fonts. Hu: human SARS-CoV; Ci: civet SARS-CoV; Bt: Bat SARS-CoV.
JTMC15 is a SARSr-BatCoV having the same genome organization as other SARSr-BatCoVs(e.g.,Rf1),but sequence deletions were observed in ORF1a and N, and between genes 7b–8. A 579-nt deletion in ORF1a of JTMC15 was also observed in SARSr-BatCoV Rs672 from a Rhinolophus sinicus bat(Yuan et al,2010) and a human SARS-CoV ShanghaiQXC2 from the late phase of the 2003 epidemic(GenBank #AY463060). This 579-nt deletion results in a 193-aa deletion of nsp3 in ORF1a,from residues 1059 to 1251 in the nucleic acid-binding(NAB)domain(Serrano et al,2009). A second deletion in the N gene of JTMC15(1156–1158 nt,one residue Q368 )was also found in 3 SARSr-BatCoV strains,Rp/Shaanxi2011(Yang et al,2013),Rm1(Li et al,2005) and 279/2005(Tang et al,2006). Interestingly,four discontinuous deletions were identified in JTMC15 between genes 7b and 8,which is unique in JTMC15,resulting in an ORF shift and elimination of gene 8(Figure 2B). Similar to known CoVs,extensive S gene variations were also observed in JTMC15,resulting in low aa identities with other SARSr-BatCoV strains(the highest being 86.1% to Rf1)as compared with other gene fragments in the genome. Receptor-binding motif(RBM)is an extended loop that lies on the surface of the receptor binding domain(RBD)of the spike protein, and is the most important domain for SARSr-BatCoV to recognize its host receptor,angiotensin-converting enzyme 2(ACE2)(Ren et al,2008; Baez-Santos et al,2015). Further alignment of the deduced amino acid sequences of RBM(55 aa)showed a closer relationship of JTMC15 to SARSr-BatCoVs than to human or civet SARS-CoVs(Supplementary Figure S1). Taking the above altogether,as shown in Figure 2A and Table 2,there are 13 genes predicted in JTMC15: 5′-ORF1a, 1b,S,3a,3b,E,M,6,7a,7b,N,9a,9b-3′. Apart from gene 7b(83.0%) and S(86.1%)all ORFs of JTMC15 had high aa identities to Rf1,ranging from 94.4%(9b gene)to 99.1%(M gene),indicating that JTMC15 is a new variant within the SARSr-BatCoV Rf1 species.
ORF JTMC15 Rf1 Rs672 BJ01 Length Length % identity Length % identity Length % identity 1a 4185 4378 98.0 4190 93.8 4383 93.5 1b 2704 2704 98.1 2704 98.1 2704 98.0 S 1236 1241 86.1 1255 81.9 1241 76.7 3a 274 274 98.2 274 92.0 274 86.2 3b 114 114 97.4 114 91.3 114 90.4 E 76 76 94.8 76 96.1 76 96.1 M 221 221 99.1 221 98.2 221 97.7 6 63 63 96.9 63 92.2 63 89.1 7a 122 122 98.4 122 93.5 122 91.9 7b 52 44 83.0 44 84.9 44 79.2 8(8a) - 122 - 121 - 39 - -(8b) - - - - - 84 - N 420 421 98.1 422 96.7 422 96.2 9a 97 97 94.9 98 79.6 98 79.6 9b 70 70 94.4 70 83.1 70 84.5 Note: # Abbreviation and accession numbers: Rf1, DQ412042; Rs672, FJ588686; BJ01, AY278488. Gene 8 in SARSr-CoVs is described as 8a and 8b in SARS-CoVs.
Table 2. Comparison of ORF amino acid identities of JTMC15 and other SARS- and SARSr- CoVs#
For JPDB144,the genome organization is almost the same as HKU4-4,with 10 genes in the order: 5′-ORF1a,1b,S,3a,3b,3c,3d,E,M,N-3′(Figure 2A). However,two differences were observed in the nsp2 of JPDB144: a 12-nt insertion(residues 1143 to 1146 of 1a) and a 3-nt deletion(residue 1155 of 1a). Other JPDB144 ORFs were the same as HKU4-4 in length,sharing aa identities of between 88.8%(3c gene) and 98.8%(E gene); however,an aa sequence comparison of JPDB144 ORFs to those of HKU5 and MERS-CoV strains in the Betacoronavirus lineage 3 showed rather low similarities(Supplementary Table S4).
Detection of CoVs
Full genomic sequences characterization
CT conceived the study and LX carried it out with BH’s guidance. FZ, WY, TJ, GL, TH, GC, YF, YZ, QF, JF and HZ were responsible for field investigation and bat sampling. TJ and GL identified bat species morphologically. XL took part in samples screening and CoVs detection. LX wrote the paper, CT and BH then revised it. All authors read and approved the final manuscript.
Supplementary figures/tables are available on the website of Virologica Sinica: www.virosin.org;link.springer.com/journal/12250.
Primers Position of first nucleotide (nt) Sequences (5′3′) Pan-coronavirus nested primers CoVOF 14615 ATGGGWTGGGAYTAYCCIAARTG CoVOR 15200 TGYTGIGARCAAAAYTCRTG CoVF 14618 GGITGGGAYTAYCCIAARTGYGA CoVR 15035 CCRTCATCWGAIARWATCATCAT JTMC15 F3 2811 GCGTGTAGAYAARGTGCTTAA R3 4799 CCACGCTTRAGAAATTCAA F4 4648 GTKTCAGTDTCWTCACCAGA R4 6919 AATRCTTAACAAYAAYAGCCACAT F5 6806 CACTWCCTACRACTATWGCTAAAAAT R5 8939 GCAGARGTRGMAAARTCACTATACT F6 8827 CCTGGHTTACCDGGTACTGT R6 11329 CGYCTAGCAGCATCATCATA F10 17350 TGAGTGTYGTCAATGCTAGAC R10 19665 CTACYTTDGTGTAAACAGCATTATT F12 21283 GCTATACCATGCATGCTAACT R12 22615 CGAAAAAGARGTTGAGTTGTAG 5OR 252 ATTGGCTGAAACGACACCACTTC 5IR 161 GTCGATTAAAGCACTTGGCTCCA 3OF 27341 GACATCCCAGAGTGGAGGAG 3IF 27448 AGGTGTTGATGCCTCAGGCTAT JPDB144 F1 1 GATTTAAGWGAATAGCYTRGCTATC R1 1749 GTVGTWCCAGAVAGWARTGC F2 1572 GGTACTATGYACTTTRTKCCT R2 3846 CWGCDATRCCACCRCCAT F3 3759 GTKACHHTAGTHTTWGGTGA R3 5978 ACTAATAGYATCACYGCCA F4 5940 TAYWCTAATAGYTGCCTTG R4 8244 ACATCAGAYTCCACACC F5 796 TGGCCAGGAAARTTTAGC R5 10071 TCACTACCAGTYTCRCTGTA F6 9854 TACTGATGGTAARCTKAATTGTAG R6 12438 CATAGTTTGCATAGCACT F7 12559 TCWATGTATAAGCAAGCACGT R7 14645 GGATCWGCKGCATACATCAT F8 14556 TATCTTGTGGTTATCACTAC R8 17048 ATACCTCTCTTGATTCAC F9 16931 CGYATWGAYTATAGTGATGCTG R9 18983 ATCCCAMTCMACACGTTC F10 18791 TATGCCTGCTGGASTCATTC R10 20854 ATACTGRCACAATTGCATATATT F11 20639 CCTATTGAYTTAACWATGATTG R11 22365 GARWAGAGRTGAACRCCTTG F12 22338 GAGTGGTTYGGYATTACMCA R12 24688 GAAATAGCACCRAAAGTRTTAG F13 24267 GCWGATCCYGGYTATATGC R13 26250 CATAACGRTTKTGYYCGAAG F14 26140 ACTAAAGYATYAGCAAAACAAGA R14 27969 CGTTAAACCCASTCSTCAG F15 27842 GCTAYTMGATTATGTGTGC R15 30232 GCCTAATCTAATTGAATAATAGC 5OR 268 GTCACACTAGCCTTGGAAAGCA 5IR 83 CAGACCACAACACAACACGCACACAACA 3OF 30061 ATCATGTTARACTTACAGTGCAAG 3IF 30151 AAAGACTGTCACCTCTGCGTGATT JTAC2 F1 4010 CCACTATGTSACCAATWTYTATGAT R1 6107 CTTATCAATAAGCTTAGTAGCGTCT F2 5961 TGTYGGMCAYTATACTGTTTTTGA R2 8460 ACACGGCAATARGTCATAGC F3 8172 TGGTAAAACWCTTGTKTTTGC R3 9704 ACAAGCGCCATTAATGAA F4 9615 TTAAYATTYTGGCRTGCTATGAT R4 11457 CYTGTTCMGCCATTCTATCAA F5 11364 GTTCTCCACCTCAGTTGGT R5 13349 TCCTCACCAAAWATATCACTCTT F6 13199 GATAAYCAGGATCTTAATGGTGA R6 15458 TGACATGRTCATAAGCRCACTT F7 15309 ATTCWACTGCTAARTTTTGGGA R7 17318 CCATAAASGAKATWACATGCTCATA F8 17174 GAKGGTTGYGGTCTYTTTAAAG R8 18608 GGTGTTGTARGCATTARCATAGC F9 18470 TGCCMTTYTTYTTCTATGATG R9 20451 TCRAGCACACTRTTGTAAGACATAG F10 20301 GGACAATGTTYTGTACCAGTG R10 22672 ACATTCTTRAAGGCKARCAACTG F11 22567 AAYGTGTGCACCCAGTATACTAT R11 24958 TGAMGCTTTAAACAGTGCAA F12 24352 ATCCCAGAKTATGTYGATGTTAA R12 26127 ACCTTATAGCCYTCKACAAGCA F13 25921 CAGCATCCTTATGGCTTG R13 27429 ACTTTGGCACAGTCATYTTATAG Genome walking R1 4161 TGGCTGTAAAGTTGGCTGAGGT R2 4274 GCCACCACCATGAGACAAATTCT R3 4368 CAGAGCCAACCTTAAGTTTGCCA R4 2642 ACTTACARCTAACACCGGCCAGT R5 2745 TAGTCAAACCGTTCTCTACWGGAAT R6 2889 CGTCATAGAATGCATAACCATCAAC
Table 1. Primers used in this study
nsp JTAC2$ JTMC15 JPDB144 Length (aa) First - last residuePosition Length (aa) First - last residuePosition Length (aa) First - last residuePosition 1 - - 179 M1 -G179 731 M1 -G731 2 467 Y1 -G467 639 G180 -G818 489 R732 -G1220 3 (ADRP/PLPro ) 1637 G468 -A2104 1724 A819 -G2542 1572 G1221 -A2792 4 480 G2105 -Q2584 500 K2543 -Q3042 512 T2793 -Q3304 5 (3CLPro ) 302 S2585 -Q2886 306 S3043 -Q3348 306 S3305 -Q3610 6 276 S2887 -Q3162 290 G3349 -Q3638 292 S3611 -Q3902 7 83 S3163 -Q3245 83 S3639 -Q3721 83 S3903 -Q3985 8 195 T3246 -Q3440 198 A3722 -Q3919 199 A3986 -Q4184 9 108 N3441 -Q3548 113 N3920 -Q4032 110 N4185 -Q4294 10 135 A3549 -Q3683 139 A4033 -Q4171 139 A4295 -Q4433 11 17 S3684 -D3700 13 S4172 -V4184 14 S4434 -V4447 12 (RdRP) 927 S3684 -Q4610 932 S4172- Q5103 934 S4434 -Q5367 13 (Hel) 597 S4611 -Q5207 601 A5104 -Q5704 598 A5368- Q5965 14 (ExoN) 517 A5208 -Q5724 527 A5705 -Q6231 523 S5966 -Q6488 15 (XendoU) 339 N5725 -Q6063 346 S6232 -Q6577 342 G6489 -Q6830 16 (2-O-MT) 301 S6064 -K6364 298 A6578 -N6875 302 A6831 -L7132 Note: $ The nsp2 of JTAC2 was partial sequence, lacking the 5’ terminal.
Table 2. Putative nonstructural proteins (nsps) of ORF1a and ORF1b (replicase) in BatCoV JTAC2, JTMC15 and JPDB144.
ORF JTAC2 Neixiang-14 512 PEDV Length Length % identity Length % identity Length % identity 1a 3700 2030 87.9 4128 68.9 4117 76.4 1b 2680 2679 92.8 2681 85.3 2680 88.8 S 1365 - - 1371 58.6 1386 57.1 3a 224 - - 224 56.9 224 63.1 E 76 - - 76 83.1 76 83.1 M 226 - - 227 79.7 226 84.6 N 307 - - 394 62.6 441 58.8 Note: # Abbreviation and accession numbers: Neixiang-14: MIBtCoV Neixiang-14, KF294377; 512: BatCoV/512/2005, NC_009657; PEDV: PEDV-1C, KM609203. § incomplete sequences.
Table 3. Comparison of ORF amino acid identities of JTAC2 with other three representative Alphacoronavirus strains#
ORF JPDB144 HKU4 HKU5 MERS Length Length % identity Length % identity Length % identity 1a 4447 4445 93.8 4481 71.1 4391 64.7 1b 2699 2699 97.7 2715 89.4 2701 87.4 S 1352 1352 94.5 1352 69.6 1353 67.1 3a (3) 91 91 89.3 121 44.8 103 46.8 3b (4a) 119 119 92.5 119 53.0 109 38.3 3c (4b) 285 285 88.8 256 39.2 246 27.8 3d (5) 227 227 93.0 223 46.6 224 46.9 E 82 82 98.8 82 80.7 82 69.9 M 219 219 97.3 220 82.3 219 84.2 N 423 423 97.9 427 74.4 413 70.8 Note: # Abbreviation and accession numbers: HKU4: BatCoV HKU4-4, EF065508; HKU5: BatCoV HKU5-1, EF065509; MERS: MERS-CoV ChinaGD01, KT006149. § ORF3a, 3b, 3c, 3d in HKU4 and HKU5 are described in MERS-CoV as ORF3, 4a, 4b and 5 respectively.
Table 4. Comparison of ORF amino acid identities of JPDB144 with other two representative strains of Betacoronavirus lineage 3 (β3)# .