Rabies is a zoonotic infectious disease caused by rabies virus and its mortality rate is nearly 100% when infected . Rabies is widely distributed across the globe. More than 55, 000 people die of rabies each year . About 95% of human deaths occur in Asia and Africa. India is the most severely affected, and China is the next.Since 1996 human rabies cases in China have risen continuously until it has become one of the countries experiencing the largest number of cases. Since 2007, although the number of human cases in China has declined overall , some provinces and regions have continued to experience a rise and recently some regions have even seen a re-emergence in number of cases. Thus, rabies prevention and control in China continues to be a matter of great importance [8, 24].
Rabies virus is a negative single-strand RNA virus, belonging to genus Lyssavirus, family Rhabdoviridae . The overall length of viral genome is about 12 kb, and genes from 3' to 5' terminal include N(Nucleoprotein), P(Phosphoprotein), M(Matrix protein), G(Glycoprotein) and L(RNA-dependent RNA polymerase) . Of these, the G gene-encoding the glycoprotein is located on the viral envelope surface, and it can effectively stimulate virus neutralizing antibodies . In addition, G is the binding site of virus and host cell receptors, determining viral tissue tropism and is related to viral virulence and pathogenicity . Thus, the G protein is a good choice for examining the relationship between different viral isolates.
As part of a national surveillance program of the rabies epidemic from 2007 to 2011, this study collected rabies specimens in emerging and re-emerging areas and carried out G gene sequencing of positive samples. In order to investigate the epidemic spread of rabies in China we carried out analyses of the molecular characteristics of these sequences combined with sequences from high-incidence provinces which have seen a reduction in their epidemic situations in the last five years. Our findings provide data that should contribute towards determining rabies prevention and control strategies in the future.
Investigation of the geographical distribution of rabies cases in recent years (Fig. 1) shows that most cases occurred in southeast regions of China (Fig. 2). After 2007, the epidemic situation in the southeast and the overall number of cases at the national peaked and subsequently reduced year by year . However, this reduction was accompanied by the emergence of cases in regions that were previously free of rabies and re-emergence in regions that previously had a low incidence of rabies cases. Specifically, new rabies cases were recorded in Shaanxi(XiAn) province, Gansu province, Liaoning province, Ningxia Autonomous Region, Xinjiang Autonomous Region (Table 1) and in six other low-incidence provinces and municipalities (Beijing, Shanxi, Inner Mongolia, Shanghai, Yunnan and Hainan) the number of cases were significantly reduced (Fig. 3).
Figure 1. Geographical distribution of human rabies cases in China from 2005 to 2010. Geographical expansion is shown by changes of dots.
Figure 2. The rabies epidemic status of each province in China, 2007-2011. Epidemic regions of different kind are marked by responding color.
Table 1. Rabies cases in five emerging provinces in China from 2007 to 2011
Experimental testing identified 43 specimens positive for rabies virus. Of these newly collected samples, 9 samples are from emerging and reemerging regions (Shaanxi-XiAn, Shanxi, Ningxia, Inner Mongolia and Shanghai) and the rest are from epidemic regions. For these 43 positive specimens, an overall length of 1575bp for the G gene coding region was obtained. Specific background information of the specimens is shown in Table 2.
Table 2. Background information of 43 positive specimens obtained in this study
The homology of G gene nucleotide sequence of 43 specimens obtained in this study was 87% to 100%, and the amino acid sequence homology deduced from the nucleotide was 93.7%-100%. The nucleotide sequence variation was mainly composed of synonymous mutations and the G gene nucleotide sequences of some specimens from the same province were completely consistent.
The G gene of the rabies virus encodes 524 amino acids of which the first 19 amino acids corresponds to a hydrophobic signal peptide. During transcription the signal peptide is removed and the mature glycoprotein composed of 505 amino acids is formed. The Glycoprotein is a crucial protein of the rabies virus in that it stimulates the generation of virus neutralizing antibodies  and there are at least three neutralizing antigenic sites (GI, GII and GIII). G II is a typical spatially-configured antigenic site located in an amino acid segment spanning sites 34 to 200. Within this region, two main epitopes are located at sites 34 to 42 and sites 198 to 200 . Other important sites causing antigenic changes include site 147 and site 184; GIII is located in an amino acid segment extending from sites 330 to 357 and is a spatially-dependent antigenic site. Within this segment, sites 333 to 340 are the main binding sites of neutralizing antibodies. Also, amino acids at sites 333, 336 and 339 appear to be the key amino acids . For most street strains, the amino acid at site 333 is Arg (R) which is possibly related to the virulence of virus; in the G protein of all the 43 specimens obtained in this study, R was present at this site. Compared with domestic and international isolates, all amino acids of G protein amino acid sequences of 31 virus specimens (including all the cases from emerging and re-emerging provinces) the amino acids at sites 90, 168 and 332 varied (Table 3). In particular, the hydrophobic amino acid Tyr(Y) at site 168 was replaced by hydrophilic amino acid Cys(C). These sites might be associated with the pathogenicity of rabies virus.
Table 3. Comparisons of amino acids sequences in the GII and GIII segments of the rabies virus G gene
Previous results have shown that the acetylcholine (AchR) receptor acts as the cell receptor of rabies virus in vivo and that the amino acids of the glycoprotein at sites 189 to 214 are related to virus / nerve cell tropism . Rustici et al. proved that within the spatial structure of the glycoprotein the amino acids segment from 194-197(Asn-Ser-Arg-Gly) was important, with the spatial configuration of Asn(site 194) and Arg(site 196) residues consistent with the AchR configuration . For the 31 G gene sequences (including all cases from emerging and re-emerging provinces) obtained in this study, only the amino acid at site 204 varied, whence Ser(S) was replaced by Gly(G). This indicates that rabies virus nerve cell tropism did not change significantly.
The glycosylation signal of the G protein is the Asn-X-Thr or Asn-X-Ser motif . At present, additional potential glycosylation sites include amino acids are predicted to be located at sites 37, 70, 247, 319, 465 and 480. In this study, the obtained virus specimens only had two potential glycosylation sites at sites 37 and 319.
The phylogenetic analysis of the G gene sequences of 43 positive specimens, as well as representative sequences of Chinese epidemic strains   , and one American strain as the outgroup were downloaded from GenBank, aligned and phylogenetic analysis was carried out (Fig. 2). According to the G gene grouping of strains from the current epidemic , the China strains can be divided into 6 groupst (clade Ⅰ-Ⅵ) , and the phylogenetic tree showed that the 43 specimens were assigned to clades Ⅰ and Ⅱ with all specimens from emerging and re-emerging provinces located in clade IA.
From 2007 to 2011, cases were recorded in five provinces and autonomous regions (Shaanxi, Gansu, Ningxia, Xinjiang and Liaoning) that were previously rabies free (Table 1), and which have been classified as re-emerging areas of rabies. No virus specimens were obtained from Gansu, Liaoning provinces and Xinjiang Autonomous Region. Four canine specimens (there were 3 specimens in this study, and 1 specimen was a previously obtained sample) provided by Shaanxi(XiAn) in 2009 were from Hanzhong, and their homology was very close, belonging to the Clade IA subgroup. Also, they had close homology with Sichuan specimens (Fig. 4). Onhuman specimen isolated in Ningxia in 2011 had the closest homology with strains from Tianjin and Beijing, whereas two other human specimens from Ningxia that were isolated in 1985 and 1986 were assigned to a different group (clade Ⅴ).
Figure 4. Phylogenetic tree of rabies street virus G gene sequences isolated in China. Based on a 1575nt (nt1–nt1575) glycoprotein (G) gene of rabies virus rooted with the Pasteur strain of rabies virus (PV). Numbers at each node indicate degree of bootstrap support; only those with support > 70% are indicated. New collected sequences are shown in red, sequences from GenBank are shown in black. Provinces, hosts, year of sample collection, and GenBank accession no. are included for each taxa.
The tree showed a number of interesting associations (Fig. 4). (ⅰ) Only one Beijing sample BeijingHu1 has been released at present and this was assigned to the clade IA subgroup, and it had the closest homology to the Tianjin strain; (ⅱ) The human specimen from Shanxi and human specimens of Hebei in 2011 also formed a branch within the clade IA subgroup; (ⅲ) Canine samples from Shanxi (a low incidence region) collected in 2007 and in 2008 and samples from Hunan (high-incidence region) were all assigned to the clade IC subgroup; (ⅳ) High homology existed between the canine and milk cow specimens from Inner Mongolia in 2011, which was consistent with the infection of the cow by a dog bite; they had high homology with Hebei samples, but greater differences with local wildlife samples collected from raccoon dog in 2007 (clade IA); (Ⅴ) Samples from Shanghai had a wide range of collection dates spanning 15 years (from 1992 to 2006) but were all placed within clade IA; (ⅵ) The three samples from Yunnan showed significant diversity, and two were closer to a Guangxi strain and an Anhui strain in the clade IC subgroup, yet another was in clade Ⅵ and similar to another Guangxi strain.