Shotgun sequencing of the complete CyunNPV genome was conducted using a Roche 454 system. Scaffolds were assembled using Roche GS De Novo assembler software using high-quality reads. The genome was sequenced at 3439 coverage with 138, 528 reads and an average read - length of 331 bp. Gaps in the genome scaffolds were filled via PCR and Sanger sequencing. The genome sequence was 142, 900 bp in length and had a G + C content of 45%. Genome analysis identified 147 methionine-initiated ORFs containing at least 150 nt with minimal overlaps (Supplementary Table S1). The polyhedrin gene is set as the first ORF (ORF 1) in the circular genome (Fig. 1). The 38 core genes (red), 24 conserved genes in lepidopteran baculoviruses (blue), and the nine Group Ⅰ unique genes (green) are illustrated on the circular map (Fig. 1). Two of the previously identified Group Ⅰ unique genes, namely, protein tyrosine phosphatase gene (ptp) and ac30, were not detected in CyunNPV. The remaining 71 common genes and five unique genes are shown in gray as open arrows (Fig. 1).
Figure 1. Gene organization of CyunNPV. Arrows depict an indicated ORF and its transcriptional direction. The colors represent gene types: red indicates core genes, blue indicates genes conserved in lepidopteran baculoviruses, green indicates genes unique to members of Group Ⅰ Alphabaculovirus, gray indicates common genes, and open arrows indicate genes unique to CyunNPV. Pink squares represent repeat structures. Inner circle indicates genome scale position by 20 kb. The collinearly conserved region of lepidopteran baculoviruses is also indicated.
Among the 147 ORFs, 118 genes were categorized as follows based on function: 13 replication-associated ORFs, 11 transcription-related ORFs, ten ORFs involved in oral infection, 34 ORFs related to structure, and 50 auxiliary genes (Table 1). The functions of the remaining 29 ORFs remain to be determined.
Table 1. Gene content of CyunNPV.
Most baculoviral genomes contain homologous repeat sequences (hrs), which typically consist of a few repeated sequences with imperfect palindromes that are interspersed within the genome. The hrs of different baculoviruses are highly variable (Ferrelli et al. 2012), and previous studies have suggested that hrs can act as origins of DNA replication and enhancers of gene expression (Guarino et al. 1986; Kool et al. 1993; Rodems and Friesen 1995; Theilmann and Stewart 1992). A total of 13 hrs were identified in the CyunNPV genome (Fig. 2A). A cAMP response element (CRE)-like motif and its palindrome sequences were found in the hrs of CyunNPV (Fig. 2A). Additionally, an AT-rich region and a GC-rich region were detected in the hrs of CyunNPV (Fig. 2A). Transient assays revealed that the CRE-like motif functions as a transcriptional activator in AcMNPV (Landais et al. 2006). Non-hr replication origins have been reported in some baculoviruses (Habib and Hasnain 2000; Kool et al. 1993; Pearson et al. 1993; Wu and Carstens 1996). The sequence features of non-hr are similar to those of hrs but appear only once within a genome. A direct repeat structure with two imperfect palindromes has been identified in the CyunNPV genome and is likely to act as a non-hr origin of DNA replication (Fig. 2B).
Figure 2. Repeat sequences of CyunNPV. A Sequence comparisons of hrs. The four bases, namely, adenine (A), cytosine (C), guanine (G), and thymine (T), are marked as green, blue, purple, and red, respectively. The CRE-like motif and its palindromes are indicated below the alignment. B Sequence comparison of direct repeats. Black background shows 100% identity among the compared regions, while dark gray and light gray colors indicate 75% and 50% identity, respectively. The palindromes are indicated above the alignment. The arrows show the direction of each palindrome.
CyunNPV contains the gp64 gene, indicating that it is a member of the Group Ⅰ alphabaculoviruses. A phylogenetic tree was constructed using the combined 38 core genes from CyunNPV, SujuNPV (Liu et al. 2014), ClasGV-B (Yin et al. 2015), and other sequenced baculoviruses in the NCBI database (Fig. 3A). CyunNPV was clustered under clade "a" of Group Ⅰ and appeared to have diverged after the radiation of clade "a" (Fig. 3A).
Figure 3. Phylogenetic tree construction and gene parity plot analysis. A Phylogenic tree. The unrooted tree was constructed based on the 38 core proteins from all sequenced baculoviruses in the NCBI database using the maximum likelihood method. Only Group Ⅰ alphabaculoviruses are shown in the phylogenetic tree. The NCBI accession numbers of the sequences of Group Ⅰ alphabaculoviruses used in this study are indicated after the name of each species. The reliability of the tree was tested via the bootstrap method with a value of 500. CyunNPV is indicated in red. B Gene parity plot analysis. Gene parity plots of CyunNPV are compared with the Group Ⅰ clade a representative species, AcMNPV, and the Group Ⅰ clade b representative species, OpMNPV, based on ORF order. Red line indicates the collinearly conserved region.
Gene parity plots are useful for comparing gene organization between any two viral genomes (Hu et al. 1998a). CyunNPV was plotted against six representative baculoviruses (Supplementary Table S1), including the Autographa californica MNPV (AcMNPV, Group Ⅰ clade "a"), Orgyia pseudotsugata MNPV (OpMNPV, Group Ⅰ clade "b"), Helicoverpa armigera NPV-G4 (HearNPV-G4, Group Ⅱ), Cydia pomonella granulovirus (CpGV, Betabaculovirus), Neodiprion lecontei nucleopolyhedrovirus (NeleNPV, Gammabaculovirus), and Culex nigripalpus nucleopolyhedrovirus (CuniNPV, Deltabaculovirus). The plots indicated that the CyunNPV gene organization was highly collinear with Group Ⅰ alphabaculoviruses (Fig. 3B). Previously, we reported a collinear conserved region in lepidopteran baculoviruses comprising 20 core genes and five lepidopteran baculoviral conserved genes (Zhu et al. 2014). The CyunNPV genome contains the similar collinearly conserved region (red line, Fig. 3B).
Table Table S1. Genome annotation of CyunNPV.
Previous studies have identified 11 genes that are specific to Group Ⅰ viruses and are absent from all other baculoviruses (Herniou et al. 2003; Jiang et al. 2009). These genes could have played important evolutionary roles in the divergence of Group Ⅰ and Group Ⅱ alphabaculoviruses (Herniou et al. 2003; Jiang et al. 2009).
Although CyunNPV belongs to Group Ⅰ and contains nine of the 11 genes unique to Group Ⅰ, CyunNPV lacks the protein tyrosine phosphatase gene (ptp) and the ac30 gene. The protein encoded by ptp exhibits dual enzymatic property as a protein tyrosine phosphatase (Sheng and Charbonneau 1993) and as an RNA 5'-triphosphatase (Gross and Shuman 1998; Takagi et al. 1998). Deletion of ptp from AcMNPV is not deleterious to BV synthesis. However, ptp deletion impairs ODV production in Sf-21 cells but not in TN-368 cells (Li and Miller 1995). PTP appears to enhance wandering behavior in infected insects and is supported by the fact that BmNPV with a ptp deletion is unable to induce such behavior (Kamita et al. 2005). Ac30 appears to be non-essential for BmNPV because its deletion does not affect viral production or the median lethal dose (LD50), but it appears to prolong the median survival time (LT50) of its host (Huang et al. 2008).
There are two hypotheses that can explain the lack of the ptp and ac30 genes in CyunNPV. The first hypothesis is that CyunNPV may lose these two genes during evolution, and the second hypothesis is that ptp and ac30 are acquired after CyunNPV diverged from the ancestor of Group Ⅰ viruses. The phylogenetic tree based on the 38 core genes grouped CyunNPV under clade "a" of Group Ⅰ (Fig. 3A), thereby suggesting that the virus emerged after the split between clades "a" and "b". To further verify the above findings, a phylogenetic tree was constructed by using conserved gp64 genes specific to Group Ⅰ (Fig. 4). Similar to the above results, the generated tree grouped CyunNPV under clade "a" (Fig. 4). The ancestors of both clades independently acquire ptp and ac30 after their divergence. Consequently, the ptp and ac30 genes are likely to exist in the ancestor of Group Ⅰ viruses, and CyunNPV has lost these two non-essential genes throughout evolution.
Figure 4. Phylogenetic tree of GP64 shared in alphabaculoviruses. The unrooted tree was constructed based on the GP64 protein sequences using the maximum likelihood method. The genome accession number is indicated after each baculovirus. Bootstrap analysis was performed with a value of 500. The bootstrap values over 50% are shown in front of each node.
Baculoviral F protein belongs to class Ⅰ viral envelope fusion proteins, which mediate the entry of BVs into permissive cells (Pearson et al. 2000). Previous studies have demonstrated that the furin cleavage site is essential for protein processing and for virus/host membrane fusion (Westenberg et al. 2002; Ijkel et al. 2000). Similar to the F-like proteins found in other Group Ⅰ viruses, the F-like protein of CyunNPV lacks the furin cleavage site and therefore has lost part of the fusion function (Pearson et al. 2000) (Fig. 5). In addition, a string of continuous polar amino acids was inserted into the fusion peptide region. Similar observations were obtained in Thysanoplusia orichalcea NPV (ThorNPV), in which five continuous glycines were inserted into the fusion peptide region (Fig. 5). The fusion peptide of F proteins contains highly conserved hydrophobic amino acids that form an amphiphilic structure and play a central role in facilitating membrane fusion (Tan et al. 2008). These insertions are likely to inactivate the fusion peptide function of the F-like proteins of CyunNPV and ThorNPV.
Figure 5. Sequence alignment of F and F-like proteins. Alignment was performed using ClustalW. A schematic figure of SeMNPV F protein (Westenberg et al. 2004) with sequence alignments of two enlarged regions is shown at the bottom. The predicted regions of the furin cleavage site, fusion peptide, and TM domains are indicated below the alignments. The red and blue arrows indicate the insertion regions of the continuous polar amino acids in the fusion peptide region and pre-transmembrane domain, respectively in CyunNPV.
Many viral fusion proteins contain a short sequence that is rich in aromatic amino acids and precedes the transmembrane domain, known as the pre-transmembrane (preTM) domain (Lorizate et al. 2008). The preTMs, particularly the conserved aromatic amino acids, play important roles in membrane fusion (Sainz et al. 2005; Salzwedel et al. 1999). Interestingly, the preTM regions of Group Ⅰ viruses were found to contain fewer aromatic amino acids than those of other baculoviruses (Fig. 5). Strikingly, insertion of a continuous polar amino acids chain in and near the preTM domain of CyunNPV F-like protein disrupted its hydrophobicity (Fig. 5).
One of the major features differentiating Group Ⅰ viruses from other baculoviruses is the presence of gp64 and the lack of the fusion function of the F protein (i.e. converted to F-like). A previous study that investigated the selection pressure indicates that Group Ⅰ viruses emerge from an ancient member of Group Ⅱ and that the acquisition of a new fusion protein gp64 plays an important role in the emergence of Group Ⅰ viruses (Jiang et al. 2009). Group Ⅱ F proteins are functional envelope fusion proteins that are essential for the viral fusion process (Pearson et al. 2000; Ijkel et al. 2000). Acquisition of gp64 leads to reduced selection pressure on the F protein and the F-like proteins in Group Ⅰ loses part of their function (Jiang et al. 2009; Pearson et al. 2001; Rohrmann and Karplus 2001; Wang et al. 2014). The gene encoding F-like protein in AcMNPV is not essential for viral infection and BV production but acts as a pathogenic factor in cellular and insect infection (Lung et al. 2003; Wang et al. 2008; Yu et al. 2009). It has been suggested that F-like proteins have function(s) other than membrane fusion, which could explain why they are retained in alphabaculoviruses under the evolutionary selective pressure (Wang et al. 2008). In this study, we identified a continuous stretch of polar amino acids inserted into the predicted fusion peptide and in the preTM region of the F-like protein. The fusion peptide and preTM regions are essential for the viral fusion function. The amino acid insertions interfere with hydrophobicity in these regions and are likely to affect fusogenicity. A similar insertion in the fusion peptide region has been identified in ThorNPV and could provide insights into the evolution of the F-like proteins. Our findings suggest that different Group Ⅰ alphabaculoviruses underwent different routes to inactivate the fusion function of F targeting the furin cleavage site, the fusion peptide, or the preTM domain. The acquisition of higher fusogenicity by GP64 and the inactivation of the fusion capacity of F proteins are considered critical events in the emergence of Group Ⅰ viruses. Our results provide new insights into the underlying F inactivation throughout evolution.
CyunNPV is a member of Group Ⅰ alphabaculoviruses and exhibits certain unique features. Phylogenetic analysis shows that CyunNPV occupies a distinct branch in the alphabaculoviral clades. The F-like protein contains two insertions that are different from those found in other alphabaculoviruses and could reflect the process underlying the inactivation of the protein. The abovementioned unique features indicate that CyunNPV is a distinct species in the genus Alphabaculovirus.
This work was supported by the National Key R&D Program of China (Grant No. 2017YFD0200400) and the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB11030400). We acknowledge the technical support and core facility of the Wuhan Institute of Virology for their technical assistance.
Conceived and designed the experiments: ZZ, MW, ZH, FD, HW, and FY. Viral nucleic acid extraction and 454 sequencing: ZZ, DH, FY, XL, and ZL. Genome assembly, annotation and data analysis: ZZ, XL, JW, QW, and HL. Wrote the paper: ZZ, BA, ZH, and MW.