A large virus was defined as one that possess either a large particle ( > 100 nm) or a large genome ( > 100 kb) (84). All viruses with large genomes also form large particles that are complex in both composition and structure. Furthermore, all large virus genomes known to date are of double-stranded DNA (84) and include several animal viral families such as the Herpe-sviridae, Poxviridae, Iridoviridae, Asfaviridae, Phy-codnaviridae, Baculoviridae, Nimaviridae and some large dsDNA bacteriophages such as Myoviridae (84). Their morphology varies from brick shaped or ovoid virions (Poxviridae), cylindrical virions (Baculoviridae) to a number of icosahedrally based forms infecting both eukaryotes (Herpesviridae and Iridoviridae) and prokaryotes (Myoviridae) (84).
Viral proteomics is a mixture of conceptions of several aspects, including virion proteomics, viral structural proteomics, viral interactomics and virus-induced cellular differential proteomics (64). Viral proteomics will provide fundamental information for viral assembly mechanisms which are important for understanding the viral infection pathways. The recent developments of proteomic approaches provide powerful tools to study viral proteomics of large viruses. In this review, we summarized the advances on virion proteo-mics of large DNA viruses. In the first part of the review, the approaches used for virion proteomics were introduced and the recent achievements on proteomic studies of herpesvirus, poxvirus, and nimavirus were reviewed. In the second part of the review, the virion proteomics of baculoviruses were summarized in detail.
The development of proteomic techniques combi-ning mass spectrometry with database searches of sequenced genomes provided a comprehensive approach for virologists to identify protein components of large DNA viruses. Virion proteomic investigations are particularly amenable to this approach, since a large number of fully sequenced genomes are available, and viral particles consist of a relatively narrow range of proteins that have constant, stable profiles in com-parison to whole cells (64).
In recent years, two mass spectrometry identifi-cation approaches, peptide mass fingerprinting (PMF) using matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry and one dimensional or two dimensional liquid chromatography linked to tandem mass spectrometry (LC-MS/MS or 2D-LC-MS/MS, tandem MS usually using Qtrap, Q-TOF or MALDI-TOF/TOF), have been widely used to analyze protein components of puriﬁed virions leading to the identiﬁcation of previously unknown components of virus particles. Two proteomic strategies often used for virion proteomic studies were demonstrated in Fig. 1. (ⅰ) Purified virions could be separated by 1D SDS-PAGE or 2D gel electrophoresis. Protein bands or spots are excised, digested with trypsin and subjected to PMF (usually using MALDI-TOF) or LC-MS/MS identifi-cation techniques. (ⅱ) Purified virions could be directly digested with trypsin in solution and subjected to LC-MS/MS with high sensitivity. Alternatively, a powerful shotgun proteomics approach, or so called Multi-dimensional Protein Identification Technology (MudPIT), usually using strong cation exchange (SCX) column chromatography before the ordinary reversed phase (RP) column chromatography (also called 2D-LC), will give better peptides separation and reduce the complexity of samples for subsequent tandem MS detection. Different approaches have their technical advantages and limitations. The identities using different approaches are usually complements to each other. Therefore, in order to achieve a comprehensive analysis of virion proteome, two or more proteomic approaches should be applied. Here, the virion proteomic studies of three kinds of complex enveloped large DNA viruses including herpesvirus, poxvirus and nimavirus are summarized in detail (Table 1).
Table 1. Summary for virion proteomic researches mentioned in this review.
All infectious herpesvirus virions are composed of an icosahedral nucleocapsid (100 to 110 nm in diameter, T=16) surrounded by a structural asymmetric matrix layer called the tegument, which in turn surrounded by a lipid envelope. The multiple structural components and large size (120 to 200 nm in diameter) of the virions give the potential for packaging numerous viral and host proteins (64). So far, virion proteomics have been performed on several herpesviruses, in-cluding Herpes simplex virus type 1 (HSV-1) (59) of the alphaherpesviruses, human cytomegalovirus (HCMV) (2, 105) and murine cytomegalovirus (MCMV) (45) of the betaherpesviruses and Epstein-Barr virus (EBV) (43), Kaposi's sarcoma-associated herpesvirus (KSHV) (3, 126), rhesus monkey rhadinovirus (RRV) (71) and murine gammaherpesvirus 68 (MHV68) (6) of the gammaherpesviruses (Table 1). The virion proteomic analyses on HCMV, a prototypic herpesvirus, are described here in detail.
HCMV is the largest and most complex member of the family of human herpesviruses and contains a linear 230 kb dsDNA genome encoding over 200 predicted ORFs (15). The first proteomic study of HCMV virions was conducted by Baldick and Shenk (2) in 1996, where purified virions were separated by SDS-PAGE followed by N-terminal sequencing. In this way, eleven viral proteins (six of them not reported to be virion packaged) and one actin-like cellular protein were identified.
Later, in 2004, Varnum et al. (105) applied two kinds of tandem MS based proteomic techniques, LC-Qtrap (Quadrupole ion trap) & LC-FTICR (Fourier transform ion cyclotron resonance) after the first dimensional SCX-HPLC separation to determine the composition of the HCMV proteome. Totally, 71 HCMV viral proteins and 71 host proteins were identified from purified HCMV virions (105). The LC-FTICR MS approach allowed for the relative quantification of the virion proteins from the average intensity of the spectra of the most abundant peptides for each protein, indicating that the pp65 (UL83) as a tegument protein was the most abundant protein in the virion and that gM (UL100) was the most abundant glycoprotein. Approximately, the virion is comprised of 50% tegument proteins, 30% capsid proteins, 13% envelope proteins and 7% undefined proteins. Host proteins identified from HCMV include many cytoskeletal proteins, cha-perones, enzymes, transporting proteins, signal trans-duction proteins, transcription or translation proteins etc. The host proteins could be possibly picked up with tegument matrix as the capsids move through the cytoplasm or localized on the envelope as the virions budded from the plasma membrane.
Poxviruses are among the largest and most structurally complex of all known viruses (122). Virions appear as ovoid or brick-shaped membrane-bound particles (250-450 nm in length and 140-260 nm in height) with a complex internal structure featuring a walled, bi-concave core (which contains the genome) and one or two lateral bodies inside an external coat containing lipid and tubular or globular protein structures (20). The virions exist in several forms: intracellular mature virions (IMV), intracellular enveloped virions (IEV), cell-associated enveloped virions (CEV) and extracellular enveloped virions (EEV) (19). The IMV is the most abundant virion with a single membrane in cytoplasm. A portion of the IMV is subsequently wrapped with two layers of Golgi membrane to form an IEV, which is transported through microtubules to the cell periphery and loses one membrane during virion egress to become a CEV, which associated with the cell surface. CEV is released by host cell Src/Abl kinases into the medium to become an EEV (19).
Vaccinia virus, the best-characterized member of the Poxviridae, contains more than 200 ORFs in a 190 kbp dsDNA genome (42). Several virion proteomics analyses have been performed on vaccinia virus. Initially, Takahashi et al. (94) identified 17 virally encoded components by N-terminal sequencing fol-lowing the separation of purified IMV proteins by SDS-PAGE. Jensen et al. (42) identified 13 viral and 5 cellular proteins in IMV using 2D electrophoresis (2DE) coupled with PMF MS identification technique. Later, Yoder et al. (121) used PMF and LC-MS/MS (tandem MS by Qtrap and Q-TOF) identification approaches after separation of purified IMV proteins by SDS-PAGE or RP-HPLC, resulting in the identifi-cation of 63 virally encoded proteins. Around the same time, Chung et al. (19) analyzed the IMV proteome by Trypsin and Lys-C digestion and LC-MS/MS (using LC-Q-TOF), identifying 75 viral proteins (74 by Trypsin digestion approach; 67 by Lys-C digestion approach) and 23 host proteins. A better approach using 2D-LC fractionation prior to MS/MS (also called MudPIT) was also performed in their study (19). Most of the cellular proteins identified include chaperon proteins, cytoskeleton proteins and protein transport/vesicular trafficking protein, etc. They also applied a feasible approach to determine the relative abundances of viral proteins in IMV using exponentially modified protein abundance index (emPAI) value (39) estimated from MS/MS data. According to their calculated molar percentage (mp) and weight percentage (wp), 75 viral proteins were classified into four groups according to their abundant on virions. The most abundant viral-protein group (wp > 5) consisted of four core proteins (A4L, A10L, F17R and A3L) (19).
In addition to vaccinia virus, IMV particles from the myxoma leporipoxvirus have been subjected to proteomic analysis, leading to the identification of 17 viral proteins, most of which are homologues of the viral proteins found in vaccinia virus with one exception not been found in vaccinia virions and another that lacked a vaccinia virus homologue (122).
White spot syndrome virus (WSSV), one of the most devastating viral pathogens of cultured shrimp worldwide, is the type species of the genus Whispo-virus, family Nimaviridae (99). WSSV is a large dsDNA (about 300 kb) virus of non-occluded and bacilliform in shape with an envelope and sometimes with a tail-like appendage (37). Many investigations have been done on the WSSV proteome since 2002.
Huang et al. (37) identified 18 viral proteins (isolated from Penaeus japonicus) using a combination of SDS-PAGE with PMF or nano-ESI (electrospray ionization)-MS/MS, Zhang et al. (125) further reported 12 viral proteins using 2-DE with PMF or nano-ESI-MS/MS. In 2004, Tsai et al. (99) identified 33 WSSV viral proteins using SDS-PAGE and LC-MS/MS. Four crustacean proteins (sarco/endoplasmic reticulum Ca2+ -ATPase, vitellogenin, hemocyanin and actin) were also identified in this study (99). Later, the same group reported further studies on the localization of WSSV virion proteins (100). Triton X-100 combined with various concentrations of NaCl was used to separate intact WSSV virions into distinct fractions, such as fractions containing envelope and tegument proteins, tegument and nucleocapsid proteins, or nucleocapsid proteins only. The localization studies of 15 known structural proteins were mainly determined by Western blots and some were confirmed by im-munogold electron microscopy (IEM) technique. Four capsid proteins, four tegument proteins and seven envelope proteins were identified. Furthermore, the purified nucleocapsids proteins were separated by SDS-PAGE and the minor bands were excised, trypsin digested and subjected to LC-MS/MS analysis, identifying two newly reported virion associated proteins (VP160A and VP160B). Xie et al. (118) reported a different strategy for WSSV virion proteins localization. The purified WSSV virions were treated by Triton X-100 and separated into the envelope and nucleocapsid fractions, then analyzed by SDS-PAGE and PMF. Finally, 22 proteins were detected in the envelope fraction, 7 in the nucleocapsid fraction and 1 in both fractions (tegument).
Li et al. (57) applied a new quantification MS approach, isobaric tags for relative and absolute quan-tification (iTRAQ) to study WSSV, and 23 envelope proteins and 6 nucleocapsid proteins were success-fully identified. Shotgun proteomics using offline coupling of an LC system with MALDI-TOF/TOF were also applied as a complementary and com-prehensive approach to investigate the WSSV virion proteome, identifying 45 viral proteins. By combining other WSSV proteomic studies, Li et al. (57) proposed that WSSV is assembled by at least 58 structural proteins, including 28 proteins localized in the enve-lope, 9 proteins in the capsid structure and 5 proteins in the tegument (57).
The Baculoviridae, a diverse family of more than 600 viruses, infects mainly insects of the orders Lepi-doptera, Hymenoptera and Diptera and encompasses four genera, alphabaculovirus (lepidopteran-specific NPV), betabaculoviruse (lepidopteran-specific GV), gammabaculovirus (hymenopteran-specific NPV) and deltabaculovirus (dipteran-specific NPV) (40). NPVs are designated as either MNPV or SNPV, referring to whether the ODV particles contain multiple (M) or single (S) nucleocapsids (28). Based on phylogeny, alphabaculoviruses are divided into Group Ⅰ (such as AcMNPV) and Group Ⅱ (such as HearNPV) (35).
Two progeny phenotypes are produced in the baculovirus replication cycle, the occlusion-derived virus (ODV) and the budded virus (BV) (28). ODVs are encapsulated in a protein matrix composed predominantly of polyhedrin (or granulin in GVs) that allow the virus to exist in the environment with relatively stable viability for a long time. In larvae, ODVs initiate primary infections in midgut epithelial cells of susceptible hosts, while BVs are responsible for the spreading of the virus from cell to cell in the larva host (28). Although the two phenotypes are genotypically identical, each has characteristic stru-ctural components to accommodate their respective functions (13). At the early stage of the infection, the nucleocapsids bud from the plasma membrane to form BVs, while, in the late stages of the infection, the nucleocapsids are enveloped within the nucleus with a lipid bilayer from microvesicles that were considered to be derived from the inner nuclear membrane modified by virus encoded proteins (9, 13, 36). There-fore, it is generally believed that BV and ODV share the same components of the nucleocapsid, but differ in their envelopes. Moreover, ODV fills the space between NC and E with a tegument structure (112).
To date, 50 genome sequences of baculoviruses have been reported (as of Jul. 7th, 2009), including 36 alphabaculoviruses, 10 betabaculoviruses, 3 gam-mabaculoviruses and 1 deltabaculovirus. The availa-bility of genome sequences has facilitated proteomic analysis of baculoviruses, especially with mass spec-trometry-based techniques. Since Braunagel et al. (2003) first reported the ODV components of Autographa californica multiple nucleopolyhedrovirus (AcMNPV) (12), the ODV components of three other viruses that of Culex nigripalpus nucleopolyhedrovirus (CuniNPV) (81), Helicoverpa armigera nucleopoly-hedrovirus (HearNPV) (21) and Bombyx mori nuc-leopolyhedrovirus (BmNPV) (58), have also been reported. A summary of the combined data representing all the protein associated with ODVs are listed in Table 2. This information shed lights on the ODV structure and assembly.
Table 2. Viral proteins associated with the baculoviral ODVs
AcMNPV is the type species of baculoviruses and also the typical representation of Group Ⅰ alphabacu-lovirus and MNPV. In 2003, Braunagel et al. (12) analysed AcMNPV ODV by using multiple proteomic approaches such as expression library screening by isolating positive colonies only to ODV antiserum, SDS-PAGE followed by PMF (using MALDI-TOF) or LC-MS/MS (using LC-Q-TOF) as well as MudPIT (2D-LC-MS/MS). In addition to polyhedrin, which can not be completely avoided during the purification of ODV, forty proteins were identified by at least one of these techniques. Among these proteins, eight were considered to be envelope proteins (ODV-E18, ODV-E25, ODV-E56, ODV-E66, P74, PIF-2, F-protein and VP91), one was tegument protein (GP41), nine were nucleocapsid specific or associated proteins (P6.9, VP39, VLF-1, 49K, ODV-EC27, P78/83, BV/ODV-C42, VP80 and FP25K) and five were replication necessary proteins (DNA-pol, Helicase, IE-1, LEF-1 and LEF-3). The rest seventeen proteins without any localization reports are listed as others in Table 2 (Ac66, P33, Ac109, Alk-Exo, Ac5, Ac30, Ac39, Ac58, Ac59, HCF-1, Ac74, Ac79, PNK/PNL, CG30, Ac102, Ac114 and Ac132). For the localization of VP91, there were some conflicting reports. Immuno-EM images showed association of VP91 with envelopes and with nucleocapsids. Since VP91 could not be dissociated completely from ODV virions by NP-40 detergents, it was thought to be nucleocapsid associated (88). The predicted highly hydrophobic N-terminal transmembrane region suggests VP91 is a membrane protein, which might have a strong interaction with the nucleocapsid proteins (90). In this review, VP91 was listed as a predicted envelope protein in Table 2.
One of the most important discoveries of the proteomic research of AcMNPV ODV is that five of the six DNA replication essential proteins were identified including DNA polymerase, Helicase, IE1, LEF-1 and LEF-3. Almost a complete set of DNA replication proteins packaged into ODV virions may help the virus replication immediately after ODVs entry into the midgut cells.
Bombyx mori Nucleopolyhedrovirus, BmNPV is a Group Ⅰ NPV, and quite similar to AcMNPV in both the gene sequence and order. By using SDS-PAGE or 2DE to separate virion proteins and PMF technique on BmNPV ODV, 16 viral proteins (Table 2) and 4 host proteins were identified (58). Apart from polyhedrin, the viral proteins include four envelope proteins, one tegument protein, four nucleocapsid proteins and six other proteins. The four host proteins include heat shock cognate protein, heat shock protein hsp21.4, glutathione S-transferase 2 and cyclophilin A. This is the first report that host proteins associated with baculovirus ODV virions, but the potential functions of these host proteins need further investigations.
HearNPV (also called HaSNPV), first isolated in 1975 in the province Hubei of the People's Republic of China, has been used extensively in China over 25 years to control H. armigera on cotton. Phylogenetic analysis indicated that HearNPV belongs to Group Ⅱ NPVs (41). Its DNA genome is 131 kb containing 135 ORFs potentially encoding proteins of 50 aa or larger (18).
Initially, several methods have been tried to purify HearNPV ODV virions according to different references. Finally, pure and intact HearNPV ODV virions were obtained using similar methods to AcMNPV (12, 13) with little modifications. First, polyhedra were pret-reated with 0.01 mol/L HgCl2 in 0.1 mol/L Tris-HCl (pH7.8) at room temperature for 30 min and washed to remove HgCl2, then resuspended in water and put into a 70℃ water bath for 20 min. The pre-treatments by HgCl2 and heat aimed to inactive the alkaline protease, which helps to dissolve polyhedra in the larval midgut (92). ODV was released from the pretreated polyhedra by alkaline treatment (DAS solution including 0.1 mol-L Na2CO3, 0.5 mol/L NaCl, 10 mol/L EDTA, pH 10.9) and purified on continuous sucrose gradients. EM of negatively preparations demonstrated that some of the non-pretreated ODVs partly have an incomplete loose envelope (Fig. 2 B), while ODVs virions from pretreated polyhedra seemed more intact and compact (Fig. 2 A). There were also differences on SDS-PAGE profiles (Fig. 2 C) between these two purification methods. As a result, purified ODVs from pretreated polyhedra were used for further proteomic research (21).
Figure 2. A: Negatively stained EM photos of purified ODV virions from pretreated polyhedra. B: Negatively stained EM photos of purified ODV virions from non-pretreated polyhedra. C: Coomassie brilliant blue stained (left) and silver stained (right) SDS-PAGE profiles of purified ODVs from pretreated (line 1) and non-pretreated (line 2) polyhedra respectively.
SDS-PAGE and PMF mass spectrometry technique was applied to analyze purified ODVs of HearNPV and identifed 23 viral proteins (Table 2) in addition to polyhedrin. Among these proteins, seven envelope proteins (ODV-E18, ODV-E25, ODV-E56, ODV-E66, P74 and PIF-1), one tegument protein (GP41), nine nucleocapsid proteins (P6.9, VP39, VLF-1, 49K, ODV-EC27, P78/83, ODV/BV-C42 and VP80), three proteins involved in DNA replication and transcription (Helicase, LEF-3 and DNA-polymerase), and three other proteins including Ha66 (Ac66), P33, Ha94 (Ac109), were reported as ODV associated proteins in former AcMNPV ODV proteomic study (12). Ha44 and Ha100 (without homologues in AcMNPV) were identified for the first time as ODV associated proteins. Ha44 is conserved in Group Ⅱ NPVs and GVs but does not exist in Group Ⅰ NPVs, while Ha100 is conserved only in Group Ⅱ NPVs and is a homologue of poly (ADP-ribose) glycohydrolase (parg). Western blot analysis revealed that Ha44 and Ha100 were present on both BV and ODV (21).
Baculoviruses that infect mosquitoes are of growing interest as they are currently believed to represent a separate branch within the Baculoviridae that existed prior to the split of lepidopteran nucleopolyhed-roviruses (NPVs) and granuloviruses (GVs) (81). CuniNPV is pathogenic to Culex spp. mosquitoes, which are important vectors of West Nile virus and St. Louis and Eastern encephalitis viruses. CuniNPV development is restricted to the nuclei of midgut epithelial cells in the gastric caeca and posterior stomach. The genome of CuniNPV contains 109 ORFs and only 36 of which show similarities to ORFs of other nucleopolyhedroviruses (1).
Multiple proteomics approaches including Edman sequencing, PMF and LC-MS/MS (by LC-Q-TOF) were used to identify of proteins associated with CuniNPV ODV (81). In addition to the major occlusion body (OB) protein (cun085), a total of 44 proteins were identified. Of the 44 proteins, 20 were unique ORFs encoded by CuniNPV genome, and they were not in-cluded in Table 2. The other 24 proteins were con-served among all sequenced baculovirus genomes except F-proteins and Bro (102). These proteins include nine ODV envelope proteins including ODV-E18, ODV-E28 (Ac96) (14), ODV-E56, P74, PIF-1, PIF-2, PIF-3, F-protein and VP91, one tegument protein (GP41), seven nucleocapsid necessary proteins (P6.9, VP39, VP1054, VLF-1, 38K, 49K and ODV-EC27) and a late gene expression factor (LEF-9). In addition, Cun58 (Ac68) and Cun106 (Ac81) were classified into predicted envelope proteins group for their predicted highly hydrophobic transmembrane region while the left four proteins including Cun92 (Ac66), Cun14 (P33), Cun69 (Ac109) and Cun109 (Bro) were listed into the others in Table 2.
In Table 2, we summarized the results to date of baculovirus ODV structurally associated proteins identified either by proteomic investigations or by classical Western bolt and IEM studies (the unique ORFs of CuniNPV were not included). The proteins associated with the ODV structure can be classified into several groups including envelope proteins, predi-cted envelope proteins, a tegument protein, nucleocapsid essential proteins, nucleocapsid associated proteins, replication and transcription proteins and others. Polyhedrin was identified in the ODVs of AcMNPV, HearNPV, and BmNPV, and the major OB protein was identified in CuniNPV. As it is difficult to avoid contamination of the major matrix protein of the OBs during the purification process of ODV, both polyhedrin and the major OB protein of CuniNPV were not included in Table 2.
The transmembrane domains of the proteins were analyzed on Toppred website (http://mobyle.pasteur.fr/cgi-bin/MobylePortal/portal.py?form=toppred) and proteins with high score of certain transmembrane region were signed "TM" on "TM prediction" column in Table 2. Almost all the ODV envelope proteins have predicted TMs except BV/ODV-E26 which is palmitoylated to be associated with membranes (16). VP91, Ac68 and Ac81 were considered as predicted envelope proteins in this review as their predicted TMs were high scored and conserved in their homolo-gues of other baculoviruses. Ten of these ODV (predicted) envelope proteins were conserved in all baculoviruses, including ODV-E18, ODV-E28, ODV-E56, VP91, Ac68, Ac81 and four oral infectivity proteins (P74, PIF-1, PIF-2 and PIF-3) (40, 66). The tegument protein, GP41, and seven nucleocapsid essential proteins, including DNA core protein P6.9, capsid major protein VP39, nucleocapsid assembly essential proteins of VP1054, VLF-1, 38K, 49K and ODV-EC27 were conserved in all baculoviruses (40). In addition, four replication and transcription related proteins, including DNA-pol, Helicase, LEF-1 and LEF-9, and four other proteins of Ac66, P33, Ac109 and Alk-Exo are also conserved core proteins of all baculoviruses (40, 85). In conclusion, 26 of the totally 31 conserved baculoviruses proteins were identified in ODV virions (40, 66, 85). The association of these highly conserved proteins should play quite important roles in virion packaging, the infection pathway and subsequent uncoating and initiation of replication processes.
Ha44 and Ha100 do not have homologues in Group Ⅰ alphabaculoviruses and both proteins are conserved in Group Ⅱ alphabaculoviruses. In contrast, BV/ODV-E26, PTP, Ac5, Ac30, Ac114 and Ac132 are unique Group Ⅰ proteins. These proteins may represent signifi-cant differences in ODV structure and/or assembly between Group Ⅰ and Group Ⅱ alphabaculoviruses.