Coccolithoviruses are a group of viruses that infect Emiliania huxleyi, a coccolithophorid alga with a global distribution in temperate and sub-temperate oceanic regions, and therefore play a crucial role in biogeochemical cycling and primary productivity (van Rijssel M, et al., 2002). So far coccolithoviruses have been studied in mesocosm systems (Martinez J M, et al., 2007), natural open ocean blooms (Rowe J M, et al., 2011; Wilson W H, et al., 2002), and in laboratory based experiments (Allen M J, et al., 2006; Wilson W H, et al., 2005). With the recent sequencing of laboratory isolates, glimpses into the natural biodiversity of these viruses at the genetic level have been observed (Allen M J, et al., 2006; Nissimov J I, et al., 2011; Nissimov J I, et al., 2011; Nissimov J I, et al., 2012; Nissimov J I, et al., 2012; Pagarete A, et al., 2012; Wilson W H, et al., 2005). EhV-86, the model strain, harbours a 407, 339 bp genome which encodes 472 genes including core Nucleo-cytoplasmic large DNA Virus (NCLDV) genes for DNA polymerase, major capsid protein and RNA polymerase (Allen M J, et al., 2006). Despite the designation of the majority of the content as unknown or putative function, the genetic machinery for a near complete sphingolipid biosynthesis pathway (acquired through horizontal gene transfer from the host, E. huxleyi) has been identified in every coccolithovirus isolate to date (Allen M J, et al., 2006; Monier A, et al., 2009). Genomic analysis of coccolithoviruses isolated from different geographical locations has shown that, despite displaying similar genome sizes, they differ at a number of genomic loci; however the functional relevance of this has yet to be determined experimentally (Allen M J, et al., 2007). Problems have arisen when extrapolating the diversity characterised in the laboratory to naturally occurring environmental virus communities. Viral isolates studied in the laboratory often represent the most abundant virus strains present at the time of isolation (often dependant on the most abundant host at that time), and are biased towards isolates capable of infecting established laboratory strains of E. huxleyi and may or may not be environmentally relevant. Indeed, during natural conditions in the oceans, many different viral strains compete with each other for infection and replication. Some will be more successful than others. The success of a virus is determined by a plethora of transient environmental conditions which create an ever-changing landscape to adapt and evolve to. It is this intense selection pressure that contributes to the diverse pool of genes observed in the handful of 'model' strains characterised to date. However, reliance on a limited number of strains to infer ecological functional relevance often ignores the diversity and variation found in the natural environment.
Traditionally, the DNA polymerase gene is used to study the diversity and phylogeny of phycodnaviruses (algal viruses) (Chen F, et al., 1996). In recent years the gene encoding the major capsid protein (MCP) has also been used as an alternative marker, capable of distinguishing phylogenetic differences on a strain level (Larsen J B, et al., 2008; Rowe J M, et al., 2011). Several cruises have used these markers to observe the diversity of coccolithoviruses in natural blooms. The first looked at the temporal succession of E. huxleyi and their viruses during the propagation of a natural bloom in the North Sea in 1999 (Martinez J M, et al., 2012), whilst another cruise in the North Atlantic between Iceland and the UK in 2005 focused mainly on the distribution of coccolithoviruses, their location specific distinctions and their clustering with the use of the MCP marker gene (Rowe J M, et al., 2011).
Despite these efforts, there are many questions that still remain unanswered. For example, it is not currently known how the coccolithoviruses persist during non-bloom periods. With the exception of coccolithovirus sequences extracted from Black Sea sediments (Coolen M J, 2011), the diversity of these viruses in non-bloom conditions is poorly understood. Given the harsh conditions to which viruses are exposed to in their natural environment, it is somewhat surprising that infection, and the resulting bloom termination, occurs regularly and reliably on a yearly basis. Yet, perhaps, the most important questions left unanswered (in this and the majority of virus systems under laboratory study) is what is the functional relevance of the observed biodiversity, and how does it impact on the ecology of the virus community and its function?
Here, we aim to investigate both the biogeographic and temporal distribution of coccolithoviruses and their diversity with the established MCP marker, whilst also targeting a gene whose protein is of known metabolic function during infection, serine palmitoyltransferase (SPT). SPT is the first and rate limiting enzyme in the de novo sphingolipid biosynthesis pathway and homologues are encoded by both the virus and host genomes (Han G, et al., 2006; Monier A, et al., 2009; Wilson W H, et al., 2005). It has been implicated in the formation of lipid rafts and virus release during infection (Pagarete A, et al., 2009), and is even considered to be involved in the mass termination of coccolithophore blooms via the propagation of programmed cell death (PCD) of its host (Bidle K D, et al., 2011; Vardi A, et al., 2009; Vardi A, et al., 2012). SPT gene expression has been observed during infection of E. huxleyi under both laboratory and natural conditions, and the enzyme's activity has been characterised (Allen M J, et al., 2006; Han G, et al., 2006; Pagarete A, et al., 2009; Pagarete A, et al., 2012). Here, we use the two genes as markers for phylogeny and functionality in a study attempting to assess both spatial and temporal variability, using an archive of DNA samples collected during a cruise in the Atlantic Ocean, a coccolithophore bloom cruise in the North Sea in 1999, and samples collected weekly during a seven year period from the Western Channel Observatory in the English Channel near Plymouth, UK. By obtaining samples from a variety of locations and time points, and using the two marker genes we were hoping to improve the current understanding of the classification of these viruses, their distribution, and gain an insight into their functional biodiversity and ecological relevance.
Seawater was collected twice a day (before dawn and at solar noon) through a Conductivity Temperature and Density instrument (CTD) at 65 stations along the Atlantic Meridional Transect-20 (AMT-20) cruise track (www.amt-uk.org) (Supplementary material Fig. S1 and S2). Samples (10 L) were collected from five depths at each station corresponding to 97%, 55%, 33%, 14% and 1% light penetration, filtered via a 0.2 μm Millipore nitrocellulose membrane filter (47 mm), snap-frozen in liquid nitrogen and stored at -80 ℃. Two 1 mL samples from each depth were fixed in 1% glutaraldehyde for Analytical Flow Cytometry (AFC) of coccolithophores and coccolithoviruses. AFC data are available from http://www.bodc.ac.uk/projects/uk/amt/.
The Western Channel Observatory is located 10 km south off Plymouth Sound in the English Channel (50° 15.00' N, 4° 13.02' W). Weekly samples (1 L) were taken between 2001 and 2007 from the L4 station, filtered onto 0.45 μm Millipore nitrocellulose membrane filter (47 mm), snap-frozen in liquid nitrogen and stored at -80 ℃. Two 1 mL samples were fixed in 1% glutaraldehyde for Analytical Flow Cytometry (AFC) of coccolithophores and coccolithoviruses.
Extracted DNA samples from the DISCO cruise (Dimethyl Sulphide Biogeochemistry within a Coccolithophore Bloom) were obtained from the Plymouth Marine Laboratory DNA archive. Samples were originally collected in June, 1999 on board the RRS Discovery during a phytoplankton bloom located in the North Sea (East to West from -2.0° to 4.0° and North to South from 61.0° to 51.0°). Further details on the methodology used can be obtained from Martinez et al (Martinez J M, et al., 2012).
All samples were subjected to a total genomic DNA extraction following an adapted phenol-chloroform protocol as described by Schroeder et al. (Schroeder D C, et al., 2005). Extracted DNA samples were subjected to a two-step nested PCR. Primers and reaction conditions for the detection of the coccolithovirus Major Capsid Protein (MCP) by DGGE have been described previously (Martinez J M, et al., 2012; Martinez J M, et al., 2007). Primers for the detection of the coccolithovirus serine palmytoyltransferase (SPT) gene were designed manually following the multiple DNA alignment of SPT from nine fully sequenced coccolithovirus genomes (EhV-84, EhV-86, EhV-88, EhV-99B1, EhV-201, EhV-202, EhV-203, EhV-207 and EhV-208). All PCR reactions were conducted in a VWR JENCONS Uno Thermal Cycler in 25 μL final volume (cycle conditions in Supplementary material Table S1). For the first step of the nested PCR reaction, 1 μL of DNA template (typically ~50 ng of extracted DNA) was mixed with 5 L of 5 × PCR reaction buffer (Promega), 1.5 L of 25 mmol/L MgCl2, 0.1 L of Taq DNA polymerase (Promega), 2 L of each 10 mol/L primer (MCP-F1/ MCP-R1 or SPT-F1/SPT-R1, see Table 1), 1.25 L of 2 mmol/L dNTPs and DNA-free molecular biology grade water (Sigma-Aldrich) up to a final volume of 25 L. Only samples that gave a band when visualised by agarose electrophoresis after the first step were subjected to the second PCR step. The second reaction of the nested PCR was performed under the same conditions as the first round, except 2 μL of product from the first reaction was used as template and primers used were MCP-F2-GC/MCP-R2 (generating a 135 bp fragment) or SPT-F2-GC/SPT-R2 (generating a 335 bp fragment), see Table 1.
Primer Sequence (5' to 3') MCP-F1 GTCTTCGTACCAGAAGCACTCGCT MCP-R1 ACGCCTCGGTGTACGC ACCCTCA MCP-F2-GC CGCCCGGGGCGCGCCCCGGGCGGGGCGGGGGCACGGGGGGTTCGCGCTCGAGTCGATC MCP-R2 GACCTTTAGGCCAGGGAG SPT-F1 GTTGGATATCCCGCAACACC SPT-R1 CAATGTCGCCAATGTTGGC SPT-F2-GC CGCCCGGGGCGCGCCCCGGGCGGGGCGGGGGCACGGGGGGGAATCTCGCGCGCG SPT-R2 CGCGGTCCACATGTACC SPT-F2 GGAATCTCGCGCGCG
Table 1. The primers used in this study
DGGE was performed using an Ingeny PhorU-2 system. 15 μL of nested PCR product was applied directly onto an 8% w/v polyacrylamide gel (acrylamide /N, N'-methylene bisacrylamide, 37:1, w/w) in 1 × TAE buffer (40 mmol/L Tris pH 7.4, 20 mmol/L NaAcetate, 1 mmol/L Na2EDTA). A 30 to 60% linear denaturing gradient was formed using 20% and 80% denaturants (100% denaturant being 7 mol/L urea and 40% v/v formamide). 20 μL of marker (composed of the single product from the nested PCR of nine laboratory strains) was used in the first well of each gel. For SPT samples, electrophoresis was performed at a constant voltage of 100V and a temperature of 60 ℃ for 17 hrs. For MCP samples a constant voltage of 200V was used for 3.5 hrs. Following electrophoresis the gels were stained for 1 hr in Milli-Q water containing 1 μg/mL Ethidium Bromide then de-stained in Milli-Q water for 1 hr, visualised on a UV transilluminator (Syngene GeneGenius) and photographed using the Syngene GeneSnap software. Bands of interest were excised and incubated in 30 μL of DNA water at 4 ℃ overnight and then 2 μL was used as a template for a final third step PCR following all the conditions of the second step (using MCP-F2-GC /MCP-R2 or SPT-F2/SPT-R2, Table 1), prior to sequencing. Sanger sequencing was performed by the LGC Genomics Sequencing Centre in Germany (www.lgcgenomics.com). The LifeTech (former Applied Biosystems) BigDye version 3.1 sequencing mix was used for cycle sequencing. After the purification of the sequencing reactions by gel filtration (Centri–Pur 96, EMP biotech Berlin) the samples were run on an ABI 3730 XL instrument using POP7 polymer and standard run conditions.
Coccolithovirus isolate MCP and SPT sequences retrieved from GenBank (JF974290, AJ890364, JF974310, FN429076, JF974311, EhV-202, JF974291, JF974317 and JF974318) and the new environmental sequences (submitted to GenBank under accession numbers: AB738836, AB738837, AB738838, AB738839, AB738840, AB738841, AB738842, AB738843, AB738844, HE970437, HE970438, HE970439 and HE970440) were aligned using the MEGA4 (version 4.0.2) multiple sequence alignment software (Tamura K, et al., 2007). Prior to CLUSTALW alignment the primer sequences were removed and the sequences were made the same length in order to decrease bias that can arise from potential gaps and sequences with different lengths. The evolutionary history was inferred using the NeighborJoining method (Saitou N, et al., 1987). Phylogenetic analyses were conducted in MEGA4 (Tamura K, et al., 2007). Translated MCP and SPT gene sequences from EhV-86 and EhV-99B1 were modelled in order to determine their predicted secondary and tertiary protein structures. Sequences were uploaded into the Phyre2 protein fold recognition server (Kelley L A, et al., 2009). The final models of SPT and MCP were conducted by Phyre2 using seven and six templates respectively in order to maximise confidence, percentage identity and alignment coverage. The resulting theoretical models were uploaded into the 3D protein structure analysis software: Jmol (Hanson R, 2010), Swiss PdBViewer (Guex N, et al., 1997), and Astex (Hartshorn M J, 2002).
Host and virus abundance was determined using a FACScan Flow Cytometer (Beckton Dickinson, Oxford, UK) as described by Brussaard et al (Brussaard C P, et al., 2000). Data files were analysed using the WinMDI 2.9 and CellQuestProTM software.