The full-length gene SARS-CoV-2 nsp2 (nsp2full-length) of 638 amino acid residues was obtained from Prof. Bo Zhang (Wuhan Institute of Virology, Chinese Academy of Sciences). Construct of nsp2full-length was subcloned into the EcoR I/Kpn I sites of a modified pPICZ expression vector with a C-terminal GFP-6 × His tag under the AOX1 promoter for eukaryotic expression. The plasmids were transformed into P. pastoris X-33 cells. The expression was detected by GFP fluorescence in a small scale. After screening the strains with high expression, they were cultured in BMM medium (100 mmol/L potassium phosphate pH 6.0, 1.34% YNB, 1% methanol) at 28 ℃ for 72 h, and methanol was added every 24 h.
Using nsp2full-length as a template, the N-terminal of 276 amino acid residues (nsp21-276) was inserted into pGEX expression vector (Novagen). The mutant of nsp21-276 was created using site-directed mutagenesis and verified by DNA sequencing. The GST fusion tags are cut from the nsp21-276 constructs by a tobacco etch virus (TEV) protease cleavage site. The plasmids were transformed into Escherichia coli BL21 (DE3) cells. Cells were incubated in Luria–Bertani medium at 37 ℃ until the OD600 reached 0.8–1.0 and then supplemented with 100 μmol/L ZnCl2, meanwhile induced with 0.2 mmol/L isopropyl-D-thiogalactoside for an additional 12 h at 18 ℃.
Both yeast and E. coli cultures were harvested by centrifugation at 4, 000 ×g for 10 min at 4 ℃, and resuspended in a lysis buffer (20 mmol/L Tris pH 7.5, 500 mmol/L NaCl, 2 mmol/L β-mercaptoethanol, 5% glycerol, 30 mmol/L imidazole), then added with 0.1% (v/v) Triton X-100 and 1 mmol/L PMSF (Invitrogen). Cells were disrupted by high pressure (1, 800 bar, P. pastoris) or sonication (E. coli), then clarified by centrifugation at 18, 300 ×g for 45 min to remove cell debris. The supernatant was applied to a Ni2+-chelating column or GST affinity column (GE Healthcare). The GFP or GST fusion tag of the proteins was cleaved using TEV protease at 4 ℃ for 6–8 h. The protein sample was then loaded onto a Q Sepharose (GE Healthcare), and eluted with a NaCl gradient. The proteins were further purified by size exclusion chromatography using Superdex 200 10/300, (GE Healthcare) in SEC buffer (20 mmol/L Tris pH 7.5, 150 mmol/L NaCl, 2 mmol/L TCEP). The peak fractions were collected and checked by SDS-PAGE (Fig. 1A). The protein with a purity of 95% was concentrated to 5.5 mg/mL and stored at 4 ℃ for further use.
Figure 1. The overall structure of nsp2. A. The nsp21-276, nsp2full-length, and nsp21-276 mutant1 (K111A/K112A/K113A) samples after size exclusion chromatography purification, respectively. B The overall structure of nsp21-276 consists of four chains in the asymmetric unit. A, B, C, D four chains are shown in pink, yellow-orange, cyan, and green, respectively. C The stereo structure of nsp21-276. The α-helices and β-sheets are colored cyan and magenta, respectively, and the three zinc atoms are shown as red, blue, and yellow spheres, respectively. The diagram of Zn2+ binding site is illustrated under the stereo structure of nsp21-276.
Initial crystal screening was performed at 4 ℃ with multiple commercial screens (Hampton Research). Using the sitting-drop vapor diffusion method, 1 μL of purified protein was mixed with 1 μL reservoir solution in 48-well plates. Initial nsp21-276 crystals were grown in a reservoir solution containing 20% PEG 8000, 0.1 mol/L HEPES pH 7.0 at 4 ℃. Unfortunately, the crystal of nsp2full-length was not obtained after many attempts. Crystals of nsp21-276 were further refined by the hanging-drop vapor diffusion method, and the protein was mixed with reservoir solution in different volume ratios.
After precipitant concentration and pH optimized, well-diffracted crystals were finally obtained in 18% PEG 8000, 0.1 mol/L HEPES pH 7.5 at a 2.5:2 (v/v, protein/reservoir solution) at 4 ℃ (Supplementary Fig. S2A, S2B). The crystals were flash-cooled in liquid nitrogen in mother liquor containing 20% glycerol as a cryoprotectant. Data were collected on beamlines BL17U1 at Shanghai Synchrotron Radiation Facility (SSRF). Data were indexed, integrated, and scaled by autoPROC and XDS (Yu et al. 2019). The statistics are summarized in Table 1.
Parameters Value Data collection Wavelength (Å) 0.9793 Space group P21 Cell dimensions a, b, c (Å) 57.93, 159.6, 63.55 α, β, γ (°) 90, 91.2, 90 Resolution (Å)a 50.0–1.96 (2.06–1.96) I/σ 17.9 (2.7) Rmerge 0.062 (0.673) Completeness (%) 99.8 (99.7) Total No. of reflections 555, 159 Unique reflections 82, 763 Redundancy 6.7 (5.8) Refinement Resolution (Å) 50.0–1.96 No. of reflections 75, 133 Rwork/Rfree (%) 19.97/22.56 No. of atoms Protein 8406 Ligand/ion 18 Water 925 B-factors (Å2) Protein 34.26 Ligand/ion 39.12 Water 42.08 r.m.s. deviationsb Bond lengths (Å) 0.003 Bond angles (º) 1.29 Ramachandran Plot (%)c 96.8/3.2/0.0 aStatistics for highest resolution shell.
bRoot mean square deviations
cResidues in favored, allowed, and outlier regions of the Ramachandran plot.
Table 1. Data collection and refinement statistics of SARS-CoV-2 nsp21-276 (PDB: 7EXM)
The structure of nsp21-276 was solved by the single-wavelength anomalous diffraction (SAD) method. The anomalous signals in the data were strong as analyzed by SHELX C program (Sheldrick 2008), indicating the existence of zinc atoms. There initial Zn sites were found by the program SHELX D with a CCweak/CCall of 20.8/28.9 in space group P21. Twelve initial Zn sites were found and the phases were generated. The crude partial model with 19 β-sheets and 27 α-helices in 787 residues was built by program SHELEX E and figure of merit reached 0.626 (Sheldrick 2008). The initial model was further adjusted by manual model building using COOT (Emsley et al. 2010) and refinement using REFMAC5 (Murshudov et al. 2011). The structure was refined finally to 1.96 Å resolution with an Rwork of 19.97% and an Rfree of 22.56%.
Small-angle X-ray scattering (SAXS) data were performed at the beamline BL19U2 of the SSRF. Briefly, proteins were subjected to size exclusion chromatography with buffer (20 mmol/L HEPES pH 7.5, 150 mmol/L NaCl). Various concentrations of protein samples were tested, and the data were collected at 1.03 Å with 20 frames and a distance of 1 m from the detector. Individual data were used to process by using Software RAW (Nielsen et al. 2009). The scattering data from the buffer alone were measured before and after each sample, and the average of the scattering data before and after each sample was used for background subtraction. The scattering data and the structure PDB file for fitting were submitted to the FoXS online server (http://modbase.compbio.ucsf.edu/foxs/) (Pelikan et al. 2009).
Single or double-stranded DNAs (ssDNA or dsDNA) were used for EMSA. The ssDNA sequence is: 5'-GATGTGATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA-3', which is similar to the 3'-UTR RNA sequence of SARS-COV-2. Double-stranded DNAs were prepared by annealing two oligonucleotides (the above ssDNA and its complementary DNA) slowly. In short, the mixture of the two complementary oligos was incubated at 95 ℃ for 5 min and then annealed two degrees per minute to 4 ℃. Different amounts of full length and nsp21-276 were incubated with 5 μmol/L ssDNA or dsDNA in a 20 μL reaction volume for 2 h at 25 ℃ and the mixture was then separated on a 6% native polyacrylamide gel in 1 × TG buffer (45 mmol/L Tris pH 8.0, 45 mmol/L glycine pH 8.9) and 1 × TB buffer (45 mmol/L Tris pH 8.0, 45 mmol/L boric acid, pH 8.0) at 180 V for about 45 min, respectively. The incubation molar ratios of full length and nsp21-276 to nucleic acid were shown in the figures. In order to investigate whether the zinc finger structures were involved in interacting nucleic acid, the zinc ions of nsp21-276 were chelated with gradient EDTA concentration (0–200 mmol/L). The nsp21-276 mutant 1 (K111A/K112A/K113A) was incubated with DNA at a 15:1 molar ratio to study the effect of residues K111, K112, and K113 on binding nucleic acids. The DNAs were visualized by staining with GelRed.