Chemical Characterization and Anti-HIV-1 Activity Assessment of Iridoids and Flavonols from Scrophularia trifoliata

Plants are the everlasting source of a wide spectrum of specialized metabolites, characterized by wide variability in term of chemical structures and different biological properties such antiviral activity. In the search for novel antiviral agents against Human Immunodeficiency Virus type 1 (HIV-1) from plants, the phytochemical investigation of Scrophularia trifoliata L. led us to isolate and characterize four flavonols glycosides along with nine iridoid glycosides, two of them, 5 and 13, described for the first time. In the present study, we investigated, for the first time, the contents of a methanol extract of S. trifoliata leaves, in order to explore the potential antiviral activity against HIV-1. The antiviral activity was evaluated in biochemical assays for the inhibition of HIV-1Reverse Transcriptase (RT)-associated Ribonuclease H (RNase H) activity and HIV-1 Integrase (IN). Three isolated flavonoids, rutin, kaempferol-7-O-rhamnosyl-3-O-glucopyranoside, and kaempferol-3-O-glucopyranoside, 8–10, inhibited specifically the HIV-1 IN activity at submicromolar concentration, with the latter being the most potent, showing an IC50 value of 24 nM.


Introduction
Scrophularia trifoliata L., belonging to the family of Scrophulariaceae, is an endemic plant of Sardinia, Corsica, and Gorgona Islands [1]. It is an herbaceous perennial plant, with a woody base, growing up to 1.5-2.0 m height. It is found predominantly in fresh and shady places. The plant is characterized by opposite and lanceolate leaves and tetragon, fistulous multistems. The irregular flowers have a green or red bilabiate corolla, which has the unique characteristic of two reddish or reddish-purple spots, surrounded by a broad black stripe [2]. These features act as nectar guides for animals and insect pollinators, involved in the so called "bird and mixed vertebrate-insect pollination system" (MVI); in addition, their peculiar colorations are indicative of a high anthocyanin content [3].
The plant is included in the large genus Scrophularia, which comprises about 60 species from where different classes of specialized metabolites have been isolated. Among them, iridoids and flavonoids have proven to produce several biological activities [4,5].
For example, scropolioside B isolated from S. dentata "Ye-Xin-Ban" had an inhibitory effect against nuclear factor kappa-light-chain-enhancer of activated B cells [6]; harpagide from S. buergeriana showed a good protective effect against glutamate-induced oxidative

Anti-HIV Activity of Methanolic Crude Extract
With the aim to find new agents that dual inhibit HIV-1 RNase H and IN activities, we tested the crude extract of S. trifoliata, firstly for its ability to inhibit the RT-associated RNase H activity, for which no drug is currently available. Then, it was assayed for its effects also on HIV-1 IN in biochemical assays. Results were expressed as IC 50 values against HIV-1 RNase H activity and in IN LEDGF-dependent integration. The extract showed an evident inhibitory activity with IC 50 values of 9.90 ± 0.93 and 2.5 ± 0.4 µg/mL on HIV-1 RNase H activity and HIV-1 IN strand transfer activity, respectively.

2D-NMR Investigation of Crude Extract
In the attempt to characterize the metabolites potentially responsible for highlighted anti-HIV activities, an extensive 2D-NMR study of the crude extract was carried out. The 1 H-NMR spectrum of S. trifoliata (Figure 1) was dominated by resonance of iridoids with a characteristic doublet at δ H 5.04 as well as two double doublets at 6.38 (dd, J = 1.2 and 5.7 Hz) and 5.08 (dd, J = 4.2 and 5.7 Hz) related to dihydropyrane ring. Several overlapped signals were also detected in the region of proton germinal to oxygen as well as in the up-field region of 1 H-NMR spectrum.
In the attempt to characterize the metabolites potentially responsible for highlighted anti-HIV activities, an extensive 2D-NMR study of the crude extract was carried out. The 1 H-NMR spectrum of S. trifoliata (Figure 1) was dominated by resonance of iridoids with a characteristic doublet at δH 5.04 as well as two double doublets at 6.38 (dd, J = 1.2 and 5.7 Hz) and 5.08 (dd, J = 4.2 and 5.7 Hz) related to dihydropyrane ring. Several overlapped signals were also detected in the region of proton germinal to oxygen as well as in the upfield region of 1 H-NMR spectrum. The doublet of doublets at δH 6.38 (δC 141.4) showed long range heterocorrelation in the CIGAR-HMBC with a methine carbon at δC 36.7 (δH 2.33), an olefin carbon at δC 103.7 (δH 5.08), and an acetalic carbon at δC 95.1 bonded to a doublet resonating at δH 5.04. This signal was, in turn, correlated with three methine at δC 141.4, δC 34.7, and finally an anomeric carbon at δC 99.3. This latter suggested the presence of a glycosidic moiety in this compound, identified as glucose based on spectroscopic evidence. The 13 C-NMR values at 99.3, 77.4, 77.1, 74.4, 71.0, and 62.1 and the coupling constant value of 8.1 Hz are in good agreement with the β-glucopyranosil moiety. The acetalic proton at δH 5.04 showed correlations, in the HSQC-TOCSY experiment, with two carbinolic carbon at δC 95.1 and δC 87.4, and two aliphatic methine at δC 42.5 and 36.7. This latter, in the same experiment ( Figure 2), showed correlations with all the protons belonging to the same spin system: 6.38 (H-3), 5.08 (H-4), 2,58 (H-9), 2.33 (H-5), and 3.74 (H-6). This latter proton, linked to methine carbon at 87.4, showed long range correlations, in the CIGAR-HMBC, also with a proton at 3.77, linked to carbon at 58.0 in good accordance with the presence of an oxirane ring. Thanks to long range correlations of both carbons of epoxide (δC at 58.0 and 66.0) with H-9 and methylene proton of a hydroxyethyl at 4.20 and 3.80, the oxyrane ring is located at the C-7 and C-8 carbons of iridoid skeleton. The doublet of doublets at δ H 6.38 (δ C 141.4) showed long range heterocorrelation in the CIGAR-HMBC with a methine carbon at δ C 36.7 (δ H 2.33), an olefin carbon at δ C 103.7 (δ H 5.08), and an acetalic carbon at δ C 95.1 bonded to a doublet resonating at δ H 5.04. This signal was, in turn, correlated with three methine at δ C 141.4, δ C 34.7, and finally an anomeric carbon at δ C 99.3. This latter suggested the presence of a glycosidic moiety in this compound, identified as glucose based on spectroscopic evidence. The 13 C-NMR values at 99.3, 77.4, 77.1, 74.4, 71.0, and 62.1 and the coupling constant value of 8.1 Hz are in good agreement with the β-glucopyranosil moiety. The acetalic proton at δ H 5.04 showed correlations, in the HSQC-TOCSY experiment, with two carbinolic carbon at δ C 95.1 and δ C 87.4, and two aliphatic methine at δ C 42.5 and 36.7. This latter, in the same experiment ( Figure 2), showed correlations with all the protons belonging to the same spin system: 6.38 (H-3), 5.08 (H-4), 2,58 (H-9), 2.33 (H-5), and 3.74 (H-6). This latter proton, linked to methine carbon at 87.4, showed long range correlations, in the CIGAR-HMBC, also with a proton at 3.77, linked to carbon at 58.0 in good accordance with the presence of an oxirane ring. Thanks to long range correlations of both carbons of epoxide (δ C at 58.0 and 66.0) with H-9 and methylene proton of a hydroxyethyl at 4.20 and 3.80, the oxyrane ring is located at the C-7 and C-8 carbons of iridoid skeleton.
All data were in agreement with those reported for catalpol (1, Figure 1), already reported as a constituent of S. trifoliata [17].
Moreover, it was evident the presence of other signals belonging to other iridoids glycosides, which were present in low quantities, and other signals in the range of 6.0-8.0 ppm, attributable to flavonoids and other metabolites. Nevertheless, the identification of compounds in the mixture was not possible, hence, the extract was subjected to phytochemical study in order to identify the compounds responsible for the HIV-1 RNase H and IN activities. All data were in agreement with those reported for catalpol (1, Figure 1), already reported as a constituent of S. trifoliata [17].
Moreover, it was evident the presence of other signals belonging to other iridoids glycosides, which were present in low quantities, and other signals in the range of 6.0-8.0 ppm, attributable to flavonoids and other metabolites. Nevertheless, the identification of compounds in the mixture was not possible, hence, the extract was subjected to phytochemical study in order to identify the compounds responsible for the HIV-1 RNase H and IN activities.

Phytochemical Study of Scrophularia trifoliata
The crude methanolic extract of S. trifoliata was purified on Amberlite XAD-4, first with water and then eluting with MeOH, in order to eliminate sugar and other watersoluble metabolites. The alcoholic eluate then was fractionated through column chromatography RP-18, obtaining seven fractions, A-G ( Figure 3). Fractions were preliminary analyzed by thin-layer chromatographic (TLC) plate, eluting with the lower phase of CHCl3/MeOH/H2O (13:7:2) solution. Fraction G showed a more complex TLC profile and the NMR spectrum suggested the presence of flavonoids and iridoids.

Phytochemical Study of Scrophularia trifoliata
The crude methanolic extract of S. trifoliata was purified on Amberlite XAD-4, first with water and then eluting with MeOH, in order to eliminate sugar and other watersoluble metabolites. The alcoholic eluate then was fractionated through column chromatography RP-18, obtaining seven fractions, A-G ( Figure 3). Fractions were preliminary analyzed by thin-layer chromatographic (TLC) plate, eluting with the lower phase of CHCl 3 /MeOH/H 2 O (13:7:2) solution. Fraction G showed a more complex TLC profile and the NMR spectrum suggested the presence of flavonoids and iridoids.
The combination of different chromatographic procedures led to isolate, from fraction A-F, a new iridoid glycoside (5), along with another six already known compounds (1-4, 6-7).
The downfield shift of the C-6 carbon at δC 93.5 with respect to the same signal of cymdahoside A (δC 77.6) [22] suggested the linkage of a methoxy group to C-6 carbon ( Table 1). The heterocorrelation in the CIGAR-HMBC experiment between the singlet methyl at δH 3.46 and the carbon at δC 93.5 (C-6) confirmed this hypothesis. Similar differences were registered between methylcatalpol (4) and catalpol (1). Finally, the correlations observed between anomeric proton at δH 4.65 (H-1′) and carbon at δC 91.6 (C-1), and vice versa in the CIGAR-HMBC spectra, allowed us to link a β-glucopyranosyl unit of a C-1 of iridoid structure. The coupling constant of the anomeric proton (7.8 Hz) suggested a β-configuration for the anomeric carbon.
The downfield shift of the C-6 carbon at δ C 93.5 with respect to the same signal of cymdahoside A (δ C 77.6) [22] suggested the linkage of a methoxy group to C-6 carbon ( Table 1). The heterocorrelation in the CIGAR-HMBC experiment between the singlet methyl at δ H 3.46 and the carbon at δ C 93.5 (C-6) confirmed this hypothesis. Similar differences were registered between methylcatalpol (4) and catalpol (1). Finally, the correlations observed between anomeric proton at δ H 4.65 (H-1 ) and carbon at δ C 91.6 (C-1), and vice versa in the CIGAR-HMBC spectra, allowed us to link a β-glucopyranosyl unit of a C-1 of iridoid structure.
The stereochemistry of ring fusion was confirmed by the coupling constant J = 10.2 Hz between crumpled protons H9 (δ H 2.45) and H-5 (δ H 2.77). The NOESY (Supplementary Materials Figure S2) experiment enabled the relative configuration at the chiral carbons to be defined. In fact, the NOE observed between the β-oriented H-9 proton and H-5 and H-7 protons agreed with a β-orientation of both C-7 and C-5 methine, corresponding to R configuration for the C-5, C-7, and C-9 carbons.
Thus, compound 5 ( Figure 5) was characterized for the first time and named trifoliatoside A.

Anti-HIV Activity of Pure Compounds and Fraction G
Pure metabolites 1-7 and fraction G were evaluated for their anti-HIV properties in enzymatic assays. Among isolated metabolites, only compounds 5 and 7 were able to inhibit IN, selectively, even though at high concentration (Table 2), while the enriched fraction was active on both RNase H and IN activities with IC 50 values of 26.6 ± 1.3 µg/mL and 6.1 ± 0.92 µg/mL, respectively. Even though the concentrations were higher than those tested for the alcoholic extract, these results brought us to a phytochemical study of enriched fraction, trying to identify the pure compounds responsible for this effect.

Anti-HIV Activity of Pure Compounds Isolated from Fraction G
Pure compounds exhibited selective inhibition properties against IN-LEDGFdependent activity (Table 4). In particular, the most active were compounds 11 and 9, which showed IC50 values of 0.11 and 0.024 μM, respectively. In addition, 8, 10, and the new compound 13 inhibited the HIV-1 IN in an IC50 values range between 0.33 and 5.96 μM. Differently, compound 12 was found inactive on both enzymes' activities. However, the presence in the aromatic region of the 1 H-NMR spectrum of additional signals along with HSQC data suggested the presence of a cinnamoyl moiety. The coupling constant value of 12.6 Hz for olefinic protons H-7 (δ C 142.4) and H-8 (δ C 122.3) is in good agreement with a Z-geometry for a double bond of cinnamoyl moiety linked to C-8 of iridoid. In fact, in the HMBC experiment, the proton of methyl group (δ H 1.58) showed heterocorrelation with C-8 (δ C 122.3), C-9 (δ C 49.4), and C-8 (δ C 90.3). Furthermore, the down-field shift of C-8 chemical shift that bears cinnamoyl unit is in good agreement with the presence of an acyl moiety.
Thus, the structure of compound 13 was elucidated and named as trifoliatoside B. Also, the laterioside, 12, [31], was from leaves of S. trifoliata.

Anti-HIV Activity of Pure Compounds Isolated from Fraction G
Pure compounds exhibited selective inhibition properties against IN-LEDGF-dependent activity (Table 4). In particular, the most active were compounds 11 and 9, which showed IC 50   It was interesting to note that compounds 9, 10, and 11, even though sharing the same aglycon moiety, exhibited different activities. In fact, the best activity was reported for 11, which presents a minor grade of glycosylation in respect to the other flavonoids tested, which are linked at two sugar units. In addition, we can hypothesize that this compound has two different modes of action against the HIV-1 virus, based on the fact that anti-HIV-1 protease activity is already reported in the literature [32]. This is the first report about the HIV-1 IN-LEDGF-dependent inhibition of compound 13.

Discussion
Scrophularia trifoliata L. is an endemic plant of Sardinia, which is known to be used in traditional medical practices of the island for the treatment of different diseases, such as skin disorders and rheumatism [16]. Despite these traditional uses, to the best of our knowledge, to date only one phytochemical study on S. trifoliata has been reported [17]. This study that analyzed the monoterpenoid fractions led to isolate two C 9 iridod glycosides: catalpol and aucubin, which were also detected in the present investigation. Aucubin, bartioside, harpagoside from S. scorodonia leaves [33], 6-O-methylcatalpol, harpagide from S. ningpoensis roots [34] and 8-O-acetylharpagide, and scropolioside B from S. saharae [35] are just example of iridoids that have been isolated over the years from a little percent (4%) of investigated Scropularia species [4]. Recently, Venditti et al., 2015 [36] underlined some glycosidic iridoids also from S. canina living in Calabria region (Italy). So, C 9 iridoids in glycosidic and non-glycosidic forms could be considered chemotaxonomy markers of Scrophulariaceae family [4,37]. In this way, the present study furnishes important information in term of phytochemical composition of another endemic plant of Sardinia (Italy), highlighting the presence of iridoid glycosides also in S. trifoliata. Furthermore, this study contributed to increase the percentage of investigated Scrophulariaceae species [4] against the total of uninvestigated (only 17 of the approx. 350 species). Besides iridoids, also flavonoids and phenilethanoid glycosides have been isolated from different species belonging to Scrophularia genus [4].
All these compounds amply studied for their antioxidant, antibacterial, and antiinflammatory activities have not been sufficiently investigated for antiviral properties [33,38].
Furthermore, in this study, we tested S. trifoliata pure compounds on two HIV-1 viral enzymes: HIV-1 Reverse Transcriptase (RT)-associated Ribonuclease H (RNase H) activity, a promising target for which no drug is currently available in the therapy [39][40][41], and for the inhibition of HIV-1 integrase (IN) in the presence of the LEDGF7p75 cellular cofactor [42], a protein that is able to bind the IN to promote its catalytic activities, trying to identify potential anti-HIV therapeutic agents. In fact, HIV-1 RNase H and IN activities are viral-encoded enzymes that belong to the nucleotidil transferase superfamily and, therefore, possess homologies in their structure; in this contest, we discovered N'-acylhydrazones that inhibit one or both HIV-1 RT-associated RNase H activity and IN enzyme activities [43].
Today, the research of anti-HIV 1 agents is looking for plant-derived compounds, which can be an essential natural source of antiretroviral with low toxicity and a wide spectrum of actions. Out of them, flavonoids were reported for their ability to inhibit HIV-1 IN [39]. In addition, kaempferol from Securigera securidaca [38] and baicalin from Scutellaria baicalensis [44] inhibited the HIV-1 RT with IC 50 values of 50 µg/mL and 0.2 µg/mL, respectively; taxifolin from Juglans mandshurica inhibited the enzymes protease and RT [45]. On the contrary, only the iridoid glycoside 2 -O-(4-methoxycinnamoyl) mussaenosidic acid, isolated from Avicenna marina, has been reported for its ability to prevent viral infection, acting on co-receptors CCR5 and CXCR4 [38]. In this study, a bio-guided phytochemical approach led us to purification of iridoid glycosides and flavonoid compounds from methanol extract of S. trifoliata leaves, some of which showed a potential anti-HIV activity. In fact, the newly identified iridoid glycoside cis-laterioside and all isolated flavonoids exhibited a significant inhibition of integrase. We plan to complete in the future the study of the main active compounds with docking molecular analysis, providing some information about their interaction mechanism, and to proper pharmaceutical formulations for medical applications.

Plant Material
Leaves of S. trifoliata were collected at the flowering stage (April 2016) in the site of Seui (Sardinia, Italy, 39 • 49 52.3 N, 9 • 20 31 E-693 m a.s.l.). The plant was identified and a voucher specimen (Herbarium CAG 1011) was deposited at the General Herbarium of the Department of Life and Environmental Sciences, University of Cagliari (Cagliari, Italy). S. trifoliata, even if endemic, is not protected by local or international regulations, therefore, no specific permission was required for its collection. The plant raw materials were dried in a ventilated stove at 40 • C to constant weight, powdered with liquid nitrogen, and stored at −20 • C until next analysis.

Preparation of Crude Extract for Bioassay
The powder obtained from the leaves of S. trifoliata (427 g) was extracted with MeOH (3 × 1 L, 8 h each), filtered, and evaporated to obtain crude extract (22 g) that was submitted for biological evaluation.

Extraction Procedure for NMR Analysis
Powdered air-dried leaf material of S. trifoliata (600 mg) underwent ultrasound assisted extraction (Branson 3800 MH, Milan, IT), with a H 2 O/MeOH (1:1) solution (18 mL), for 40 min. Subsequently, the mixture was centrifuged (Beckman Coulter's AllegraTM 64R centrifuge; rotore F1202, r = 3.5 cm) a 13,000 rpm, for 10 min. After filtration and solvent removal, a crude extract was obtained and stored at −20 • C until analyses. To obtain enriched fractions in specialized metabolites, potentially responsible for biological activities, the crude extract was filtered through a Sep-Pak C18 classic short body cartridge (Waters, Milford MA, USA), previously conditioned with MeOH (10 mL), followed by H 2 O (10 mL). The resulting residue was dissolved in a volume of 1.5 mL of phosphate buffer in D 2 O and methanol-d4 (1:1). An aliquot of 0.6 mL was transferred to an NMR tube and analyzed by NMR [46].

NMR Experiments
NMR spectra were recorded at 25 • C on 300.03 MHz for 1 H and 75. 45 MHz for 13 C on a AVANCE II 300 MHz NMR Spectrometer Fourier transform in CD 3 OD or CD 3 OD: phosphate buffer solutions at 25 • C. Chemical shifts are reported in δ (ppm), and referenced to the residual solvent signal; J (coupling constant) are given in Hz. 1 H-NMR spectra were acquired over a spectral window from 14 to −2 ppm, with 1.0 s relaxation delay, 1.70 s acquisition time (AQ), and 90 • pulse width = 13.8 µs. The initial matrix was zero-filled to 64 K. 13 C-NMR spectra were recorded in 1 H broadband decoupling mode, over a spectral window from 235 to −15 ppm, 1.5 s relaxation delay, 90 • pulse width = 9.50 µs, and AQ = 0.9 s. The number of scans for both 1 H and 13 C-NMR experiments were chosen, depending on the concentration of the samples. With regards to the homonuclear and heteronuclear 2D-NMR experiments, the data points, number of scans, and increments were adjusted according to the sample concentrations. Correlation spectroscopy (COSY) and double quantum filtered COSY (DQF-COSY) spectra were recorded with gradientenhanced sequence at spectral widths of 3000 Hz in both f2 and f1 domains; the relaxation delays were of 1.0 s. The total correlation spectroscopy (TOCSY) experiments were performed in the phase-sensitive mode with a mixing time of 90 ms. The spectral width was 3000 Hz. Nuclear Overhauser effect spectroscopy (NOESY) experiments were performed in the phase-sensitive mode. The mixing time was 500 ms, and the spectral width was 3000 Hz. For all the homonuclear experiments, the initial matrix of 512 × 512 data points was zero-filled to give a final matrix of 1 k × 1 k points. Proton-detected heteronuclear correlations were also measured. Heteronuclear single-quantum coherence (HSQC) experiments (optimized for 1 J (H, C) = 140 Hz) were performed in the phase sensitive mode with field gradient. The spectral width was 12,000 Hz in f1 ( 13 C) and 3000 Hz in f2 (1H), and had 1.0 s relaxation delay; the matrix of 1 k × 1 k data points was zero-filled to give a final matrix of 2 k × 2 k points. Heteronuclear 2 bond correlation (H2BC) spectra were obtained with T = 30.0 ms and a relaxation delay of 1.0 s; the third order low-pass filter was set for 130 < 1 J (C,H) < 165 Hz. A heteronuclear multiple bond coherence (HMBC) experiment (optimized for 1 J (H,C) = 8 Hz) was performed in the absolute value mode with field gradient; typically, 1 H-13 C gHMBC were acquired with spectral width of 18,000 Hz in f1 ( 13 C) and 3000 Hz in f2 ( 1 H) and 1.0 s of relaxation delay; the matrix of 1 k × 1 k data points was zero-filled to give a final matrix of 4 k × 4 k points. Constant time inverse-detected gradient accordion rescaled heteronuclear multiple bond correlation spectroscopy (CIGAR-HMBC) spectra (8 > n J (H,C) > 5) were acquired with the same spectral width used for HMBC. Heteronuclear single quantum coherence-total correlation spectroscopy (HSQC-TOCSY) experiments were optimized for n J (H, C) = 8 Hz, with a mixing time of 90 ms.

MALDI-TOF MS Analyses
Mass spectrometry analyses of pure compounds were performed with a matrix assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometer equipped with a pulsed nitrogen laser (λ = 337 nm). Prior to the acquisition of spectra, 1 µL of sample solution (100 pmol/µL) was mixed with 1 µL of saturated α-cyano-4-hydroxycinnamic acid matrix solution (10 mg/mL in acetonitrile/trifluoroacetic acid 0.1%, 1:1, v:v) and a droplet of the resulting mixture (1 µL) placed on the mass spectrometer's sample target. The droplet was dried at room temperature. Once the liquid was completely evaporated, the sample was loaded into the mass spectrometer and analyzed in positive reflectron mode. The instrument was externally calibrated using a 50 fmol/µL tryptic alcohol dehydrogenase digest. The instrument source voltage was set at 12 kV.

Hydroalcoholic Extraction of S. trifoliata Leaves and Compounds Purification
Dried leaf material of S. trifoliata (40 g) was powdered and extracted by ultrasound assisted extraction (Branson 3800 MH) for 40 min each and three cycles with H 2 O/MeOH (1:1) solution (1.2 L). The flask was centrifuged. Subsequently, after centrifugation at 4800 rpm (Beckamn, GS-15R centrifughe; rotore S418; r = 3.5 cm) for 10 min at 22 • C, the extract was filtered on Whatman paper and concentrated under vacuum, furnishing a dried crude extract (16.6 g).
The dried crude extract dissolved in H 2 O was purified on Amberlite XAD-4 column first with water in order to eliminate sugars and other water-soluble compounds. Successively, methanol elution furnished 1.2 g of residual material purified on RP-18 CC eluting with decreasing polarity solution (CH 3   HIV-1 RT group M subtype B. Heterodimeric RT was expressed and purified essentially as previously [47] described. Briefly, protein was expressed in E. coli strain M15 containing the p6HRT-prot vector, induced with 1.7 mM isopropyl β-D-1-thiogalactopyranoside for 4 h. Protein purification was carried out with a BioLogic LP system (Biorad), using a combination of immobilized metal affinity and ion exchange chromatography. First, crude bacterial extract was clarified by centrifugation and loaded onto a Ni 2+ -NTA-Sepharose column pre-equilibrated with a loading buffer (50 mM sodium phosphate buffer pH 7.8, containing 0.3 M NaCl, 10% glycerol, and 10 mM imidazole); RT was eluted with an imidazole gradient in wash buffer (0-0.5 M). Fractions were collected and protein purity was checked by SDS-PAGE and found to be higher than 90%. The 1:1 ration between the p66/p51 subunits was also verified. Enzyme-containing fractions were pooled and diluted 1:1 with 50 mM sodium phosphate buffer pH 7.0, containing 10% glycerol, and then loaded into a Hi-trap heparin HP GE (Healthcare Lifescience) using a loading buffer (50 mM sodium phosphate buffer pH 7.0, containing 10% glycerol and 150 mM NaCl). RT was eluted with Elute Buffer 2 (50 mM Sodium Phosphate pH 7.0, 10% glycerol, 1 M NaCl). Fractions were collected and protein was dialyzed and stored in a buffer containing 50 mM Tris-HCl pH 7.0, 25 mM NaCl, 1 mM EDTA, and 50% glycerol. Catalytic activities and protein concentrations were determined. Enzyme-containing fractions were pooled and aliquots were stored at −80 • C.

Expression and Purification of Recombinant HIV-1 IN and LEDGF
Recombinant 6xHis tagged IN protein was expressed and purified as described previously [48]. Briefly, IN was expressed in Escherichia coli strain BL21 (DE3). Initial