Model

Selection of enzymatic candidates

Materials and methods

Search for orthologous Enzymes


The search for orthologous enzymes was conducted using BLAST (Basic Local Alignment Search Tool) on the NCBI server (https://blast.ncbi.nlm.nih.gov/Blast.cgi). BLAST is a widely used local alignment tool that compares nucleotide or protein sequences against those stored in databases. The tool calculates the level of similarity between sequences and provides an output with assigned scores. Users can modify several search parameters, including database selection, taxonomic restrictions, and algorithm parameters like the E-value and substitution matrix, through the NCBI graphical interface.

In our analyses, we maintained the default algorithm parameters, applying only taxonomic restrictions when necessary. For the BLASTp runs, we used taxonomic restrictions with taxid identifiers to narrow our search and identify orthologous proteins within specific genera: Pseudomonas (taxid:286), E. coli (taxid:562), and Synechocystis PCC6803 (taxid:1148).

Docking Simulations


Docking simulations were employed to predict how various PFAS molecules might bind to the active sites of dehalogenases and laccases, our target enzymes. We utilised AutoDock Vina (https://vina.scripps.edu/) within UCSF Chimera (https://www.rbvi.ucsf.edu/chimerax/) and CB-Dock2 (https://cadd.labshare.cn/cb-dock2/index.php) for these simulations. Although these tools use different scoring algorithms, their results are comparable, allowing for robust predictions of enzyme-ligand interactions.

The primary aim of these docking simulations was to evaluate the specificity and affinity of enzyme-ligand interactions. By calculating the binding free energy (ΔG), the software ranks potential binding orientations from most to least stable. This approach enables us to predict the specificity of enzyme-ligand interactions, infer potential catalytic and binding sites, and identify amino acid residues involved in the ligand-protein interface. However, it's important to note that docking simulations do not confirm binding; they merely suggest how it might occur, assuming the target and ligand do indeed interact.

Functional Annotation with InterProScan


To further analyse and annotate the protein sequences, we used InterProScan (https://www.ebi.ac.uk/interpro/), a tool that integrates data from various databases to identify features such as domains, protein families, and conserved sites. This comprehensive approach provides a deeper understanding of the functional aspects of the enzymes under study.

Dehalogenases research

For the selection of dehalogenases, we decided to focus on one of the five dehalogenases from Delftia acidovorans. As our first candidate, we selected the DeHa2 enzyme, which was previously isolated by the USAFA iGEM 2020 team. Among the five dehalogenases isolated from D. acidovorans, only DeHa2 and DeHa4 showed actual defluorination capability (1). Before implementing DeHa2, we further investigated its structure and ability to interact with PFAS through docking simulations. DeHa2 (WP_011137954.1) achieved the highest score, which bioinformatically confirmed previous evidence showing DeHa2 might be the most efficient among all the dehalogenases from D. acidovorans.

Lists of fluoroacetate dehalogenases from D. acidovorans



For the second enzyme, we conducted additional simulations on all orthologous identified so far, which revealed an orthologous of DeHa4 in Synechocystis (WP_010872272.1 / UPI00000C10BF) as the best hit. For our second dehalogenase, we sought orthologous with potential defluorinating activity. Using PROTEIN BLAST, we searched the Synechocystis proteome for alignments with the five D. acidovorans dehalogenases. Results with a minimum identity of 25% and a minimum coverage of 70% were considered valid. No significant similarities were found for DeHa1, DeHa2, DeHa3, or DeHa5, but for DeHa4, we found compatible results. The search yielded significant results only for DeHa4, leading us to identify two promising orthologous: WP_010872272.1 and WP_010872152.1. As D. Acidovorans DeHa 4 is considered to be functional, we trust that also its orthologous could achieve the defluorination.


Further research was conducted on enzymes found in supplementary materials from research by the University of Padua (2). Their work identified some alpha/beta fold hydrolase enzymes in Synechocystis with potential PFAS resistance or degradation abilities. Particularly, There were 19 enzyme candidates with a similarity of 25-39% to Burkholderia sp. FAc-dex and another 53 candidates with a similarity of 22-40% (2). A multiFASTA file with all 72 enzymes was created and used as query sequences for BLAST alignment against the five dehalogenases from D. acidovorans, those from E. coli, and the FaC-DEX from Burkholderia. For D. acidovorans, DeHa4 was again the only one that produced significant results. This encouraging finding suggests that the orthologous enzymes might share similar defluorination capabilities with DeHa4, consistent with the experimental results from the UniPadua research team. We then filtered the results, retaining only those from Synechocystis PCC6803, which resulted in 11 candidates. Of these, the two best hits were WP_010872272.1 and WP_010872160.1.

The overall top candidate, UPI00000C10BF, an alpha/beta hydrolase fold enzyme from Synechocystis PCC 6803, emerged as a promising enzyme for PFAS interaction, with a coverage of 95% and an identity of 27.34%. A 3D model was developed through homology modeling for this enzyme and docking simulations with various PFAS molecules demonstrated positive and encouraging results. The simulations revealed a specific binding pocket within the enzyme, suggesting a potential catalytic site where PFAS could be degraded.

Figure - Synechocystis dehalogenase 3D model


Docking scores showed that PFAS binding strength to the enzyme’s cavity increased with the carbon chain length. The more negative the Vina score, the stronger the PFAS-cavity binding. For instance, the Vina score for PFOA (8 carbons) was -8.3, compared to -6.3 for PFBA (4 carbons) and -3.6 for fluoroacetic acid (2 carbons). While these results suggest strong binding interactions, it is crucial to keep in mind that the binding affinity does not directly equate to enzymatic activity. However, binding predictions must be experimentally validated to confirm enzymatic activity.

gif - Synechocystis Dehalogenase 3D structure with binding pocket for PFAS highlighted. Hydrophobic residues in green, hydrophilic residues in red.



Table The table shows the docking results for the selected enzymes: DeHaS, a ortholog of D. acidovorans DeHa4; DeHa2 from D. acidovorans; and EIQ7173218, an E. coli dehalogenase orthologous to D. acidovorans DeHa2, which was one of the many results from BLAST analyses. The lengths of the carbon chains are indicated in brackets. FA - Fluoroacetic Acid; TFA - Trifluoroacetic Acid; PFPA - Perfluoropropionic Acid; PFBA - Perfluorobutanoic Acid; PFBS - Perfluorobutane Sulfonate; PFHxA - Perfluorohexanoic Acid; PFHxS - Perfluorohexane Sulfonate; PFOA - Perfluorooctanoic Acid; PFOS - Perfluorooctane Sulfonate.

These bioinformatic findings indicate strong compatibility between PFAS compounds and the dehalogenase enzyme, but binding alone is not proof of the desired defluorination activity. Further experimental research is required to confirm the degradation activity of the enzymes when cloned in E. coli.

Laccases research

Exploring potential laccase candidates presented more challenges than dehalogenases, largely due to the limited available data and the complex nature of these enzymes. Laccases are still largely uncharacterised, exhibiting significant sequence and structural divergence, which made our initial BLAST searches across various bacterial families particularly difficult. The sequences often displayed considerable variation, including unexpected gaps and additional segments in typically conserved regions, making clustering impossible.

We decided to start again with a well-characterised E. coli laccase (P36649), previously implemented in iGEM projects for various bioremediation purposes, and we tested it for PFAS bioremediation as well.
The starting point for the laccase research was ECOL I (https://parts.igem.org/Part:BBa_K863006) E. coli laccase (EKK0468176.1) However, we discovered that this sequence was incomplete, leading us to find the full sequence (P36649). After investigating a wide range of different bacterial laccases, we were able to identify three domains that were characteristic of the family: CuRO 1, 2 and 3. These are crucial for the enzyme’s functionality. The complete ECOL sequence presents all three domains. This enzyme became our first laccase candidate, given its extensive characterization and broad implementation.

To identify a second candidate, we turned to Synechocystis. Our starting point for Synechocystis was a laccase sequence identified as WP_194014464.1, described as a multicopper oxidase domain-containing protein (2). This laccase shares a 24% similarity with Pleurotus ostreatus laccase POX1. Pleurotus ostreatus in previous studies has been observed to effectively degrade PFOA and may also be applicable for the degradation of other PFAS compounds (3). Using BLASTp, we searched for orthologous in Synechocystis, filtering for sequences with more than 50% coverage. This search yielded four results.

Figure Laccases orthologous in Synechocystis sp.

In parallel, we conducted another BLAST search using the previously mentioned E. coli laccase while using the taxid identifier for Synechocystis. This search returned eight results, many overlapping with the previous BLAST search.

Figure Laccases in Synechocystis sp. orthologs to E. coli laccase

To broaden our search, we also used laccases from Pycnoporus sp. and Pleurotus ostreatus, both suggested for PFAS fragmentation (4), as queries in BLASTp against the Synechocystis proteome. The Pleurotus ostreatus laccases (lac1 Q12729 and lac2 Q12739) both returned two similar results, while the Pycnoporus sp. laccase (O59896) yielded five results, including those from the previous searches.


Figure On top, the results from lac1; on bottom, the results from lac2

Figure Results from Pycnoporus sp.

After merging these results, we selected a Synechocystis laccase with a high similarity score to the E. coli, Pycnoporus sp., and Pleurotus ostreatus laccases, identified as MEB3229074.1. This laccase stood out as the best candidate due to its high scores across all BLAST searches.

Next, we used InterProScan to verify the presence of the three copper-binding domains (CuRO 1, 2, and 3) characteristic of laccases. While the E. coli laccase (P36649) contained all three domains, the Synechocystis laccase (MEB3229074.1) was missing the CuRO_2 domain.

We then proceeded to study the 3D structures of both ECOL and Synechocystis laccase.

Figure – 3D model of E. coli laccase

Figure – 3D model of Synechocystis laccase

To better understand which cavities of the proteins could bind different types of PFAS and assess their binding potential, we performed a series of blind docking simulations using CB-Dock2. These analyses yielded different results than those previously performed for the dehalogenases, as the different types of PFAS did not all preferentially bind to the same cavity.

Table The table shows the docking results for E. coli laccase (P36649) with different PFAS molecules, which were obtained using CB-Dock2. The top 3 scores for each cavity are highlighted in yellow, with the highest score in bold. The PDB model chosen for the simulations is 5B7E and was selected after research in UniProt.
FA - Fluoroacetic Acid; TFA - Trifluoroacetic Acid; PFPA - Perfluoropropionic Acid; PFBA - Perfluorobutanoic Acid; PFBS - Perfluorobutane Sulfonate; PFHxA - Perfluorohexanoic Acid; PFHxS - Perfluorohexane Sulfonate; PFOA - Perfluorooctanoic Acid; PFOS - Perfluorooctane Sulfonate.

This result shows that E. coli laccase can bind PFAS, but with a minor specificity for the active site than the dehalogenases.
In addition to that, docking scores showed that generally PFAS binding strength to the enzyme’s cavities increased with the carbon chain length.
To better understand which cavity shows the catalytic activity, we also performed some blind docking simulations using molecules that are known to be oxidised by laccases (5). We tested the affinity of the proteins to ABTS and Syringaldazine, and both showed high affinity for cavity 2.

Docking simulations with E. coli laccase (P36649)

The same analysis was conducted also for Synechocystis laccase, and the following results were obtained.

Table The table shows the docking results for Synechocystis’ Iaccase with different PFAS molecules, which were obtained using CB-Dock2. The top 3 scores for each cavity are highlighted in yellow, with the highest score in bold.

Docking simulations with Synechocystis laccase

The results are comparable to those obtained with E. coli laccase (P36649), indicating that Synechocystis's laccase does not possess a single cavity capable of binding specifically to PFAS molecules. Instead, it exhibits a series of binding sites.

Codon optimization

Before expression, the enzymes need to undergo codon optimisation. This process is necessary because different organisms prefer certain codons over others to encode the same amino acid, which can affect the efficiency of protein translation.

First, we obtained the nucleotide sequences of our four enzymes conducting a brief research in genomic databases. For each codon in the sequence, we compared the codon usage of the starting organism to that of E. coli, the host organism. Where necessary, we substituted common codons with rare ones (frequency < 10%).

table - E. coli codon usage

figure - example of codon optimisation in UPI00000C10BF.

Next, we checked the optimised sequences for the presence of restriction enzyme sites (PstI - CTGCAG, SpeI - ACTAGT, EcoRI - GAATTC, XbaI - TCTAGA, BsaI - GGTCTC) and made additional codon substitutions to remove these sites when present.

Finally, we added a His tag (CATCACCATCACCATCAC) at the end of each sequence and inserted restriction enzymes sites at the sequence extremities: Ndel (CATATG) at the 5’ end and BamHI (GGATCC) at the 3’ end. In this way, the four enzymes were ready to be ordered and inserted in E. coli.

Growth Tests for E. coli

To characterize the survival of E. coli when exposed to substrates that would be present in the conceived bioreactor (i.e. PFAS pollutant and filters desorption buffers), we decided to perform several growth tests on chassis cultures grown in 96-well plates.
Plate reader [Tecan Spark multimode microplate reader] kinetic measures of the OD at 600 nm were taken, sampling every 5 minutes for 14 hours overnight. Plates were shaken for 15 seconds before each measurement.
The kinetic protocol adopted had these settings:


This was the kinetic cycle set up:

  • Shaking
  • Absorbance
  • Wait
At the end of the measuring process that the plate reader does for about 14 hours, its software gives as an output a matrix with all the OD measures that were taken for each of the 96 wells.
To perform an initial analysis of this data, we used the Excel calculation software. The program allows the generation of graphs of the measurements that occurred overnight. These graphs granted a visual evaluation of the growth curves in order to verify the absence of spikes and outliers. In addition, this made it possible to verify that the negative controls were not contaminated.

Then, we calculated the average between all the negative control wells with the same concentration and subtracted it from the OD values of each of the colonies (at each concentration tested). In this way, we obtained a matrix of absorbance values representing bacterial growth in the medium.
This matrix of values was later entered into the command “ODs: []” in the Matlab code.

Matlab code:


The Matlab code used for our data analysis calculates the growth rate for each colony at each condition using this algorithm:

  1. Identify data in the exponential phase: 0.05 < OD600 < 0.2
  2. Calculation of: log(OD600,exp)
  3. Linear regression of log(OD)=m(t)+q
    Slope m=Growth rate GR
    This linear regression gets plotted in a graph that has time expressed in hours on the x axis and the log(OD) on the y axis
  4. GR per minute= GR/60

The program considers OD values between 0.05 and 0.2 and calculates the growth rates for each of the tested conditions. The choice of those numbers as extremes of the range was made in order to consider what should be the exponential phase of the growth curve.
After computing, the output for each of the conditions consists in the growth rate per minute and a graph containing a linear regression line and the log(OD) points. The closer the OD points are to the regression line, the better the value of R^2 is.
In this code, cyan was assigned to fitting lines with R2 less than 0.90 and blue to fitting lines with R2 between 0.90 and 1.

In addition to an analysis of the exponential region of the growth curves, we decided to do an evaluation of the final OD values. This was done in the following way:
  1. We considered five OD measurements around the fourteenth hour
  2. We calculated the average of these measurements for each sample
  3. For each condition, we calculated the average between the 3 values calculated in the previous step (corresponding to the three colonies)
  4. For each of those values we calculated the standard error


This process allowed a comparison of the growth conditions while keeping time constant.
Thanks to this process, we overall analyzed the growth curves in the exponential region and in the final region where growth should have reached a plateau.

Here is a graphical representation of our work process:

Filtering

To determine which concentrations of PFAS E.coli would find inside the bioreactor, we hypothesized working with a column filter that had reached saturation and had groundwater as its inlet.
We focused on a resin filter because in our design the filter will be a resin, not GAC.
From our studies we selected Purolite A860 as a filter model.
In particular, from [1] we know that the resin can be described using the Freundlich isotherm:
qe=kf*Ce1/n
where qe represents the mass of PFAS adsorbed per filter mass, kf is the Freundlich constant, n is a parameter found empirically, Ce is the equilibrium concentration.
kf and n are data taken from [1] and [2]; the equilibrium concentration is an average of the data taken from [3].
In particular, the kf reported in the paper is expressed in μeq/meq*(μeq/L)1/n (an equivalent is the quantity of a substance that produces a mole of an electron’s electric charge because of a dissociation phenomena).
We therefore had to report it in our calculations in a universal unit of measurement.


To do so we used the charge density of each PFAS:


As for the resin, the paper reports that 221 mg corresponds to 0,8 meq.

We assumed a saturated filter so that the equilibrium concentration can be considered equal to the PFAS concentration of an aquifer.
We then calculated qe using the Freundlich isotherm.
From studies on the regeneration of AER (Anion Exchange Resin) [2] the volume of the desorption solution that we will use will be equal to 10 times the volume occupied by the resin.
Therefore the final concentration will be equal to the mass of PFAS in the solution after regeneration (mPFAS*r, r is the regeneration yield) related to the volume of the solution, determined by: V=10*b/dr, dr is the density of the resins.
So:
Cf=mPFASV=qe*r*dr10*bdr=qe*r*dr10
Therefore the mass of PFAS adsorbed and the filter used are indifferent: the interesting parameters are the quantity of PFAS adsorbed per mass of filter and the density of the resin.

The following table summarizes the results obtained:


We thus obtained estimates of the maximum concentration values ​​that the resin filtration process could reach. It should be noted that we started from hypotheses that will never be achieved in a real case scenario: the filter will be regenerated before reaching saturation in order to avoid the loss of filtration efficiency.
We also want to highlight that our calculations are based on regeneration yields obtained through batch experiments, while we are more interested in the yields obtained through continuous regeneration.
However, we were interested in having an estimate of these concentrations to know if E.coli could survive in these conditions through growth tests (for further information regarding growth test results visit Growth tests-results).

Sensor

The electrode-solution interface is the contact region between two different phases, such as metal and electrolyte solution, and plays a crucial role in the operation of electrochemical sensors. This interface presents a discontinuity in physicochemical properties, which causes variations in charge distribution and the generation of an electric field.

In an electrochemical transducer, the interface is formed by a metal, which contains free electrons as mobile charges, and an electrolyte solution, where the mobile charges are ions. The portion of the solution closest to the discontinuity, called the interphase, presents anisotropic and heterogeneous conditions, generating an electric field. This field causes charge transfer across the interface, triggering a redox process. The free electrons in the metal can cross the interface to reduce ions in solution, while during oxidation, the electrons leave the solution and return to the metal. The reduction and oxidation currents balance at the standard redox potential characteristic of each electrolyte.

To get a more quantitative idea of the data obtained during our measurements(although it remains a highly preliminary result), we used MATLAB and performed a fitting of the impedimetric data obtained based on an equivalent electrical circuit composed as follows:


In the circuit, we can identify two distinct elements: Rs, which represents the resistive contribution of the electrochemical cell, and the CPE (constant phase element), which represents its capacitive contribution. The fitting returns three parameters: Rs, Q0, and n.

This last value allows us to characterize the non-linearity of the real response and is used together with Q0 to calculate the CPE using the formula:


It's easy to notice that in the case of n=1, our element would exhibit ideal capacitive behavior.

La Cdl(double layer capacity) has not been generated by the fitting process but it can be derived from the CPE through the equation:



From the fitting of the measurements with PFOA, we obtained the values shown in the following table:


In this picture we can see how the fitting is performed, with the response from the equivalent circuit (*) that tries to match the impedance module data obtained by our measurements.


To better appreciate the Cdl variations we can look at this graph:


As we can see from the graph, we were able to achieve a satisfactory decrease in Cdl, which we can confidently attribute to the interaction between the functionalization and the PFOA molecules.

We will now analyze the subsequent measurements carried out using a new set of electrodes and the compounds PFOS:




As visible in the Cdl graph, in this case, the variation has not been as constant and noticeable if compared to the PFOA, so we can conclude that our sensor does not interact with the PFOS molecules.

The last compound that we tested is the PFBA:




This measurement doesn’t show the same powerful drop in the Cdl value as the PFOA one did, but a clear trend is noticeable.

We believe that, although the variation in this case is less evident, there is still a good chance that our sensor is working correctly. It is highly likely that, since PFOA and PFBA share the same functional group, they can similarly interact with the surface. Additionally, since PFBA has a shorter carbon chain, it creates less obstruction at the interface, which results in a smaller variation in double-layer capacitance.

References

  1. Steel JJ. iGEM. 2021 [cited 2024 Jun 16]. USAFA iGEM 2021 - Results! Available from: https://2021.igem.org/Team:USAFA/Results
  2. Marchetto F, Roverso M, Righetti D, Bogialli S, Filippini F, Bergantino E, et al. Bioremediation of Per- and Poly-Fluoroalkyl Substances (PFAS) by Synechocystis sp. PCC 6803: A Chassis for a Synthetic Biology Approach. Life. 2021;11(12).
  3. Luo Q, Lu J, Zhang H, Wang Z, Feng M, Chiang SYD, et al. Laccase-Catalyzed Degradation of Perfluorooctanoic Acid. Environ Sci Technol Lett. 2015 Jul 14;2(7):198–203.
  4. Scott C, Hu M. Toward the development of a molecular toolkit for the microbial remediation of per-and polyfluoroalkyl substances. Appl Environ Microbiol. 2024 Mar 13;90(4):e00157-24.
  5. Martin E, Dubessay P, Record E, Audonnet F, Michaud P. Recent advances in laccase activity assays: A crucial challenge for applications on complex substrates. Enzyme Microb Technol. 2024 Feb;173:110373.
  6. Fuhar D, Benoit B, Shadan G.M, Madjid M, PFOA and PFOS removal by ion exchange for water reuse and drinking applications: role of organic matter characteristics, The Royal Society of Chemistry, 2019
  7. Fuhar D, Benoit B, Shadan G.M, Madjid M, Removal of legacy PFAS and other fluorotelomers: Optimized regeneration strategies in DOM-rich waters, Water Research, 183, 2020
  8. https://www.researchgate.net/publication/320041381_Monitoraggio_delle_sostanze_ perfluoroalchiliche_nella_rete_di_sorveglianza_delle_acque_sotterranee

Let's find out more about our project