Model | Manchester

Overview

The overarching aim of this project was to engineer recombinant skin-specific nanowires capable of transmitting electrical signals from the residual limb of a myoelectric prosthetics user to the prosthetic’s receiving electrode, even when movement occurs. Our design focuses on addressing the issue of motion artifacts, which arise when the electrode shifts from its affixed contact point — the spot where it usually detects signals from muscle contractions in the residual limb [1]. This disconnection is a primary cause of lagging limb movement. These nanowires are created by engineering a skin-specific binding tag onto the carboxy-terminal end of a pilin monomer from Geobacter sulfurreducens, an electroactive bacterium whose pili can conduct electricity (e-pili) [2]. The dry lab experiments not only aimed to inform the team on the safety of the wet lab procedures but also sought to uncover new information about the conductivity of other potential electrically conductive pili from similar proteins.

Background: Phylogenetic tree of organisms with type IV pili

After consulting with stakeholders during our human practices meetings, it was understood that there were concerns about patients and clinicians using a biological product consisting of type IV pili (See Human Practices). While only playing a role in direct electron transfer in Geobacter, type IV pili can also mediate cell motility and pathogenicity in highly virulent strains of bacteria such as Pseudomonas aeruginosa, Vibrio cholerae and Neisseria meningitidis, which are responsible for a large number of hospital infections[3-5]. For example, the type IV pili in P. aeruginosa acts as a surface adhesin that can mechanochemically regulate virulence factors by triggering a signal cascade, resulting in the expression of hundreds of genes associated with pathogenicity[3]. Further concerns were then raised within the team as members of the wet lab could be at risk depending upon the choice of chassis organism, two of which have been experimentally proven to produce e-pili[6,7]. The established protocol that designed a strain of E. coli to successfully produce e-pili did so using the type IV machinery from Enterohaemorrhagic Escherichia coli (EHEC), a highly pathogenic bacteria[6]. The alternative would be to use Vibrio natriegens as our expression host, as was done by iGEM Tongji China in 2023[7]. The use of V. natriegens was dismissed due to the rising problem associated with the growing antibiotic resistance of its strains[8]. Furthermore, all species of Vibrio have recently been classed as Human Pathogen Group 2 by the UK Health and Safety Executive, meaning it would not be possible to work with V. natriegens within our lab[9]. Therefore, the use of E. coli as our chassis would need to be investigated further. As such, replicating the existing protocol for using E. coli as the chassis would not be feasible without thorough safety investigation[6]. Any capabilities of these e-pili to obtain pathogenic traits from using such genes would need to be assessed in this case.

Thus, it was proposed to construct a phylogenetic tree to infer the genetic distance between pili from related species of Geobacter, and related pili from pathogenic bacteria. Through this analysis, not only was it the aim to determine the level of risk e-pili pose in use on humans, even when composed of protein monomers derived from pathogenic bacteria, but also to investigate the difference in size and structure of the respective pili proteins. This dry lab study aimed to complement the wet lab efforts to avoid using the EHEC machinery entirely as part of the team’s Safety efforts, seeking to provide reassurance to clinicians, stakeholders and future iGEM teams. The results of this research then began to unveil patterns that may lead to the discovery of novel e-pili, with potential to be utilised for the production of a wider range of synthetic e-pili.

To generate a phylogenetic tree, a homemade Python script (linked below) was used first to remove any redundant proteins in the dataset.


import numpy as np
from adjustText import adjust_text
from Levenshtein import ratio

fasta = open(r, "r")
#copy the file path where your seqdump is located within apostrophes i.e fasta = open(r,'C:\Users\yourname\Documents\iGEM\DryLab\yourseqdumpfilename'"r") - remove spaces from your file and folder names or replace them with _, however you read best. The code will not run well with spaces. 
Rawseq = []
Processedseq = []
Seqbuilder = []
Sameseq = 0

for line in fasta:
    if ">" in line and Sameseq == 1:
        Seqbuild = ''.join(Seqbuilder)
        Rawseq.append(Seqbuild)
        Processedseq.append(Rawseq)
        Rawseq = []
        Seqbuilder = []
        Sameseq = 0
    if ">" in line:
        line = line.strip()
        Rawseq.append(line)
        Sameseq = 1
    if Sameseq == 1 and ">" not in line:
        line = line.strip()
        Seqbuilder.append(line)

Seqbuild = ''.join(Seqbuilder)
Rawseq.append(Seqbuild)
Processedseq.append(Rawseq)
Rawseq = []
Seqbuilder = []
Sameseq = 0

fasta.close()

Distanceseqs = []

Count = 0 
Tempseqs = []
Finalseqs = []

Finalmsaseqs = []
Finalmsaid = []

for SeqData in Processedseq:
    if upper limit of your amino acid sequence length > len(SeqData[1]) > lower limit of your amino acid sequence length:
        if "U" not in SeqData[1]:
            if "X" not in SeqData[1]:
                if "WP" in SeqData[0] and "uncultured" not in SeqData[0] and "sp." not in SeqData[0] and "unclassified" not in SeqData[0] and "MULTISPECIES" not in SeqData[0]:
                    Tempseqs.append(SeqData[0])
                    Tempseqs.append(SeqData[1])
                    Finalseqs.append(Tempseqs)
                    if Count == 0:
                        Finalmsaid.append(SeqData[0])
                        Finalmsaseqs.append(SeqData[1])
                Count = Count + 1
                Tempseqs =[]

MSAseqs = []
MSAid = []

for i in Finalseqs:
    MSAid.append(i[0])
    MSAseqs.append(i[1])

for iteration in MSAseqs:
    Redundant = False
    for list in Finalmsaseqs:
        if ratio(iteration, list) > insert similarity value you want to filter by:
            Redundant = True
    if Redundant == False:
        ID = MSAseqs.index(iteration)
        Filtername = str(MSAid[ID])
        FilternameFINAL = Filtername[len(Filtername) - 20:]
        if FilternameFINAL not in str(Finalmsaid) and "prepilin" not in MSAid[ID]:
            Finalmsaid.append(MSAid[ID])
            Finalmsaseqs.append(iteration)

f = open(r'#insert the file destination and the name you want to call your new file', "w")

print(Finalmsaid)
print(len(Finalmsaid))

index = 0
for sequence in Finalmsaseqs:
    Strconvert = str(Finalmsaid[index])
    Strready = Strconvert.replace(" ", "_")
    Strready = Strready.replace("[", "")
    Strready = Strready.replace("]", "_")
    f.write(Strready + '\n')
    f.write(str(sequence) + '\n')
    index = index + 1

f.close()

The script first filtered the PilA sequences using their NCBI sequence ID, removing any sequences that had the keywords; “uncultured”, “sp”, “unclassified”, “multispecies” in their names. This ensured that every sequence could be traced back to a specific strain while eliminating uncertain metagenomic sequences. Next, the script removed any sequences containing the “Z” and “X” amino acid letters, given that these are used to represent undetermined amino acids in the given protein sequence. Then, the dataset was filtered further by sequence similarity, filtering out every sequence that were >90% similar to each other using the Levenshtein distance [13]. Finally, the dataset was filtered by sequence length in order to filter out truncated sequences. G . sulfurreducens PilA homologs were set to have a size between 50 and 100 amino acids, while P. aeruginosa PilA homologs were set to have a size between 140 and 170 amino acids. This script enabled the generation of a curated dataset of 494 proteins which were then aligned via a multiple sequence alignment (MSA) using MAFFT [14]. The resulting MSA was then used for the construction of a phylogenetic tree of PilA proteins. The tree was constructed using IQ-Tree [15] on the CIPRES gateway server [16] with the following parameters : LG + C20 + I + F + G and 1000 bootstraps for the generation of bootstrap values

Methodology

Given that there exist more than 15,000 putative type IV pilin sequences on the NCBI database, the necessary protein sequence data needed for phylogenetic tree construction was gathered using two reference sequences; the conductive type IV pilin protein from Geobacter sulfurreducens, PilA, was BLASTed in order to collate a list of PilA proteins that have close homology to this sequence[10]. The same was then done for the type IV pilin protein from the pathogenic species Pseudomonas aeruginosa in order to obtain a list of PilA homologues that may share its pathogenic traits, such as surface adhesion[3]. Both of these organisms were selected as their PilA protein monomers make up two of the most well studied type IV pili given their specific function[11, 12]. This method yielded a dataset of more than 10,000 PilA protein sequences, of which thousands were either redundant or highly similar to each other. Subsequently, the dataset was filtered and curated to less than 1000 non-redundant PilA sequences to generate a more accurate phylogenetic tree[13].

To generate a phylogenetic tree, a homemade Python script (link) was used first to remove any redundant proteins in the dataset. The script first filtered the PilA sequences using their NCBI sequence ID, removing any sequences that had the keywords; “uncultured”, “sp”, “unclassified”, “multispecies” in their names. This ensured that every sequence could be traced back to a specific strain while eliminating uncertain metagenomic sequences. Next, the script removed any sequences containing the “Z” and “X” amino acid letters, given that these are used to represent undetermined amino acids in the given protein sequence. Then, the dataset was filtered further by sequence similarity, filtering out every sequence that were >90% similar to each other using the Levenshtein distance[13]. Finally, the dataset was filtered by sequence length in order to filter out truncated sequences. G . sulfurreducens PilA homologs were set to have a size between 50 and 100 amino acids, while P. aeruginosa PilA homologs were set to have a size between 140 and 170 amino acids. This script enabled the generation of a curated dataset of 494 proteins which were then aligned via a multiple sequence alignment (MSA) using MAFFT[14]. The resulting MSA was then used for the construction of a phylogenetic tree of PilA proteins. The tree was constructed using IQ-Tree[15] on the CIPRES gateway server [16] with the following parameters : LG + C20 + I + F + G and 1000 bootstraps for the generation of bootstrap values.

Results

The tree created from this dataset showed low LogL (-95000) and bootstrap values (< 30) on average, an indication of an inaccurate tree topology[16](Figure 1a). This issue was most likely caused by a too lenient filtering of homologous sequences in the script, which resulted in the low bootstrap values observed for the given tree topology, impeding further analysis. Indeed, a large phylogenetic tree without strong bootstrap values means that it would be difficult to both analyse the data and deduce whether the pilin from G. sulfurreducens was either closely or distantly related to the pilins from pathogenic bacteria as the branch topologies are not accurate[16]. Within this tree, the PilA from G. sulfurreducens and Klebsiella pneumoniae, a pathogenic species, were shown to have a close relationship, which was surprising given that most other pathogenic bacteria displayed a much more distant phylogenetic relationship to G. sulfurreducens (Figure 2). Found naturally in the intestine, K. pneumoniae is ordinarily harmless, but can become a severe health risk if it spreads anywhere else in or on the human body[17]. The pili in K. pneumoniae mediate pathogenicity outside of the intestines by promoting bacterial adhesion to epithelial and immune cells [18]. Therefore, it is important to determine whether there exists a close relationship between these two bacteria in order to ensure that pili from G. sulfurreducens can safely be used on the human body. The bootstrap values for these separation events starting from their common evolutionary event were 53, then 49 and 39, 70, then 91 for G. sulfurreducens and K. pneumoniae respectively, which suggests that this topology is highly uncertain about their degree of phylogenetic separation[16].

Given that the tree displays a low LogL as well as low bootstrap values, it was unknown whether those two sequences were truly close phylogenetically. Hence, in order to confirm this tree topology, we decided to filter the original dataset with more stringent parameters in order to reduce sequence redundancy and increase branch topology accuracy. The script was applied to the original dataset with a 60% homology cut-off for the sequences instead of the 90% used previously. Given that this method could filter strains of interest indiscriminately, a list of 7 important PilA from the literature were manually added back into the dataset. These were from Geobacter metallireducens, Geotalea uraniireducens, Neisseria meningitidis, Neisseria gonorrhoea, Myxococcus xanthus, Salmonella enterica subsp. enterica serovar Typhimurium, and Escherichia coli O157: H7 (EHEC)[19-22]. This approach reduced the dataset to 156 sequences, and the subsequently created tree yielded a phylogenetic tree with a 2.5-fold increase in LogL value (-36776) along with significantly higher bootstrap values (> 60) on average (Figure 1b). From this tree, the evolutionary distance between G. sulfurreducens and K. pneumoniae is much further than the tree created using the larger data set implied. The distance spanned over 10 separation events with high bootstrap values averaging over 60 in each case. Finally, this second topology showed that K. pneumoniae PilA was much closer to the other pathogenic pili in the tree. Overall, the second tree (Fig. 1b) had better LogL and bootstrap values indicative of a more accurate phylogenetic topology. After rooting, this tree displayed clear phylogenetic separation of shorter pilins such as those from G. sulfurreducens a nd G. metallireducens from the longer pilin proteins, such as those from P. aeruginosa, with high confidence. The bootstrap value of the root is 99, with internal bootstrap values above 60. Given this, this tree was subsequently used in further analyses.

Image displays two phylogenetic trees. The first attempt has nodes overly clustered together, and is hard to interpret, which is highlighted by how three proteins of interest are annotated (using stars) very close together. The second attempt with less data is much easier to read and annotate, with the proteins of interest being easier to pinpoint.

Figure 1. Complete phylogenetic trees based on sequence divergence for pilin proteins similar to G. sulfurreducens and P. aeruginosa generated using BLAST data. The trees were inferred by the IQtree model using CIPRES and visualised using FigTree. The cyan and purple colours denote proteins with an amino acid length (l) between 50-100 residues and 140-170 residues respectively. Bootstrap values are shown as percentages of 1000 replications of different topologies, from which the value is expressed as a percentage confidence in the displayed topology. Proteins of interest are indicated with stars. 1a) Phylogenetic tree with the complete dataset excluding redundant proteins. Bootstrap values in this figure were mostly poor (<30), and ended up showing a close relationship between G. sulfurreducens and K. pneumoniae. 1b) displayed the phylogenetic tree from a trimmed data set. The separation is significantly greater, with bootstrap (>50) and -log values much greater than that of Figure 1a, which indicated a phylogenetic tree with 3-fold greater accuracy.

Data Analysis

Using the improved tree (Figure 1b), the phylogenetic relationship between the pilin of G. sulfurreducens and the pilins of other bacteria, pathogenic and non-pathogenic, could be more easily determined. To explore whether there were distinct patterns in the phylogenetic separation of pili from pathogenic bacteria compared to G. sulfurreducens, each sequence in the dataset was labelled as either pathogenic or non-pathogenic (Figure 2). Using this, it would be possible to not only visualise the data, but address the key question of whether or not the pili in G. sulfurreducens could be capable of mediating pathogenicity if designed to be expressed using a pathogenic chassis, genes, or machinery. This was done by pinpointing where the pathogenic bacteria were on the tree in relation to each other, and to G. sulfurreducens. Through this, it was found that G. sulfurreducens was on average, around 20 or more evolutionary events away from any pilin from a pathogenic bacteria.

From here, it was also observed that all of the pili from pathogenic bacteria are phylogenetically distant from G. sulfurreducens, and were also all found to be in close phylogenetic proximity to at least one other pathogenic bacteria. This could be indicative of evolutionary events that led to these organisms acquiring a certain gene or motif in their pili associated with pathogenicity. Furthermore, it was interesting to note that out of the 156 pilin monomers in this dataset, only 30 were identified to be from pathogenic bacteria. Thus, it is reasonable to conclude that in general, type IV pili do not typically mediate pathogenicity, and likely serve other diverse functions.

With the exception of K. pneumoniae, the data suggests that all pathogenic bacteria have longer pilin proteins (> 140 amino acid residues). As stated previously, this may be due to the fact that these longer pilins contain a specific motif required to mediate pathogenicity, meaning that perhaps more phylogenetically closely related non-pathogenic pili may also be able to obtain this function with certain mutations or under specific conditions. Nonetheless, these findings suggest that truncated pilin monomers, such as those from G. sulfurreducens and G. metallireducens, are unlikely to mediate pathogenicity due to the fact they do not share this long sequence in common, nor do they share any close phylogenetic relationship to any pathogenic bacteria. Due to this, it may then be the case that non-pathogenic truncated pilin proteins share traits in common with each other, perhaps even those of G. sulfurreducens and G. metallireducens.

Image displays the same phylogenetic tree as in Figure 1B, except coloured differently to instead show the relative phylogenetic positions of all the pili from pathogenic and non pathogenic bacteria

Figure 2: Phylogenetic tree based on sequence divergence of pilin proteins closely related to G. sulfurreducens and P. aeruginosa, inferred from scraped BLAST data. The tree displays a distant relationship between G. sulfurreducens and any pathogen within the entire dataset.

If the majority of bacteria with type IV pili do not mediate or play any sort of role in pathogenicity, then it may be the case that they play other roles, such as conductivity. Indeed, conductivity has been tested in more pili other than G. sulfurreducens and G. metallireducens, including in bacteria outside of the Geobacter genus, but the values for these other pili are indicative of poor conductivity[23-25]. The species tested included Shewanella oneidensis[23, 24] and Pseudomonas aeruginosa[25]. Regardless of their efficiency, or lack thereof, in conducting electricity, confirmation of conductive pili in a genus other than Geobacter suggests there may exist e-pili comparable or potentially better in conductivity yet to be discovered, as conductivity may be dependent on structural factors. Moreover, common structures are observed across all type IV pilin subunits, such as shared N-terminal sequence similarity, and a C-terminal disulfide bond, suggesting an element of common filament architecture across all type IV pili[26].

Indeed, there is a very clear structure–function correlation between both the length of the main pilin subunit, its aromatic amino acid content, and conductivity[27]. Studies have proposed that strains of G. sulfurreducens that carry mutations substituting in more aromatic amino acids confer higher conductivity, indicating that the mechanism for electron transfer occurs through pi-pi stacking interactions [28].

Furthermore, pi-pi stacking interactions are thought to be influenced by different pH environments, which affect the electrostatic interactions that mediate stacking[29]. This hypothesis has been experimentally demonstrated to be the case regarding G. sulfurreducens, as studies show that the conductivity is highly dependent on pH, with conductivity increasing from 51 ± 19 mS cm⁻¹ to 188 ± 34 mS cm⁻¹ at pH 7 to 2 respectively^[30]. This was found to be associated with proton doping, which increases stacking of aromatic amino acids due to an increase in electrostatic interactions. This strengthens the argument that aromatic amino acid content and density, pH levels, and protonation all play a role in the conductivity of a pilus.

In order to generate a list of candidates of potential e-pili, the dataset would need to be narrowed down based on factors that have been established to contribute to conductivity. The first step was to examine the isoelectric point (pI) values of all the PilA proteins in the dataset, as studies demonstrate that proton doping increases stacking of aromatic amino acids, thereby positively influencing the conductivity of a pilus[30]. Therefore, the most effective way to measure the degree of protonation at lower pH would be to examine the isoelectric point (pI) of each protein. The isoelectric point is the pH at which a protein carries no net charge [31]. A t pH levels above the pI, the protein becomes increasingly negatively charged, while at pH levels below it, the protein becomes more positively charged. Thus, a higher pI would correlate to greater protonation, and therefore an increase in aromatic amino acid stacking and conductivity, at acidic pH levels compared to proteins with a lower pI value.

The idea arose to investigate whether these pili would conduct well when engineered as skin-specific nanowires. As the pH of skin is slightly acidic and can range from 4.5–7, a pili with a pI value >7 would correlate to a pilus that would conduct sufficiently when on the skin[31,32]. To explore this, the pI was calculated for each protein sequence present on the tree (Figure 1b), and then annotated separately (Figure 3). As the pI values of G. sulfurreducens and G. metallireducens are approximated to be 9.46 and 9.4 respectively, this study considered a pI of >9 to be high value (Table 1). The values were sorted into three categories: whether they are unlikely to be conductive, poor conductors, or good conductors based upon their pI value, labelled as red, orange, and blue respectively. Interestingly, the majority of proteins with a high pI value are clustered around G. sulfurreducens and G. metallireducens. This suggests that structural characteristics associated with high conductivity may be an evolutionary trait of G. sulfurreducens and G. metallireducens, and that other closely related pilin proteins with similar pI values may also be as conductive as the pili from these two strains.

Image displays the same phylogenetic tree as in Figure 1B, except coloured differently to instead show the approximate isoelectric points of all the pilin proteins. A snapshot of the tree is zoomed in on so that the reader can see the section of interest more clearly

Figure 3: Phylogenetic tree of similar pilin proteins to G. sulfurreducens and P. aeruginosa inferred through BLAST. Coloured by isoelectric point (pI). As e-pili function best in acidic conditions, and the pH of the skin barrier can be approximated ~ 4.75, a higher pI of pili guarantees a protonated pilin that can conduct electricity at the pH of the skin barrier.

Using this, all proteins with a pI value above 9 were sorted into a list, as these proteins would be most likely candidates for conductivity testing (Table 1). This dataset was then used to identify pilins that are closely related to G. sulfurreducens and G. metallireducens that share traits contributing to conductivity, such as a shorter pilin, a tight alpha helix fold, and higher proportion of aromatic amino acid residues[28, 33]. After all the proteins had been identified, those that were either of comparable or better %aromaticity were analysed with AlphaFold on the ColabFold server[34] (Table 3). Only V. hepatarius was excluded from further study, as the length of the protein and spaced distribution of aromatic amino acids likely meant that its pili would not be a good conductor, as was seen when conductivity was tested in G. uraniireducens by Tan et al., 2016 [35]. These were compared to select proteins that displayed characteristics otherwise associated with poor conductivity; low pI values and low %aromaticity, as well as a pilin subunit from a pathogenic bacteria[27, 28, 30].

For this comparison, three proteins were selected: Variovorax beijingensis, which displayed a high pI (9.34), but low %aromaticity (3), Endozoicomonas numazuensis, which has a low approximated pI value (4.1), and Moraxella catarrhalis, a pathogenic bacteria. These larger pilin monomers all contained a beta sheet motif at the C-terminal end which was absent in all of the potential e-pili candidates (Table 2). From this observation, it is likely that this beta sheet insert is an evolutionary trait that gives the pili another function at the expense of lower to no conductivity. When aligned using PyMOL[36], the RMSD value of G. sulfurreducens compared against M. catarrhalis was 1.4, a low value that indicates that the fold is incredibly similar [37], aside from the presence of a beta sheet. This implies that if the pili of M. catarrhalis or other bacteria with longer pili were to be truncated to only the alpha-helix part of the pili, it is likely the pili may regain its conductivity function compared to their wild type. From this analysis, it appears that this beta sheet motif may be what sets the pili from G. sulfurreducens and G. metallireducens, and their conductive properties apart from other type IV pili.

Table 1: Type IV pili that display an approximated isoelectric point similar to G. sulfurreducens and G. metallireducens. From these type IV pili, the aromatic amino acid content was calculated as a % from the total amino acid length. The pili that displayed %aromaticity close to G. sulfurreducens and G. metallireducens were selected for structural analysis using Alphafold. * Display the most similar properties to G. sulfurreducens and G. metallireducens ** Display similar traits, unequal aromatic amino acid distribution means that it would not be viable for conductivity testing.

Strain Name	Isoelectric Point/pI	Amino Acid Length	%Aromaticity	Accession Number
Geobacter metallireducens*	9.40	69	13	WP004511668.1
Geobacter sulfurreducens*	9.46	73	9.6	WP119332462.1
Geotalea uraniireducens	9.22	72	8.4	WP281999251.1
Variovorax beijingensis	9.34	163	3	WP125965373.1
Vibrio hepatarius**	9.03	147	10.8	WP216599407.1
Vibrio anguillarum	9.10	151	6.6	WP340480490.1
Vibrio nigripulchritudo	9.33	144	8	WP052704144.1
Comamonas thiooxydans	9.26	144	3.5	WP081034056.
Cupriavidus plantarum	9.06	157	8.3	WP115953299.1
Allofranklinellas chreckenbergeri	9.23	163	7.4	WP122254379.1
Herminiimonas arsenitoxidans	9.40	157	5	WP076593492.
Trichlorobacter lovleyi	10.11	70	8.6	WP012470151.1
Geomonas terrae*	9.22	70	11.8	WP135869753.1
Oryzomonas sagensis	9.82	72	7	WP151157262.1
Geomonas paludis*	9.40	75	9.4	WP183347385.1
Desulfocapsa sulfexigens*	10.09	76	10.5	WP015404503.1
Citrifermentans bemidjiense*	9.10	76	11.9	WP012531019.1
Atopomonas hussainii	9.22	60	6.6	WP074869426.1
Geoalkalibacter subterraneus	9.16	79	14	WP040199521.1
Geomobilimonas luticola*	9.39	71	12.6	WP214174994.1
Stutzerimonas xanthomarina	9.60	61	4.9	WP200636514.1
Desulfuromonas thiophila*	9.40	70	12.8	WP092080546.1
Geoalkalibacter halelectricus*	9.99	75	13.3	WP302503965.1
Desulfolithobacter dissulfuricans*	9.99	75	13.3	WP326491544.1
Pseudomonas toyotomiensis	10.25	59	6.8	WP279836652.1
Thiohalorhabdus denitrificans	9.30	65	6.1	WP054965004.1
Geobacter argillaceus	9.36	75	8	WP145021489.1
Desulfosediminicola ganghwensi	10.20	76	7.8	WP275942934.1
Desulfosediminicola flagellatus	9.87	76	9.1	WP136808425.1
Deferribacter desulfuricans*	9.52	67	12	WP013007984.1

Table 2: AlphaFold structures of pilins intended for structural comparison of pili with the following: Low isoelectric point (pI), high pI but low %aromaticity, and pathogenicity.

Strain Name	%Aromaticity/pI	AlphaFold Structure
Variovorax beijingensis	High pI (9.34), but low %aromaticity (3)
Endozoicomonas numazuensis	Low pI (4.1)
Moraxella catarrhalis	Pathogenic bacteria

Table 3: AlphaFold structures of selected pilins that exhibit most similarity to existing e-pili.

Strain Name	%Aromaticity	AlphaFold Structure
Geobacter metallireducens	13
Geobacter sulfurreducens	9.6
Geomonas terrae	11.8
Geomonas paludis	9.4
Desulfocapsa sulfexigens	10.5
Citrifermentans bemidjiense	11.9
Geoalkalibacter subterraneus	14
Geomobilimonas Luticola	12.6
Desulfuromonas thiophila	12.8
Geoalkalibacter Halelectricus	14
Desulfolithobacter Dissulfuricans	13.3
Desulfosediminicola flagellatus	9.1
Deferribacter desulfuricans	12

Conclusion

Phylogenetic analysis concluded that the phylogenetic distance between the main pilin protein subunit from G. sulfurreducens and G. metallireducens was over 10 evolutionary events away from the nearest pilin subunit from a pathogenic bacteria, supporting the hypothesis that it is unlikely that pili from either species possess the capabilities to mediate pathogenicity. Furthermore, structural analysis uncovered a similar fold between the truncated pili closely related to Geobacter species, and the longer pili related to P. aeruginosa, suggesting the existence of a common structure across all type IV pili. However, these are easily distinguished by the presence of a beta sheet motif on the C-terminal end. Thus, this phylogenetic and subsequent structural analysis concludes that e-pili from G. sulfurreducens or related pili poses minimal risk in a lab or potential clinical setting in future applications, even if designed to be expressed using genes or machinery from pathogenic bacteria [6]. Further study also concluded that e-pili from G. sulfurreducens would likely be sufficiently conductive on or around the skin barrier, allowing for possibilities of further exploration into its application in wearable electronics. The findings of the structural analysis concluded that the structure of the pili has to be very specific in order to be conductive. As aromatic amino acids are negatively scored for alpha helix formation, thus there must exist a highly specific conformation that both allows for pi-pi stacking and for alpha helix folding that gives a high density[28]. Ideally, the most conductive pili would not only have a very high aromatic amino acid content, but also a very tight alpha helix fold. These factors could be used to optimise conductivity in creating a synthetic nanowire with greater conductive capabilities than the current most conductive biological nanowire, G. metallireducens, opening the possibilities for electronics to be made using sustainable materials.

References

Roland T. Motion Artifact Suppression for Insulated EMG to Control Myoelectric Prostheses. Sensors. 2020 Feb 14;20(4):1031.
Lovley DR, Walker DJF. Geobacter Protein Nanowires. Frontiers in Microbiology. 2019 Sep 24;10.
Persat A, Inclan YF, Engel JN, Stone HA, Gitai Z. Type IV pili mechanochemically regulate virulence factors in Pseudomonas aeruginosa. Proceedings of the National Academy of Sciences. 2015 Jun 3;112(24):7563–8.
Krebs SJ, Taylor RK. Protection and Attachment of Vibrio cholerae Mediated by the Toxin-Coregulated Pilus in the Infant Mouse Model. Journal of Bacteriology. 2011 Oct 1;193(19):5260–70. Available from: https://jb.asm.org/content/193/19/5260
Virji M. Pathogenic neisseriae: surface modulation, pathogenesis and infection control. Nature Reviews Microbiology. 2009 Apr;7(4):274–86. Available from: https://www.nature.com/articles/nrmicro2097
Ueki T, Walker DJF, Woodard TL, Nevin KP, Nonnenmann SS, Lovley DR. An Escherichia coli Chassis for Production of Electrically Conductive Protein Nanowires. ACS Synthetic Biology. 2020 Mar 3;9(3):647–54.
iGEM 2023 Tongji China. Project Description | Tongji-China - iGEM 2023. Igem.wiki. 2023. Available from: https://2023.igem.wiki/tongji-china/description
Li X, Liang Y, Wang Z, Yao Y, Chen X, Shao A, et al. Isolation and Characterization of a Novel Vibrio natriegens—Infecting Phage and Its Potential Therapeutic Application in Abalone Aquaculture. Biology. 2022 Nov 17;11(11):1670.
HSE. The Approved List of biological agents - MISC208(rev2). Hse.gov.uk. 2023. Available from: https://www.hse.gov.uk/pubns/misc208.htm
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10(1):421.
Reguera G, McCarthy KD, Mehta T, Nicoll JS, Tuominen MT, Lovley DR. Extracellular electron transfer via microbial nanowires. Nature. 2005 Jun 1;435(7045):1098–101. Available from: https://www.nature.com/articles/nature03661
Singh PK, Little J, Donnenberg MS. Landmark Discoveries and Recent Advances in Type IV Pilus Research. Microbiology and Molecular Biology Reviews. 2022 Sep 21;86(3).
Reddy S, Kimball RT, Pandey A, Hosner PA, Braun MJ, Hackett SJ, et al. Why Do Phylogenomic Data Sets Yield Conflicting Trees? Data Type Influences the Avian Tree of Life more than Taxon Sampling. Systematic Biology. 2017 Mar 27;66(5):857–79.
Madeira F, Madhusoodanan N, Lee J, Eusebi A, Niewielska A, Tivey ARN, et al. The EMBL-EBI Job Dispatcher sequence analysis tools framework in 2024. Nucleic acids research. 2024 Apr 10;52(W1).
Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Teeling E, editor. Molecular Biology and Evolution. 2020 Feb 3;37(5):1530–4. Available from: https://academic.oup.com/mbe/article/37/5/1530/5721363
Katsura Y, Stanley CE, Kumar S, Nei M. The Reliability and Stability of an Inferred Phylogenetic Tree from Empirical Data. Molecular Biology and Evolution. 2017 Jan 18;34(3):msw272.
Abbas R, Chakkour M, Zein H, Eseiwi FO, Obeid ST, Jezzini A, et al. General Overview of Klebsiella pneumonia: Epidemiology and the Role of Siderophores in Its Pathogenicity. Biology. 2024 Jan 27;13(2):78–8. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10886558/
Wang G, Zhao G, Chao X, Xie L, Wang H. The Characteristic of Virulence, Biofilm and Antibiotic Resistance of Klebsiella pneumoniae. International Journal of Environmental Research and Public Health. 2020 Aug 28;17(17):6278. Available from: https://www.mdpi.com/1660-4601/17/17/6278
Craig L, Forest KT, Maier B. Type IV pili: dynamics, biophysics and functional consequences. Nature Reviews Microbiology. 2019 Apr 15;17.
Xicohtencatl-Cortes J, Monteiro-Neto V, Saldaña Z, Ledesma MA, Puente JL, Girón JA. The Type 4 Pili of Enterohemorrhagic Escherichia coli O157:H7 Are Multipurpose Structures with Pathogenic Attributes. Journal of Bacteriology. 2009 Jan;191(1):411–21.
Fabrega A, Vila J. Salmonella enterica Serovar Typhimurium Skills To Succeed in the Host: Virulence and Regulation. Clinical Microbiology Reviews. 2013 Apr 1;26(2):308–41. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3623383/
Walker DJ, Adhikari RY, Holmes DE, Ward JE, Woodard TL, Nevin KP, et al. Electrically conductive pili from pilin genes of phylogenetically diverse microorganisms. The ISME Journal. 2017 Sep 5;12(1):48–58.
Reguera G, Nevin KP, Nicoll JS, Covalla SF, Woodard TL, Lovley DR. Biofilm and Nanowire Production Leads to Increased Current in Geobacter sulfurreducens Fuel Cells. Applied and Environmental Microbiology. 2006 Nov 1;72(11):7345–8. Available from: https://aem.asm.org/content/72/11/7345
Gorby YA, Yanina S, McLean JS, Rosso KM, Moyles D, Dohnalkova A, et al. Electrically conductive bacterial nanowires produced by Shewanella oneidensis strain MR-1 and other microorganisms. Proceedings of the National Academy of Sciences. 2006 Jul 18;103(30):11358–63.
Liu X, Tremblay PL, Malvankar NS, Nevin KP, Lovley DR, Vargas M. A Geobacter sulfurreducens Strain Expressing Pseudomonas aeruginosa Type IV Pili Localizes OmcS on Pili but Is Deficient in Fe(III) Oxide Reduction and Current Production. Applied and Environmental Microbiology. 2013 Dec 2;80(3):1219–24.
Craig L, Taylor RK, Pique ME, Adair BD, Arvai AS, Singh M, et al. Type IV Pilin Structure and Assembly: X-Ray and EM Analyses of Vibrio cholerae Toxin-Coregulated Pilus and Pseudomonas aeruginosa PAK Pilin. Molecular Cell . 2003 May 1 [cited 2021 Sep 28];11(5):1139–50. Available from: https://www.sciencedirect.com/science/article/pii/S1097276503001709#BIB47
Malvankar NS, Vargas M, Nevin K, Tremblay PL, Evans-Lutterodt K, Nykypanchuk D, et al. Structural Basis for Metallic-Like Conductivity in Microbial Nanowires. Brennan RG, editor. mBio. 2015 May;6(2).
Tan Y, Adhikari RY, Malvankar NS, Ward JE, Woodard TL, Nevin KP, et al. Expressing the Geobacter metallireducens PilA in Geobacter sulfurreducens Yields Pili with Exceptional Conductivity. Papoutsakis ET, editor. mBio. 2017 Jan 17;8(1).
Carter-Fenk K, Liu M, Pujal L, Matthias Loipersberger, Tsanai M, Vernon RM, et al. The Energetic Origins of Pi–Pi Contacts in Proteins. Journal of the American Chemical Society. 2023 Nov 2;
Adhikari RY, Malvankar NS, Tuominen MT, Lovley DR. Conductivity of individual Geobacter pili. RSC Advances . 2016 Jan 19 [cited 2023 Jun 12];6(10):8354–7. Available from: https://pubs.rsc.org/en/content/articlelanding/2016/ra/c5ra28092c
Tokmakov AA, Kurotani A, Sato KI. Protein pI and Intracellular Localization. Frontiers in Molecular Biosciences . 2021 Nov 29;8(775736). Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8667598
Lambers H, Piessens S, Bloem A, Pronk H, Finkel P. Natural skin surface pH is on average below 5, which is beneficial for its resident flora. International Journal of Cosmetic Science . 2006 Oct;28(5):359–70. Available from: https://onlinelibrary.wiley.com/doi/full/10.1111/j.1467-2494.2006.00344.x
Holmes DE, Dang Y, Walker DJF, Lovley DR. The electrically conductive pili of Geobacter species are a recently evolved feature for extracellular electron transfer. Microbial Genomics. 2016 Aug 25;2(8).
Mirdita M, Schütze K, Moriwaki Y, Heo L, Ovchinnikov S, Steinegger M. ColabFold: making protein folding accessible to all. Nature Methods . 2022 May 30;19(6):1–4. Available from: https://www.nature.com/articles/s41592-022-01488-1
Tan Y, Adhikari RY, Malvankar NS, Ward JE, Nevin KP, Woodard TL, et al. The Low Conductivity of Geobacter uraniireducens Pili Suggests a Diversity of Extracellular Electron Transfer Mechanisms in the Genus Geobacter. Frontiers in Microbiology . 2016;7:980. Available from: https://pubmed.ncbi.nlm.nih.gov/27446021/
Schrodinger, L. (2010) The PyMOL Molecular Graphics System, Version 1.3r1.
Carugo O, Pongor S. A normalized root-mean-spuare distance for comparing protein three-dimensional structures. Protein Science. 2008 Dec 31;10(7):1470–3.

Python Code Embed