Team Heidelberg

Google Material Icon

Results

PICasSO Enables 3D Genome Engineering

Our PICasSO toolbox (Plasmid-Integraded Cas Stapled Origami), can be used to selectively induce proximity between two custom DNA loci or strands, enabling future scientists to investigate the effects of spatial genome perturbations on cell development and pathogenesis. We established a range of novel DNA linking proteins that we call protein staples. These range from minimal and easy to express, to larger Cas-based staples that allow for flexible target design. Furthermore, we established different functionalization modules, allowing for cell-specific staple creation, depletion and control. We successfully validated our Cas staples in mammalian cells and programmed functional DNA interactions in trans. To facilitate efficient delivery of large constructs we engineered bacteria capable of inter-kingdom conjugation.

CRISPR/Cas Staples

Actively utilizing the 3D DNA conformation to engineer gene expression requires a tool that is highly specific while still allowing for targeting any location within the genome. Recently, CRISPR/Cas-based systems that manipulate three-dimensional DNA conformation to regulate cellular processes have been developed. However, these systems are often complex and have a limited range of applications. Here we present the design and application of fusion guide RNAs (fgRNAs) and fused Cas-nucleases to create a simple, adjustable system for the precise programming of DNA-DNA contacts. We successfully demonstrated the use of these Cas staples and their application in gene expression control by artificially inducing spatial proximity of otherwise separate regulatory elements. Taken together, our results establish Cas staples as an innovative and powerful tool for remodeling the three-dimensional landscape of the genome.

Introduction

The CRISPR/Cas System as a Gene Editing Tool

In 2012, Jinek et al. described the use of the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)/Cas system, to induce DNA double-strand breaks. Since their seminal work, gene editing with CRISPR/Cas has come a long way as exemplified by a multitude of applications in all areas of the life sciences and even first CRISPR-based gene therapies (Sheridan, 2023). At their core the CRISPR/Cas gene editors are constituted by a ribonucleoprotein complex consisting of a Cas nuclease and one or two RNA molecules responsible for guiding the nuclease to a specific genomic site. The most widely applied CRISPR nuclease is Cas9 from Streptococcus pyogenes. It binds a CRISPR RNA (crRNA) and a transactivating CRISPR RNA (tracrRNA) that can be combined into a single guide RNA (sgRNA) (see Fig. 1A) (Pacesa et al., 2024). The sgRNA includes a scaffold required for its binding to the Cas nuclease and an interchangeable 20 nucleotide (nt) spacer sequence that defines the DNA target via complementary base pairing. Once Cas9 binds to a matching DNA strand, it efficiently cleaves the target DNA (Cong et al., 2013). Furthermore, a specific three nucleotide sequence (NGG) on the 3’ end in the targeted DNA is required for binding and cleavage. This is referred to as the protospacer adjacent motif (PAM) (Sternberg et al., 2014).

Figure 1: The CRISPR/Cas system. A and B, schematic structure of Cas9 and Cas12a with their sgRNA/crRNA, sitting on a DNA strand with the PAM. The spacer sequence forms base pairings with the dsDNA. In case of Cas9 the spacer is located at the 5’ prime end, for Cas12a at the 3’ end of the gRNA. The scaffold of the gRNA forms a specific secondary structure enabling it to bind to the Cas protein. The cut sites by the cleaving domains, RuvC and HNH, are symbolized by the scissors.

Over the following years additional Cas proteins with different functional properties have been discovered, including Cas12a from Acidaminococcus sp. (AsCas12a) and Moraxella bovoculi (MbCas12a)(Zetsche et al., 2015). In contrast to Cas9, the gRNAs of Cas12a carry a 5’-scaffold (see Fig. 1B). Similarly, the PAM (TTTN) is also on the 5’ side of the spacer (Pacesa et al., 2024). Important to our work, Cas12a is capable of processing arrays of multiple consecutive crRNA repeats into individual crRNAs/gRNAs enabling the expression of multiple gRNAs from a single expression cassette (Paul and Montoya, 2020). The introduction of specific mutations allows the generation of catalytically dead Cas variants (Koonin et al., 2023) (Kleinstiver et al., 2019). By fusing dCas proteins to transactivator domains and targeting them to an endogenous promoter region via a specific gRNA, they can be harnessed for programmable transcriptional activation or “CRISPR activation” (CRISPRa) (Kampmann, 2017).

Strategies to Program DNA Interactions

The three-dimensional conformation of genomes is crucial for the correct interpretation of the genetic code and the phenotypic plasticity of cells. Interacting genetic elements often occur in close proximity in cis but it was shown that long-range interactions of genomic loci separated by thousands of bases or even located on different chromosomes are similarly important (Cramer, 2019). Here, the three-dimensional (3D) conformation of the genome comes into play and directly influences biological functions (Lieberman-Aiden et al., 2009)(Dixon et al., 2012). Being able to artificially influence DNA interactions would allow us to study and program the 3D genome organization and to study and potentially cure pathophysiological rearrangements. One complex example to achieve this, makes use of the CRISPR/Cas system in the form of light-activated dynamic looping for endogenous gene expression control (LADL) by Kim et al. (2019). In this system, two dCas9 proteins bind to specific endogenous loci through their respective sgRNAs. The dCas9 are fused to photoreceptors that cluster upon activation by blue light resulting in the inducible bridging of two genomic loci. (see Fig. 2).

Figure 2: Light-activated dynamic looping for endogenous gene expression control. The Cas proteins bind to the DNA via the sgRNA. They are fused to CIBN, which, upon exposure to blue light, forms a heterodimer with CRY2. CRY2 also forms oligomers with other CRY2 proteins, resulting in the bridging between two different Cas9 protein complexes. This induces the looping of DNA segments (adapted from Kim et al. (2019)).

Aim of This Subproject

Although systems for the manipulation of 3D DNA conformations principally exist, they require several components and multiple gRNAs, resulting in very complex experimental setups. The arguably most prominent tool, called LADL, relies on the unspecific clustering of dCas9 rendering multiplexing impossible. Related less complex approaches that are based on smaller proteins like zinc finger domains, remain difficult to adapt to any genomic locus of choice (Kim and Kini, 2017).
Here we present the usage of CRISPR/Cas proteins to reliably connect or "staple" two DNA strands. We created fusion guide RNAs (fgRNAs) that are composed of Cas12a and Cas9 gRNAs, connected via their spacers (Kweon et al., 2017). These Cas staples solely require the two Cas proteins dCas9 and dCas12a and an fgRNA to bridge two separate DNA strands. This allows for precise and adjustable genomic targeting while keeping the systems complexity to minimum. Cas12a's ability to process concatenated crRNAs will allow for simple multiplexing by expressing several fgRNAs in repeat from a single promoter (Gonatopoulos-Pournatzis et al., 2020).
In the first part of this project we established functional fgRNAs. We then continued to show that these fgRNAs can be used in combination with Cas12a and Cas9 to form complete Cas staples that bring two separate DNA loci into proximity.

Results

To successfully induce proximity of two DNA strands we had to connect two different DNA binding elements. We selected SpyCas9 and MbCas12a as DNA-binders due to their programmability. To physically connect both proteins, we decided to link their gRNAs. In contrast to fusing the Cas proteins, this comes with the advantage that one could easily multiplex the system, while still guaranteeing that specific pairs of genomic loci are connected. Specifically, the 3'-end of the Cas12a gRNA was fused to the 5'-end of the Cas9 gRNA. Via this approach the two spacer sequences are linked directly, ensuring a minimal distance between the two DNA strands. The design also facilitates interchangeability, as the central portion of the fgRNA construct, which includes both spacers (for Cas9 and Cas12a) together with an optional linker, can be easily replaced in one cloning step. The scaffold components remain integrated within the backbone of the construct. To implement this compact composition and ensure an efficient cloning procedure, we engineered an entry vector for the fgRNAs . We inserted a gene block encoding ccdbB between the scaffolds of Cas12a and Cas9 guide RNA (gRNA). SapI cut sites were incorporated between the scaffolds and the ccdB gene. Therefore, all fgRNAs can be created by a simple Golden Gate Assembly when adding the spacer sequence as annealed oligonucleotides to the entry vector (see Fig. 3). CcdB ensures bacteria transformed with the unchanged plasmid to die due to the toxicity of ccdB. This concept was proven by our own clonings throughout the project resulting in a very high success rate for the transformations. Picking one colony for sequencing was sufficient in all cases but one.

Figure 3: Construction process of fgRNAs using the entry vector. The ccdB gene can be excised using SapI in a Golden Gate assembly. By inserting oligonucleotides with the desired spacer sequences and matching overhangs, the complete fgRNA can be assembled into the entry vector. Due to the cytotoxic nature of ccdB, only cells with the oligonucleotides as inserts survive.

Editing Endogenous Loci With fgRNAs

To develop and test functional fgRNA Cas staples we determined two essential applications of the fgRNAs to establish first. These are the utilization for multiplex genome editing and the implementation in CRISPRa (see Fig. 4) (Kweon et al., 2017).

Figure 4: Applications of the Fusion Guide RNA. Fusion Guide RNAs can be used for multiplex genome editing by guiding active Cas12a and Cas9 to two distinct loci. Similarly, fgRNAs allow for CRISPRa, by guiding the Cas9-VP64 transcriptional activator towards a target locus.

To prove that our fusion gRNAs can still form functional ribonucleocomplexes together with Cas9 and Cas12a, a series of different fgRNAs was created, each carrying spacers specific to the VEGFA and FANCF gene. HEK293-T cells were transfected with constructs encoding the catalytically active Cas proteins and gRNAs. The gene editing rates were evaluated 72 h post transfection by measuring indel frequencies with a T7 endonuclease I assay.
Initially we used AsCas12a and SpCas9 targeting the endogenous VEGFA (AsCas12a) and FANCF (SpCas9) loci. As controls, we included samples containing separate single gRNAs co-transfected with a plasmid encoding the corresponding Cas protein, and compared them to samples including fgRNAs in combination with either one of the Cas proteins or both Cas combined in a single sample (see Fig. 5). We detected high editing rates (45% for VEGFA and 15% for FANCF), when using individual sgRNAs. Of note, the editing efficiencies for FANCF with fgRNAs resulted in noticeable indel frequencies of about 10%, with either SpCas9 and even when both Cas orthologs were co-transfected. Similar results were observed for VEGFA. In this case indel frequencies of even 40 % were measured for the sample in which both Cas variants were combined with the fgRNA These initial results confirmed our engineering approach and prove that fgRNA show editing efficiencies comparable to commonly used gRNAs.

Figure 5: FgRNAs Enable Efficient Editing of Endogenous Loci.Editing efficiencies were measured by assessing indel rates 72h post transfection via T7EI assay. Editing % was calculated through band intensities. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample.

Functional Fusion Guide RNAs Can Be Designed for Different Cas Orthologs

To further evaluate the capabilities of the fgRNAs, we tested them in combination with different Cas12a orthologs. After some initial testing, we decided on using MbCas12a together with SpCas9. Additionally, to test if the differences in editing rates from the preliminary assay resulted from the targeted loci or the different Cas orthologs, the spacers were tested in both arrangements. Once with Cas12a targeting FANCF and SpCas9 targeting VEGFA and once vice versa. To better assess the impact that the utilization of a fgRNA has on the editing rates, the sgRNAs were tested separately and in one sample.
Having the sgRNA with single Cas proteins in the same sample resulted in no clear difference in the indel frequencies (see Fig. 6A and Fig. 6B). The fusion of the gRNAs resulted in a lower editing rate overall. While the editing for VEGFA stayed at about 20% in all cases, the editing for FANCF dropped significantly. When targeting the same gene under the same conditions, the editing rates for MbCas12a were overall lower than the ones from SpCas9.

Figure 6: Fusion gRNA Editing Rates In Combination with MbCas12a. A and B, Editing efficiencies were measured by assessing indel rates 72h post transfection via T7EI assay. Editing % was calculated through band intensities. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample. A and B display both orientations of the two spacers for VEGFA and FANCF.

fgRNAs Are Compatible With Linkers of Various Lengths

To further assess the effect of the genomic locus on the editing rate, we included CCR5 as an additional gene target. For this assay, a fgRNA with a 20 nt long linker was included between the two spacers. The editing rate for VEGFA was again relatively consistent throughout the samples (see Fig. 7). For CCR5, the editing rate with sgRNAs was approximately the same at about 30%. However, it dropped below 10% for the fgRNA. The addition of the 20 nt linker had no effect on the editing rates compared to no linker.

Figure 7: Fusion gRNA Editing Rates for Multiplexing CCR5 and VEGFA Editing efficiencies were measured by assessing indel rates 72h post transfection via T7EI assay. Editing % was calculated through band intensities. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample. Cas12a targets VEGFA and Cas9 targets CCR5.

fgRNAs Enable Efficient Activation of Gene Expression via CRISPRa

To establish the foundation for their use as protein scaffolds, we identified the next step as demonstrating the use of fgRNAs for CRISPR activation. For this, we intend to recruit the transcriptional activator VP64 to a firefly luciferase gene to induce expression. The VP64 protein is attached to the catalytically inactive Cas9 protein, which is then guided by gRNAs to the luciferase gene. The gRNAs target a TetO sequence, which is positioned in front of the luciferase gene in multiple repeats. The firefly luciferase activity was then quantified as photon counts and normalized against Renilla luciferase, which is expressed on a separate plasmid under an ubiquitous promoter. In two biological replicates we saw similar Relative luciferase activity with fgRNA as a guide compared to a sgRNA (see Fig. 8). For further insight into the engineering behind these findings, we recommend to take a look at the engineering cycle fgRNA iteration 4 and 5.

Figure 8: CRISPRa Induced Luciferase Expression for sgRNAs and fgRNAs. Firefly luciferase activity was measured 48h post transfection and normalized against ubiquitously expressed Renilla luciferase. The tetO repeats were targeted by Cas9-VP64, once with a sgRNA and once with a fgRNA that had a non-targeting sequence for the Cas12a spacer. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample.

"Stapling" Two DNA Strands Together Using fgRNAs

After showing the general capability of the fgRNA to work for editing and for CRISPR activation, the next step was to use it to staple two DNA loci together, and thereby induce proximity between two separate functional elements. For this, an enhancer plasmid and a reporter plasmid was used. The reporter plasmid has firefly luciferase behind several repeats of a Cas9 targeted sequence. The enhancer plasmid has a Gal4 binding site behind several repeats of a Cas12a targeted sequence. By introducing a fgRNA staple and a Gal4-VP64, expression of the luciferase is induced (see Fig. 9A). Different linker lengths were tested. Cells were again normalized against ubiquitous renilla expression. Further information on our learnings from this assay can be found in the Cas staple engineering cycle iteration 2.
Using no linker between the two spacers showed similar relative luciferase activity to the baseline control (see Fig. 9B). An extension of the linker from 20 nt up to 40 nt resulted in an increasingly higher expression of the reporter gene.

Figure 9: Applying Fusion Guide RNAs for Cas staples. A, schematic overview of the assay. An enhancer plasmid and a reporter plasmid are brought into proximity by a fgRNA Cas staple complex binding both plasmids. Target sequences were included in multiple repeats prior to the functional elements. Firefly luciferase serves as the reporter gene, the enhancer is constituted by multiple Gal4 repeats that are bound by a Gal4-VP64 fusion. B, results of using a fgRNA Cas staple for trans activation of firefly luciferase. Firefly luciferase activity was measured 48h post transfection and normalized against ubiquitously expressed Renilla luciferase. Statistical significance was calculated with ordinary One-way ANOVA with Dunn's method for multiple comparisons (*p < 0.05; **p < 0.01; ***p < 0.001; mean +/- SD). The assay included sgRNAs and fgRNAs with linker lengths from 0 nt to 40 nt.

To further improve the efficiency, we introduced a GSG-linker between the Cas proteins. Similar to the initial tests for the fgRNAs, the capability of our fusion Cas constructs was tested by assessing the indel frequencies via a T7EI assay. For this, the same target sequences as before were used, namely FANCF and VEGFA in both configurations. We included biological duplicates in this assay.
The fusion Cas proteins allowed for editing in general, with single gRNAs and fgRNAs (see Fig. 10A and 10B). The editing rate with Cas9 was higher overall. Especially for the fgRNA fusion Cas combinations, the Cas12a editing rates were significantly lower, dropping to about 1%. At the same time, targeting VEGFA resulted in a higher editing efficiency than FANCF, as one can see in our fgRNA engineering cycle iteration 3.

Figure 10: Editing rates for fusion guide RNAs with fusion Cas proteins. A and B, editing efficiencies were measured by assessing indel rates 72h post transfection via T7EI assay. Editing % was calculated through band intensities. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample. Cas proteins linked by a dash ("–") were fused to each other. Biological replicates are marked as individual dots.

To further investigate the characteristics of the fusion Cas system, CCR5 was again included as a different target site, as well as a 20 nt linker between the two spacers. In this case, while the combination of fusion Cas proteins with sgRNAs allowed for a high editing rate of 15% to 25% at both target sites, VEGFA in combination with Cas12a was much more consistent at around 20% than CCR5 with Cas9 at about 2% (see Fig. 11). The inclusion of a linker had no significant impact on the editing rates with a fgRNA.

Figure 11: Targeting VEGFA and CCR5 with fusion Cas proteins and fgRNAs. Editing efficiencies were measured by assessing indel rates 72h post transfection via T7EI assay. Editing % was calculated through band intensities. The schematic at the top shows the composition of the fgRNA. Below each spacer is the targeted gene. The symbols below indicate which parts are included in each sample. Cas proteins linked by a dash ("–") were fused to each other.

Continuing the procedure in a similar manner as for the fgRNAs, we focused on inducing proximity between genetic loci next. The same assay was used, with one enhancer plasmid and one reporter plasmid. Though less distinct than the results for using just fgRNAs, the fusion Cas proteins can be used to increase expression levels of the reporter firefly luciferase (see Fig. 13). While using sgRNAs results in similar relative luciferase activity as for the negative control between 0.1 and 0.2., using a fgRNA with a 20 to 30 nt linker consistently resulted in activities at 0.25. Fusion guide RNAs without a linker and with a 40 nt linker had on average about the same activity, but with a higher spread over the biological replicates. Further learnings from this assay and how we want to continue in the future is layed out in our Cas staples engineering cycle iteration 2.

Figure 12: Results of Implementing Fusion Cas Proteins in Trans Activation of a Reporter Firefly luciferase activity was measured 48h post transfection and normalized against ubiquitously expressed Renilla luciferase. Statistical significance was calculated with ordinary One-way ANOVA with Dunn's method for multiple comparisons (*p < 0.05; **p < 0.01; ***p < 0.001; mean +/- SD). Fusion Cas proteins were paired with sgRNAs and fgRNAs with linker lengths from 0 nt to 40 nt.

Discussion

The use of fgRNAs for multiplex gene editing with Cas9 and Cas12a was shown to be effective. Though the fgRNAs have proven to allow for similar editing rates compared to sgRNAs in some cases, a number of factors have been identified that have a higher impact on the efficiency of fgRNAs, including the targeted genomic locus and the Cas ortholog. The editing rate varies considerably for different genes. In the various assays conducted in this subproject, VEGFA showed a relatively consistent and high editing rate. The editing rate of FANCF was observed to be fluctuating and lower in most cases.

This is likely due to differences in chromatin accessibility, which allows the Cas proteins to reach some parts of the DNA more effectively than others (Klemm et al., 2019). Cas9 and Cas12a appear to not only have a varying editing rate overall, but also show different responses to the fusion of the gRNAs or the Cas proteins themselves. Comparing editing rates on individual genes indicates an overall increased performance of Cas9 with fgRNAs.

One potential explanation for this observation might be a higher tolerance of Cas9 to modifications made to the gRNA. In contrast, the addition of a linker appears to have no impact on the editing rates. A reason for that might be that either both Cas proteins are not able to bind together in general, or that 20 nt are not enough space between the spacers to fit both Cas proteins.

While the fusion Cas protein constructs also worked in combination with fgRNAs, their overall perfomance was better in presence of sgRNAs. Having two individual connections in the Cas staple complex, might result in a more rigid assembly that would require precise coordination of the discrete linkers. To achieve this, a vast range of protein and gRNA linkers would need to be tested in different combinations, allowing for a better assessment on the exact effect they have on the system and how these linkers would need to be combined to allow for an effective forming of the Cas staple complex.

In comparison to the results presented by Kweon et al., Cas9 was also observed to have a higher editing rate in general. However, the difference in editing rates for different genes was not significant and the results show the editing rates for FANCF were in fact higher. To further assess the actual difference the targeted gene makes, a large scale screen of multiple different genes that are targeted under the same conditions would be necessary.

The main objective of this project was to establish fgRNA based Cas staples as an effective tool to bring genetic loci of choice into proximity. We first showed the fgRNAs ability to bind DNA in conjunction with a dead Cas protein, by applying CRISPRa. In the subsequent assays, we were able to show that our fusion Cas construct is able to bring and hold two genomic loci in proximity in a way that allows for gene activation. By this we not only confirm that changes in the 3D genome structure can determine whether a gene is activated, but also show that our Cas staple system is capable of exactly that.

Though, compared to simple CRISPRa, the fold changes of expression are way lower, this may allow for way more precise manipulation of gene expression. By expanding the system to endogenous loci and enhancers, it could easily be linked to complex regulatory systems within the cell, permitting a precisely engineered gene expression for the locus of choice.

Introducing fusion Cas proteins into the Cas staple system, showed first promising results, though not significant enough to consider it as a clear increase in gene expression. This again shows the requirement of further assays to understand the characteristics of fusion Cas proteins, especially in the context of binding DNA.

The use of CRISPRa with fgRNA aligns with the results previously published in this regard, also showing an increase in gene expression through the recruitment of an activator (Kweon et al., 2017). Changing the 3D structure to induce expression with fgRNA staples is to our knowledge a completely new approach. In a similar way, this has been done with the LADL system (Kim et al., 2019). Other approaches that use a more complex, modified version of Cas9 also showed that gene expression can be altered by chemically inducing rearrangements on a 3D-level (Morgan et al., 2017). These publications also show, that hijacking genomic loci for this procedure is feasible.

Outlook

To further prove our Cas staples to be applicable in shaping the human genome, a few new assays were designed to be conducted in the future. We are already in the process of designing constructs for the next experiments, which would focus on using human genome enhancers to increase expression of target genes. This extension of the system would further show how the introduction of Cas staples changing the arrangement of DNA is enough to influence expression levels at will.

In a very recent publication the usage of so-called double guide RNAs (dgRNAs) for a similar application was proposed (Yang et al., 2024). These dgRNA consist of two subsequent Cas9 gRNAs rather than a Cas12a and a Cas9 gRNA. While this publication only showed application in bacterial context, we want to properly assess and compare both systems in the eukaryotic environment. We are currently in the cloning phase of dgRNAs that are usable in HEK293-T cells and plan to perform the same assays as we did with our fgRNA system to thoroughly compare these to ways of using gRNA fusions as the basis of DNA stapling.

On the one hand, the fgRNA system would be able to solve complex systems via multiplexing different target sites, for which we already have an experiment running, confirming this. On the other hand, the dgRNA needs only one Cas ortholog to form the staple complex, lowering the amount of protein coding DNA required to be introduced into the host cell. In this may lie the possibility of making it easier to package in delivery systems like adeno-associated viruses and thereby enabling the use in gene therapy. Having both of these systems available in our toolbox, further increases the range of applications that we can provide to researchers.

Furthermore, we plan on improving the fusion Cas staple, to get similar expression levels as seen for the fgRNAs. First thoughts on new approaches can be read about in our Cas staples engineering cycle iteration 2. This would again include the screening of different protein linkers combined with different fgRNAs. The establishment of a stable protein linker is the basis of making it responsive to outside stimuli like certain chemicals or proteases. In the context of staple extension we show the successful cleavage of a peptide linker by cathepsin B, allowing for the functionalization of a Cas staple construct. We would thereby enable our system to be adaptable to different conditions, that can be specific to certain types of cancers through the likes of tumor micro environments or overexpression of distinct proteins (Anderson & Simon, 2020).

Staple Functionalization

To enable the functionalization of our PICasSO toolbox for a wide range of therapeutic and synthetic biology applications, we designed cathepsin B-responsive peptide linkers to selectively control the connection of our protein staples. Cathepsin B is a lysosomal protease enriched in various cancer types. We overexpressed cathepsin B in HEK293T cells to investigate its ability to cleave different peptide linkers using a fluorescence readout assay. We successfully demonstrated that the GFLG linker, which can be incorporated into our staple constructs, was efficiently cleaved by cathepsin B in vivo when cells were treated with doxorubicin. Furthermore, we showed that wild-type cathepsin B matured into its active forms under these conditions.

Introduction

Cathepsin B is a cysteine protease typically located in lysosomes or secreted outside the cell, where it degrades proteins of the extracellular matrix. It plays a critical role in apoptosis and is significantly overexpressed in various cancer types, including breast and colorectal cancer (Ruan et al., 2015). The overexpression of cathepsin B is associated with tumor invasion and metastasis. Various stimuli, such as ischemia, bile acids and TNFα, can induce cathepsin B-mediated apoptosis. In this process, lysosomes release large amounts of cathepsin B into the cytosol, where it cleaves anti-apoptotic factors like Bcl-2 and XIAP, leading to an increase in apoptotic proteases, such as caspase-3 (Bien et al., 2010).
The significance of cathepsin B in cancer progression is well-documented, with studies showing elevated cathepsin B levels in cancerous tissues compared with non-cancerous tissues (Ruan et al., 2015). Research has shown a connection between elevated levels of cathepsin B and enhanced angiogenesis, invasion and metastasis (Ruan et al., 2015). Given its important role in tumor progression, cathepsin B is considered a potential therapeutic target (Ruan et al., 2015) or prodrug-activating enzyme (Zhong et al., 2013). Proteolytic cleavage of pro-biologics allows for the precise temporal and spatial regulation of biopharmaceutical activity in therapeutic strategies (Bleuez et al., 2022).

Aim of This Subproject

We wanted to explore the potential of our PICasSO platform approach for therapeutic applications by designing protein-based DNA staples that are cleaved in response to the overexpression of cathepsin B in cancerous tissues. Furthermore, we demonstrated doxorubicin-dependent cathepsin B cleavage of one out of five documented linkers (Jin et al., 2022; Shim et al., 2022; Wang et al., 2024) in HEK293T cells.
Additionally, we developed a construct consisting of a dead Cas9 (dCas9) connected to two SV40 nuclear localization sequences and two caged intein fragments (NpuN, NpuC) (Gramespacher et al., 2017). The cages were connected to the split intein via cathepsin B-responsive linkers, preventing fragment association and consequent protein trans-splicing. In cancer cells overexpressing cathepsin B, the linkers would be cleaved, thus uncaging the inteins and resulting in association of NpuN and NpuC. In the subsequent protein trans-splicing reaction NpuN and NpuC cleave themselves out of the construct, linking dCas9 proteins to each other.
With these approaches, we aimed to induce structural and functional changes in protein-stapled DNA selectively in cancerous tissue.

Results

To achieve cathepsin B cleavage-induced Cas stapling, catalytically active cathepsin B needs to be expressed in the cytosol. Therefore, we investigated the expression of different cathepsin B constructs under different conditions in HEK293T cells. In addition to wild-type (wt) cathepsin B, we also cloned a truncated and mutated version of cathepsin B (Δ1-20, D22A, H110A, R116A) and compared protein expression of both constructs in doxorubicin-treated and untreated conditions.
Figure 13 shows a Western blot of the wild-type (wt) version of cathepsin B as well as the truncated and mutated version of cathepsin B (Δ1-20, D22A, H110A, R116A). The truncated and mutated version of cathepsin B lacked an N-terminal signal peptide responsible for co-translational targeting to the rough endoplasmic reticulum. This would lead to the cytoplasmic expression of cathepsin B (Müntener et al., 2005). Additionally, three point mutations would disrupt the conformation of an occluding loop increasing cathepsin B activity in the cytoplasm (Nägler et al., 1997). Cells of both cathepsin B versions were treated with 500 nM doxorubicin (dox) 24 hours post-transfection and incubated for additional 24 hours. For each condition, three replicates were blotted. We observed no differences in protein expression levels between the dox-treated and untreated wt versions of cathepsin B. For the truncated and mutated version of cathepsin B, however, only the untreated samples showed the corresponding band at approximately 36 kDa expected for this version of cathepsin B. Additionally, the bands of the truncated and mutated version appeared much weaker than the ones of the wt, indicating poorer protein expression. The household protein β-tubulin is visible in all samples at approximately 55 kDa. The wt cathepsin B additionally showed bands for pro-cathepsin B at approximately 42 kDa, a mature single-chain version of cathepsin B at approximately 33 kDa and a mature double-chain version at approximately 26 kDa.

Figure 13: Western Blot of Two Versions of Cathepsin B With and Without Doxorubicin. From left to right: protein ladder, wild-type (wt) cathepsin B with (+) and without (-) doxorubicin, truncated and mutated version of cathepsin B with (+) and without (-) doxorubicin. The household protein, β-tubulin, is visible in all samples at 55 kDa. The wt cathepsin B also shows bands for pro-cathepsin B at 42 kDa, mature single-chain cathepsin B at 33 kDa and mature double-chain cathepsin B at 26 kDa. The band for the truncated and mutated version of cathepsin B can be seen in the samples without doxorubicin at 36 kDa.

Based on the results of the Western blot, we decided to use wt cathepsin B to investigate the proteolytic cleavage of different peptide linkers. To this end, we used a fluorescence readout assay based on VP64-induced mCherry expression. We successfully identified one linker that could be cleaved efficiently by cathepsin B in vivo.
To investigate cathepsin B cleavage of different linkers, we incorporated five peptide linkers from literature (GFLG, FFRG, FRRL, VA, FK) (Jin et al., 2022; Shim et al., 2022; Wang et al., 2024) in between the DNA binding domain (DBD) of Gal4 and the transactivator domain VP64 (as previously described in Muench et al. (2023)). Binding of Gal4-DBD upstream of a gene encoding the fluorescence protein mCherry induces overexpression of mCherry by VP64. Consequently, separation of Gal4-DBD and VP64 by cathepsin B cleavage of the peptide linker reduces mCherry expression (see Fig. 14).

Figure 14: Schematic Illustration of the Cathepsin B Fluorescence Readout Assay.The DNA binding domain (DBD) of Gal4 is conjugated to the transactivator domain VP64 via a cathepsin B-cleavable peptide linker. Binding of the Gal4-DBD to the upstream activating sequence (UAS) in proximity to the mCherry gene induces mCherry overexpression via VP64. Cathepsin B cleavage of the linker separates Gal4-DBD and VP64 and consequently reduces mCherry expression.

We conducted fluorescence readout assays in HEK293T cells. The transfected plasmids encoded mCherry, the Gal4-VP64 constructs with different linkers, and cathepsin B (see Fig. 15). Additionally, a stuffer plasmid and a plasmid encoding eGFP were transfected for normalization. By conducting preliminary tests, we determined adding doxorubicin 24 hours post transfection in a final concentration of 500 nM to the cell supernatant as the optimal procedure. This induces the lysosomal escape of mature cathepsin B (Bien et al., 2004). 48 hours post transfection, we measured the fluorescence intensities of mCherry and eGFP and took micrographs of the transfected cells with a fluorescence microscope.

Figure 15: Transfection Plan of HEK293T Cells for Fluorescence Readout Experiments. HEK293T cells in a 96-well plate were transfected with plasmids encoding mCherry, the Gal4-VP64 constructs with different linkers, and cathepsin B (CatB). Additionally, a stuffer plasmid and a plasmid encoding eGFP were transfected for normalization.

In this experiment, mCherry and eGFP were evaluated as reporters to quantify the efficiency of cathepsin B-mediated cleavage of Gal4-Linker-VP64 constructs in HEK293T cells.
Figure 16 shows micrographs taken with a fluorescence microscope of three different conditions: the null control, the negative control and the test sample. Figure 17 shows the corresponding graphs. All samples were transfected with plasmids encoding eGFP and mCherry. The null control and the negative control were not transfected with the plasmid encoding cathepsin B. The null control was also not transfected with any of the plasmids encoding Gal4-Linker-VP64 constructs. The test sample was transfected with 30 ng of the plasmid encoding cathepsin B and with the plasmid encoding Gal4-GFLG-VP64. As expected, the null control exhibited no detectable mCherry signal, with corresponding fluorescence intensity measurements at baseline levels. Since no plasmid encoding a Gal4-V64 construct was transfected, mCherry overexpression via VP64 could not be induced. However, we observed a high fluorescence intensity for eGFP, indicating that the transfection was successful. The negative control showed strong signals of both mCherry and eGFP. Therefore, it can be assumed that the transfection was successful and that our mCherry readout system is functional. Interestingly, there are some cells which either seem to only express mCherry or eGFP and some cells that show no fluorescence signal. The test sample showed less eGFP and mCherry fluorescence compared to the negative control. We expected to observe reduced fluorescence intensity of mCherry, as the transfected cells would express cathepsin B, which cleaves the linker, thereby decreasing mCherry expression.

Figure 16: Micrographs of HEK293T Cells in Two Control Conditions and One Test Condition. Micrographs were taken with a fluorescence microscope 48 hours post transfection. An overlay of brightfield, eGFP and mCherry is shown. All samples were transfected with plasmids encoding eGFP. The null control and the negative control were not transfected with the plasmid encoding cathepsin B. The null control was also not transfected with any of the plasmids encoding Gal4-Linker-VP64 constructs. The test sample was transfected with 30 ng of the plasmid encoding cathepsin B and with the plasmid encoding Gal4-GFLG-VP64. The micrograph of the test sample is not from the same biological replicate as the micrographs of the two controls.

Figure 17: Fluorescence Readout After 48 Hours for Two Control Conditions and One Test Condition. The fluorescence intensity for mCherry was measured for the GFLG linker and normalized against a baseline eGFP fluorescence intensity. The null control and the negative control were not transfected with the plasmid encoding cathepsin B. The null control was also not transfected with any of the plasmids encoding Gal4-Linker-VP64 constructs. The test sample was transfected with 30 ng of the plasmid encoding cathepsin B and with the plasmid encoding Gal4-GFLG-VP64.

This experiment investigated the cleavage of five peptide linkers by cathepsin B in vivo, analyzing mCherry fluorescence intensity across the different linkers and transfection conditions.
Figure 18 shows the fluorescence intensity of mCherry for five different peptide linkers (GFLG, FFRG, FRRL, VA, FK). The negative control was not transfected with the plasmid encoding cathepsin B. We investigated two different test conditions, in which we either transfected 30 ng or 60 ng of the plasmid encoding cathepsin B. The fluorescence intensity of mCherry was normalized by the measured fluorescence intensity of eGFP in each condition. Additionally, the values for 30 ng and 60 ng cathepsin B were normalized against the corresponding negative controls. One data point for the VA linker, transfected with 60 ng of the plasmid encoding cathepsin B, was excluded due to severe deviation from the other values. We conducted a two-way analysis of variance (ANOVA) to assess the significance of the observed differences between the negative control and the test conditions for each linker. As the negative control did not contain the plasmid encoding cathepsin B, we expected the measured fluorescence intensity of mCherry to be the highest in these conditions. However, this was only observed for the GFLG and FK linkers. Contrary to our expectations, the fluorescence intensity of the negative control was the lowest out of the three conditions tested for the remaining linkers. It appears that the addition of the plasmid encoding cathepsin B increases mCherry fluorescence intensity when the linker is not cleaved. However, this increase is only significant for the FFRG linker in the 60 ng condition. For the GFLG linker, we observed significant decreases in fluorescence intensity between the negative control and both test conditions, with no difference between the 30 ng and 60 ng conditions. For the FK linker, no significant decreases in fluorescence intensity between the negative control and the test conditions were observed.

Figure 18: Fluorescence Readout After 48 Hours for Five Different Peptide Linkers and Three Different Conditions. The fluorescence intensity for mCherry was measured for five different linkers and normalized against a baseline eGFP fluorescence intensity. The negative control was not transfected with the plasmid encoding cathepsin B. The fluorescence intensity of the negative control was set to one. Two different test conditions were investigated, in which either 30 ng or 60 ng of the plasmid encoding cathepsin B were transfected. The fluorescence readout was analyzed using a two-way ANOVA. Medium: DMEM (10% FCS). P values: ns, P > 0.05; *, P ≤ 0.05; **, P ≤ 0.01; ***, P ≤ 0.001; ****, P ≤ 0.0001.

Discussion

Initially, Western blot analysis confirmed that overexpressed wild-type cathepsin B was processed into its mature single-chain and double-chain forms as previously reported in the literature (Mentlein, Hattermann, Held-Feindt, 2012). This confirms that active cathepsin B is present inside of the cells. However, since lysis of the cells also disrupts the lysosomes, we can not conclude whether this active cathepsin B is also present in the cytosol in vivo. For the truncated and mutated version of cathepsin B, we only observed protein bands in samples that were not treated with doxorubicin. The three samples incubated with doxorubicin for 24 hours showed only faint bands for the housekeeping protein β-tubulin, indicating generally low protein levels. This suggests that the cells were subjected to stress, possibly through doxorubicin or inadequate handling. Along with low transfection efficiency, these factors may have contributed to the low protein levels observed.
Using fluorescence microscopy, we demonstrated cathepsin B-mediated linker cleavage in our reporter assay setup. This assay employed a Gal4-VP64 system, in which cleavage of the peptide linker by cathepsin B reduced mCherry expression, providing a reliable measure of cleavage efficiency. The successful cleavage of linkers was indicated by reduced mCherry fluorescence intensity in cells overexpressing cathepsin B. It was also observed that not all cells were fluorescent in accordance with an expected transfection efficiency. We also noticed that some cells seemed to only express eGFP or mCherry. However, most cells seemed to express both eGFP and mCherry as indicated by their yellow fluorescence.
Finally, we investigated the cleavage of different peptide linker variants by cathepsin B with help of our fluorescence readout assay. Five different peptide linkers (GFLG, FFRG, FRRL, VA, FK) were tested. A two-way ANOVA revealed a significant reduction in fluorescence between the negative control and the two test conditions for the GFLG linker in the presence of cathepsin B, demonstrating efficient cleavage. Surprisingly, a significant increase in fluorescence was observed between the negative control and 60 ng test condition for the FFRG linker. However, since the increase between the negative control and the 30 ng test condition of the same linker was not significant, this difference is likely due to biological variability between the samples.
Additionally, our cathepsin B-cleavable linker can be combined with caged inteins (Gramespacher et al., 2017) conjugated to a dead Cas9 to selectively induce Cas-stapling in the presence of cathepsin B. Corresponding parts can be found in our part collection and in the iGEM registry, but still require characterization.
In conclusion, these findings demonstrate that our fluorescence-based readout assay can reliably detect cathepsin B-mediated cleavage of peptide linkers, with the GFLG linker showing particular susceptibility to cleavage. This makes GFLG a promising candidate for targeted applications in environments with upregulated cathepsin B activity, such as cancer tissues.

Outlook

Future experiments will investigate the influence of different doxorubicin concentrations on the activity of cathepsin B in the cytosol. Different linker lengths or a repeat of the GFLG linker could also be tested. Additionally, this system could be used for other proteases that are involved in certain diseases, such as different caspases in neurodegenerative conditions (Espinosa-Oliva et al., 2019).

Readout Systems

By developing custom EMSA and FRET assays, we established key tools for the validation and characterization of various protein staples. Beginning with the successful construction of basic staples, these assays provided key insights into the characteristics of DNA binding proteins and stapling mechanisms. By rigorously testing our workflow, we developed foundational techniques that future iGEMers and researchers can leverage to engineer and optimize protein-based DNA-folding systems.

Introduction

Selection of DNA binding Proteins tetR, Oct1 and GCN4

To establish easy-to-use assays for the characterization of protein staples, we started building on basic protein parts. Instead of Cas proteins, we used the Tetracycline Repressor (TetR) and the human transcription factor Oct1 as DNA-binding domains as they are well-characterized proteins with known binding properties. TetR is a bacterial transcriptional repressor that binds specifically to the tetO operator sequence and dissociates in the presence of tetracycline. It is widely adopted as a synthetic gene regulation tool, both in prokaryotic and eukaryotic systems (Berens & Hillen, 2004). Similarly, Oct1, a POU domain transcription factor involved in immune cell regulation and stress response, has been shown to bind tightly to its octamer DNA motif (Lundbäck et al., 2000; Stepchenko et al., 2021). It was shown that the DNA-binding domain of Oct1 can be readily fused to other proteins, for increased protein solubility and strong DNA-binding capabilities, even during protein purification (J. H. Park et al., 2013; Y. Park et al., 2020). Fusing TetR to Oct1 we created "simple staple proteins" able to bind both operator sequences simultaneously. In addition to TetR and Oct1, we additionally used small basic-region leucine zipper (bZip) proteins with DNA-binding capabilities as they are among the most compact DNA-binding domains known to date. The motif consists of a coiled-coil leucine zipper dimerization domain, and a highly charged basic region that binds to DNA (Hollenbeck & Oakley, 2000). One well characterized example is the General Control Protein 4 (GCN4), a well-characterized transcriptional activator from yeast (Arndt & Fink, 1986). At its N-terminus, GCN4 contains basic residues, the so-called bZip domain, through which it binds specifically to the CRE (cyclic AMP response element) DNA sequence (Hollenbeck et al., 2002). A variant of GCN4 with the DNA binding bZip-domain at the C-terminus (rGCN4) has been engineered to bind to the inverted CRE sequence, INV2 with similar affinity (Hollenbeck et al., 2001). By genetically fusing GCN4 to rGCN4, we created a small bivalent DNA binding staple with less than 150 amino acids.

Förster Resonance Energy Transfer (FRET)

Förster Resonance Energy Transfer (FRET) is a distance-dependent physical process: in which energy is transferred non-radiatively from an excited donor fluorophore to an acceptor fluorophore via dipole-dipole coupling. The efficiency of energy transfer is highly sensitive to the distance between the donor and acceptor, typically in the range of 1-10 nm, making FRET ideal for studying molecular proximity (Hochreiter et al., 2019). This proximity sensitivity is governed by the Förster radius (R₀), which is the distance at which 50% energy transfer occurs. Factors affecting FRET efficiency include the overlap of the donor's emission spectrum with the acceptor’s absorption spectrum, the quantum yield of the donor, and the relative orientation of the fluorophores (Wu & Brand, 1994). These characteristics allow FRET to detect interactions such as protein-DNA binding or DNA-DNA contacts in real time. For our assay, we selected mNeonGreen and mScarlet-I as donor and acceptor fluorophores, respectively, due to their strong fluorescence, spectral overlap, and minimal photobleaching, ensuring high FRET efficiency in our system (Bindels et al., 2017; Shaner et al., 2013). FRET's sensitivity to small changes in distance makes it especially powerful for analyzing molecular interactions in living cells (Okamoto & Sako, 2017).

Electrophoretic Mobility Shift Assay (EMSA)

The Electrophoretic mobility shift assay (EMSA) is a widely adopted method used to study DNA-protein interactions. EMS exploits the fact that nucleic acids bound to proteins have reduced electrophoretic mobility, compared to their unbound counterpart. (Hellman & Fried, 2007). Mobility-shift assays can be used to qualitatively assess DNA binding capabilities as well as for the quantitative determination of binding stoichiometries and kinetics such as the apparent dissociation constant (Kd (Fried, 1989).

Aim of This Subproject

Engineering a solid and versatile toolbox is a huge challenge, especially when working with complex systems such as DNA-bound protein staples. To systematically characterize these components, we first set out to develop a well-tested collection of assays that lay the foundation for the investigation of the more complex aspects of our PICasSO toolbox. These assays provide essential tools for studying protein-DNA interactions and DNA-DNA proximity. We used electrophoretic mobility shift assays (EMSA) to analyze protein-DNA binding kinetics and sequence specificity in vitro and a Förster resonance energy transfer (FRET) assay to detect interactions in vivo. Together, these assays form the backbone of our experimental approach and were utilized to systematically analyze and create new staples.

Results

The FRET assay was developed using a two-plasmid system in bacterial cells. After testing different constructs, our final expression plasmid contains a tetR binding site and expresses three key proteins under the control of a single T7 promoter in a polycistronic operon: (1) tetR-Oct1, our simple staple fusion protein that acts as a bivalent DNA binding protein, tethering two plasmids via tetR and Oct1 binding sites; (2) Oct1-mNeonGreen, serving as the FRET donor; and (3) tetR-mScarlet-I, the FRET acceptor. This ensures all three proteins are co-expressed in similar stoichiometry. The folding plasmid contains an Oct1 binding site for the staple and FRET donor binding.

When tetR-Oct1 binds the respective sites on both plasmids, mNeonGreen and mScarlet-I are brought into into proximity, facilitating FRET between the two fluorophores (see Fig. 19). Successful stapling of the plasmids results in increased energy transfer from mNeonGreen to mScarlet-I, which can be detected by exciting mNeonGreen and measuring enhanced emission from mScarlet-I. A positive control, consisting of a direct fusion of mNeonGreen and mScarlet-I, ensures maximal FRET efficiency and serves as a benchmark for the assay.

basic-staple-fret

Figure 19: Overview of DNAs tapling and FRET measurement

Fluorescence intensity, normalized to OD600, of mNeonGreen and mScarlet-I was measured 18 h after inducing staple protein expression with varying concentrations of IPTG (see Fig. 20A and 20B). Counterintuitively, an increasing fluorescence was detected for decreasing IPTG concentrations, likely due to slower culture growth due to the high burden of strong induction. Fluorescence intensity of the positive control was significantly stronger compared to the negative control and staple. The negative control and staple, both of which carried the same expression plasmid construct, had similar fluorescence intensity for mNeonGreen and mScarlet-I down to approximately 0.05 mM. Lower concentrations resulted in strong discrepancies. To ensure comparability between the negative control and staple, further fluorescence intensity measurements were conducted after induction with 0.05 mM IPTG. In this experiment, only minor differences were detected for the individual fluorescent proteins between the non-connected negative control and the stapled sample When we measured FRET efficiency instead, we detected a strong increase in FRET efficiency for the staple as compared to the negative control, indicating successful induction of spatial proximity between both DNA strands (see Fig. 20C).

fluo-tit

Figure 20: Fluorescence measurement of mNeonGreen, mScarlet-I and FRET. Fluorescent measurement, normalized to cell count, of mNeonGreen (ex. 490 nm, em. 530 nm), mScarlet-I (ex. 560 nm, em. 600 nm), and FRET (ex. 490 nm, em. 600 nm) in E. coli, 18 h after induction with 0.025 mM IPTG. Data is presented as mean +/- SD. A, B Fluorescence intensity of mNeonGreen and mScarlet-I with different IPTG concentrations. C Fluorescence intensity of FRET pair after induction with 0.05 mM IPTG. (n = 3) Statistical significance was determined with Ordinary two-way ANOVA with Šidák's multiple comparison test, with a single pooled variance. *p < 0.05, ****p < 0.001. Only significant results are shown.

To better characterize them and perform EMSA experiments, we aimed to purify all mini staples described above. The DNA binding proteins TetR and Oct1-DBD were fused to mScarlet-I and mNeonGreen, respectively, and to each other, each harboring a His6-tag. The bZip proteins GCN4, rGCN4 and their fusion bGCN4 were fused N-terminally to a FLAG-tag (DYKDDDDK). All proteins could be readily expressed under control of a T7 promoter in E. coli BL21 DE3 and purified with Ni-NTA, or Anti-FLAG affinity columns for His-tagged and FLAG-tagged proteins, respectively (see figure 21).

Figure 21: SDS-PAGE analysis of purified DNA binding proteins. A) Analysis of fractions eluate of purified protein taken during Ni-NTA affinity chromatography. b) Analysis of fractions eluate of purified protein taken during Anti-FLAG affinity chromatography 1 µL of each sample was prepared with Leammli buffer and loaded on 4-15% TGX-Gel. Correct bands of interest are highlighted by red boxes.

To assess possible DNA binding, a qualitative EMSA was performed with the purified Oct1-DBD, bGCN4 and TetR staple proteins and additionally three different buffer systems TetR. DNA binding could be detected for the single purified proteins including Oct1-DBD, TetR, bGCN4, but not for the bGCN4 fusion (see Fig. 22). Binding buffer 1 (137 mM NaCl, 2.7 mM KCl, 10 mM Na 2HPO4, 1.8 mM KH2HPO4, 0.1 % (v/v) IGEPAL® CA-360, 1 mM EDTA), also described by Hollenbeck (2001)) was the best performing buffer and used for subsequent experiments.

Figure 22: Qualitative EMSA results. Gelelectrophoresis was performed in TBE buffer with 10 % TGX-Gel pre-equilibrated with TBE. Bands are visualised by post-staining with SYBR-Safe. A, 8 µM purified mNeonGreen-Oct1 fusion-protein were equilibrated with different DNA concentrations (1000 nM, 100 nM, 10 nM) containing three Oct1 binding sites. in different buffer compositions (Binding buffer 1: 137 mM NaCl, 2.7 mM KCl, 10 mM Na 2HPO4, 1.8 mM KH2HPO4, 0.1 % (v/v) IGEPAL® CA-360, 1 mM EDTA; Binding buffer 2: 10 mM Tris, 50 mM KCl; NaP250: 50 mM NaH2PO4, 150 mM NaCl, 250 mM Imidazol). B, purified TetR-Oct-1 fusion protein was incubated with 0.5 µM DNA containing either a TetR or Oct-1 binding site. C, 200 µM purified protein were equilibrated with 0.5 µM DNA containing one target site.

To further analyze DNA binding, quantitative shift assays were performed for GCN4 and rGCN4. Here 0.5 µM DNA were incubated with varying concentrations of protein until equilibration. After electrophoresis, bands were stained with SYBR-Safe and quantified based on pixel intensity. The obtained values were fitted to equation 1, describing formation of a 2:1 protein-DNA complex.

Θapp = Θmin + (Θmax - Θmin)   Ka2 [L]tot2 1 + Ka2 [L]tot2
Equation 1


Here [L]tot describes the total protein monomer concentration, Ka corresponds to the apparent monomeric equilibration constant. The min/max values arSe the experimentally determined site saturation values (For this experiment 0 and 1 were chosen for min and max respectively). GCN4 binds to its optimal DNA binding motif with an apparent dissociation constant Kk of (0.2930.033)×10-6 M, which is almost identical to the rGCN4 binding affinity to INVii a d of (0.2980.030)×10-6 M (see Fig. 23).

Figure 23: Kd Calculation of GCN4 and rGCN4 Quantitative assessment of binding affinity for GCN4 and rGCN4. Proteins of varying concentrations were incubated with 0.5 µM DNA in Binding buffer 1, and the bound fraction analyzed by dividing pixel intensity of bound fraction with pixel intensity of bound and unbound fraction using ImageJ. At least three separate measurements were conducted for each data point. Values are presented as mean +/- SD.

Discussion

The results from the fluorescence intensity measurements showed stronger expression of the fluorescent proteins for the positive control. Based on our preliminary testing, explained in our engineering cycles, we suspect a strong metabolic burden to influence expression levels, especially for the staple construct. This could be both due to the polycistronic expression of multiple protein, as well as the strong T7 promoter resulting in too high mRNA levels. The negative control and staple showed strong discrepancies in mNeonGreen and mScarlet-I fluorescence for IPTG concentrations below 0.05 mM. This is surprising as both samples have the same expression plasmid and only the folding plasmid of the negative control is missing the binding site for Oct1. One possible explanation could be off-target binding in the E. coli genome, resulting in changes in the expression pattern or similar. With 84 specific Oct1 binding sequences previously reported in the genome this could be a major factor. (Y. Park et al., 2020). Studies also showed binding affinity of Oct1 to different target sites, albeit with lower affinity, possibly resulting in even more potential genomic binding sites(Verrijzer et al., 1992). The folding plasmid, harboring a p15A origin of replication, is expected to have around 11 copies per cell (Shao et al., 2021), resulting in approximately 130 Oct1 binding sites, which stand in competition to genomic binding sites. Further experiments are needed to better understand this observation. All proteins for in vitro characterization could be expressed and purified. Some nonspecific proteins still remained in the eluate as detected by SDS-PAGE, but this is to be expected given well known unspecific binding of cellular proteins to the Ni-NTA affinity matrix. Initial qualitative tests showed successful binding of the single DNA binding proteins to the DNA. Since TetR and Oct1 were fused to mScarlet-I and mNeongreen, respectively, we could show that these proteins accept fusions, while still being able to bind DNA, which also matches previous observations (Gossen & Bujard, 1992; J. H. Park et al., 2013b). Bound and unbound fractions were visualized on the gel by SYBR-Safe staining and illumination with a trans-illuminator, resulting. To enable exact quantification, and to further characterize additional DNA binding proteins the bZip proteins were chosen for quantitative gel shift assays. For the purified proteins GCN4, rGCN4 and the fusion bGCN4 a qualitative gel was run with high protein concentration. Bands for GCN4 and rGCN4 were visible but no band for bGCN4 could be detected, indicating a lack of DNA binding. This suggests that the dimerization, necessary for simultaneous binding of two DNA strands is disrupted by the GSG-linker (Ellenberger et al., 1992; Liu et al., 2006; Lupas et al., 2017; Woolfson, 2023). To better understand possible problems in dimerization circular dichroism could be used to analyze secondary structure and proper coiled coil formation (Greenfield, 2006). Further engineering will be required including the investigation of various linkers with specific properties to ensure correct folding and dimerization (Chen et al., 2013). The apparent binding kinetics calculated for GCN4 ((0.2930.033) × 10-6 M) and rGCN4 ((0.2980.030) × 10-6 M) are approximately a factor 10 higher then those described in literature ((9 ± 6) × 10-8 M for GCN4 and (2.9 ± 0.8) × 10-8 M for rGCN4) (Hollenbeck et al., 2001). The differences could be explained by the lower sensitivity of SYBR-Safe staining compared to radio-labeled oligos. Most likely, the protein concentration was miscalculated due to the presence of additional (lower intensity) bands in the SDS-PAGE analysis, indicating the co-purification of small amounts of unspecific proteins.

The FLAG-tag fusion to the N-terminus of proteins could potentially decrease binding affinity, likely due to steric hindrance affecting the interaction with DNA. Interestingly, the differences in binding affinity between GCN4 and rGCN4 appear negligible. Since GCN4 binds to DNA via its N-terminus and rGCN4 binds C-terminally, the FLAG-tag likely does not directly influence DNA binding. However, it may influence the dimerization of the proteins, which is necessary for DNA binding. To further investigate this, the FLAG-tag can be cleaved using an enterokinase and potential changes in binding affinity could be analyzed.

Outlook

Future work will focus on quantifying FRET interaction in vivo to assess stapling efficiency and DNA-DNA proximity more accurately. Additionally, the development of mini staples can be further improved by testing alternative linkers predicted by our dry lab simulations. By systematically assessing different linker sequences, we aim to create functional staples and furthermore lay the foundation for future engineering of small bZip DNA binding domains.
Finally, we are also currently testing out our measurement system with the more complex Cas staples.

Delivery System

As part of the PICasSO toolbox, we propose bacterial conjugation as a cheap, simple and scalable alternative to conventional DNA delivery methods, particularly suited for large plasmid constructs encoding protein-based staples for controlled modulation of mammalian 3D genome organization. Bacterial conjugation is one of the key mechanisms for horizontal gene transfer between bacteria on solid media. We validated the proficiency of the RP4 conjugative machinery in mediating conjugation between bacteria on solid media. Future experiments will expand the DNA delivery capabilites of the Type IV secretion system to mammalian cells by employing synthetic adhesins to enhance cell-cell contact.

Introduction

The bacterial T4SS is implicated in conjugation, a conserved mechanism of horizontal gene transfer among gram-negative (De La Cruz et al., 2010) and gram-positive (Grohmann et al., 2003) bacteria. In nature, conjugation is a major driver of bacterial genome evolution in several niches ranging from soil and water to biofilms in animal hosts, enabling the utilization of new metabolites, conferring pathogenic properties, and even resistance to heavy metals and antibiotics, under selection pressure (Virolle et al., 2020).

The mechanism of conjugation and the myriad of factors required are well-characterized, although several questions remain open and the function of some of the structures still remains elusive. For instance, the fundamental question of what triggers conjugation still remains poorly understood as we learnt from Dr. Christian Lesterlin, an expert in the field.

Moreover, different conjugative systems have been described in gram-negative bacteria (IncF, IncW, IncP, IncN, etc.) all of which are functionally similar. In all these systems, following the establishment of pilus-mediated contact between bacteria (mating pair formation), a mobilizable plasmid is processed in the donor bacterium and a single strand of the plasmid is covalently attached to a protein (relaxase). This DNA-protein complex is then transferred through T4SS into the recipient bacterium, where it is re-circularized and maintained. One of the key elements in this process is the origin of transfer (oriT) sequence in the mobilizable plasmid, which is recognized and nicked by the relaxase to initiate the transfer.

Dr. Christian Lesterlin recommended us to work with the RP4 system (IncP) encoding short and rigid pili as it is a well-characterized and established broad host-range conjugative system. As a consequence of the short and rigid nature of pili encoded by the RP4 system, conjugation rates tend to be two to four orders of magnitude higher in solid media than in liquid media (Robledo et al., 2022), where shear forces have a destabilizing effect on the mating pair.

Notably, all the events leading to DNA transfer by conjugation are driven by the donor bacterium and its protein components are typically plasmid encoded. Thus, it is technically possible for any type of cell to serve as the recipient (Waters, 2001). In line with this claim, there have been some studies that report on conjugational DNA transfer not only between different bacterial species (Hamilton et al., 2019) but also from bacteria to members of other kingdoms – Agrobacterium tumefaciens that is able to deliver DNA to plants (Gelvin, 2003) being a well-known example. Besides DNA transfer to plants by A. tumefaciens, other bacteria like Bartonella henselae have been shown to deliver DNA to human cells (Schröder et al., 2011) via their T4SS during infection, albeit with very low efficiency.

Fascinated by this phenomenon, we wondered whether conjugation can be engineered as a generalized DNA delivery tool. Indeed, a few studies have been successful in generating bacteria capable of delivering DNA to yeast using their T4SS (Soltysiak et al., 2019) but there is a lack of research done on engineering bacteria to conjugate with mammalian cells. Taking inspiration from the work of VL Waters (2001), who showed that it is possible to deliver DNA to mammalian cells by conjugation using the RP4 machinery, we contemplated ways to rationally engineer this system to enable higher efficiency and selectivity.

Since the primary trigger for conjugation remains obscure, we hypothesized that cell-cell contact might be one of the main determinants for conjugation to occur efficiently. In this line, Robledo et al.(2022) showed that enhanced cell-cell contact mediated by synthetic adhesins led to an increase in conjugation efficiency between bacteria. Specifically, a 100-fold increase in conjugation efficiency was observed in liquid matings between bacteria carrying the RP4 conjugation system.

This combination of knowledge motivated us to test inter-kingdom conjugation between bacteria and mammalian cells using synthetic adhesins against EGFR (a common mammalian surface receptor) to potentially increase the probability of DNA transfer. We received valuable feedback from Dr. Robledo herself, who said that if cell-cell contact were to be the limiting factor, we are likely to succeed with our approach.

What are the benefits of our system?

By establishing our conjugation-based DNA delivery system, we seek to contribute to the available array of DNA delivery methods with an innovative alternative. The advantages of using conjugation over commonly used gene delivery methods include:

  • The ability to deliver large plasmids (~ 100 kb in size): It is known that lipofection efficiency decreases with increasing plasmid size (Kreiss et al., 1999), so conjugation can be employed as an alternative tool to deliver large plasmids to mammalian cells in vitro. Furthermore, our system circumvents the limited coding capacity of AAVs which can only be used to deliver DNA up to ~ 5 kb (Wang et al., 2019).
  • Low cost and easy scalability: propagating bacteria transformed with the conjugative helper plasmid and the mobilizable plasmid is easy to scale up and is considerably cheaper than transfection reagents available on the market.
  • Tunable specificity: The synthetic adhesin module can be modified to target bacteria to specific cell types and therefore allows targeted DNA delivery in a heterogeneous cellular environment, such as in 3D in vitro cancer models.

Aim of This Subproject

We sought to explore the potential of using the bacterial Type IV secretion system (T4SS) as a novel platform to deliver genetic payloads to mammalian cells. Intrigued by the work of VL Waters (2001), who showed conjugation-mediated DNA transfer between bacteria and mammalian cells, we brainstormed ways to engineer conjugation as a generalized DNA delivery tool. Recognizing the importance of cell-cell contact as an important parameter for efficienct conjugation between bacteria (Robledo et al., 2022), we were curious to test the effects of the same in promoting conjugation between bacteria and mammalian cells. To this end, we planned to incorporate adhesins into our conjugation system that increase conjugation efficiency and direct plasmid transfer to specific cell types.

Results

Bacterial conjugation can be mediated by plasmids in either a cis or trans configuration. The cis configuration involves a single plasmid that encodes for the conjugation machinery as well as carries the oriT sequence. Very high conjugation efficiencies can be achieved with this system (frequencies in the order of 10-2 in 24 hours (Hamilton et al., 2019)) as every successfully conjugated recipient automatically becomes a donor. The trans configuration, on the other hand, is a dual-plasmid-system where the conjugation machinery and the oriT sequence are present on separate plasmids and thus, only the plasmid carrying the oriT is mobilized between bacteria. As can be imagined, the conjugation efficiencies are modest while using the trans configuration (frequencies in the order of 10-5 in 24 hours (Hamilton et al., 2019)). Nevertheless, we decided to use the trans set up since we were not interested in delivering the entire T4SS arsenal to our future recipients - mammalian cells. As a second benefit of this strategy, recipient cells do not become conjugation competent themselves, increasing the safety of our system. Therefore, we aimed to assess the functionality of the RP4 system in the trans configuration.

Our dual-plasmid system consists of pHelper_RP4 (RP4 conjugation helper plasmid) and pmob_b (mobilizable plasmid carrying the oriT sequence). For information on how we cloned these plasmids, please refer to the experiments page. We chose to use pTA-Mob 2.0 as the backbone for pHelper_RP4 not only because it encodes the RP4 conjugation machinery, but also since it has been shown to be successful in mediating inter-kingdom conjugation between bacterial and yeast cells (Soltysiak et al., 2019). Finally, as the RP4 system encodes short and rigid pili, the conjugation efficiency is considerably lower in liquid media than in solid media (Robledo et al., 2022) and hence our decision to test the proficiency of our system in both solid and liquid media.

The donors in our test group carried two plasmids (pHelper_RP4 and pmob_b) and all recipients carried a plasmid conferring resistance to ampicillin (to allow for selection against donor bacteria). Origin of replication (ori) compatibilities between the different plasmids used was ensured.

In addition, prior to mixing the experimental donor and recipient groups (presented in Table 1), the bacterial suspensions were either brought to an OD600 of 1.2 or 10 to test the effect of cell density on conjugation efficiency.

Table 1: Donors and recipients in the different controls and in the test group used for the bacteria-bacteria conjugation assay.
Donor Recipient
Negative control 1 E.coli 10-beta carrying the RP4 helper plasmid E.coli BL21(DE3)
Negative control 2 E.coli 10-beta carrying the mobilizable plasmid E.coli BL21(DE3)
Positive control - E.coli BL21(DE3) carrying the mobilizable plasmid
Test group E.coli 10-beta carrying both the RP4 helper and the mobilizable plasmid E.coli BL21(DE3)

We observed efficient conjugation in all test groups (calculated as transconjugants per recipient), with conjugation efficiencies differing depending on the OD600 of the donors and recipients used at the start of conjugation. Notably, a 1000-fold increase in conjugation efficiency was observed between the test groups upon increasing the OD600 from 1.2 to 10 (see Fig. 24). This provides a clear indication that cell density and therefore, enhanced cell-cell contact is an important parameter for conjugation on solid media. Interestingly, at an OD600 of 10, we also observe some natural transformation happening in case of the negative control 2 samples, which could be a consequence of high cell density resulting in increased chances of natural transformation. In liquid media however, we could not show any conjugation happening, which confirms the inefficiency of the wild-type RP4 system to mediate conjugation in liquid media. Moreover, the positive controls that were chemically transformed grew on selective agar plates, suggesting that the apparent lack of conjugation in liquid media is not due to the experimental conditions but rather the inefficiency of the conjugative system itself, under liquid conditions.

Figure 24: Efficiency of conjugation on solid media. Bar charts depicting the conjugation efficiency (reported as transconjugants/recipient) of the various experimental groups. Three technical replicates were used for the OD600 10 test group.

The general workflow for a coIP experiment involves using antibodies to pull down the target protein (bait) from a crude or pre-cleared cell lysate. The target protein tends to form complexes with its interaction partners (prey) which now gets pulled down along with the bait protein. Following immunoprecipitation, the protein complexes bound to the beads can be analyzed by resolving them using SDS-PAGE and subsequently detecting them via Western Blot.

In our case, we plan to use coIP to prove interaction between the anti-EGFR adhesin and EGFR in vitro. This will be achieved by checking for co-immunoprecipitation of GFP-tagged EGFR along with myc-tagged anti-EGFR adhesins following a myc pulldown. Figure 25 provides a graphic illustration of the concept of this assay.

Figure 25. Co-immunoprecipitation assay. Anti-myc beads bind to the myc-tagged adhesin, which carries a nanobody against EGFR (wild-type 7D12). The nanobody mediates high affinity binding of the adhesin to EGFR which is fused to GFP. After the myc pull down, the beads are washed and the proteins are eluted from the beads. The eluted proteins are then analysed by Fluorescent Western Blot analysis against both myc and GFP.

The coIP itself is yet to be executed but preliminary experiments have already been performed to validate proper expression of EGFR-GFP by HEK293T cells and the adhesins (without nanobodies) by E.coli 10-beta. In addition, since the proteins of interest are rather large (~180 kDa for EGFR-GFP and ~76 kDa for adhesin without nanobody), we also tested two different sample preparation conditions (boiling for 5 mins at 95 ℃ and heating for 30 mins at 37 ℃) and blotting conditions (15V, 30 mins at 4 ℃ and 15V, 1 hour at 4 ℃). The results revealed successful EGFR-GFP expression after transfection in HEK293T cells and adhesin (without nanobody) expression after induction in E.coli 10-beta (see Fig. 26). Moving forward, these results affirm expression of full-length intimin by E.coli 10-beta, allowing for their confident utilization in upcoming conjugation assays.

Figure 26: Fluorescent Western Blot scans after staining against GFP (green) and myc (red) and subject to different sample preparation and blotting conditions. Contents of samples 1,2,3 and 4 are presented in the text under the figure. The proteins of interest are indicated using arrows: white arrows point at the location of EGFR-GFP (~180 kDa) and blue arrows point at the location of myc-tagged adhesin without nanobody (~76 kDa).

For the planned coIP assay, HEK293T cells will be transfected with a plasmid encoding EGFR-GFP and lysed. E.coli lysates will be generated after transformation and induction of protein expression. Please refer to the experiments page for a more detailed description of the procedures. The experimental design is planned to include pulldowns of various lysate combinations using anti-myc beads:

  • Pulldown of HEK293T cell lysate that would reveal any cross-reaction of the anti-myc beads with proteins in the HEK293T cell lysate;
  • Pulldown of E.coli lysates (both adhesins without and with nanobody) to validate the beads and also to test proper expression of full-length adhesins by E.coli 10-beta;
  • coIP of HEK293T and E.coli lysate expressing adhesins without nanobody to test for any interaction between the bait and prey proteins that is not mediated by the nanobody;
  • coIP of HEK293T and E.coli lysate expressing adhesins with the nanobody to prove specific interaction between the bait and prey proteins.

To showcase the improved adhesion of conjugative bacteria expressing anti-EGFR adhesins to the surface of mammalian cells, we aimed to compare the accumulation of anti-EGFR adhesin-expressing bacteria on the surface of HEK293T and HeLa cells against bacteria expressing adhesins without nanobodies.

We chose to conduct a fluorescence microscopy-based assay to test whether adhesins expressed on bacteria enable better binding to the surface of mammalian cells. Following IPTG induction of adhesin expression, E.coli 10-beta were labeled by growing them in LB-media supplemented with fluorescently labeled D-amino acids (RADA). In preliminary tests we confirmed this to result in the incorporation of fluorescent D-amino acids into the bacterial cell wall during synthesis, thereby labeling the peptidoglycan layer. However, the adhesion assay itself could not be performed due to time constraints.

In the future, we aim to confirm adhesin-dependent binding of fluorescently labeled bacteria by adding them to HEK293T or HeLa cells grown on coverslips, followed by incubation, washing and fixation for microscopy. Following fixation and subsequent staining with Hoechst and WGA-Alexa Fluor 488, the coverslips will be mounted on glass slides to be visualized under a fluorescence microscope. A more thorough description of the assay can be found in the experiments page. The experimental design is planned to include the following bacteria:

  • E.coli 10-beta transformed with pHelper_RP4: to assess natural bacterial adhesion to mammalian cell surfaces;
  • E.coli 10-beta transformed with pHelper_RP4 and pNeae2: no nanobody control;
  • E.coli 10-beta transformed with pHelper_RP4 and pNeae2_7D12: test group.

Through this assay, we hope to prove correct extracellular display of nanobodies by conjugative bacteria and also gain qualitative insights into the enhanced affinity of nanobody-expressing bacteria to the mammalian cell surface.

The goal of the bacteria-mammalian conjugation assay is to explore the avenue of inter-kingdom conjugation and potential improvements in conjugation efficiency due to enhanced cell-cell contact. This concept can be seen on figure 27.

Figure 27: Bacteria-mammalian conjugation mediated by synthetic adhesins. DNA delivery to mammalian cells via the T4SS by adhesin-expressing bacteria docked on the mammalian cell surface.

We aim to perform the bacteria-mammalian conjugation assay using a custom protocol that we designed. It involves seeding HEK293T cells in T25 flasks and growing them for 24 hours prior to the experiment. Then, conjugation-competent bacteria carrying a mobilizable plasmid encoding EGFP under a SV40 promoter (for constitutive expression in mammalian cells) will be added to the flasks containing adherent HEK293T cells. Conjugation will be allowed to proceed for 12 hours in media supplemented with DNaseI (prevents natural transformation) and cytochalasin D (prevents endocytosis-mediated bacterial uptake) as done by Waters, V (2001). 12 hours later, the media will be removed and the mammalian cells will be washed to remove most of the attached bacteria. The remaining bacteria shall be neutralized by adding fresh media supplemented with appropriate antibiotics to the flasks. After checking successful elimination of live bacteria, the HEK293T cells will be placed back into the humid CO2 incubator. 24 hours later, the HEK293T cells will be imaged under a fluorescence microscope to visualize reporter protein expression. FACS can be performed for a more quantitative read-out. For a more detailed description of the assay, please refer to the experiments page.

Several controls will be included in the experimental design to allow proper interpretation of results. In all cases, bacteria transformed with different plasmid combinations shall be added to the HEK293T cells. Following are the bacteria that have been planned to be utilized in the bacteria-mammalian conjugation assay:

  • No adhesin negative controls: E.coli 10-beta expressing either:
    • pHelper_RP4 or
    • pmob_m_CMV
  • Adhesin expressing negative controls: E.coli 10-beta expressing either:
    • pHelper_RP4 + pNeae_7D12 or
    • pmob_m_CMV + pNeae_7D12
  • Conjugation test groups: E.coli 10-beta expressing:
    • pHelper_RP4 and pmob_m_CMV (no adhesin test group)
    • pHelper_RP4 and pmob_m_CMV + pNeae_7D12 (test group with adhesins)
  • Positive control for comparison against other groups:
    • Transfection with pmob_m_CMV.

We are yet to conduct the experiment and obtain data. Nevertheless all the necessary materials are in place and the assay shall be duly executed.

Discussion

We started by validating the functionality of the RP4 helper plasmid in our dual-plasmid system as well as of the oriT sequence in the mobilizable plasmid by performing conjugation assays between bacteria on both solid and liquid media. We observed that the conjugation efficiency achieved using our experimental design was comparable to published literature (Hamilton et al., 2019), thereby validating our RP4 helper plasmid and its ability to catalyze transfer of the oriT carrying mobilizable plasmid. Moreover, we noted a 1000-fold increase in conjugation efficiency between the test groups upon increasing the OD600 from 1.2 to 10. This provided a clear indication that cell density and therefore, enhanced cell-cell contact is an important parameter for conjugation on solid media. Also, almost no conjugation could be observed when perfomed in liquid media, pointing at the inefficiency of the RP4 counjugative machinery to mediate conjugation in destabilizing conditions. Thus, we laid the foundation for our future conjugation experiments between bacteria and mammalian cells.

Western Blot analysis confirmed expression of myc-tagged adhesins (without nanobodies) by E.coli 10-beta. This knowledge allows for their utilization as the donor strain in upcoming conjugation assays in spite of not being an optimal strain for protein expression. We also established a coIP workflow to prove specific interaction between the nanobody and its target protein in vitro, which can be easily adapted to other nanobodies and their respective target proteins. Validation of nanobody-target protein interaction is vital to interpret results from the adhesion assay designed to examine the enrichment of nanobody-expressing bacteria on the surface of mammalian cells. The results from the adhesion assay will be used to verify adhesin display in the extracellular space and also give us a qualitative understanding of whether the presence of adhesins result in an increased affinity between the conjugative bacteria and mammalian cells. This knowledge will also help us to complement and support the results from the coIP experiment.

Finally, through the bacteria-mammalian conjugation assay, we hope to delineate the role of cell-cell contact in mediating inter-kingdom conjugation between bacteria and mammalian cells. In the future, we also aim to test our system on different cell lines to examine their susceptibilities to receiving DNA from bacteria via the T4SS. We also seek to further tune the specificity of our delivery system using cell type-specific promoters, the constructs for which have already been cloned.

Bien, S., Ritter, C. A., Gratz, M., Sperker, B., Sonnemann, J., Beck, J. F., Kroemer, H. K. (2004). Nuclear factor-kappaB mediates up-regulation of cathepsin B by doxorubicin in tumor cells. Molecular Pharmacology 65(5), 1092-102. https://doi.org/10.1124/mol.65.5.1092

Bleuez, C., Koch, W. F., Urbach, C., Hollfelder, F., & Jermutus, L. (2022). Exploiting protease activation for therapy. Drug Discov Today, 27(6), 1743-1754. https://doi.org/10.1016/j.drudis.2022.03.011

Espinosa-Oliva, A. M., García-Revilla, J., Alonso-Bellido, I. M., & Burguillos, M. A. (2019). Brainiac Caspases: Beyond the Wall of Apoptosis [Mini Review]. Frontiers in Cellular Neuroscience, 13. https://doi.org/10.3389/fncel.2019.00500

Gramespacher, J. A., Stevens, A. J., Nguyen, D. P., Chin, J. W., & Muir, T. W. (2017). Intein Zymogens: Conditional Assembly and Splicing of Split Inteins via Targeted Proteolysis. J Am Chem Soc, 139(24), 8074-8077. https://doi.org/10.1021/jacs.7b02618

Jin, C., EI-Sagheer, A. H., Li, S., Vallis, K. A., Tan, W., & Brown, T. (2022). Engineering Enzyme-Cleavable Oligonucleotides by Automated Solid-Phase Incorporation of Cathepsin B Sensitive Dipeptide Linkers. Angewandte Chemie International Edition, 61(13), e202114016. https://doi.org/https://doi.org/10.1002/anie.202114016

Mentlein, R., Hattermann, K., Held-Feindt, J. (2012). Lost in disruption: Role of proteases in glioma invasion and progression, Biochimica et Biophysica Acta (BBA) - Reviews on Cancer, 1825(2), 178-185. https://doi.org/10.1016/j.bbcan.2011.12.001.

Muench, P., Fiumara, M., Southern, N., Coda, D., Aschenbrenner, S., Correia, B., Gräff, J., Niopek, D., & Mathony, J. (2023). A modular toolbox for the optogenetic deactivation of transcription. bioRxiv, 2023.2011.2006.565805. https://doi.org/10.1101/2023.11.06.565805

Müntener, K., Willimann, A., Zwicky, R., Svoboda, B., Mach, L., & Baici, A. (2005). Folding Competence of N-terminally Truncated Forms of Human Procathepsin B*. Journal of Biological Chemistry, 280(12), 11973-11980. https://doi.org/https://doi.org/10.1074/jbc.M413052200

Nägler, D. K., Storer, A. C., Portaro, F. C. V., Carmona, E., Juliano, L., & Ménard, R. (1997). Major Increase in Endopeptidase Activity of Human Cathepsin B upon Removal of Occluding Loop Contacts. Biochemistry, 36(41), 12608-12615. https://doi.org/10.1021/bi971264+

Ruan, H., Hao, S., Young, P., & Zhang, H. (2015). Targeting Cathepsin B for Cancer Therapies. Horiz Cancer Res, 56, 23-40.

Shim, N., Jeon, S. I., Yang, S., Park, J. Y., Jo, M., Kim, J., Choi, J., Yun, W. S., Kim, J., Lee, Y., Shim, M. K., Kim, Y., & Kim, K. (2022). Comparative study of cathepsin B-cleavable linkers for the optimal design of cathepsin B-specific doxorubicin prodrug nanoparticles for targeted cancer therapy. Biomaterials, 289, 121806. https://doi.org/10.1016/j.biomaterials.2022.121806

Wang, J., Liu, M., Zhang, X., Wang, X., Xiong, M., & Luo, D. (2024). Stimuli-responsive linkers and their application in molecular imaging. Exploration, 4(4), 20230027. https://doi.org/https://doi.org/10.1002/EXP.20230027

Zhong, Y.-J., Shao, L.-H., & Li, Y. (2013). Cathepsin B-cleavable doxorubicin prodrugs for targeted cancer therapy (Review). Int J Oncol, 42(2), 373-383. https://doi.org/10.3892/ijo.2012.1754

Arndt, K., & Fink, G. R. (1986). GCN4 protein, a positive transcription factor in yeast, binds general control promoters at all 5’ TGACTC 3’ sequences. Proceedings of the National Academy of Sciences, 83(22), 8516–8520. https://doi.org/10.1073/pnas.83.22.8516

Berens, C., & Hillen, W. (2004). Gene Regulation By Tetracyclines. In J. K. Setlow (Ed.), Genetic Engineering: Principles and Methods (pp. 255–277). Springer US. https://doi.org/10.1007/978-0-306-48573-2_13

Chen, X., Zaro, J. L., & Shen, W.-C. (2013). Fusion protein linkers: Property, design and functionality. Advanced Drug Delivery Reviews, 65(10), 1357–1369. https://doi.org/10.1016/j.addr.2012.09.039

Ellenberger, T. E., Brandl, C. J., Struhl, K., & Harrison, S. C. (1992). The GCN4 basic region leucine zipper binds DNA as a dimer of uninterrupted alpha helices: Crystal structure of the protein-DNA complex. Cell, 71(7), 1223–1237. https://doi.org/10.1016/s0092-8674(05)80070-4

Fried, M. G. (1989). Measurement of protein-DNA interaction parameters by electrophoresis mobility shift assay. ELECTROPHORESIS, 10(5–6), 366–376. https://doi.org/10.1002/elps.1150100515

Gossen, M., & Bujard, H. (1992). Tight control of gene expression in mammalian cells by tetracycline-responsive promoters. Proceedings of the National Academy of Sciences, 89(12), 5547–5551. https://doi.org/10.1073/pnas.89.12.5547

Greenfield, N. J. (2006). Using circular dichroism spectra to estimate protein secondary structure. Nature Protocols, 1(6), 2876–2890. https://doi.org/10.1038/nprot.2006.202

Hellman, L. M., & Fried, M. G. (2007). Electrophoretic mobility shift assay (EMSA) for detecting protein–nucleic acid interactions. Nature Protocols, 2(8), 1849–1861. https://doi.org/10.1038/nprot.2007.249

Hochreiter, B., Kunze, M., Moser, B., & Schmid, J. A. (2019). Advanced FRET normalization allows quantitative analysis of protein interactions including stoichiometries and relative affinities in living cells. Scientific Reports, 9(1), 8233. https://doi.org/10.1038/s41598-019-44650-0

Hollenbeck, J. J., Gurnon, D. G., Fazio, G. C., Carlson, J. J., & Oakley, M. G. (2001). A GCN4 Variant with a C-Terminal Basic Region Binds to DNA with Wild-Type Affinity. Biochemistry, 40(46), 13833–13839.

Hollenbeck, J. J., McClain, D. L., & Oakley, M. G. (2002). The role of helix stabilizing residues in GCN4 basic region folding and DNA binding. Protein Science, 11(11), 2740–2747. https://doi.org/10.1110/ps.0211102

Hollenbeck, J. J., & Oakley, M. G. (2000). GCN4 Binds with High Affinity to DNA Sequences Containing a Single Consensus Half-Site. Biochemistry, 39(21), 6380–6389. https://doi.org/10.1021/bi992705n

Liu, J., Zheng, Q., Deng, Y., Cheng, C.-S., Kallenbach, N. R., & Lu, M. (2006). A seven-helix coiled coil. Proceedings of the National Academy of Sciences, 103(42), 15457–15462. https://doi.org/10.1073/pnas.0604871103

Lundbäck, T., Chang, J.-F., Phillips, K., Luisi, B., & Ladbury, J. E. (2000). Characterization of Sequence-Specific DNA binding by the Transcription Factor Oct-1. Biochemistry, 39(25), 7570–7579. https://doi.org/10.1021/bi000377h

Lupas, A. N., Bassler, J., & Dunin-Horkawicz, S. (2017). The Structure and Topology of α-Helical Coiled Coils. Fibrous Proteins: Structures and Mechanisms, 82, 95–129. https://doi.org/10.1007/978-3-319-49674-0_4

Park, J. H., Kwon, H. W., & Jeong, K. J. (2013a). Development of a plasmid display system with an Oct-1 DNA binding domain suitable for in vitro screening of engineered proteins. Journal of Bioscience and Bioengineering, 116(2), 246–252. https://doi.org/10.1016/j.jbiosc.2013.02.005

Park, J. H., Kwon, H. W., & Jeong, K. J. (2013b). Development of a plasmid display system with an Oct-1 DNA binding domain suitable for in vitro screening of engineered proteins. Journal of Bioscience and Bioengineering, 116(2), 246–252. https://doi.org/10.1016/j.jbiosc.2013.02.005

Park, Y., Shin, J., Yang, J., Kim, H., Jung, Y., Oh, H., Kim, Y., Hwang, J., Park, M., Ban, C., Jeong, K. J., Kim, S.-K., & Kweon, D.-H. (2020). Plasmid Display for Stabilization of Enzymes Inside the Cell to Improve Whole-Cell Biotransformation Efficiency. Frontiers in Bioengineering and Biotechnology, 7. https://doi.org/10.3389/fbioe.2019.00444

Shao, B., Rammohan, J., Anderson, D. A., Alperovich, N., Ross, D., & Voigt, C. A. (2021). Single-cell measurement of plasmid copy number and promoter activity. Nature Communications, 12(1), 1475. https://doi.org/10.1038/s41467-021-21734-y

Stepchenko, A. G., Portseva, T. N., Glukhov, I. A., Kotnova, A. P., Lyanova, B. M., Georgieva, S. G., & Pankratova, E. V. (2021). Primate-specific stress-induced transcription factor POU2F1Z protects human neuronal cells from stress. Scientific Reports, 11(1), 18808. https://doi.org/10.1038/s41598-021-98323-y

Verrijzer, C. P., Alkema, M. J., van Weperen, W. W., Van Leeuwen, H. C., Strating, M. J., & van der Vliet, P. C. (1992). The DNA binding specificity of the bipartite POU domain and its subdomains. The EMBO Journal, 11(13), 4993–5003. https://doi.org/10.1002/j.1460-2075.1992.tb05606.x

Woolfson, D. N. (2023). Understanding a protein fold: The physics, chemistry, and biology of α-helical coiled coils. Journal of Biological Chemistry, 299(4), 104579. https://doi.org/10.1016/j.jbc.2023.104579

Wu, P. G., & Brand, L. (1994). Resonance Energy Transfer: Methods and Applications. Analytical Biochemistry, 218(1), 1–13. https://doi.org/10.1006/abio.1994.1134

Anderson, N. M. & Simon, M. C. (2020). The tumor microenvironment. Current Biology, 30(16), R921–R925, doi:10.1016/j.cub.2020.06.081

Cong, L., Ran, F. A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X., Jiang, W., Marraffini, L. A., and Zhang, F. (2013). Multiplex Genome Engineering Using CRISPR/Cas Systems. Science 339, 819–823, doi:10.1126/science.1231143.

Cramer, P. (2019). Organization and regulation of gene transcription. Nature 573, 45–54, doi:10.1038/s41586-019-1517-4.

Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S., and Ren, B. (2012). Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380, doi:10.1038/nature11082.

Gonatopoulos-Pournatzis, T., Aregger, M., Brown, K. R., Farhangmehr, S., Braunschweig, U., Ward, H. N., Ha, K. C. H., Weiss, A., Billmann, M., Durbic, T., Myers, C. L., Blencowe, B. J., and Moffat, J. (2020). Genetic interaction mapping and exon-resolution functional genomics with a hybrid Cas9–Cas12a platform. Nature Biotechnology 38, 638–648, doi:10.1038/s41587-020-0437-z.

Jinek, M., Chylinski, K., Fonfara, I., Hauer, M., Doudna, J. A., and Charpentier, E. (2012). A Programmable Dual-RNA–Guided DNA Endonuclease in Adaptive Bacterial Immunity. Science 337, 816–821, doi:10.1126/science.1225829.

Kampmann, M. (2017). CRISPRi and CRISPRa Screens in Mammalian Cells for Precision Biology and Medicine. ACS Chemical Biology 13, 406–416, doi:10.1021/acschembio.7b00657.

Kim, M.-S., and Kini, A. G. (2017). Engineering and Application of Zinc Finger Proteins and TALEs for Biomedical Research. Molecules and Cells 40, 533–541, doi:10.14348/molcells.2017.0139.

Kim, J. H., Rege, M., Valeri, J., Dunagin, M. C., Metzger, A., Titus, K. R., Gilgenast, T. G., Gong, W., Beagan, J. A., Raj, A., and Phillips-Cremins, J. E. (2019). LADL: light-activated dynamic looping for endogenous gene expression control. Nature Methods 16, 633–639, doi:10.1038/s41592-019-0436-5.

Kleinstiver, B. P., Sousa, A. A., Walton, R. T., Tak, Y. E., Hsu, J. Y., Clement, K., Welch, M. M., Horng, J. E., Malagon-Lopez, J., Scarfò, I., Maus, M. V., Pinello, L., Aryee, M. J., and Joung, J. K. (2019). Engineered CRISPR–Cas12a variants with increased activities and improved targeting ranges for gene, epigenetic and base editing. Nature Biotechnology 37, 276–282, doi:10.1038/s41587-018-0011-0.

Koonin, E. V., Gootenberg, J. S., and Abudayyeh, O. O. (2023). Discovery of Diverse CRISPR-Cas Systems and Expansion of the Genome Engineering Toolbox. Biochemistry 62, 3465–3487, doi:10.1021/acs.biochem.3c00159.

Kweon, J., Jang, A.-H., Kim, D.-e., Yang, J. W., Yoon, M., Rim Shin, H., Kim, J.-S., and Kim, Y. (2017). Fusion guide RNAs for orthogonal gene manipulation with Cas9 and Cpf1. Nature Communications 8, doi:10.1038/s41467-017-01650-w.

Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., Sandstrom, R., Bernstein, B., Bender, M. A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L. A., Lander, E. S., and Dekker, J. (2009). Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome. Science 326, 289–293, doi:10.1126/science.1181369.

Pacesa, M., Pelea, O., and Jinek, M. (2024). Past, present, and future of CRISPR genome editing technologies. Cell 187, 1076–1100, doi:10.1016/j.cell.2024.01.042.

Paul, B., and Montoya, G. (2020). CRISPR-Cas12a: Functional overview and applications. Biomedical Journal 43, 8–17, doi:10.1016/j.bj.2019.10.005.

Sheridan, C. (2023). The world’s first CRISPR therapy is approved: who will receive it? Nature Biotechnology, 42(1), 3–4. doi:10.1038/d41587-023-00016-6

Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C., and Doudna, J. A. (2014). DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67, doi:10.1038/nature13011.

Yang, Y., Rocamonde-Lago, I., Shen, B., Berzina, I., Zipf, J. & Högberg, B. (2024). Re-engineered guide RNA enables DNA loops and contacts modulating repression in E. coli. Nucleic Acids Research. doi:10.1093/nar/gkae591

Zetsche, B., Gootenberg, J. S., Abudayyeh, O. O., Slaymaker, I. M., Makarova, K. S., Essletzbichler, P., Volz, S. E., Joung, J., van der Oost, J., Regev, A., Koonin, E. V., and Zhang, F. (2015). Cpf1 Is a Single RNA-Guided Endonuclease of a Class 2 CRISPR-Cas System. Cell 163, 759–771, doi:10.1016/j.cell.2015.09.038.

De La Cruz, F., Frost, L. S., Meyer, R. J., & Zechner, E. L. (2010). Conjugative DNA metabolism in Gram-negative bacteria. In FEMS Microbiology Reviews (Vol. 34, Issue 1). https://doi.org/10.1111/j.1574-6976.2009.00195.x

Gelvin, S. B. (2003). Agrobacterium-Mediated Plant Transformation: the Biology behind the “Gene-Jockeying” Tool. Microbiology and Molecular Biology Reviews, 67(1). https://doi.org/10.1128/mmbr.67.1.16-37.2003

Grohmann, E., Muth, G., & Espinosa, M. (2003). Conjugative Plasmid Transfer in Gram-Positive Bacteria. Microbiology and Molecular Biology Reviews, 67(2). https://doi.org/10.1128/mmbr.67.2.277-301.2003

Hamilton, T. A., Pellegrino, G. M., Therrien, J. A., Ham, D. T., Bartlett, P. C., Karas, B. J., Gloor, G. B., & Edgell, D. R. (2019). Efficient inter-species conjugative transfer of a CRISPR nuclease for targeted bacterial killing. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-12448-3

Kreiss, P., Cameron, B., Rangara, R., Mailhe, P., Aguerre-Charriol, O., Airiau, M., Scherman, D., Crouzet, J., & Pitard, B. (1999). Plasmid DNA size does not affect the physicochemical properties of lipoplexes but modulates gene transfer efficiency. Nucleic Acids Research, 27(19). https://doi.org/10.1093/nar/27.19.3792

Robledo, M., Álvarez, B., Cuevas, A., González, S., Ruano-Gallego, D., Fernández, L. Á., & De La Cruz, F. (2022). Targeted bacterial conjugation mediated by synthetic cell-to-cell adhesions. Nucleic Acids Research, 50(22). https://doi.org/10.1093/nar/gkac1164

Schröder, G., Schuelein, R., Quebatte, M., & Dehio, C. (2011). Conjugative DNA transfer into human cells by the VirB/VirD4 type IV secretion system of the bacterial pathogen Bartonella henselae. Proceedings of the National Academy of Sciences of the United States of America, 108(35). https://doi.org/10.1073/pnas.1019074108

Silbert, J., Lorenzo, V. De, & Aparicio, T. (2021). Refactoring the Conjugation Machinery of Promiscuous Plasmid RP4 into a Device for Conversion of Gram-Negative Isolates to Hfr Strains. ACS Synthetic Biology, 10(4). https://doi.org/10.1021/acssynbio.0c00611

Soltysiak, M. P. M., Meaney, R. S., Hamadache, S., Janakirama, P., Edgell, D. R., & Karas, B. J. (2019). Trans-kingdom conjugation within solid media from Escherichia coli to Saccharomyces cerevisiae. International Journal of Molecular Sciences, 20(20). https://doi.org/10.3390/ijms20205212

Virolle, C., Goldlust, K., Djermoun, S., Bigot, S., & Lesterlin, C. (2020). Plasmid transfer by conjugation in gram-negative bacteria: From the cellular to the community level. In Genes(Vol. 11, Issue 11). https://doi.org/10.3390/genes11111239

Wang, D., Tai, P. W. L., & Gao, G. (2019). Adeno-associated virus vector as a platform for gene therapy delivery. In Nature Reviews Drug Discovery (Vol. 18, Issue 5). https://doi.org/10.1038/s41573-019-0012-9

Waters, V. L. (2001). Conjugation between bacterial and mammalian cells. Nature Genetics, 29(4). https://doi.org/10.1038/ng779