- Overview -
In today’s highly digitalized world, the security and confidentiality of information transmission face significant challenges. To address this issue, we propose an innovative solution: encoding information into DNA sequences using the Wubi input method and inserting these sequences into Escherichia coli for secure and covert information transfer. Our design includes three key components: first, we encode Chinese characters into DNA sequences using the Wubi input method and introduce them into bacteria for information encoding and storage; second, we design a conditional growth mechanism, ensuring that the bacteria can only grow under specific conditions, using caffeine or xanthine as essential growth factors, which guarantees that the information can only be accessed in certain environments; finally, we incorporate a self-destruction mechanism, where the bacteria secrete DNase enzymes under high temperatures to automatically degrade their DNA, thus destroying the encoded information to ensure its security. This system provides a robust solution for protecting the confidentiality and security of information.
Module 1:
- Information Encoding and Transmission -
The central dogma of molecular biology states that the base sequence of DNA determines the expression of proteins. In our project, we leverage this principle by encoding information into DNA sequences using the Traditional Wubi input method, aiming to make these artificially synthesized base sequences resemble the protein-coding regions of bacteria, thereby disguising the encoded information.
design1

design2

First, we classify codons (triplets of bases corresponding to mRNA) into three categories: the first category represents amino acids that easily form alpha helices, the second category represents amino acids that easily form beta sheets, and the third category includes codons that can represent both structural elements. For example, GCT and GTT can correspond to a particular structural element.
design3

For the information encoded by the third category of codons, we use the Lorenz equation: when the value exceeds a certain threshold, we use codons representing alpha helices; when below, we use codons representing beta sheets. In this way, codons containing important information can be disguised as sequences that appear to encode proteins, effectively hiding the data within a biologically plausible context.(这里缺一幅图关于Lorenz equation的解释)
design4

Although the probability is low, gene mutations can still occur, which may lead to distortion of our information. To address this issue, we add a hash value at the end of the codon sequence, representing the number of A, T, C, and G bases within the sequence. If the actual count of these bases does not match the hash value, it indicates that a mutation may have occurred, and the information needs to be re-extracted from another bacterium. This approach ensures the maximum integrity and reliability of the information encoded within the bacterial DNA.
Module 2:
- Conditional Growth and Caffeine Dependency -
· Knockout of guaB
In its natural state, the guaB gene in Escherichia coli encodes a key enzyme in the guanine nucleotide synthesis pathway, which is involved in the synthesis of XMP (xanthosine monophosphate). XMP is a precursor for the synthesis of RNA and DNA, making the guaB gene crucial for bacterial survival. To make the bacteria dependent on specific substances, we used gene editing technology to knock out the guaB gene in E. coli. After the knockout, these bacteria cannot synthesize XMP, and consequently cannot synthesize RNA or DNA, meaning they are unable to grow on media that lack externally supplied xanthine or caffeine. This design makes bacterial growth strictly dependent on exogenous substances in specific environments, thereby enhancing the security and controllability of information transmission.
design5

design6

· Introduction of the DeCaf Pathway
To restore the growth ability of the guaB-knockout bacteria and to use more commonly available caffeine in place of the less common xanthine, thereby increasing stealth, we introduced the caffeine degradation pathway (DeCaf Pathway) from Pseudomonas putida CBB5. We constructed the key genes of this pathway into an expression vector and introduced them into the guaB-knockout E. coli strain (BW-ΔguaB), resulting in the BW-ΔguaB-DeCaf strain. This pathway enables the bacteria to demethylate caffeine into xanthine, allowing them to regain the nucleotides necessary for growth. By introducing the DeCaf pathway, the bacteria can utilize caffeine as a substitute for xanthine, thereby restoring their growth capability.
design7

design8

We aim to further screen and test common caffeine-containing beverages from daily life to determine if they can support the normal growth of the guaB-knockout bacteria with the DeCaf pathway added. We plan to prepare solid media from these common beverages to test the growth of BW-ΔguaB-DeCaf. This experiment will further demonstrate the caffeine dependency of the BW-ΔguaB-DeCaf strain and showcase its growth ability in different caffeine environments. More importantly, it will show that BW-ΔguaB-DeCaf can grow in widely available beverages in daily life without relying on the uncommon xanthine, thereby increasing the stealth of agents and reducing the difficulty of retrieving information.
design9
Module 3:
- Information Protection -
· Resistance Mechanisms
To ensure selective growth of the bacteria and further protect the transmitted information, we designed multiple resistance mechanisms to respond to different environmental conditions. First, during the knockout of the guaB gene, we replaced its location in the genome with a kanamycin resistance gene, enabling the modified E. coli to gain resistance to kanamycin. This allows us to use kanamycin-selective media to screen successfully transformed strains.
Additionally, the caffeine degradation pathway (DeCaf Pathway) incorporates a streptomycin resistance gene, further enhancing the selective resistance of the strain. The plasmid also includes an ampicillin resistance gene, enabling the strain to grow in the presence of ampicillin. Consequently, the final modified strain exhibits resistance to ampicillin, chloramphenicol, and kanamycin, ensuring that the bacteria only grow under specific conditions and protecting the internal information.
design10

· Self-Destruct Mechanism
To prevent unauthorized access to the information, we designed a temperature-sensitive self-destruct mechanism. Under high temperatures, the bacteria secrete DNase enzymes, which degrade the bacteria’s own DNA, thereby destroying the encoded information and ensuring its security while preventing unauthorized access.
design11

We achieve this by combining a transcription factor (cI dimer) with a DNA endonuclease to protect the bacteria from self-lysis under normal conditions. However, when the temperature rises to 37°C or above, the dimer begins to dissociate and leave the binding site, causing the genetic circuit of the DNA endonuclease to be expressed, leading to the spontaneous cutting of the DNA into numerous fragments.
design12

We chose to use the gene for expressing the DpnI endonuclease, as the E. coli genome is rich in its cutting sites. The recurrence and visibility of this specific sequence (GATC from 5’ to 3’) are prevalent in the E. coli genome. When the temperature reaches 37°C, the DpnI gene is activated, and the resulting DNase enzyme cuts the bacteria’s DNA, causing the bacteria to die and ensuring the destruction of the information.
design13

Through these designs, the modified bacteria can selectively grow under specific conditions and automatically trigger the self-destruct mechanism when exposed to unfavorable environments, maximizing information security and preventing leaks.
Conclusion
Through our innovative design, we have established a robust method for covert information transmission using microorganisms. By encoding information into DNA sequences through the Wubi input method, we ensure secure storage within Escherichia coli, facilitating the transmission of hidden data. Our approach features three essential components: first, the utilization of codons that mimic protein-coding regions effectively disguises the encoded information, enhancing its security. Second, the knockout of the guaB gene, coupled with the introduction of the DeCaf pathway, ensures bacterial growth is dependent on specific substances like caffeine, adding a layer of control over the environment. Finally, the temperature-sensitive DNase secretion mechanism acts as a safeguard against unauthorized access by degrading the encoded information under unfavorable conditions. Together, these elements create a microbial-based system for discreet information transfer, ensuring that sensitive data remains secure and accessible only in intended environments.