Integrated Human Practices

January

Anmol Trehan

Background

Anmol is a venture manager with extensive experience in setting up startup companies in the field of biotechnology who also has previous experience in helping various undergraduate iGEM teams (including McGill 2023) in their finance and sponsorship reachout efforts. This made him an excellent candidate to reach out to for advice regarding not only our finance efforts, but also how to find the right target audience and use case to make our project more appealing to a lay audience and potential investors for the entrepreneurship plan.

Interview

Our discussion with Anmol focused on various sponsorship reach out strategies our team could employ for our fundraising efforts. He advised our team to first reach out to local companies and faculties within UBC for financial support. He also connected us with potential sponsors that could be interested in supporting our team, such as along with relevant resources. In terms of scientific communication, Anmol gave us some advice on how to frame and pitch our project so that the general public could easily digest.

Reflection

Through our discussion with Anmol, our team was able to formulate a fundraising and sponsorship outreach strategy to increase our chances of raising sufficient funds for the 2024 iGEM season. We also realized that the sponsorship package our team prepared might not be easily understandable for a reader that does not come from a background in scientific research. Anmol also suggested that we connect with local biotech venture investors such as Nucleate BC as a part of our entrepreneurship plan.

Intervention

Based on Anmol’s feedback on our fundraising strategies, we made major edits to our sponsorship packages to ensure that the readers can easily understand and digest the science behind our project regardless of their technical background. We also decided to focus our initial fundraising efforts to reaching out to Faculties and Departments within UBC, local biotech companies, and investors that are more likely to be interested in supporting a local student-led research team.

February

March

Dr. Holly Longstaff

Background

Holly Longstaff is a healthcare data research with expertise in ethics and is the Director of Research Integration and Innovation of the Research & Academic Services in Provincial Health Services Authority. Her professional background made her a relevant iHP contact to request her thoughts on the implementation of nuCloud as a healthcare data storage platform. We contacted Holly to gain insights into healthcare data management, regulatory hurdles related to new technologies like DNA storage, and the importance of interdisciplinary perspectives.

Interview

Our primary question was to explore the feasibility and acceptance of DNA storage in healthcare systems. We wanted to understand the current methods and challenges in healthcare data storage as well as the regulatory and ethical considerations of transitioning to DNA storage. We also briefly discussed EDI efforts in the field of data science and the importance of holding initiatives that practice inclusivity and accessibility.

Reflection

From our discussions, we learned that healthcare data storage is currently fragmented and archaic, posing significant challenges in accessibility and innovation. This is due to the substantial regulatory challenges surrounding privacy concerns, ethical implications, and the high stakes involved in healthcare data accuracy. The COVID-19 pandemic was just a single huge event that allowed for innovation in this area, opening up virtual health records and changes being made to privacy laws. Thus, we learned that the acceptance and implementation of new technologies in healthcare are influenced by global events, regulatory changes, and the need for rigorous compliance with healthcare standards. Furthermore, Holly believes that storing healthcare data through DNA would be helpful in only certain areas that concern long term storage.

Intervention

Our discussion with Holly moved our project forward by emphasizing the necessity of interdisciplinary collaboration to navigate complex regulatory landscapes effectively. It also guided our project towards a different route other than healthcare, perhaps increasing our focus in archival data, realizing the need to integrate regulatory compliance, ethical practices, and interdisciplinary collaboration.

Jon Corbett

Background

Jon is an Instructor at Simon Fraser University, within the School of Interactive Arts and Technology. Jon also has a background in computer science, and has previously worked at S as a software developer. His previous work involved implementing the software involved in tracking labels on packages. His current research is multidisciplinary, involving his roots as a member of the Metis Nation of Alberta, and his tinkering with computers as a youth. Our co-director, Narjis, met Jon, sparking a conversation about the applications of DNA storage within the Indigenous community, due to DNA representing a living vessel, of which aligns well with the Indigenous belief that their stories, languages and events are living knowledge. Thus, we reached out to Jon for advice regarding use cases for DNA storage, as well as ideas for error correction for software.

Interview

Our team first asked Jon about the steps we should be taking to incorporate indigenous communities and their data as a potential application for our project. Jon emphasized that if we wanted to involve the Indigenous community by storing their history in DNA, a long-term relationship between us and the Indigenous community would need to be established. First Nations communities have developed policies such as OCAP (Ownership, Control, Access, and Possession), to outline how researchers can ethically conduct research, while respecting the rights and privacy of Indigenous communities.

While discussing our software design and incorporating error correction algorithms to retrieve more data, Jon also mentioned that error correction on the tracking labels had lots of redundancy, because these tracking codes must withstand lots of physical wear and tear. Notably, he mentioned that an entire corner of these tracking labels could be lost, and the information could be recovered. He told us to look for use cases where information is being transmitted must deal with high rates of deletion, such at data formats with lots of redundancy like QR codes, generative art, and SVGs.

Reflection

Based on Jon’s comments regarding how long-term connections must first be established prior to considering Indigenous data storage as a use case, our Human Practices team decided that we should further conduct relevant iHP interviews to explore the idea of integrating our DNA storage system with indigenous stories. With respect to our dry lab software design, his mention of data formats with lots of built in redundancy led our team to look at error correction methods used with QR codes. Jon also mentioned to look beyond the standard ASCII type of encoding to check whether other encoding methods might be more suitable for our project.

Intervention

Although our Human Practices team reached out to several more indigenous iHP contacts to get a better understanding of how our project could be applied, our team ran into several logistical challenges. Due to the short time scale of the iGEM competition and the inaccessibility of technologies such as sequencing machines on Reserves, we were not able to demonstrate how data sovereignty could be reached with our DNA storage system using currently-available technology. Our team concluded that until technologies like NGS platforms are cheaper and DNA can be stored without the use of expensive freezers, DNA storage is not yet an accessible and effective way for First Nations communities to store their history. Regarding Jon’s comments about our software, several modifications were made to our software implementation plan based on this interview. To implement more redundancies, our dry lab team decided to implement an error correction method called the Reed-Solomon method, which is used in QR codes. We also decided to look into alternative encoding methods that utilize less than 8 bits for a character.

Kenny Hammond

Background

The research technology department at UBC’s Faculty of Medicine is relevant to healthcare data and resource storage and management. Kenny Hammond is the manager of Research Data Services at UBC’s Faculty of Medicine, responsible for supporting researchers and medical educators with technological solutions to support their needs. Kenny works with small-scale and short-term data that is recent and accessed daily. As a part of the research technology department, Kenny was a relevant contact in the field of healthcare research data and resource storage and management.

Interview

The main question we wanted to ask Kenny was another perspective to how healthcare data is being managed and whether the transition to DNA storage is feasible in this field. Building upon on previous interview insights on healthcare data with Holly Longstaff, we also wanted to learn Kenny’s views on the issues with strict regulations and privacy surrounding healthcare data.

Following our interview, we learned that while DNA storage is looking to more long term storage, current healthcare data, especially the data that he deals with, is short term and transactional as it needs to be accessed regularly. There are indeed privacy issues concerning all the different health authorities in BC and data storage is hardly a technical problem. Data is easy to move but it is difficult to convince policy makers and authorities to collectively agree on a method to move data. Kenny believes that DNA storage would work better for data that is not in tandem with privacy issues; healthcare data is only used when absolutely needed, thus it is in minimal amounts. Moving forward, Kenny suggests that our project may be beneficial in reducing costs for large data sets such as medical imaging data. In particular, Kenny believes that DNA storage may be a good solution for ‘cold storage’, referring to storing data that is rarely accessed.

Intervention

In combination with previous healthcare data interviews, our DNA storage project seems to steer away from the healthcare field. In particular, Kenny believes we have potential looking into archival and Indigenous data. Thus, we should research and interview Indigenous and archival individuals for our next steps. Kenny provided us with another contact, Eugene Barsky, who works with a UBC Dataverse Repository called Borealis that manages long term data storage.

Catherine Zhu

Background

Catherine Zu is a Masters of Journalism candidate, who has previous industry experience in science communication. She has previously interned as a Canadian Geographic journalist. We decided to reach out to Catherine to learn how to effectively facilitate science communication, especially to foster a two way dialogue to our community.

Interview

During our interview, we discussed the then-current stages of our DNA project and strategies on how we can improve our communication of our project. Catherine pointed out that our project is quite complicated in terms of science concepts. She guided us to emphasize the importance of our idea and use that to create a story that captivates our audiences. Catherine also prompted us to constantly ask ourselves why our project is important and why people should know the information that we are delivering. In doing so, we would be able to engage our audience, as well as drive home the point of our project. Catherine also helped us develop our question-asking ability that we could use in our iHP interviews with members of our community. As a journalism student, Catherine was also able to get our Simply Synbio Blog started by advising us on how to effectively write about science.

Reflection

Human Practices takes a very multidisciplinary approach in how we communicate science. Not only do we communicate our project to the judges at the Jamboree, but we also do so with the community around us, whether it be through iHP interviews or through our Human Practices projects. Catherine was able to shed some light into how we can effectively communicate science, specifically our project, to multiple audiences that we may encounter in our iGEM journey.

Intervention

The Human Practices team implemented Catherine’s advice on journalism to our writing in the Simply Synbio Blog. As well, in a team-wide scope, her advice contributed in creating an engaging story for our project that we hope engages and educates our various audiences.

Professor Lindsay Eltis

Background:

Dr. Lindsay Eltis is a professor in the faculty of Microbiology and Immunology, whose research covers a broad spectrum related to microbial catabolism, including uncovering the catabolic mechanisms for specific enzymes. Dr. Lindsey Eltis, as an expert in researching and utilizing enzymes, is a great candidate to answer our question on TdT protein synthesis, purification and kinetics measurement.

Interview:

We were able to compile a list of questions related to wet lab experimental design:

  • Both the sequence of mutant TdT (insert) and the backbone expression plasmid pET28b+ contain 6x his tag. Should the C terminus His tag be removed as it is not part of the sequence provided in the paper? If so, is it sufficient to remove the His-tag by adding a stop codon before it?
  • With the encoded His tag, we are planning to use it for purification via magnetic beads. Would the protein purity from purification via magnetic beads be sufficient for our purpose? Should we consider other purification methods ex. IMAC?
  • How should we measure the protein kinetics for mutant TdT?

Reflection:

Dr. Lindsey Eltis answered each of the questions listed above.

  • For cloning, we should not include the C-terminal His-tag. It suffices to add a stop codon.
  • Dr. Lindsey Eltis indicated that there are two considerations in purifying any protein: (a) you don’t want a competing activity in your final preparation; and (b) you don’t want an inhibitory activity in your final preparation. Therefore, both magnetic beads and IMAC might work for our system, but he may still prefer IMAC.
  • Dr. Lindsey Eltis also pointed out a great point that once we have purified our TdT, we should verify it is active. We can use standard nucleotides and visualize addition using the high-resolution gel-based assay. If we want to use modified nucleotides to allow one addition at a time, we will need to test them. He recommended a fluorimetric assay.

Intervention:

Dr. Lindsey Eltis provided constructive advice for multiple parts of our experiment. Cooperating his advice into the experimental design, we will remove the C-terminal His-tag. We will also try to use magnetic beads if we need to purify the proteins. We decided not to use modified nucleotides for our project, but we will listen to his suggestion to test mutant TdT activity before proceeding to use it for SPS DNA synthesis and optimization.

Professor Nozomu Yachie #1

Background

Dr. Yachie is an expert in the field of synthetic biology and genome editing. His previous work in Japan involved storing data into the bacterial genome by inserting synthetic DNA sequences (doi: 10.1021/bp060261y). He also had previous experience working with TdT. This made Dr. Yachie an excellent iHP contact to reach out for feedback regarding our we lab team’s experimental design.

Interview

Since Dr. Yachie’s research background was closely tied to our project this year, we wanted to hear his opinions on our overall project, as well as experimental design. Specifically, we wanted to request his thoughts and feedback on which sequencing method to choose for ssDNA synthesis confirmation, whether a 15% Urea-PAGE of a fluorometric assay would be suitable to verify single nucleotide additions, how our optimization experiments could be designed to ensure the process is efficient and can be completed within the competition timeframe, and if it would be feasible for our team to consider storing our data in the form of dsDNA strands.

Reflection

For sequencing, Dr. Yachie suggested NGS over Sanger. He also suggested we look into Genewiz or Genescript.

When comparing between the two assay options for single nucleotide addition confirmation, Dr. Yachie suggested that we look more into the Urea-PAGE assay. After explaining that our wet lab team was aiming for single dNTP addition at the time of the interview, Dr. Yachie also suggested an alternative as he thought using modified dNTPs could be too stringent for this project.

In terms of reaction optimization, Dr. Yachie suggested that we strategically design our experiments such that multiple parameter combinations are tested with each round of optimization in a specific order. Since NGS sequencing takes several weeks for the results, Dr. Yachie suggested that our wet lab team could generate a large dataset with various optimization conditions to submit for NGS sequencing, then performing the next round of optimization that tests separate optimization parameters such that we wouldn’t have to wait for the previous optimization results to be analyzed beforehand. This will save us a lot of time, and all we need to do is wait for the sequencing data back while preparing more products for next round of sequencing. Dr. Yachie also suggested our team to specifically test temperature optimization first with a gradient PCR machine.

Dr. Yachie also recommended our team to synthesize a forward and complementary reverse ssDNA strand separately, if we were to proceed with a specific synthesis method. If we combine this approach with a high fidelity DNase that can identify faulty basepair alignment, he suggested that we can build in a quality control system into a platform by effectively ‘killing’ faulty dsDNA strands with incorrect basepair alignments.

Intervention

Our wet lab team was able to modify and adjust our experimental design based on Dr. Yachie’s specific feedback and suggestions on the overall direction of our project. In particular, our wet lab team decided to look into using natural dNTPs to encode information instead of modified dNTP, which our team ultimately decided to go forward with after further discussion with the entire team and our advisors.

Based on the optimization experimental design strategy that Dr. Yachie suggested, our wet lab team decided to reach out to dry lab for assistance, since we learned that it will be more efficient to use machine learning to identify optimal reaction conditions in large batches of NGS sequencing datasets.

Professor Nozomu Yachie #2

Background

Dr. Yachie is an expert in the field of synthetic biology and genome editing, with several publications based on bioinformatics analyses that process large high-throughput sequencing data. His technical expertise in both encoding information using DNA and extracting information from sequencing results made him an excellent iHP contact to reach out to. In terms of our project’s dry lab aspect, we wanted to present our overall plans for the software design and request his feedback on its feasibility, novelty, and area for improvement.

Interview

Our team first presented the overall software design for Dr. Yachie’s feedback on its direction. He mentioned that while our current plan doesn’t offer novel scientific discoveries, it can provide a valuable extension from an engineering perspective, since we are using a new programming language. He also suggested that to push our project even further and make it stand out, we could consider developing a novel error correction algorithm — it wouldn’t have to be perfectly polished and ready as we are an undergraduate team, but mentioned that we could reach out to UBC Faculty members for feedback and advice. In terms of chaosDNA, Dr. Yachie mentioned that testing it against purely random mutations should be fine and that we shouldn’t include insertion errors — it’s more likely that a deletion error occurs, making if unclear whether downstream bases are mutated or inserted. Dr. Yachie also mentioned that with our current project design, we will only have to test up to 100 nucleotides in our de novo assembly since we’re only synthesizing short ssDNA strands. But he added a note that if we were to store the synthesized DNA strand into a a plasmid, then we would have to consider the assembly of sequences that are 1000s of nucleotides long.

Reflection

Dr. Yachie’s advice on our software design was helpful in guiding the overall direction, since he was able to make comments based on our project’s wet lab aspect as well and helped us consider experimental constraints when designing our software. He also provided context on how information retrieval differs from a computer science and bioinformatics perspective by stressing the different types of mutations that the two fields handle. Our dry lab team was able to come up with a better understanding on the type of mutations that we should be focusing our error correction on and a suitable ssDNA strand length to base our de novo assembly on.

Intervention

Based on Dr. Yachie’s feedback, our dry lab team decided that we could consider making bold assumptions in our error correction algorithm that only deletion errors occur, based on the bioinformatics context that Dr. Yachie provided. In terms of next steps, our team decided that we need more research into the type of mutations that could be introduced through our platform - from both synthesis and sequencing. We also noted that the type of mutations introduced may differ depending on the sequencing platform we choose to analyze wet lab’s data.

Scott Baker

Background

Scott Baker is the Manager of Sensitive Research at UBC’s ARC (Advanced Research Computing), and has over 20 years of experience with managing projects and data systems while adapting to new effective solutions with modern research techniques. We reached out to Scott to gain more insights into how university data is being stored and the logistics behind it’s security.

Interview

Our main questions for Scott was whether there are any issues with current storage methods, how feasible our project is in future application, and what regulatory frameworks or legal considerations we need to take into account when implementing DNA storage for archival data. Additionally since we were deciding between pursuing healthcare, archival, governmental or Indigenous data as our main benefactors for our research, we asked Scott about the feasibility of storing governmental/university data.

Reflection

Overall, we learned that if DNA storage has the ability to have a long life span (durability) and the accessibility, it would be an amazing solution to current problems within the archival data storage realm. These problems include that current methods that involve archival data take up large amounts of space and resources, and commercial data storage methods to tackle this problem are very expensive (10-20 cents per gigabyte). Thus if we were able to successfully create a DNA storage method which was relatively inexpensive and required less resources to upkeep, it could be a viable solution.

For our use case, Scott shared with us that DNA storage could start as a backup solution to current governmental and university data storage, but we would most likely have to wait many years to gain the public’s trust in order to properly integrate it’s use into current methods. He stated that simple data storage that does not have security needs is best for the prototype phase.

Intervention

With Scott’s advice about his knowledge on governmental storage, this led us forward with our project in the sense that our project has true potential impact as a novel and viable solution within the field of data storage. However, it set us backwards in the sense that he discussed with us that the public fears new solutions, and it would take very long to see any real implementation. We were debating which aspect of DNA storage to focus on, either governmental archival, healthcare or Indigenous/Museum storage and this interview helped affirm that we should focus on Museum/Indigenous Archival data as there are many safety concerns with governmental data.

April

Dr. Aria Hahn

Background

Our Principal Investigator Dr. Steven Hallam introduced us to Koonkie, a biotech company that utilizes large biological datasets and bioinformatics tools to make new scientific discoveries. We were connected to the CEO, Dr. Aria Hahn, who comes from a metagenomics and bioinformatics background. This made her an excellent iHP candidate to reach out to during our initial project design phase, particularly given that our project’s aim was to be able to store large datasets that companies like Koonkie could employ.

Interview

Our discussion with Dr. Hahn was focused on the problem that nuCloud aims to address and how it can be better framed such that a lay audience could be more engaged. When we pitched our project’s purpose, Dr. Hahn shared her enthusiasm towards nuCloud’s vision and pointed out that various ground-breaking scientific innovations came to life through benchmarking nature. She also noted that by incorporating nature into nuCloud’s storyline, a lay audience will be more inclined towards understanding our project and the urgent global demand for more data storage. She also mentioned that Koonkie would be willing to financially support our team as a sponsor, as they strongly believed that our project’s goal is aligned with Koonkie’s vision as well.

Reflection

Prior to this discussion, our team did not consider framing our project in a way that focuses on how we are benchmarking nature’s method to store data. However, through our discussion with Dr. Hahn, our team learned that the way a scientific project is story-lined and shared with the public plays a significant role in its potential impact.

Intervention

After our interview with Dr. Hahn, our team ran several internal discussions where we brainstormed the best way to storyline our project, nuCloud. We decided to focus on Dr. Hahn’s advice on incorporating the ‘nature’ aspect into our storyline, which was later refined and further developed into nuCloud’s storyline presented in our wiki.

Dr. Eugene Barsky

Background

Eugene Barsky is a Research Data Management Librarian at the UBC Walter C. Koerner Library. He is a part of the Portage Data Discovery group and is participating in building the Canadian Federated Research Data Repository service (FRDR). He is an adjunct professor at the iSchool at UBC and teaches courses in science librarianship and research data management. His expertise and familiarity in archival data led us to reach out with hopes that his work in data management would have some insight on use cases for our project.

Interview

In our interview with Eugene, we asked him how data is currently stored in his work in archival data. Eugene explained that data is stored in two ways in his industry.

  • Repository data, where data is constantly being used and accessed
  • Archival data, where data that is stored is never really touched again.

Eugene explained that our project with DNA storage aligns more with the archival data as our data access system may not reach the speeds that are needed to quickly access and store repository data. He suggested that within archival data, we could work with microform data storage (e.g., old newspapers and records) as a possible use case, since these are pieces of information that do not change often. As well for pieces of data we could work with, Eugene recommended starting out with data that is public domain. Anything that is government licensed or available on creative common sources are all viable sources of data that our project can use. While explaining of of this, Eugene highlighted that we should always consider any biological risks and privacy rights that come with data.

Reflection

Through our interview with Eugene, we initially wanted to ask if his field, specifically data management in libraries, would be interested in a DNA storage system. Eugene introduced us to direct pieces of information, such as newspapers and historical records, that never change and directed our project to work with these objects. In contrast to medical data, our interview with Eugene shed light to a possible use case scenario for our project. While also pointing us to a possible project route, our interviewee also pointed out the concerns we should be thinking about for our project - privacy and data management rights. This will be a common theme for nuCloud, one that occurs again when we talk to our contacts at the First Nations Data Center, where data privacy is crucial in Indigenous Data Sovereignty.

Intervention

It was great to hear one perspective of how nuCloud can be used in archival data, specifically with microfilm. The attention to privacy and data management rights made us realize that our project needs to account for how we manage the privacy of our end-users. To learn more about privacy and safety policies, we looked towards our iGEM mentor, Sriram Kumar, who is a part of iGEM’s Safety and Security Committee to learn more.

May

Laura Gonzalez Campos

Background

Laura Gonzalez, is a PhD candidate at Eaves Lab at BC Cancer Research Center, co-supervised by Dr. Peter Zandstra. Laura’s ongoing project includes factorial design, where she uses the approach to decide an optimal media condition to allow human primitive hematopoietic stem cell to lymphocyte lineage. We conducted the interview with Laura to learn about factorial design, aiming to apply the relevant knowledge to our DNA synthesis process for mutant TdT activity testing and optimization. With substantial previous experience, Laura is the perfect candidate to guide us through our experimental factorial design.

Interview:

We conducted the interview with Laura to learn about factorial design, aiming to apply the relevant knowledge to our DNA synthesis process for mutant TdT activity testing and optimization. The specific question is: How should we proceed with the factorial design for liquid phase synthesis?

Reflection:

We first chose the best factorial design model for our experiments: full factorial design. The initial TdT testing aims to test whether our mutant TdT is able to add nucleotides to the DNA primer. Therefore, we decided to use only minimal factors that are most essential for TdT activity. Accordingly, we determined to consider dNTP concentration and TdT concentration as the two factors for this initial factorial design. For full factorial design, both an upper limit and a lower limit are needed. In addition to the boundary, an additional midpoint was included. We then went straight to determine the values. In order to determine the range of values we want to measure for TdT concentration, we referenced Chua et al. (2020) as they are the research group that utilized this specific TdT mutant to test DNA addition. However, Chua et al. (2020) did not purify the mutant protein, so there was no indicated specific TdT concentration in the paper. Instead, they used cell lysate containing TdT mutant protein and tested it with different dilutions. Since no other paper tested this TdT mutant, we could not find a previously determined optimal concentration for purified mutant TdT. Thus, we decided to follow Chua et al.’s (2020) protocol and tested different lysate concentrations for our initial testing. Similar to what the paper used, the dilutions used in the testing are 1:5, 1:50 and 1:500 dilution. According to Lee et al. (2019), TdT has different specificities for different dNTPs. Different dNTP concentrations were therefore used in the paper to allow optimal DNA addition for all four nucleotides (ATCG). For the initial mutant TdT activity testing, Laura agreed that we should not put effort into testing all four nucleotides as this will quadruple the amount of work. We therefore will choose 1 type of nucleotide for the initial testing, and when moving into solid phase synthesis, all four dNTPs will be tested with different concentrations to determine the optimal concentrations for all four nucleotides ATCG. Lee et al. (2019) tested concentrations of dNTPs ranging from 5uM to 400uM. This sparked us to choose concentrations: 4uM, 40uM and 200uM. We then together generated a matrix for those two factors:

dNTPs (uM)TdT (lysate dilution)
45
450
4500
405
4050
40500
4005
40050
400500
  • Besides those two factors, other reaction conditions were also considered. We have also agreed to choose a medium reaction time for now: 5 min, and 47 degrees Celcius, which is the optimal temperature for the mutant TdT determined by Chua et al. (2020). Since TdT is semi-specific, meaning that it will add as many nucleotides as possible when available. However, for a more efficient production and decoding process, the minimum number of nucleotides should be added to allow the maximum amount of information encoding. This means that we would want to minimize the addition of multiple repeating nucleotides. This affects our algorithm calculation after gathering the relevant information — minimum. We have also determined the readout of the assay, which will be the DNA sequence read through the DNA ladder.

Intervention:

This is a very meaningful discussion. However, we decided to directly go forward with purified proteins, so the exact factorial design was not used for our later experiments. However, the factorial design template and the thought-through process can be implemented for future experimental design. We can potentially take a similar approach of using a similar factor design for our solid phase synthesis once it is set up.

June

Dr. Anthony Tang

Background:

Dr. Anthony Tang, who earned his PhD from UBC and is currently working at Amgen, is highly experienced in molecular cloning techniques. Amgen is one of the world’s largest biotechnology companies, focused on developing and manufacturing innovative human therapeutics. His expertise makes him an ideal candidate to help troubleshoot our ThTdT cloning experiments.

Interview:

After the wet lab was unable to successfully clone the plasmid and transform it into BL21 E. coli, we consulted Dr. Tang for feedback on our current cloning design and potential improvements to our protocol.

Reflection:

During the initial planning phase of cloning, several recommendations were provided to improve our approach. We were advised to use high-fidelity polymerase for inverse PCR. While our primers for iGibson Assembly had 15 bp overhangs, Dr. Anthony Tang suggested increasing this to 30 bp to enhance the assembly process. For transformation, we initially used the BL21(DE3) strain due to its capacity for high protein synthesis, which is critical for producing mutant TdT. However, its lower transformation efficiency prompted Dr. Tang to recommend switching to the DH5α strain. Additionally, lowering the Gibson reaction temperature to 45°C for one hour may improve the assembly efficiency. As a contingency, restriction enzyme digestion was suggested as an alternative cloning method to Gibson assembly.

Intervention:

Moving forward, we will switch the E. coli strain from BL21(DE3) to DH5α to improve transformation efficiency and incorporate a 30 bp overhang onto the ThTdT G-block. If these adjustments do not yield successful results, we will revise our cloning strategy and implement restriction enzyme digestion as an alternative approach.

Drew Pihlainen

Background

The First Nations Data Centre (FNDC) is a resource from The First Nations Information Governance Centre (FNIGC), ensuring the data sovereignty of every First Nation. The FNDC promotes the appropriate use and dissemination of all First Nations data, offering access to requestable data resources. In order to discover how applicable our DNA storage project could be in regards to storing Indigenous Archival Data, we reached out to Drew Pihlainen and his colleagues Maria Santos and Kayla Boileau at the FNDC. While the FNDC’s work is closely-related to First Nations communities, it is important to note that the staff themselves are not from First Nations. They are currently working on a project through CIHR to engage First Nations in biobanking activities and genomic research, which we thought was relevant to the themes of iGEM as well.

Interview

We wanted to discover whether it would be feasible by any means to store Indigenous data via DNA and cloud storage, as well as the ethics and policies surrounding First Nations data. Prior to the interview, we were asked to review the FNIGC / FNDC website as well as the policies that guide the FNDC. We discussed and identified the important principles of OCAP which supports the information governance on the path to First Nations data sovereignty.

Reflection

From our discussion with Drew, Maria, and Kayla, we learned that is it not feasible to keep cloud data on Indigenous Land. In order to collaborate with Indigenous communities, we need to first honour the OCAP principles and reach out directly to every community individually and discuss with them from the start of our project about their data involvement. Additionally, we were reminded about the importance of receiving consent from the entire Indigenous community rather than one individual in order to continue with our project in this direction. They are met with a lot of resistance against genomic research and biobanking and the goal of FNDC is to reach a point of understanding and reduce fear. Thus, we learned that it is important to establish connections like these early and have only plans for engagement before thinking of implementation.

Intervention

The information from this interview led us to the decision to pursue DNA storage means in regards to museums and archival data since our relationship with the local Indigenous communities in regards to data storage was not established since the beginning of the project. They recommend that if we still wanted to incorporate First Nations communities, to take the Fundamentals of OCAP course and really integrate it into our storage system.

July

iGEM Startups Summer School (Event)

Background

The iGEM Startups Summer School trains iGEM teams in entrepreneurship and guides them toward winning the Best Entrepreneurship Prize. They do extensive case studies and research to bring teams the best insider tips to help them build advanced business plans.

Interview

The event was divided into five main events: an introduction, a Building Business plan workshop, a start-up panel, best entrepreneurship case studies, and the pitching workshop.

Reflection

We learned that getting experts to endorse the science behind our idea is key, as it helps us determine its potential to scale. They also guided us by advising us to establish stakeholder relationships and to connect with what our customers want, rather than just focusing on our idea. We received an introduction to business pitches, which provided insights into tailoring a presentation deck for investors. Importantly, they emphasized the need to address skepticism about GMOs by explaining them in simple terms to ease concerns.

Intervention

We received illustrative feedback from conversations with other iGEM teams. Initially, we were considering using traditional long-term storage, which involves freezers, but they raised sustainability concerns due to high energy consumption. A member from TU Braunschweig mentioned drying DNA to enhance longevity without refrigeration, which helped us consider alternative storage options for our DNA. Additionally, we started reaching out to more experts who can validate our project and customers who can vouch for it. Lastly, we began looking for incubator programs and contacted entrepreneurship@UBC to gain insight into legal considerations.

Jonathon Jafari

Background

Jonathon (Jon) Jafari has over 20 years of biotech industry experience and patented a technology that was developed at UBC. He is also an entrepreneur in residence and the lead of the Human Health Venture Studio at entrepreneurship@UBC (e@UBC). e@UBC’s goal is to empower UBC students, researchers, faculty members, alumni and staff with the resources, networks, and funding they need to succeed in building transformational ventures that positively shape British Columbia’s local and global society. This made him an excellent iHP candidate to reach out to for feedback on our team’s entrepreneurship plan.

Interview

Jon met with us under the premise of discussing a potential venture startup opportunity for our project. We first presented our project to him, and then we engaged in a conversation to discuss available resources, regulatory policies, and important legal considerations such as IP law. We were curious to understand what the immediate next steps would be to take our project one step further as a potential venture startup, as well as the downstream long-term considerations to make when constructing our entrepreneurship plan.

Reflection

Jon started by emphasizing the importance of having well known experts on our field endorse our project. Considering that the science behind biology and chemical ventures is often too convoluted, it is key for respected individuals to validate our claims so investors can feel safe. Regarding, this point Jon referred us to UBC’s Venture Founder program. This program would not only help us validate our idea with the required people, but also provide mentors that would guide us in our entrepreneurship journey through 16 weeks of experiential workshops.

Another point of conversation was UBC’s claim on intellectual property of our project. He mentioned that the university’s inventions policy (LR11) dictates that it has complete ownership, of any work done using UBC resources, but we will be credited. Furthermore, if UBC decides to commence market mobilization, 50% of the revenue will be given to us. That said, he mentioned that because we were undergrad students, some rules might differ, there are also special cases where UBC might transfer those right. Given his limited knowledge on the subject he advised us to reach out to the Tech Transfer Office.

Regarding LR11, it is likely that the work we do will be categorized as University Research, and the major determinant is how the Hallam Lab operates within UBC and what agreement does our PI, Dr. Hallam, have. While we can argue that the financial part was clear in what the sponsors would receive when donating, thus potentially bypassing LR11, what is unarguable is the use of the Hallam Lab.

Intervention

Before proceeding with communicating with the tech transfer office, we decided to discuss the idea of creating our project into a startup with all our members to gauge the interest of the group. Additionally, we decided on having a conversation with our PI, Dr. Hallam, to inquire about his interest and claim towards the startup idea, in addition to any information he might have on LR11.

Concerning the actual entrepreneurship plan, we set our goal of reaching out to experts to request their input in our idea and asking what would we need to adjust to our plans for them to endorse them. We also became more familiar with UBC’s IP policy thus helping us shape our own IP strategy.

Dr. Reda Tafirout

Background

Dr. Reda Tafirout is an ATLAS Tier-1 researcher at TRIUMF, Canada’s particle accelerator centre. A major portion of TRIUMF’s research requires collaboration with other research facilities around the world, including the European Organization for Nuclear Research (CERN). Apart from its commitment to groundbreaking scientific research in various fields including life sciences and molecular physics, TRIUMF is also one of the data storage centres for ATLAS, which generates thousands of terabytes of data every year. Dr. Tafirout’s professional background as a senior research in this facility made him an excellent iHP candidate to request his feedback on the feasibility of implementing nuCloud to store large data for collaborative research.

Interview

Through the interview with Dr. Tafirout, our team wanted to learn more about the data storage demands specific to research centers, including data transport between collaborators, sustainability concerns, data integrity, and access times. We learned that TRIUMF’s data is kept on local servers and on systems like the Digital Research Alliance of Canada. We also learned that TRIUMF utilizes magnetic tapes for long-term storage, similar to CERN, and participates in the Worldwide LHC Computing Grid for distributed data management. Dr. Tafirout also noted that to facilitate cross-institutional collaboration at this scale, high-speed networks are essential for data transport across universities and international borders. With regards to environmental concerns, our discussion touched on Power Usage Effectiveness (PUE) metrics for data centers and the efficiency of magnetic tapes to gage the feasibility of using nuCloud as an alternative data storage medium. In terms of nuCloud’s possibility to be applied in a research setting similar to TRIUMF, Dr. Tafirout noted that if DNA storage can be completely streamlined and automated from encoding to decoding, it could be potentially considered.

Reflection

Through the interview with Dr. Tafirout, our team wanted to gain a clearer understanding of TRIUMF’s data storage practices, particularly in terms of large-scale data management, spanning from local servers to global data grids. As we explore various use cases for nuCloud in the real world, requesting Dr. Tafirout’s insight on the data storage demands specific to large research facilities helped shape our understanding of the project’s future directions. In particular, we learned that TRIUMF’s data storage demands required relatively fast access times, which is an area that requires further optimization for nuCloud.

Intervention

After our discussion with Dr. Tafirout and reflecting on his feedback, our team was further motivated to implement reaction automation features with our Hardware, particularly with the microfluidic pumps. This aligned with the downstream implementation goals of nuCloud, as we wanted to ensure that the project could eventually be developed into a large-scale biomanufacturing platform for data storage.

Tony Liu

Background

Tony Liu is a doctoral student from the Hallam lab focussing on using Bioinformatics to research is in the area of metabolic networks. During a discussion with our PI, Dr. Hallam, our software lead was expressing difficultly towards balancing error correction with short nucleotides. Dr. Hallam then noted that it seemed like our dry lab was pursuing the problem of DNA sequence error correction from a very computer science focussed angle, and thus he reminded our team that DNA is a biological entity, and that biology doesn’t want to be “correct.” He reminded us that the field of bioinformatics already deals with trying to recover the correct DNA sequence from genomes, and other biological entities. Thus, Dr. Hallam recommended our team some iHP contacts who come from a bioinformatics background, one of which being Tony. Additionally, at a Hallam lab meeting, while one of our co-directors was discussing the project, Tony expressed interest in learning more, and thus we established contact.

Interview

Our dry lab team had just wrapped up the first DBTL cycle of their software pipeline, and were realizing that error correction for DNA more challenging then they thought it would be. The development of error correcting codes (ECC) and error detection (ED) comes from the field of information theory, which has historically established ECC and ED for bit flips, which are the equivalent of nucleotide mutations in DNA sequences. However, from another previous iGEM team’s results, and in general, edit errors in DNA sequence are much more frequent, and DNA also suffers from deletion and insertions, on top of mutations. Additionally, our synthesis method being semi-specific makes it hard to add error correction, because we get less unique nucleotides per strand. Thus, our team wanted to discuss two questions with Tony:

  1. How can we work with semi-specific synthesis?
  2. How do we deal with correcting errors with such short nucleotide sequences, and with a minimal number of unique sequences?

We wanted to see if our ideas about error correction, from a computational science point of view, made sense in the context of DNA storage. We first explained the constraints on our software, and then our encoding strategies, our hypothesis and estimation on the type of errors we would get, and then our error correction strategies.

Reflection

Tony mentioned throughout the interview that since biology can “break” for no reason, solely approaching DNA error correction with traditional computer science approach would probably not work. When we told Tony that our nucleotide sequences were 50 nucleotides long, he mentioned that traditionally, bioinformatics works with populations to find a consensus sequence between them, and this is how robustness is achieved. Our dry lab first highlighted that based on Aachen’s results, most edit errors would occur during synthesis, in which Tony highlighted that to prevent these errors, the reaction should be run multiple times, to create enough redundancies and copies of the sequence, so that by sheer numbers, we can eventually minimize the large rate of deletion errors. Additionally, to reliably add bases, Tony mentioned we should urge our wet lab team to focus more on adding as many nucleotides as possible, opposed to trying to use an optimal amount of reagents, and then when bases can be added reliably, optimization of reagent use can begin.

When we discussed our two proposed error correction strategies, HEDGES and G&C+, Tony stated that for ease of development and orthogonality, we should separate encoding and error correction, as HEDGES was very complex, and the whether HEDGES could recover the high rate of deletion errors wasn’t too convincing.

We then discussed sequencing strategies, in which Tony mentioned our proposed methods of Sanger and NGS, would need to be discussed with wet lab, since Sanger would be an ill-suited sequencing method if our reaction pool contained short sequences with a high rate of variance between them. He suggested nanopore sequencing, which detects different nucleotides based on the changes between them, which fits quite well with our ternary based encoding. Finally, Tony asked how we would store large files of information, and mentioned that if we use the primer as the ID, we may run out of them.

Intervention

Tony’s points about DNA being imperfect allowed out dry lab team to plan what to prioritize in our upcoming DBTL cycles. Firstly, while we were split between working on compression or error correction, Tony’s points allowed us to choose to prioritize error compression. Additionally, we agreed that keeping encoding and error correction separate would ease development and make the software easier to design and think about. Hence, we dropped HEDGES, in favour of focussing on adding checksums for error detection and G&C+, a error correction method that focusses specifically on deletions and insertions for short nucleotide strands, which can be applied to any encoding scheme. We also thought more about how we could order strands for large files, and instead of using primers as ID, we decided could intentionally overlap strands; while this method may not scale well, our team doe not have the means to create primers on the fly, and by intentionally overlapping strands, we achieve both redundancy and some ordering of strands, all in one. We will also pursue overlapping of bits and bases as a form of statistical error correction, which Tony highlighted, is the upside to the density of DNA, and how we can deal with biology being imperfect. Finally, our dry lab team will now begin incorporating both the bioinformatics (statistical) and computational science (formal methods) point of view into our further DBTL cycles. We will now reach out to computer science iHP contacts, so we have a balanced perspective on developing our software.

August

Beth Davenport

Background:

Beth completed her MSc at Imperial College London in Applied Biosciences and Biotechnology and is currently pursuing a PhD in Synthetic Microbiology at the Hallam lab. With her expertise in molecular cloning and E. coli protein expression, she is well-equipped to assist with our cloning and protein extraction experiments.

Interview:

Before successful cloning, we consulted her with our primer design and possible ways to improve our experiments. After successfully integrating the pET-28b(+) - ThTdT plasmid into BL21 E. coli, we consulted Beth for advice on possible E. coli lysis methods to proceed with protein extraction.

Reflection:

Beth confirmed our plasmid design, cloning experimental protocol, and provided us with Lysing Matrix A and a mechanical lysis protocol to aid in the E. coli lysis process for protein extraction.

Intervention:

We will utilize the equipment and protocol provided by Beth to lyse BL21 E. coli for protein purification.

Alonso Flores

Background

Alonso is a Safety & Security Program Officer at iGEM’s Safety and Security Committee. His expertise in the iGEM’s commitment towards safety and biosecurity, along with his background in dual-use technology, made him a good iHP candidate to reach out to discuss the safety aspect of nuCloud.

Interview

Our discussion with Alonso focused on the potential safety hazards posed by nuCloud’s downstream implementation in the market. Although nuCloud did not utilize any hazardous strains of bacteria or pathogens, an interesting safety hazard was brought up during our discussion: information hazard. As nuCloud aims to be scaled-up in to a biomanufacturing platform for large-scale data storage, Alonso pointed out that our team must consider the potential information hazards associated with its use. For instance, there may be malicious user intent regarding the types of information to be stored with nuCloud, inherent data security risks associated with using DNA as a data storage medium, and more. Such concerns aligned with the feedback provided by our iGEM mentor, Sriram Kumar. When asked on the different mitigation strategies our team could take to handle such risks, Alonso suggested that our team first identify the different steps associated with implementing nuCloud into the real world. This involved asking questions such as whether access to this technology should be limited or whether data will be stored in a centralized manner or distributed to each user.

Reflection

Through our discussion with Alonso, our team wanted to gain a better understanding of how information hazards could be addressed for iGEM projects and their future implementations in the real-world. Alonso’s advice regarding what steps we could take to mitigate foreseeable risks by identifying major milestones in project implementation was very insightful. We learned that although iGEM projects are not typically developed into a full product that is launched to the market, there are ways to plan ahead and identify potential safety risks upstream in the design process.

Intervention

Based on our discussion with Alonso, our team proceeded to conduct a more thorough analysis nuCloud’d Data Security aspect and the potential information hazards and dual use risks associated with this project. We were also reminded of the importance of incorporating best safety practices in our project design and reaching out to relevant iHP candidates for their insight from a safety perspective.

Anne Condon

Background

Dr. Condon is a professor at the Department of Computer Science. Her work involves computational complexity theory and design of algorithms. Currently, she is researching models and algorithms for computing with molecules and predicting DNA kinetics. Hence, Dr. Condon’s interdisplinary work in computing and life sciences makes her an excellent iHP contact for our software team.

Interivew

During this interview, our software team presented our current designs, involving complex error codes. Dr. Condon and her grad students asked us if these extra levels of error correction would actually be able to prevent or correct deletions; to that we were not sure. After, we mainly discussed the paper Terminator-free template-independent enzymatic DNA synthesis for digital information storage, and Dr. Condon suggested to adopt a similar strategy, inspiring our current error correction design to use a scaffolding technique. Both her grad students also mentioned that because there were so many deletions, we could just filter out the strands with deletions.

Reflection

After the interview with Dr. Condon, it made our software team realize that the traditional error correction methods for SSDs, HDDs etc would not apply well for DNA storage; infact for nuCloud, they would not work at all. Hence, we decided to start thinking about each encoded strand as existing in a pool, and that deletion of bases would be much better resolved through wet lab means then through software preventitive means.

Intervention

Our software algorithm now does not correct for deletions, because we filter them out. Currently, trying to correct for strands that may have around 10 - 50% deleted nucleotides is too much work when compared to the alternative, which is to synthesize more strands in wet lab. Adding error correction bits is also not a viable strategy at the moment because if those bases are mutated or deleted, then our redundancy cannot actually correct for errors. Thus, until TdT DNA synthesis is lower error, is is better to let random noise cancel out each other through sheer numbers (with the assumption correct bases are added more often then deletions, mutations and insertions).

Dr. Changsung Lee

Background

Dr. Changsung Lee is a lead systems architect at Samsung Electronics HQ, South Korea. As the manager of various data centres worldwide, his background made him a good iHP candidate to request feedback on nuCloud’s potential application as a large-scale IT industry data storage platform.

Interview

We asked Dr. Lee about the current technical drawbacks, sustainability concerns, and management cost of running large data centres. We also inquired whether IT companies are considering the use of alternate data storage mediums given their technical limitations, and what would be the important factors to consider when implementing new types of data storage media. Dr. Lee responded that there are clear limitations in current SSD and HDD-based data storage centres, but that industry is mitigating such drawbacks by mean of optimization, instead of searching for new data storage mediums. He also noted that if DNA were to be feasible for IT-related data storage use, it must be able to guarantee data integrity, be secure, and capable of completely deleting select data when needed. He added that a significantly faster read and write speed must first be guaranteed for DNA to be considered as a potential candidate. In terms of sustainability and relevant regulations, Dr. Lee noted that compliance with Environmental, Social, and Governance (ESG)-related policies are typically of the greatest concern for industry data centres these days.

Reflection

We requested to meet with Dr. Lee for his feedback on the feasbility of nuCloud’s implementation as a data storage platform in the IT industry and the potential roadblocks or technical considerations to make for it to be a suitable candidate for this use case. Through this interview, we learned that several key technical optimizations must first be made to consider this specific use case, but most importantly, demonstrate faster read and write speeds. We also learned that we must be mindful of the relevant sustainability regulations if this project were to be implemented in industry.

Intervention

Having been taught of the important functional requirements for data storage mediums required by the IT industry, our team learned that it’s important to ensure that nuCloud has the potential for further optimization to meet such requirements in the future. His comments on compliance towards sustainability regulations also drove us to conduct a more thorough analysis of our Sustainable Development Goals.

Professor Rashmi Prakash

Background

Rashmi Prakash is an adjunct professor at UBC’s School of Biomedical Engineering and the CEO of Aruna Revolution, a company that aims to revolutionize menstrual health with compostable menstrual pads. She comes from a background in electrical engineering and has a passion for integrating sustainable principles to scientific research and its applications in the real world. This made Rashmi a great iHP candidate to discuss how UN SDGs could be incorporated into nuCloud’s design to ensure it is developed and implemented in a sustainable manner.

Interview

Our discussion with Rashmi centered around 2 specific UN SDGs (SDG 12 Responsible Consumption and Production; SDG 13 Climate Action), how her experience in the field have worked towards achieving the SDGs in an engineering context, and how our team could further develop nuCloud to work towards achieving various SDGs. We also requested her advice on how to better incorporate various stakeholder inputs during the design iteration phase of our hardware to better incorporate sustainability principles and the importance of conducting a thorough life cycle assessment for nuCloud as a biomanufacturing platform.

Reflection

Through our discussion with Rashmi, we learned that sustainability must be integrated into the entire project design phase; starting from brainstorming, iterative design, and its eventual release into the market. She emphasized that engineers must be mindful of the potential sustainability impacts that their products could pose both in the short-term and a long-term perspective as well. We also learned that user feedback and various stakeholder perspectives must also be integrated into the sustainable design aspect of any engineering project, as their socioeconomic status, stance on how important sustainable design practices are, and the extent to which they are willing to compromise other aspects of user experience may significantly vary. One thing she noted was that the public often gets desensitized from pressing environmental challenges, such as climate change, due to the sheer amount of data being shared over the media.

Intervention

After our discussion with Rashmi, our team was reminded of the importance of incorporating UN SDGs and sustainable practices into our project design process. This motivated us to conduct a proof-of-concept product life cycle analysis for nuCloud’s implementation in the real world to quantify its impact. Her feedback and advice were reflected on our sustainable development plans, particularly when identifying the relevant problems and potential mitigation strategies for SDGs 12 and 13.