🧮 Modeling

Classical experimental techniques for optimization typically assess one response by altering a single factor while keeping others constant. However, this approach has limitations, particularly in understanding the interactions between factors and responses. It also often requires a greater number of experimental runs [1]. In contrast, the Design of Experiments (DOE) approach allows for the simultaneous evaluation of multiple factors and their interactions, leading to more efficient optimization. This results in fewer experimental runs, lower costs, and a better overall understanding of the process [2]. Accordingly, we were able to minimize the number of experiments to optimize our framework. Additionally, we were able to assess the interactions and the effect of each individual factor within the experimental workflow.

Design of Experiment: Energy Buffer Optimization

Previous efforts in energy buffer optimization have focused on replacing individual components and testing their practicality [3]-[5]. However, we quickly noticed a gap in the research: a need to explore a fully alternative energy buffer recipe. Simply replacing individual components may not address broader limitations such as cost, scalability, or compatibility with different cell-free systems. A completely new recipe could unlock more efficient or sustainable solutions. To optimize multiple component concentrations simultaneously, we employed the Design of Experiments (DOE) approach. This method allowed us to systematically test and refine the buffer recipe, ensuring that we could achieve the global optimum for performance and efficiency.

Experimental Protocol

As outlined in the Results section, the individual substitutions were successful! We replaced: Nucleoside triphosphates (NTP), with nucleotide monophosphate (NMP) Amino acids (AA), with yeast extract and tryptone 3-Phosphoglyceric acid (3-PGA), with maltodextrin (MD) and hexametaphosphate (HMP) Building on this, we applied the same experimental setup to assess the full factorial central composite design for a fully substituted alternative energy buffer. The design used was a face-centered, randomized design, ensuring a comprehensive exploration of the variable space.

i.DOT liquid handler.

In simple terms, we tested various ingredient levels to find the best mix, and explored all possible options thoroughly and in a random order, to ensure reliable and statistically significant results. To enable high-throughput and precise experimentation, we utilized the i.DOT liquid handler for accurate liquid handling and dispensing in low volumes.

Pre-Processing Data

The response assessed for the Central Composite Designs (CCD) was the maximum fluorescence, determined using the QurvE R package, which implemented a linear regression method. The data quality thresholds were set as follows: an R² threshold of 95%, a relative standard deviation threshold of 0.15, and a dY threshold of 0.1.

Before analysis, the fluorescence data for each sample was blanked to the mean of the first three readings. To minimize batch-to-batch variation, a control containing the original energy buffer recipe without substitutions was included in the DOE runs. Each sample was tested in triplicate to ensure accuracy.

To compare performance across samples, the maximum fluorescence of the control was divided by the maximum fluorescence of each sample within the same run. This scaling was done on a scale where 1 represents the maximum fluorescence of the control within each run, inspired by the method used by Borkowski et al. [6]

1. The first Central Composite Design (CCD)

Table 1. A summary of table MMX concentrations within the reaction volume.

MMX	Final Reaction Concentration
K-glutamate (mM)	76.68
HEPES (mM)	54.76
NMP (mM)	1.64
CoA (mM)	0.28
NAD (mM)	0.36
cAMP (mM)	0.82
Folinic Acid (mM)	0.07
Spermidine (mM)	1.10
Putrescine (mM)	1.10
DTT (mM)	1.10
Ammonium glu (mM)	10.95
Oxalic Acid (mM)	4.38
PEG-8000 (%)	2.19

For the assessment of the alternative energy buffer recipe, a master mix was prepared containing the following fixed components: K-glutamate, HEPES (4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid), NMP, CoA (Coenzyme A), NAD (Nicotinamide adenine dinucleotide), cAMP (Cyclic adenosine monophosphate), folinic acid, spermidine, putrescine, DTT (Dithiothreitol), ammonium glutamate, oxalic acid, and PEG-8000 (Polyethylene glycol 8000). These components were pre-mixed in specific ratios before being added to a reaction volume of 5 µL to ensure that the final concentrations in the reaction mixture met the desired specifications, as shown above in Table 1 [7,8].

For the CCD design, maltodextrin, tryptone, yeast extract, and Mg-glutamate (Magnesium glutamate) were tested at various levels, based on insights from the literature and previous OFAT (One Factor At a Time) analysis for energy buffer optimization. The concentration of HMP (Hexametaphosphate) was adjusted to maintain a ratio of 5 % relative to the maltodextrin added, as outlined in the study by Warfel et al. [4]. These stock solutions were added to the reaction volume, followed by the addition of a DNA template (a plasmid with GFP gene under constitutive promoter).

Table 2. A summary of the first CCD design factors ranges.

Component	Minimum level	Maximum level	center point	Reference
Yeast extract (%)	0.77	2.3	1.54	[3]
Tryptone (%)	0.83	2.5	1.67	[3]
Mg-glutamate (mM)	4.5	10	7.25	[6]
MD (mg/ml)	30	50	40	[4]

Table 3. Statistics of the full quadratic regression model fit of the first CCD.

Degrees of Freedom	73
Root Mean Square of Error	0.05
R-Squared	0.90
Adj. R-Squared	0.88
Residual Sums of Squares	0.17
Predicted. R-Squared	0.84

The resulting design was initially analyzed using a full quadratic regression model. The key statistics for the developed model are summarized below: the regression model equation is as follows: Standardized_flourescence = 0.29621 - 0.02937 · Tryptone - 0.09285 · MD - 0.08326 · Yeast + 0.0456 · Mg - 0.00257 · Tryptone² -0.02967 MD² - 0.01618 · Yeast² + 0.00539 · Mg² -0.01037 · (Tryptone × MD) - 0.01576 · (Tryptone × Yeast) + 0.00972 · (Tryptone × Mg) + 0.02588 · (MD × Yeast) + 0.04032 · (MD × Mg) - 0.00363 (Yeast × Mg)

Contour plot of the first Central Composite Design (CCD) results

Figure 1: Effect plot of standardized fluorescence, displaying the standardized effects of various factors and their interactions. The horizontal blue line represents the significance threshold at 1.993, with bars exceeding this value considered statistically significant. Maltodextrin (MD) and yeast extract show the strongest effects, followed by magnesium glutamate (Mg) and interactions such as MD*Mg and MD*MD. The lower effects of terms such as Tryptone, and interactions involving T, Mg, and yeast suggest less influence on the standardized fluorescence.

The generated regression model showed a very good fit, explaining approximately 90% of the results. However, the slight decrease in the adjusted R-squared value indicates potential overfitting. This observation is further supported by the effect plot, where the terms Mg², Yeast × Mg, and Tryptone² interactions were insignificant but still included in the regression model. Accordingly, a simplified, reduced regression model was developed, excluding these insignificant terms to improve accuracy.

The reduced regression model showed a slightly higher predicted R-squared while maintaining the R-squared and adjusted R-squared values. This confirmed the presence of overfitting in the initial model and validated the improved performance of the simplified model. The simplified regression model equation is as follows:

Standardized_flourescence = 0.29621 - 0.02937 · Tryptone - 0.09285 · MD - 0.08326 · Yeast + 0.0456 · Mg - 0.01037 · (Tryptone × MD) - 0.01576 · (Tryptone × Yeast) + 0.02588 · (MD × Yeast) + 0.04032 · (MD × Mg)

Table 4. Statistics of the reduced quadratic model excluding the insignificant variables.

Statistic	Value
Root Mean Square of Error	0.05
R-Squared	0.90
Adj. R-Squared	0.88
Residual Sums of Squares	0.17
Predicted R-Squared	0.86

Contour plot of the second Central Composite Design (CCD) results

Figure 2: Residual plot analysis for the reduced quadratic regression model. A: The residuals (differences between observed and predicted values) are randomly scattered across the row order, showing no clear pattern, which is a good sign. B: Histogram of the spread of the residuals, which suggests that most residuals are close to zero with a few small deviations. C: Residuals versus predicted values, and the lack of a clear pattern indicates that the model is consistent. D: Most points follow a straight line, meaning the residuals are close to normally distributed, with minor deviations at the edges.

As shown in figure 2, the residual plots indicated homoscedasticity, meaning that the variance or spread of the residuals—the differences between observed and predicted values—is consistent across all values of the independent variables. This consistency, coupled with only minor deviations from normality observed particularly in the tails (Fig. 2D), suggests that the model was well-fitted with no significant issues in residual behavior. Figures 2A and 2B further demonstrate that the residuals were generally randomly scattered around zero, reinforcing the overall robustness of the regression model and confirming its ability to provide reliable predictions across various conditions.

Figure 3: Interaction plots for standardized fluorescence as a function of different predictors: tryptone, maltodextrin (MD), yeast extract, and Mg-glutamate. Each plot illustrates the effect of one predictor at varying levels of the others. Black, red, and green lines represent different concentrations of: A - tryptone (0.8%, 1.65%, and 2.5%), B - MD (30 mg/mL, 40 mg/mL, and 50 mg/mL), C - yeast extract (0.8%, 1.55%, and 2.3%), and D - Mg-glutamate (4.5 mM, 7.25 mM, and 10 mM). The plots demonstrate how changing these concentrations affects the standardized fluorescence and highlight key interactions between the variables.

Both tryptone and yeast extract had a negative effect on standardized fluorescence across all tested ranges, with lower concentrations (ideally below 1 %) providing better results (Fig. 3A and C). Maltodextrin exhibited a similar trend, where the lowest concentration (30 mg/mL) was found to be optimal (Fig. 3B). In contrast, higher concentrations of magnesium glutamate led to improved standardized fluorescence (Fig. 3D). However, at a maltodextrin concentration of 30 mg/mL, changes in magnesium glutamate levels showed no significant effect on fluorescence variation. The contour and surface plots further confirm that reducing tryptone and yeast extract concentrations brings the system closer to the optimal fitness space for peak performance.

Figure 4: Surface plot of standardized fluorescence as a function of tryptone and yeast extract concentrations, with maltodextrin (MD) held at 40 mg/mL and magnesium glutamate (Mg) held at 7.25 mM. The plot demonstrates that lower concentrations of both tryptone and yeast extract result in higher standardized fluorescence, supporting the conclusion that reduced levels of these components lead to better performance in the energy buffer system.

Figure 5: Contour plot of standardized fluorescence as a function of tryptone and yeast extract concentrations, with maltodextrin (MD) held at 40 mg/mL and magnesium glutamate (Mg) held at 7.25 mM. The color gradient indicates the standardized fluorescence levels, with red representing lower values and blue representing higher values. The plot shows that reducing the concentrations of both tryptone and yeast extract leads to higher fluorescence, supporting the conclusion that lower concentrations of these components are beneficial for optimal performance.

2. The second Central Composite Design (CCD)

Since yeast extract and tryptone both serve as amino acid sources [3] and they might be acting as confounding variables, a simplification of the design was carried out by using yeast extract as the sole source of amino acids. This decision was based on the significance analysis, which indicated that yeast extract had a greater impact on the results compared to tryptone. Additionally, it was hypothesized that using a single amino acid source could enhance the model's predictive capability and improve the regression model's fit. The explored fitness space was shifted according to the optimization function of the first CCD where less concentration of yeast extract was preferred and a low concentration of MD of about 20 mg/mL. For the Mg-glutamate, the range was narrowed down to include the optimized value as far as possible to refine the quality of the design to reach the peak of optimization.

Table 5. A summary of the second CCD design factors ranges.

Component	Minimum level	Maximum level	center point
Yeast extract (%)	1	2	1.5
Mg-glutamate (mM)	3	7.6	5.3
MD (mg/ml)	15	25	20

The developed quadratic model was as the following: Standardized_flourescence = 0.08618 + 0.04077 · MD - 0.02461 · Yeast - 0.08813 · Mg + 0.09064 · MD^2 + 0.04619 · Yeast^2 + 0.07845 · Mg^2 - 0.02978 · (MD × Yeast) - 1.08487E-5 · (MD × Mg) + 0.05925 · (Yeast × Mg)

Table 6. A summary of the statistics of the second CCD regression model for fitting the standard deviation (std) of the triplicated standardized fluorescence measurements.

Root Mean square of Error	0.05466
R-Squared	0.88277
Adj. R-Squared	0.86167
Residual Sums of Squares	0.14937
Predicted. R-Squared	0.8119

The developed model fitness was very good. 88 % of the response could be illustrated with the developed model. There is a significant drop in the predicted R-Squared which indicates potentiality of overfitting. This can be tracked down in the effect plot where the MD * Mg interaction term was not significant and still included. Accordingly, the model was simplified to be as the following excluding the MD * Mg interaction term: Standardized_flourescence = 0.08618+ 0.04077· MD - 0.02461· Yeast - 0.08813 · Mg + 0.09064MD² + 0.04619 · Yeast² + 0.07845· Mg² - 0.02978· (MD × Yeast) + 0.05925 (Yeast × Mg).

The resulting predicted R-Squared was slightly improved to 82% while leaving the R-Squared and adj. R-Squared unaffected. The contour plot and the surface plot (Fig. 6 & 7) shows the potentiality to reach closer to the optimized fluorescence when low concentration of yeast extract is utilized along with slightly higher concentration of MD (>25) higher than what was tested in the current design. As shown in the contour plot (Fig. 7), the lower concentration of the yeast extract and higher concentration of MD at the tested ranges were better for the higher lysate activity.

Figure 6: Effect plot of standardized fluorescence, showing the standardized effects of various factors and their interactions. The horizontal blue line represents the significance threshold at 2.009, with bars exceeding this value considered statistically significant. The strongest effects were observed for maltodextrin (MD) and its interaction with itself (MD*MD), magnesium glutamate (Mg), and the Mg*Mg interaction. Yeast (Y) extract and its interactions with magnesium glutamate (Yeast*Mg) and itself (Yeast*Yeast) also had notable effects, though to a lesser extent. The lower effects of MD*Yeast and MD* Mg indicates smaller contributions to the standardized fluorescence.

Figure 7: Contour plot of standardized fluorescence as a function of yeast extract and maltodextrin (MD) concentrations, with magnesium glutamate (Mg) held constant at 5.3 mM. The color gradient represents the levels of standardized fluorescence, where blue indicates lower fluorescence and red indicates higher fluorescence. The plot shows that moderate levels of yeast extract and MD yield the highest fluorescence, with decreasing fluorescence observed at both higher and lower concentrations of these components.

Confirmatory Runs

First CCD Optimization:

To achieve maximum standardized fluorescence using tryptone, yeast extract, and Mg-glutamate as predictors for the alternative energy buffer preparation, the first CCD reduced regression model was applied. This model aimed to maximize standardized fluorescence, and an additional response - standard deviation (std) of triplicated fluorescence measurements - was included to improve prediction accuracy and the robustness of the energy buffer design.

A two-response optimization procedure was performed, focusing on minimizing the log of the standard deviation (log[std]) while maximizing standardized fluorescence. This approach was inspired by [9]. The log of the standard deviation was used to further refine the prediction capability.

Statistics for the log[std] response regression are summarized below (Table 7). The model showed an R-Squared of 52 %, indicating that the model can explain about half of the variation in the response. This suggests that the standard deviation among replicates within the same CCD run was random, not influenced by specific predictors, reflecting the robustness of the experimental design. Minimizing the log[std] while maximizing fluorescence simultaneously guided the model towards a robust prediction space, improving the activity's reliability.

Table. 7 A summary of the statistics of the first CCD regression model for fitting the standard deviation (std) of the triplicated standardized fluorescence measurements.

Statistic	Value
Root Mean Square of Error	0.043
R-Squared	0.52
Adj. R-Squared	0.43
Residual Sums of Squares	0.14937
Predicted. R-Squared	0.33

The regression model developed for log[std] response of the standardized fluorescence as follows: Standardized_flourescence = -2.04245 - 0.0472 · Tryptone - 0.20784 · MD - 0.00912 · Yeast + 0.25215 · Mg - 0.15394 MD² -0.15394 · Yeast² + 0.00539 · Mg² -0.09885 · (Tryptone × MD) - 0.03989 · (Tryptone × Yeast) + -0.01835 · (Tryptone × Mg) - 0.18588 · (MD × Yeast) + 0.17078 · (MD × Mg).

Second CCD Optimization:

The same strategy used in the first CCD was applied in the second optimization. The regression model for the second CCD displayed relatively similar statistics, as summarized below:

Table 8. A summary of the statistics of the second CCD regression model for fitting the standard deviation (std) of the triplicated standardized fluorescence measurements.

Root Mean Square of Error	0.09
R-Squared	0.61
Adj. R-Squared	0.55
Residual Sums of Squares	0.00431
Predicted. R-Squared	0.42

The regression model developed for log[std] response of the standardized fluorescence as follows: Standardized_flourescence = 0.0103 + 0.00351· MD - 2.47131E-4· Yeast - -0.00945 · Mg + 0.00625·MD² + 0.00228 · Yeast² + 0.00315 Mg² - 0.00476· (MD × Yeast) + 0.05925 (Yeast × Mg)

The optimization process followed the same approach as in the first CCD, focusing on either maximizing standardized fluorescence alone or minimizing the log[std] while maximizing fluorescence for improved robustness.

Confirmatory Runs Optimization and cost estimation

Table 9. DOE optimization strategy results. The table summarizes the different optimization approaches for the alternative energy buffer. In the first CCD, the strategies included maximizing standardized fluorescence alone and maximizing fluorescence while minimizing the log of the standard deviation (log[std]). The second CCD optimization involved maximizing standardized fluorescence and/or minimizing log[std]. Key parameters such as tryptone, maltodextrin (MD), yeast, Mg-glutamate, and HMP concentrations are listed alongside predicted and actual standardized fluorescence values. Standardized fluorescence values closer to the predicted ones indicate successful optimization with minimal variation.

DOE optimization strategy	Variant number	Tryptone (%)	MD (mg/ml)	Yeast (%)	Mg-glutamate (mM)	HMP (mg.ml)	Standardized fluorescence predicted	Actual standardized flourescence
First CCD: Maximizing standardized fluorescence & minimizing log[std]	1	0	20	0.5	3	1	0.76	0.75 ± 0.01
Second CCD maximizing the standardized fluorescence &/or minimizing log[std]	2	0	27	0.5	3	1.3	0.65	0.6± 0.15
First CCD: Maximizing standardized fluorescence alone	3	3	20	0.5	4	1	0.56	0.37±0.09

The confirmatory runs demonstrated promising alternative energy buffer formulations compared to the standard control, highlighting their potential for various tests and laboratory applications. This is particularly noteworthy given the cost-effectiveness and sustainability of the components used. As shown in the results section and DOE optimization, yeast extract outperformed tryptone as a superior amino acid source. Additionally, the combination of yeast extract and maltodextrin (MD) reduced the need for Mg-glutamate compared to the optimal concentrations required in the control. Variants 1 and 2, as illustrated in Table 9 and Figure 8, nearly reached the efficiency of the canonical energy buffer. Variant 1 achieved 75 % of the maximum fluorescence compared to the average of the controls, while Variant 2 achieved 61 % of the average control fluorescence.

Beyond performance, the cost-effectiveness of these variants is significant. For a 12 µL reaction, the cost of the two alternative energy buffer variants is approximately 0.00725 euros, compared to 0.0819 euros for the canonical energy buffer. This means the alternative formulations are nearly 90 % cheaper than the traditional energy buffer composition. Interestingly, lower chemical concentrations led to better performance of the energy components, with complex amino acid sources like yeast extract contributing to reduced optimal Mg-glutamate levels. However, while these alternative energy buffers show promise, their robustness could raise concerns in applications where maintaining signal strength within the dynamic range is critical. The inherent variability introduced by using complex amino acid sources could also impact consistency due to the batch to batch variations [3]. In the first CCD design, maximizing the response while reducing variance appeared to offer a viable solution. For the second CCD design, the predicted runs provided similar results, showing consistency in the optimization approach.

Figure 8: Normalized fluorescence over time for controls (with standard composition) and variants (with new composition) of energy buffers. Control 1 (same run) and Control 2 (previous run (blue and yellow line represent the standard energy buffer formulations, while Variant 1 and Variant 2 (green and orange lines) represent the optimized energy buffer formulations. The shaded areas around each line indicate the variation in the measurements. n=3

Both optimized variants show reduced fluorescence compared to the controls, but demonstrate more consistent and predictable behavior over time.

Conclusions:

Optimizing an alternative energy buffer using yeast extract as the sole source of amino acids, NMP instead of NTP, and MD and HMP instead of 3-PGA has proven feasible through the Design of Experiments (DOE) approach. This strategy has significantly reduced costs by 90 %, with the trade-off being that lysate activity reaches approximately 75 % of that observed with the canonical energy buffer. Despite these promising results, our study acknowledges limitations in reproducibility across different batches of yeast extract, which may vary between labs.

For industrial applications or scientific assays, further investigations are essential to assess batch-to-batch variations when using different batches of yeast extract, lysate, or DNA. Such studies are crucial for ensuring the consistency and reliability of the energy buffer. Future research should focus on standardizing yeast quality and its processing to mitigate these variations, thereby enhancing the buffer's applicability across various settings.

Scale-up Modeling

We also modeled the scale-up of our reaction. This Process Flow Diagram (PFD) maps the inputs and outputs of our system, to better understand the flow of the reaction and to identify the potential bottlenecks in the system. The PFD also allows us to define the necessary equipment and the amount of each component required for reactions of varying sizes.

Learn More

References:

[1] Bashir, M. J., Amr, S. S. A., Aziz, S. Q., Aun, N. C., & Sethupathi, S. (2015). Wastewater treatment processes optimization using response surface methodology (RSM) compared with conventional methods: review and comparative study. Middle-East J. Sci. Res, 23(2), 244-252.

[2] Rifi, S. K., Souabi, S., El Fels, L., Driouich, A., Nassri, I., Haddaji, C., & Hafidi, M. (2022). Optimization of coagulation process for treatment of olive oil mill wastewater using Moringa oleifera as a natural coagulant, CCD combined with RSM for treatment optimization. Process Safety and Environmental Protection, 162, 406-418.

[3] Nagappa, L. K., Sato, W., Alam, F., Chengan, K., Smales, C. M., Von Der Haar, T., ... & Moore, S. J. (2022). A ubiquitous amino acid source for prokaryotic and eukaryotic cell-free transcription-translation systems. Frontiers in Bioengineering and Biotechnology, 10, 992708.

[4] Warfel, K. F., Williams, A., Wong, D. A., Sobol, S. E., Desai, P., Li, J., ... & Jewett, M. C. (2022). A low-cost, thermostable, cell-free protein synthesis platform for on-demand production of conjugate vaccines. ACS synthetic biology, 12(1), 95-107.

[5] Guzman-Chavez, F., Arce, A., Adhikari, A., Vadhin, S., Pedroza-Garcia, J. A., Gandini, C., ... & Haseloff, J. (2022). Constructing cell-free expression systems for low-cost access. ACS Synthetic Biology, 11(3), 1114-1128.

[6] Borkowski, O., Koch, M., Zettor, A., Pandi, A., Batista, A. C., Soudier, P., & Faulon, J. L. (2020). Large scale active-learning-guided exploration for in vitro protein production optimization. Nature communications, 11(1), 1872.

[7] Sun, Z. Z., Hayes, C. A., Shin, J., Caschera, F., Murray, R. M., & Noireaux, V. (2013). Protocols for implementing an Escherichia coli based TX-TL cell-free expression system for synthetic biology. JoVE (Journal of Visualized Experiments), (79), e50762.

[8] F. X. Lehr, A. Gaizauskaite, K. E. Lipińska, S. Gilles, A. Sahoo, R. Inckemann, and H. Niederholtmeyer, "Modular Golden Gate Assembly of Linear DNA Templates for Cell-free Prototyping," arXiv preprint arXiv:2310.13665, 2023.

[9] Statistics Made Easy by Stat-Ease, "DOE for on target results with minimal variation," YouTube. Available: https://www.youtube.com/watch?v=bntcmZR0bwE. [Accessed: Oct. 1, 2024].

Design of Experiment: Energy Buffer Optimization​

Experimental Protocol​

Pre-Processing Data​

1. The first Central Composite Design (CCD)​

2. The second Central Composite Design (CCD)​

Confirmatory Runs​

First CCD Optimization:​

Second CCD Optimization:​

Confirmatory Runs Optimization and cost estimation​

Conclusions:​