1.Overview
Fuel ethanol, currently one of the most promising transportation fuels, is produced through the fermentation of Saccharomyces cerevisiae (brewer's yeast). Using traditional starch-based feedstocks for fuel ethanol production raises the issue of "competing with food for grains and land." Replacing these with lignocellulosic feedstocks is expected to address this problem. After pretreatment, lignocellulosic materials contain glucose and a rich source of xylose. However, brewer's yeast is naturally unable to utilize xylose, leading to a waste of raw materials. Traditional methods to endow brewer's yeast with xylose utilization capability involve introducing a xylose isomerization metabolic pathway and continuously culturing the strain in the xylose medium (domestication). Still, this process is time-consuming and labor-intensive, and the strain's phenotype can be unstable.
This project is based on the NFS1I492N gene mutation identified through comparative genomics, combined with a multi-copy integration technology for the xylose isomerase expression gene PsXI, to rationally construct a brewer's yeast strain capable of rapid xylose utilization.
Based on the experimental design, we successfully constructed four different strains of brewer's yeast: xyl-8XI, xyl-8XI-nfs1, xyl-8XI-△ISU1, and xyl-8XI-nfs1-△ISU1. Subsequently, we conducted xylose fermentation tests with these four strains alongside the wild-type strain xyl. Every 8 hours, we extracted samples from the anaerobic tubes, retaining 50 µL for high-performance liquid chromatography (HPLC) to analyze xylose consumption until the fermentation ended at 48 hours.
Using the experimental data, we established fitting curves for the xylose consumption changes of the five strains over different cultivation times (8, 16, 24, 32, 40, and 48 hours). These fitting curves were used to evaluate the correlation between xylose consumption changes at different cultivation times, test the impact of the NFS1 and ISU1 genes on the strains' ability to metabolize xylose and determine the strain with the optimal xylose metabolism capability and reaction time.
2.Raw data
We conducted xylose fermentation tests with the four strains alongside the wild-type strain xyl. Sampling was performed at regular intervals, extracting 200 µL of the culture every 8 hours into 1.5 mL centrifuge tubes, with 50 µL used for high-performance liquid chromatography (HPLC) analysis of xylose consumption until the fermentation ended at 48 hours. The samples were diluted 20 times with deionized water, vortexed, and centrifuged at 12,000 rpm for 30 minutes. A 200 µL aliquot of the supernatant was taken as the sample. The detection instrument used was an Agilent 1200 HPLC with a Bio-Rad HPX-87H column. The differential detector method was employed, with a column temperature of 65 °C and a mobile phase of 5 mM dilute sulfuric acid at a flow rate of 0.6 mL/min, with each sample run lasting 30 minutes.
The data in the table represent the changes in xylose at different sampling times. The X-axis indicates the different cultivation times of 8 h, 16 h, 24 h, 32 h, 40 h, and 48 h. The Y-axis shows the metabolic state of xylose for the five strains at these different times, with xylose measured in mg. The experimental data shows that the xyl strain metabolizes xylose differently from the other four strains. We employed correlation analysis from statistical modeling to analyze the obtained data.
Table 1 Xylose metabolism in yeast strains at different times
Xylose/mg Time/h |
xyl | xyl-8XI | xyl-8XI-nfs1 | xyl-8XI-△ISU1 | xyl-8XI-nfs1-△ISU1 |
8 | 789.8 | 232.7 | 211.4333333 | 239.1 | 266.0666667 |
16 | 866.03 | 26.4 | 20.63333333 | 47.7 | 46.23333333 |
24 | 773.33 | 2.2333 | 42.36666667 | 22.03333333 | 4.633333333 |
32 | 768.73 | 0.6067 | 18.63333333 | 3.366666667 | 3.266666667 |
40 | 932.37 | 1.3167 | 8.666666667 | 4.563333333 | 0 |
48 | 829.47 | 0 | 0 | 2.2 | 0 |
3.Modelling process and results
3.1 Differential Analysis
Subsequently, we conducted a differential analysis of the data from the four strains. A paired sample Friedman test was used to analyze whether there were significant differences among multiple variable data. The calculated significance P-value was 0.134, which is greater than 0.05. Therefore, the statistical results are not significant, indicating that there is no significant difference among xyl-8XI, xyl-8XI-nfs1, xyl-8XI-△ISU1, and xyl-8XI-nfs1-△ISU1. The effect size, measured by Cohen's f value, was 0.012, indicating a minimal degree of difference. The conclusion suggests that there are no significant differences in the experimental data of the four strains from a statistical perspective, allowing for the use of the same model for description.
3.1.1 Normality Test
Table 2 presents the descriptive statistics and results of the normality tests for the variables xyl-8XI, xyl-8XI-nfs1, xyl-8XI-△ISU1, and xyl-8XI-nfs1-△ISU1, including means, standard deviations, and other metrics used to assess data normality. There are two standard methods for testing normal distribution: the Shapiro-Wilk test, suitable for small sample sizes (N ≤ 5000), and the Kolmogorov-Smirnov test, suitable for large sample sizes (N > 5000). If the test shows significance (P < 0.05), it indicates rejection of the null hypothesis (that the data follow a normal distribution), suggesting that the data do not meet standard distribution requirements. Conversely, if not significant, the data meet normal distribution.
For the samples xyl-8XI and xyl-8XI-nfs1, with N < 5000, the Shapiro-Wilk test was applied, yielding a significance P-value of 0.000***, indicating significance at this level and leading to the rejection of the null hypothesis. Therefore, the data do not meet normal distribution requirements, allowing for applying the Friedman test. For the samples xyl-8XI-△ISU1 and xyl-8XI-nfs1-△ISU1, with N < 5000, the Shapiro-Wilk test was also used, resulting in a significance P-value of 0.001***, which similarly indicates significance and rejection of the null hypothesis, confirming that these data do not meet standard distribution requirements and permitting the use of the Friedman test.
Table 2 Normality Test Results
Variable Name | Sample Size | Mean | Standard Deviation | Skewness | Kurtosis | S-W Test | K-S Test |
xyl-8XI | 7 | 151.894 | 298.147 | 2.263 | 5.16 | 0.614(0.000***) | 0.377(0.210) |
xyl-8XI-nfs1 | 7 | 157.39 | 292.673 | 2.342 | 5.574 | 0.617(0.000***) | 0.367(0.236) |
xyl-8XI-△ISU1 | 7 | 159.852 | 294.703 | 2.254 | 5.135 | 0.634(0.001***) | 0.363(0.249) |
xyl-8XI-nfs1-△ISU1 | 7 | 160.029 | 298.244 | 2.164 | 4.658 | 0.642(0.001***) | 0.363(0.248) |
Note: ***, **, and * represent significance levels of 1%, 5%, and 10%, respectively. |
Figure 1 shows the results of the normality test for the xyl-8XI data. The normality plot roughly displays a bell shape (high in the center and low at the ends), indicating that while the data is not perfectly normal, it can be generally accepted as normally distributed.
Figure 1. Histogram of the Normality Test for xyl-8XI
Figure 2 shows the normality test results for the xyl-8XI -nfs1 data. The normality plot roughly displays a bell shape (high in the center and low at the ends), indicating that while the data is not perfectly normal, it can be generally accepted as normally distributed.
Figure 2. Histogram of the Normality Test for xyl-8XI-nfs1
Figure 3 shows the results of the normality test for the xyl-8XI-△ISU1 data. The normality plot roughly displays a bell shape (high in the center and low at the ends), indicating that while the data is not perfectly normal, it can be generally accepted as normally distributed.
Figure 3. Histogram of the Normality Test for xyl-8XI-△ISU1
Figure 4 shows the results of the normality test for the xyl-8XI-nfs1-△ISU1 data. The normality plot roughly displays a bell shape (high in the center and low at the ends), indicating that while the data is not perfectly normal, it can be generally accepted as normally distributed.
Figure 4. Histogram of the Normality Test for xyl-8XI-nfs1-△ISU1
3.1.2 Friedman Test
Table 3 presents the results of the Friedman test, including the median, test statistic, and effect size (Cohen's f value). According to the results of the analysis, the significance of the P-value is 0.134, which is greater than 0.05, indicating that the statistical results are insignificant. This suggests no significant differences among xyl-8XI, xyl-8XI-nfs1, xyl-8XI-△ISU1, and xyl-8XI-nfs1-△ISU1. The effect size, measured by Cohen's f value, is 0.012, indicating a minimal degree of difference.
Table 3: Results of the Friedman Test Analysis
Variable Name | Sample Size | Median | Standard Deviation | Standard Deviation | P | Cohen's f value |
xyl-8XI | 7 | 2.233 | 298.147 | 5.571 | 0.134 | 0.012 |
xyl-8XI-nfs1 | 7 | 20.633 | 292.673 | |||
xyl-8XI-△ISU1 | 7 | 22.033 | 294.703 | |||
xyl-8XI-nfs1-△ISU1 | 7 | 4.633 | 298.244 | |||
Note: ***, **, and * represent significance levels of 1%, 5%, and 10%, respectively. |
Post-hoc multiple comparisons were conducted using the Nemenyi test for pairwise differences, with the results showing:
For the paired comparisons of xyl-8XI with xyl-8XI-nfs1, xyl-8XI-△ISU1, and xyl-8XI-nfs1-△ISU1, the significance P-values were 0.588, 0.163, and 0.820, respectively. These values indicate non-significance, concluding that there are no significant differences between xyl-8XI and xyl-8XI-nfs1, xyl-8XI-△ISU1, or xyl-8XI-nfs1-△ISU1.
For the paired comparisons of xyl-8XI-nfs1 with xyl-8XI-△ISU1 and xyl-8XI-nfs1-△ISU1, the significance P-values were 0.820 and 0.900, respectively. These also indicate non-significance. Thus, there are no significant differences between xyl-8XI-nfs1 and xyl-8XI-△ISU1 or xyl-8XI-nfs1-△ISU1.
For the paired comparisons of xyl-8XI-△ISU1 with xyl-8XI-nfs1-△ISU1, the significance P-value was 0.588, indicating non-significance. Therefore, there are no significant differences between xyl-8XI-△ISU1 and xyl-8XI-nfs1-△ISU1.
Table 4: Post-hoc Multiple Comparisons
Paired Variables | Median ± Standard Deviation | Test Statistic | P | Cohen's | |||
Pair 1 | Pair 2 | Pair Difference (Pair 1 - Pair 2) | |||||
xyl-8XI paired with xyl-8XI-nfs1 | 2.233±298.147 | 20.633±292.673 | 18.4±5.474 | 1.757 | 0.588 | 0.019 | |
xyl-8XI paired with xyl-8XI-△ISU1 | 2.233±298.147 | 22.033±294.703 | 19.8±3.445 | 2.928 | 0.163 | 0.027 | |
xyl-8XI paired with xyl-8XI-nfs1-△ISU1 | 2.233±298.147 | 4.633±298.244 | 2.4±0.097 | 1.171 | 0.820 | 0.027 | |
xyl-8XI-nfs1 paired with xyl-8XI-△ISU1 | 20.633±292.673 | 22.033±294.703 | 1.4±2.03 | 1.171 | 0.820 | 0.008 | |
xyl-8XI-nfs1 paired with xyl-8XI-nfs1-△ISU1 | 20.633±292.673 | 4.633±298.244 | 16±5.571 | 0.586 | 0.900 | 0.009 | |
xyl-8XI-△ISU1 paired with xyl-8XI-nfs1-△ISU1 | 22.033±294.703 | 4.633±298.244 | 17.4±3.541 | 1.757 | 0.588 | 0.001 | |
Note: ***, **, and * represent significance levels of 1%, 5%, and 10%, respectively. |
4.Modeling Results
Simple data analysis indicates that the xyl data exhibits fluctuations but can be considered stable, with the fluctuations attributed to measurement error. The other four datasets show a clear monotonic decreasing trend. We employed a general differential equation model (used in epidemiology, population decline, demographics, etc.): dy/dx=ay. This equation leads to an exponential model: y=aebx, commonly used for predictions in infectious diseases, population forecasting, and similar applications.
4.1 Raw data input
t0=[0 8 16 24 32 40 48];
xyl0=[800 789.8 866.0333333 773.3333333 768.7333333 932.3666667 829.4666667];
xyl8XI0=[800 232.7 26.4 2.233333333 0.606666667 1.316666667 0];
xyl8XInfs10=[800 211.4333333 20.63333333 42.36666667 18.63333333 8.666666667 0];
xyl8XIISU10=[800 239.1 47.7 22.03333333 3.366666667 4.563333333 2.2];
xyl8XInfs1ISU10=[800 266.0666667 46.23333333 4.633333333 3.266666667 0 0];
4.2 Exponential Model Fitting for xyl-8XI Data
ft1 = fittype( 'exp1' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.Display = 'Off';
opts.StartPoint = [895.542960213597 -0.2308525541144];
[fitresult1, gof] = fit( xData1, yData1, ft1, opts );
h1=plot( fitresult1)
h1.LineWidth = 1.2;
hold on
plot(xData1, yData1,'*','LineWidth',1.2)
hold off
Figure 5. Linear Fitting Plot of xyl-8XI
Figure 6. Exponential Function Analysis of xyl-8XI
4.3 Exponential Model Fitting for xyl-8XI-nfs1 Data (Original Data Has Been Entered)
ft2 = fittype( 'exp1' );
opts2 = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts2.Display = 'Off';
opts2.StartPoint = [654.358037949279 -0.112316524170885];
[fitresult2, gof] = fit( xData2, yData2, ft2, opts2 );
figure(2);
h2=plot( fitresult2)
h2.LineWidth = 1.2;
hold on
plot(xData2, yData2,'*','LineWidth',1.2)
hold off
Figure 7. Linear Fitting Plot of xyl-8XI-nfs1
Figure 8. Exponential Function Analysis of xyl-8XI-nfs1
4.4 Exponential Model Fitting for xyl-8XI-△ISU1
ft3 = fittype( 'exp1' );
opts3 = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts3.Display = 'Off';
opts3.StartPoint = [781.106905967862 -0.149625769406093];
[fitresult3, gof] = fit( xData3, yData3, ft3, opts3 );
figure(3);
h3=plot( fitresult3)
h3.LineWidth = 1.2;
hold on
plot(xData3, yData3,'*','LineWidth',1.2)
hold off
Figure 9. Linear Fitting Plot of xyl-8XI-△ISU1
Figure 10. Exponential Function Analysis of xyl-8XI-△ISU1
4.5 Exponential Model Fitting for xyl-8XI-nfs1-△ISU1
ft4 = fittype( 'exp1' );
opts4 = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts4.Display = 'Off';
opts4.StartPoint = [904.88237079592 -0.206138435964666];
[fitresult4, gof] = fit( xData4, yData4, ft4, opts4 );
figure(4);
h4=plot( fitresult4)
h4.LineWidth = 1.2;
hold on
Figure 11. Linear Fitting Plot of xyl-8XI-nfs1-△ISU1
Figure 12. Exponential Function Analysis of xyl-8XI-nfs1-△ISU1
4.6 Comparison of Variations Among the Four Strains
h1=plot( fitresult1,'r');
h1.LineWidth = 1.2;
hold on
h2=plot( fitresult2,'b');
h2.LineWidth = 1.2;
h3=plot( fitresult3,'g');
h3.LineWidth = 1.2;
h4=plot( fitresult4,'k');
h4.LineWidth = 1.2;
grid on
hold off
Figure 13. Comparison of Xylose Metabolism Capability Among the Four Strains
As shown in the figure, among the four strains, the blue curve representing xyl-8XI-nfs1 demonstrates superior xylose degradation capability compared to the other three strains. The black curve for xyl-8XI-nfs1-△ISU1 indicates the weakest performance in the first 24 hours. The red curve for xyl-8XI is similar to the green curve for xyl-8XI-△ISU1.
5. Conclusion
Introducing multiple copies of the xylose isomerase PsXI into different xyl control strains has enhanced their xylose metabolism capabilities. Analysis of the xylose metabolism abilities among the four strains shows that the xyl-8XI-nfs1 strain outperforms the other three. The xyl-8XI-nfs1-△ISU1 strain exhibits the weakest performance in the first 24 hours, while xyl-8XI is similar to xyl-8XI-△ISU1. This is consistent with previous qualitative tests on xylose metabolism using xylose plates. The actual performance of xylose metabolism in the fermentation broth further validates the advantages of the NFS1 mutant strain in xylose metabolism.
In the future, the xyl-8XI-nfs1 strain will be tested as a dominant strain for its metabolic capability in actual hydrolysates, which contain glucose xylose and many inhibitors produced from lignin degradation. Our goal is to improve the strain's utilization of xylose and enhance its stability in hydrolysates, ultimately promoting the production of second-generation bioethanol.
6. References
[1]Scientific Platform Serving for Statistics Professional 2021. SPSSPRO. (Version 1.0.11)[Online Application Software]. Retrieved from https://www.spsspro.com.
[2]Xu, Weichao. "A Review of Correlation Coefficient Research." Journal of Guangdong University of Technology, 2012, 29(3): 12-17.
[3]Cheng, Xiaoliang. "Nonparametric Statistical Analysis of Economic Data in Anshan Area." Journal of Anshan Normal University, 2017, 19(04): 6-8.