. Model .

1. Summeray

This project aims to design bacterial outer membrane vesicles (OMVs) as delivery vehicles for carrying gene-editing tools and targeting elements. We are particularly focused on the loading efficiency of the delivery vehicle. In the preliminary experiment phase, through literature review and pre-processing, we used bioinformatics methods to predict some indicators affecting the loading efficiency of the delivery vehicle, which were used for later experimental adjustments.

1.1.Characterization of drug-loaded OMVs
The method of drug loading and the type of drug may affect the structure and function of OMVs. For example, electroporation may cause OMVs to aggregate or fuse, ultrasonic treatment may damage their structure, and the use of a liposome extruder may affect membrane proteins and zeta potential. To address these issues, in the preliminary experiment phase, we washed and centrifuged OMVs at high speed after drug loading to purify the samples, and then lysed the purified OMVs to detect protein concentration. The assessment indicators of drug loading efficiency include encapsulation efficiency, drug loading amount, and positive rate.

2. Formula

2.1.Encapsulation efficiency:
Refers to the ratio of the amount of drug successfully loaded into OMVs to the total amount of drug input, reflecting the utilization rate of the drug. Different drugs and loading methods will lead to different encapsulation efficiencies, and differences in detection methods will also affect the calculation results.
Calculation formula:
Encapsulation efficiency (%) = (Amount of loaded drug / Amount of input drug) × 100%
2.2.Drug loading amount:
Refers to the amount of drug loaded per unit of OMVs; the higher the drug loading amount, the fewer OMVs are required. The unit of drug loading amount may vary depending on the type of drug, such as the number of moles, mass, or copy number.
Calculation formula:
Drug loading amount = Total amount of loaded drug / Total amount of OMVs
2.3.Positive rate:
Refers to the ratio of OMVs that have successfully loaded the drug to the total amount of OMVs, which is an important indicator for evaluating drug loading efficiency. The positive rate, in combination with the drug loading amount, can comprehensively assess the drug loading performance of OMVs.
Calculation formula:
Positive rate (%) = (Amount of drug-loaded OMVs / Total amount of OMVs) * 100%
In summary, based on the original experimental foundation of the laboratory, we used Python machine learning technology to build a predictive model to predict and optimize the drug loading efficiency of OMVs.

3. Single-variable experimental conditions include:

The amount of Cas9-RNP plasmid transfection per 50 μl of competent cells:
A: 100 ng, B: 150 ng, C: 200 ng, D: 250 ng.
IPTG induction concentration:
A: 0.1 mM B: 0.2 mM C: 0.5 nM D: 1 mM
Induction temperature:
A: 18°C B: 25°C C: 30°C D: 37°C
Induction time:
A: 8 hours; B: 10 hours; C: 12 hours D: 16 hours

4. Model establishment and result visualization process:

Data collection:
Factors such as Cas9-RNP plasmid transfection amount, IPTG induction concentration, induction temperature, and induction time.
Data preprocessing:
Organizing data into a format suitable for machine learning models. (As shown in Fig.1 and 2)

Fig.1 Collate the dataset
Fig.2 Visually inspect data sets

Feature Selection:
Determine which features (variables) will be used for model training. We select the amount of Cas9-RNP plasmid transfection, IPTG induction concentration, induction temperature, and induction time.
Model Selection:
Choose an appropriate machine learning model. For problems that predict continuous values, regression models such as linear regression, decision trees, or random forests are commonly used. We use a linear regression model.
Model Training:
Train the model using the training dataset.
Model Evaluation:
Evaluate the performance of the model using the testing dataset. Common evaluation metrics include Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R² score.
Result Visualization:
Use libraries such as matplotlib or seaborn to plot a comparison of the model's predicted results with the actual results, as well as possible residual plots. (As shown in Fig.3, 4, and 5)

Fig.3 model training
Fig.4 Visualization of model training results
Fig.5 Visualization of model training results

5. Model Optimization

Based on the results of the model evaluation, adjust the model parameters or feature selection to optimize the model performance.
Final Prediction:
Use the optimized model to predict the loading efficiency under new experimental conditions. (As shown in Fig. 6 and 7)

Fig.6 Input data to the resulting predictionmodel to obtain the predicted loading rate

6. Predicting the Optimal Solution

The model is for reference only and is only used to illustrate the fit of the model. Subsequent experiments will adjust the experimental conditions based on this predictive model.