miRNA model

miRNA model
Threshold detection
Toehold designer

microRNA levels can infer something about a person’s physiological state

microRNAs (miRNA) are noncoding, single-stranded, RNA molecules that are 18-24 nucleotides long.1 They play a key role in the regulation of gene expression by repressing translation and degrading mRNA.2 Previous studies have shown that a change in miRNA expression profile correlates with the progression of many diseases, including neurodegenerative diseases.35 In multiple sclerosis (MS) patients some miRNAs are also dysregulated and can be up- or down-regulated compared to healthy people.6 Through human practices we found that relapsing-remitting MS (RRMS) was the best type of MS to focus on. As miRNA can be found in the peripheral blood, they are potential biomarkers for minimally invasive diagnosis and reproducible testing.7 Their stability and accuracy in blood samples also makes them suitable biomarkers.7,8 This part of our dry-lab aims to find an miRNA combination in the blood that is specific for RRMS. The RRMS-specific miRNA combination will be incorporated in our test design for the diagnosis of RRMS.

Data processing

First, we build a classification model where we classify miRNA expression data of MS patients and healthy controls. For this classification, we used the dataset obtained by Cox et al. (2010).9 In addition to the miRNA expression data of 37 healthy controls, this dataset contains the miRNA expression data of 18 primary progressive, 17 secondary progressive and 24 RRMS patients. The full dataset was filtered to obtain a dataset with only RRMS patients and healthy controls. The Illumina BeadArray reader was used to determine the miRNA expression values. This programme reports a detection p-value after comparing the miRNA expression data with the background values determined by the negative control probes. As the detection p-value indicates the significance of miRNA detection, the miRNA expression values with a p-value higher than 0.05 were replaced by "Not a Number" (NaN) values. After a binary logarithmic (log2) transformation, the data was processed according to the same procedure used by Cox et al. (2010).9 To account for deviation and outliers, a baseline transformation to the median was performed. A quantile normalisation was performed to decrease the technical variability between the samples.10
 

Classification of miRNA expression profiles of RRMS patients and healthy controls

The machine learning algorithm, random forest, was used for the classification of RRMS patients and healthy controls. By comparing the predicted values with the true values, our model obtained an accuracy of 0.69. The miRNAs with a prominent role in the classifications were determined through a method called permutation importance. This method permutates the miRNA expression values of each miRNA separately and recalculated the accuracy of the model. Positive importance values indicate a decrease in accuracy after permutating the sample, while negative values indicate an increase.11 It was determined that there are seven miRNAs with a positive importance value playing a prominent role in this classification (Figure 1). We also found two miRNAs with a negative importance value. This means that randomly permutating the expression values of these miRNAs increases the accuracy of the model.

miRNAs important in the classification of RRMS and healthy control. Feature (y-axis) indicates the miRNA found to play a prominent role in the classification. The importance (x-axis) is determined by permutation importance. The importance values were found to be positive for seven miRNA, while two miRNA were found to have a negative importance score. Positive importance scores indicated that a random permutation of the miRNA expression values decreased the accuracy of the model, while a negative importance scores suggests random permutation resulting in a higher model accuracy.

MS-specific miRNA combination excluding mimic diseases

Through human practices, we found that Dr. Pablo Villoslada (Professor of Neurology and MS researches, Barcelona) was concerned about the specificity of the miRNA for the diagnosis of MS. During progression of diseases similar to RRMS, referred to as mimic diseases, the same miRNA could be upregulated. This similarity in expression profile makes it difficult to distinguish between MS and these mimic diseases. To prevent our test from detecting a mimic disease, we introduced the use of the Human miRNA Disease Database (HMDD) in our pipeline. We searched for the seven miRNAs found to have a positive importance value in the previous step.12 This database lists miRNAs found to be up- or down- regulated in disease. We were unable to identify two of them (HS 263.1 and HS 65) in this database, due to the use of an older nomenclature structure. Therefore, with the miRNA sequence obtained from the paper by Cox et al. (2010), we performed sequence alignments in the miRBase, which lists miRNA names and sequences.9,13 Using this webtool with the default settings, we tried to find the names of the two miRNAs according to the new nomenclature structure. However, no similar miRNAs were found. Therefore, we continued the analysis with the five remaining miRNAs. By searching for their names with the default settings in the HMDD, we obtained a table showing all diseases where the five miRNAs are found to be dysregulated (Table 1). For hsa-miR-17-5p, the search term hsa-miR-17 was used as HMDD does not distinguish between two mature miRNAs derived from the opposite arms of a pre-miRNA (hsa-miR-17-3p and hsa-miR-17-5p).


hsa-miR-17 hsa-miR-431 hsa-miR-494 hsa-miR-106a hsa-miR-191
Acth-Independent Macronodular Adrenal Hyperplasia Adrenocortical Carcinoma Acute Kidney Injury Acute Lymphoblastic Leukemia Acquired Immunodeficiency Syndrome
Acute Coronary Syndrome Alzheimer Disease Acute Lung Injury Adenocarcinoma of Lung Acromegaly
Acute Lung Injury Arthritis Aggressive Periodontitis Adenomatous Polyposis Coli Acute Kidney Injury
Adenocarcinoma of Lung Breast Neoplasms Arthritis Alzheimer Disease Adenocarcinoma of Lung
Adenomyosis Carcinoma Atherosclerosis Angelman Syndrome Adenomyosis
Alcohol Withdrawal Delirium Cardiomyopathies Brain Ischemia Aortic Aneurysm Alzheimer Disease
Alopecia Areata Cicatrix Breast Neoplasms Asthma Aneuploidy
Alzheimer Disease Colonic Neoplasms Carcinoma Atherosclerosis Anorexia Nervosa
Amyotrophic Lateral Sclerosis Colorectal Neoplasms Cardiovascular Diseases Atrial Fibrillation Aortic Aneurysm, Abdominal
Angina Pectoris Diabetic Retinopathy Cerebral Hemorrhage Autism Spectrum Disorde Aortic Valve Insufficiency
ankylosing spondylitis 1 Enterocolitis Cervical   Spondylomyelopathy Bipolar Disorder Arthritis, Rheumatoid
Anodontia Esophageal Neoplasms Cicatrix Bone Neoplasms Atrial Fibrillation
Anxiety Disorders Glioma Colitis Brain Neoplasms Brain Injuries
Aortic Aneurysm Hand Colonic Diseases Breast Neoplasms Breast Neoplasms
Arthritis Hepatitis Colorectal Neoplasms Carcinoma Burns
Aspergillosis Hirschsprung Disease Coronary Occlusion Cleft Palate Carcinoma, Hepatocellular
Asthma Infertility Crohn Disease Colonic Neoplasms Carcinoma, Non-Small-Cell Lung
Atherosclerosis Intervertebral Disc Degeneration Cystic Fibrosis Colorectal Neoplasms Carcinoma, Transitional Cell
Autistic Disorder Lissencephaly Depressive Disorder Coronary Artery Disease Cardiomegaly
Benign Paroxysmal Positional Vertigo Liver Neoplasms Diabetes Mellitus Diabetes Mellitus Cardiomyopathies
Biliary Tract Neoplasms Lung Carcinoid Diabetes Diabetic Foot Cervical Intraepithelial Neoplasia
Brain Injuries Lung Neoplasms Diabetic Nephropathies Diabetic Nephropathies Colonic Neoplasms
Brain Ischemia Lymphoma Dmd-Associated Dilated Cardiomyopathy Drug Hypersensitivity Colorectal Neoplasms
Brain Neoplasms Melanoma Dwarfism Endometriosis Crohn Disease
Breast Neoplasms Multiple Sclerosis Early-Stage Malignant Melanoma Enteropathy-Associated T-Cell Lymphoma Dermatitis, Atopic
Bronchopulmonary Dysplasia Myocardial Infarction Endometrial Neoplasms Ependymoma Diabetes Mellitus
Burns Nasopharyngeal Carcinoma Esophageal Neoplasms Epstein-Barr Virus Infections Diabetes Mellitus, Type 2
Carcinogenesis Osteoarthritis Esophageal Squamous Cell Carcinoma esophagus adenocarcinoma Diabetic Nephropathies
Carcinoma Osteosarcoma Fanconi Anemia familial chylomicronemia syndrome Down Syndrome
Cardiomegaly Pancreatic Neuroendocrine Tumor Glioma Fractures Endometrial Neoplasms
Cardiomyopathies Precancerous Conditions Hand Glioblastoma Glioblastoma
Cardiotoxicity Premature Birth Hemangioma Glioma Heart Defects, Congenital
Celiac Disease Respiratory Distress Syndrome Hyperglycemia Glomerulonephritis Heart Septal Defects, Ventricular
Cerebral Small Vessel Diseases Sciatic Neuropathy Hypospadias Hearing Loss Hematologic Neoplasms
Cerebrovascular Disorders Squamous Cell Carcinoma of Head and Neck Intervertebral Disc Degeneration Heart Failure Hemolysis
cervical squamous cell carcinoma Stomach Neoplasms Ischemic Stroke hereditary diffuse gastric cancer Hypertension
Charcot-Marie-Tooth disease type 1B Thrombocythemia Kidney Diseases Hypercholesterolemia Hypertension, Pregnancy-Induced
Cholesteatoma Thyroid Cancer Liver Cirrhosis Hypertension hypomelanosis of Ito
Chordoma Thyroid Neoplasms Liver Diseases idiopathic scoliosis Hypoplastic Left Heart Syndrome
Chromosome Duplication Tuberculosis Lung Neoplasms Infertility Infant, Low Birth Weight
Chronic Rhinosinusitis With Nasal Polyps Uterine Cervical Neoplasms Lymphoma Inflammation Infertility
Colitis vulva squamous cell carcinoma Macular Degeneration Inflammatory Bowel Diseases Intracranial Aneurysm
Colonic Neoplasms Medulloblastoma Ischemic Stroke Ischemic Stroke
Colorectal Neoplasms Melanoma Laryngeal Neoplasms Kidney Diseases
cone-rod dystrophy 6 Mitochondrial Diseases Lung Neoplasms late onset Parkinson's disease
Congenital central hypoventilation   syndrome Mouth Neoplasms Lymphoma Leiomyosarcoma, Uterine
congenital nongoitrous hypothyroidism 4 Muscular Dystrophy Lymphoproliferative Disorders Leukemia, Acute
Coronary Artery  Disease Myocardial Infarction Marek Disease Liver Diseases
Coronary Disease Myocardial Reperfusion Injury Melanoma Lung Neoplasms
COVID-19 Nerve Degeneration Moebius syndrome 1 Macular Degeneration
CREST Syndrome Neuralgia Mouth Neoplasms Malaria, Vivax
Cushing Syndrome Neuroblastoma Multiple Sclerosis Melanoma
Cystoid Macular Edema Neurotoxicity Syndromes Myasthenia Gravis Moebius syndrome 1
Depression Osteosarcoma Mycobacterium avium-intracellulare Infection Multiple Organ Failure
Dermatitis pancreatic adenocarcinoma Mycobacterium Infections Multiple System Atrophy
Diabetes Mellitus Pancreatic Carcinoma Myocardial Reperfusion Injury Multiple Trauma
Diabetes Parkinson Disease Nasopharyngeal Carcinoma Muscular Dystrophy, Duchenne
Diabetic Foot Pneumonia Neoplasm Myocardial Infarction
Diabetic Nephropathies Pre-Eclampsia Neuralgia Neurotoxicity Syndromes
Diabetic Retinopathy Precursor Cell Lymphoblastic Leukemia-Lymphoma Non-alcoholic Fatty Liver Disease Obesity
Down Syndrome Premature Birth nonpapillary renal cell carcinoma Out-of-Hospital Cardiac Arrest
dystonia 5 Prostatic Hyperplasia Osteosarcoma Pancreatic Carcinoma
Embolic Stroke Prostatic Neoplasms Ovarian Neoplasms Parkinsonian Disorders
Encephalocraniocutaneous lipomatosis Pulmonary Disease Pancreatitis Pediatric Obesity
End Stage Liver Disease Reperfusion Injury Parkinson Disease Peripheral Blood Leukocytes
Endometrial Hyperplasia Retinoblastoma Periodontal Diseases Phenylketonurias
Endometriosis Rift Valley Fever Polycystic Ovary Syndrome Pre-Eclampsia
Enterovirus Infections Sarcoma Pre-Eclampsia Pregnancy Complications
Ependymoma Schizophrenia Prostatic Neoplasms Premature Birth
Epstein-Barr Virus Infections Sepsis Prostatic Neoplasms Prostatic Neoplasms
Esophageal Neoplasms Sepsis-Associated Encephalopathy Rhabdomyosarcoma 1 Pulmonary Arterial Hypertension
Esophageal Squamous Cell Carcinoma Shock rippling muscle disease 1 Reperfusion Injury, Renal Ischemia
esophagus adenocarcinoma Small Cell Lung Carcinoma Sepsis Sepsis
Exudative Vitreoretinopathy 4 Spinal Cord Injuries spinal cord glioma Squamous Cell Carcinoma of Head and Neck
Fabry Disease Squamous Cell Carcinoma of Head and Neck Squamous Cell Carcinoma of Head and Neck Stomach Neoplasms
Familial encephalopathy with neuroserpin   inclusion bodies Stomach Neoplasms Stomach Neoplasms Triple Negative Breast Neoplasms
Familial Mediterranean Fever T-cell acute lymphoblastic leukemia Thyroid Cancer Tuberculosis
Familial primary gastric lymphoma Triple Negative Breast Neoplasms Tibial Fractures Urinary Bladder Neoplasms
Fibrosis Urinary Bladder Neoplasms Triple Negative Breast Neoplasms Uveal melanoma
Fluorosis Uterine Cervical Neoplasms Tuberculosis Wounds and Injuries
Gastrointestinal Neoplasms Urinary Bladder Neoplasms
Glioblastoma Uterine Cervical Neoplasms
Glioma Vascular Calcification
Glomerulonephritis Venous Thromboembolism
Gout
Head and Neck Neoplasms
Heart Failure
Heart Neoplasms
Hematologic Diseases
Hepatitis B
Hepatitis C
hereditary diffuse gastric cancer
Hernias
high grade glioma
Histiocytic Sarcoma
Hypercholesterolemia
Hyperparathyroidism
Hypertrophy
Hypoxia-Ischemia
idiopathic scoliosis
Immune System Diseases
Inflammation
Insulin Resistance
Intervertebral Disc Degeneration
Intracranial Aneurysm
Iron Metabolism Disorders
Ischemic Stroke
Kidney Diseases
Kidney Failure
Kidney Neoplasms
Leukemia
Lipidoses
Liver Cirrhosis
Liver Diseases
Liver Neoplasms
Lumbar Radicular Pain
Lung Injury
Lung Neoplasms
Lupus Erythematosus
Lymphoma
Lymphoproliferative Disorders
Macular Degeneration
Melanoma
Meningioma
Metabolic Diseases
Multiple Myeloma
Multiple Sclerosis
Muscular Dystrophy
Mycosis Fungoides
Myocardial Infarction
Myocarditis
Nasopharyngeal Carcinoma
Necrobiotic Disorders
Neointimal Hyperplasia
Neoplasms
Nephritis
Nephrotic Syndrome
nephrotic syndrome type 1
Neuroblastoma
Neuroectodermal Tumors
Neuroendocrine Tumors
Neurotic Disorders
Neurotoxicity Syndromes
Nijmegen Breakage Syndrome
Non-alcoholic Fatty Liver Disease
Norrie disease
Obesity
Oligospermia
Optic Nerve Injuries
Ossification of Posterior Longitudinal   Ligament
Ossification of the posterior   longitudinal ligament of the spine
Osteoarthritis
Osteoarthropathy
Osteochondritis Dissecans
osteogenesis imperfecta type 1
Osteosarcoma
Ovarian Neoplasms
Pancreatic Carcinoma
Parathyroid Neoplasms
Parkinson Disease
Periapical Diseases
Peripheral Arterial Occlusive Disease 1
Peritoneal Fibrosis
Plaque
Platelet Storage Pool Deficiency
Pneumonia
Polycystic Kidney Diseases
Post-Dural Puncture Headache
Pre-Eclampsia
Precursor T-Cell Lymphoblastic   Leukemia-Lymphoma
Pregnancy Complications
Premature Birth
Prostatic Neoplasms
psoriasis 1
Pulmonary Arterial Hypertension
Pulmonary Fibrosis
Quadruple Negative Breast Cancer
Rectal Neoplasms
Renal Insufficiency
Reperfusion Injury
Retinal Diseases
Retinal Neovascularization
Retinal Vein Occlusion
Retinoblastoma
Retinopathy of Prematurity
Rhinitis
Sepsis
Sexually Transmitted Diseases
Siderosis
Sjogren's Syndrome
skin melanoma
Spinal Cord Injuries
Squamous Cell Carcinoma of Head and Neck
Stanford Type A Aortic   Dissection
Stomach Neoplasms
Sveinsson Chorioretinal Atrophy
Teratoid Rhabdoid Tumor
Thrombotic Stroke
Thymic aplasia
Thyroid Cancer
Thyroid Carcinoma
Thyroid Neoplasms
Thyroid Nodule
Triple Negative Breast Neoplasms
Tuberculosis
Urinary Bladder Neoplasms
Uterine Cervical Neoplasms
uterine corpus endometrial carcinoma
Uveal melanoma
Varicose Veins
Vascular Diseases
Ventricular Dysfunction
Virus Diseases
Wounds and Injuries
Wounds

As all five miRNA together indicate MS, we implemented so-called AND gates on our diagnostic test platform. By implementing these AND gates, we will only obtain an output when all RRMS-indicating miRNAs are present. To prevent the diagnostic test from detecting mimic diseases, we can implement NOT gates in our diagnostic test platform. These NOT gates should represent a miRNA dysregulated in the disease progression of mimic diseases where it is not in the progression of RRMS. If this miRNA is present, the NOT gate will not be activated and no output will be seen.

To find a proper NOT gate, we first needed to find the mimic diseases where all five miRNA are dysregulated. If all five miRNAs are also dysregulated in a mimic disease, we will be unable to distinguish between RRMS and this mimic disease by only applying an AND gate in our diagnostic test platform. Through a literature search, we obtained a list of diseases that mimic MS (Table 2). We found that in most mimic diseases not all five miRNAs were dysregulated. This means that the combination of these five miRNAs distinguishes RRMS from those mimic diseases, and means AND gates should be applied in our diagnostic test platform. However, all five miRNA are dysregulated in the mimic disease diabetes. Therefore, we searched for an miRNA that can be implemented as a NOT gate for diabetes. This miRNA should be involved in the disease progression of diabetes where it is not in RRMS. If this miRNA involved in diabetes is present in the sample, the gate is not activated and no output will be produced. Whereas the gate will be activated if the miRNA is not present.

By searching ’diabetes’ and ’multiple sclerosis’ in the HMDD with default settings, we obtained two lists of miRNAs dysregulated in diabetes and MS. After comparing these 2 lists, we found 84 potential miRNAs to function as a NOT gate. A criteria for this miRNA was to be involved in both men as well as in women. As MS is an autoimmune inflammatory disease, the miRNA in the NOT gate should not be involved in processes related to immune cells and inflammation as they might have overlap with MS. Through literature searches, we found hsa-miR-1287 to be the most suitable miRNA to make up the NOT gate for diabetes. This miRNA has not been found to be dysregulated in immune cells and inflammation.

Mimic Disease Number of miRNA Involved
Epstein-barr virus 2
Vitamin B12 Deficiency 0
Diabetes 5
Nerve Damage 2
Eye Problems 0
Stroke 4
Lupus 1
Parkinson Disease 4
Lyme Disease 0
Myasthenia Gravis 1
Amyotrofe Laterale Sclerosis 1
Guilain-Barre Syndrome 0
Acute Disseminated Encephalomylitis 0
Found miRNA in mimic diseases. Number of miRNAs discovered with our model that are dysregulated in each of the mimic diseases found in a literature search.

Conclusion

To distinguish miRNA expression data of RRMS patients and healthy controls, seven miRNAs are needed. The accuracy of this classification was 0.69. With an additional miRNA, we found an miRNA combination specific for RRMS excluding diseases that mimic MS. By implementing the detection of these eight miRNAs in AND and NOT gates in our diagnostic test platform, we can diagnose RRMS and minimise the chance of false positives.