1
|
dos Santos RR, Marumo MB, Eckeli AL, Salgado HC, Silva LEV, Tinós R, Fazan R. The use of heart rate variability, oxygen saturation, and anthropometric data with machine learning to predict the presence and severity of obstructive sleep apnea. Front Cardiovasc Med 2025; 12:1389402. [PMID: 40161388 PMCID: PMC11949982 DOI: 10.3389/fcvm.2025.1389402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 03/03/2025] [Indexed: 04/02/2025] Open
Abstract
Introduction Obstructive sleep apnea (OSA) is a prevalent sleep disorder with a high rate of undiagnosed patients, primarily due to the complexity of its diagnosis made by polysomnography (PSG). Considering the severe comorbidities associated with OSA, especially in the cardiovascular system, the development of early screening tools for this disease is imperative. Heart rate variability (HRV) is a simple and non-invasive approach used as a probe to evaluate cardiac autonomic modulation, with a variety of newly developed indices lacking studies with OSA patients. Objectives We aimed to evaluate numerous HRV indices, derived from linear but mainly nonlinear indices, combined or not with oxygen saturation indices, for detecting the presence and severity of OSA using machine learning models. Methods ECG waveforms were collected from 291 PSG recordings to calculate 34 HRV indices. Minimum oxygen saturation value during sleep (SatMin), the percentage of total sleep time the patient spent with oxygen saturation below 90% (T90), and patient anthropometric data were also considered as inputs to the models. The Apnea-Hypopnea Index (AHI) was used to categorize into severity classes of OSA (normal, mild, moderate, severe) to train multiclass or binary (normal-to-mild and moderate-to-severe) classification models, using the Random Forest (RF) algorithm. Since the OSA severity groups were unbalanced, we used the Synthetic Minority Over-sampling Technique (SMOTE) to oversample the minority classes. Results Multiclass models achieved a mean area under the ROC curve (AUROC) of 0.92 and 0.86 in classifying normal individuals and severe OSA patients, respectively, when using all attributes. When the groups were dichotomized into normal-to-mild OSA vs. moderate-to-severe OSA, an AUROC of 0.83 was obtained. As revealed by RF, the importance of features indicates that all feature modalities (HRV, SpO2, and anthropometric variables) contribute to the top 10 ranks. Conclusion The present study demonstrates the feasibility of using classification models to detect the presence and severity of OSA using these indices. Our findings have the potential to contribute to the development of rapid screening tools aimed at assisting individuals affected by this condition, to expedite diagnosis and initiate timely treatment.
Collapse
Affiliation(s)
- Rafael Rodrigues dos Santos
- Department of Physiology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Matheo Bellini Marumo
- Department of Computing and Mathematics, Faculty of Philosophy, Sciences and Letters, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Alan Luiz Eckeli
- Department of Neuroscience and Behavior Sciences, Division of Neurology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Helio Cesar Salgado
- Department of Physiology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Luiz Eduardo Virgílio Silva
- Department of Biomedical and Health Informatics, Children’s Hospital of Philadelphia, Philadelphia, PA, United States
| | - Renato Tinós
- Department of Computing and Mathematics, Faculty of Philosophy, Sciences and Letters, University of Sao Paulo, Ribeirão Preto, Brazil
| | - Rubens Fazan
- Department of Physiology, School of Medicine of Ribeirao Preto, University of Sao Paulo, Ribeirão Preto, Brazil
| |
Collapse
|
2
|
Rahmati R, Zarimeidani F, Ahmadi F, Yousefi-Koma H, Mohammadnia A, Hajimoradi M, Shafaghi S, Nazari E. Identification of novel diagnostic and prognostic microRNAs in sarcoma on TCGA dataset: bioinformatics and machine learning approach. Sci Rep 2025; 15:7521. [PMID: 40032929 PMCID: PMC11876432 DOI: 10.1038/s41598-025-91007-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Accepted: 02/17/2025] [Indexed: 03/05/2025] Open
Abstract
The discovery of unique microRNA (miR) patterns and their corresponding genes in sarcoma patients indicates their involvement in cancer development and suggests their potential use in medical management. MiRs were identified from The Cancer Genome Atlas (TCGA) dataset, with a Deep Neural Network (DNN) employed for novel miR identification. MiRDB facilitated target predictions. Functional enrichment analysis, identify critical pathways, protein-protein interaction network, and diseases/clinical data correlations were explored. COX regression, Kaplan-Meier analyses, and CombioROC was also utilized. The population consisted of 119 females and 142 males, and 1046 miRs were uncovered. Ten miRs was selected for further analysis using DNN. Upon analyzing for gene ontology, it was found that these genes showed enrichment in various activities. We identified a significant association between the overall survival rate of sarcoma patients and miRs levels. The combination of miR.3688 and miR.3936 achieved the greatest diagnostic standing. MiRs have the capability to screen sarcoma patients to identify undetected tumors, predict prognosis, and pinpoint prospective targets for treatment. Further large clinical trials are required to validate our findings.
Collapse
Affiliation(s)
- Rahem Rahmati
- Students Research Committee, Shahrekord University of Medical Sciences, Shahrekord, Iran
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fatemeh Zarimeidani
- Students Research Committee, Shahrekord University of Medical Sciences, Shahrekord, Iran
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Farnaz Ahmadi
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hannaneh Yousefi-Koma
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Abdolreza Mohammadnia
- Chronic Respiratory Diseases Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Maryam Hajimoradi
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Shadi Shafaghi
- Lung Transplantation Research Center, National Research Institute of Tuberculosis and Lung Diseases (NRITLD), Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Elham Nazari
- Proteomics Research Center, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
3
|
Ghofrani A, Taherdoost H. Biomedical data analytics for better patient outcomes. Drug Discov Today 2025; 30:104280. [PMID: 39732322 DOI: 10.1016/j.drudis.2024.104280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Revised: 12/16/2024] [Accepted: 12/20/2024] [Indexed: 12/30/2024]
Abstract
Medical professionals today have access to immense amounts of data, which enables them to make decisions that enhance patient care and treatment efficacy. This innovative strategy can improve global health care by bridging the divide between clinical practice and medical research. This paper reviews biomedical developments aimed at improving patient outcomes by addressing three main questions regarding techniques, data sources and challenges. The review includes peer-reviewed articles from 2018 to 2023, found via systematic searches in PubMed, Scopus and Google Scholar. The results show diverse disease-specific applications. Challenges such as data quality and ethics are discussed, underscoring data analytics' potential for patient-focused health care. The review concludes that successful implementation requires addressing gaps, collaboration and innovation in biomedical science and data analytics.
Collapse
Affiliation(s)
| | - Hamed Taherdoost
- Hamta Business Corporation, Vancouver, Canada; University Canada West, Vancouver, Canada; Westcliff University, Irvine, USA; GUS Institute | Global University Systems, London, UK.
| |
Collapse
|
4
|
Vignolle GA, Bauerstätter P, Schönthaler S, Nöhammer C, Olischar M, Berger A, Kasprian G, Langs G, Vierlinger K, Goeral K. Predicting Outcomes of Preterm Neonates Post Intraventricular Hemorrhage. Int J Mol Sci 2024; 25:10304. [PMID: 39408633 PMCID: PMC11477204 DOI: 10.3390/ijms251910304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 09/13/2024] [Accepted: 09/20/2024] [Indexed: 10/20/2024] Open
Abstract
Intraventricular hemorrhage (IVH) in preterm neonates presents a high risk for developing posthemorrhagic ventricular dilatation (PHVD), a severe complication that can impact survival and long-term outcomes. Early detection of PHVD before clinical onset is crucial for optimizing therapeutic interventions and providing accurate parental counseling. This study explores the potential of explainable machine learning models based on targeted liquid biopsy proteomics data to predict outcomes in preterm neonates with IVH. In recent years, research has focused on leveraging advanced proteomic technologies and machine learning to improve prediction of neonatal complications, particularly in relation to neurological outcomes. Machine learning (ML) approaches, combined with proteomics, offer a powerful tool to identify biomarkers and predict patient-specific risks. However, challenges remain in integrating large-scale, multiomic datasets and translating these findings into actionable clinical tools. Identifying reliable, disease-specific biomarkers and developing explainable ML models that clinicians can trust and understand are key barriers to widespread clinical adoption. In this prospective longitudinal cohort study, we analyzed 1109 liquid biopsy samples from 99 preterm neonates with IVH, collected at up to six timepoints over 13 years. Various explainable ML techniques-including statistical, regularization, deep learning, decision trees, and Bayesian methods-were employed to predict PHVD development and survival and to discover disease-specific protein biomarkers. Targeted proteomic analyses were conducted using serum and urine samples through a proximity extension assay capable of detecting low-concentration proteins in complex biofluids. The study identified 41 significant independent protein markers in the 1600 calculated ML models that surpassed our rigorous threshold (AUC-ROC of ≥0.7, sensitivity ≥ 0.6, and selectivity ≥ 0.6), alongside gestational age at birth, as predictive of PHVD development and survival. Both known biomarkers, such as neurofilament light chain (NEFL), and novel biomarkers were revealed. These findings underscore the potential of targeted proteomics combined with ML to enhance clinical decision-making and parental counseling, though further validation is required before clinical implementation.
Collapse
Affiliation(s)
- Gabriel A. Vignolle
- Center for Health & Bioresources, Competence Unit Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria; (G.A.V.); (P.B.); (S.S.); (C.N.); (K.V.)
| | - Priska Bauerstätter
- Center for Health & Bioresources, Competence Unit Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria; (G.A.V.); (P.B.); (S.S.); (C.N.); (K.V.)
| | - Silvia Schönthaler
- Center for Health & Bioresources, Competence Unit Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria; (G.A.V.); (P.B.); (S.S.); (C.N.); (K.V.)
| | - Christa Nöhammer
- Center for Health & Bioresources, Competence Unit Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria; (G.A.V.); (P.B.); (S.S.); (C.N.); (K.V.)
| | - Monika Olischar
- Comprehensive Center for Pediatrics, Department of Pediatrics and Adolescent Medicine, Division of Neonatology, Intensive Care and Neuropediatrics, Medical University of Vienna, 1090 Vienna, Austria; (M.O.); (A.B.)
| | - Angelika Berger
- Comprehensive Center for Pediatrics, Department of Pediatrics and Adolescent Medicine, Division of Neonatology, Intensive Care and Neuropediatrics, Medical University of Vienna, 1090 Vienna, Austria; (M.O.); (A.B.)
| | - Gregor Kasprian
- Department of Biomedical Imaging and Image-Guided Therapy, Division of Neuro- and Musculosceletal Radiology, Medical University of Vienna, 1090 Vienna, Austria;
| | - Georg Langs
- Computational Imaging Research Lab, Department of Biomedical Imaging and Image-Guided Therapy, Medical University of Vienna, 1090 Vienna, Austria;
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Klemens Vierlinger
- Center for Health & Bioresources, Competence Unit Molecular Diagnostics, AIT Austrian Institute of Technology GmbH, 1210 Vienna, Austria; (G.A.V.); (P.B.); (S.S.); (C.N.); (K.V.)
| | - Katharina Goeral
- Comprehensive Center for Pediatrics, Department of Pediatrics and Adolescent Medicine, Division of Neonatology, Intensive Care and Neuropediatrics, Medical University of Vienna, 1090 Vienna, Austria; (M.O.); (A.B.)
| |
Collapse
|
5
|
Paylar B, Bezabhe YH, Jass J, Olsson PE. Exploring the Sublethal Impacts of Cu and Zn on Daphnia magna: a transcriptomic perspective. BMC Genomics 2024; 25:790. [PMID: 39160502 PMCID: PMC11331620 DOI: 10.1186/s12864-024-10701-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 08/12/2024] [Indexed: 08/21/2024] Open
Abstract
Metal contamination of aquatic environments remains a major concern due to their persistence. The water flea Daphnia magna is an important model species for metal toxicity studies and water quality assessment. However, most research has focused on physiological endpoints such as mortality, growth, and reproduction in laboratory settings, as well as neglected toxicogenomic responses. Copper (Cu) and zinc (Zn) are essential trace elements that play crucial roles in many biological processes, including iron metabolism, connective tissue formation, neurotransmitter synthesis, DNA synthesis, and immune function. Excess amounts of these metals result in deviations from homeostasis and may induce toxic responses. In this study, we analyzed Daphnia magna transcriptomic responses to IC5 levels of Cu (120 µg/L) and Zn (300 µg/L) in environmental water obtained from a pristine lake with adjusted water hardness (150 mg/L CaCO3). The study was carried out to gain insights into the Cu and Zn regulated stress response mechanisms in Daphnia magna at transcriptome level. A total of 2,688 and 3,080 genes were found to be differentially expressed (DEG) between the control and Cu and the control and Zn, respectively. There were 1,793 differentially expressed genes in common for both Cu and Zn, whereas the number of unique DEGs for Cu and Zn were 895 and 1,287, respectively. Gene ontology and KEGG pathways enrichment were carried out to identify the molecular functions and biological processes affected by metal exposures. In addition to well-known biomarkers, novel targets for metal toxicity screening at the genomic level were identified.
Collapse
Affiliation(s)
- Berkay Paylar
- Biology, The Life Science Center, School of Science and Technology, Örebro University, SE-701 82, Örebro, Sweden
- , Örebro, Sweden
| | - Yared H Bezabhe
- Biology, The Life Science Center, School of Science and Technology, Örebro University, SE-701 82, Örebro, Sweden
- , Örebro, Sweden
| | - Jana Jass
- Biology, The Life Science Center, School of Science and Technology, Örebro University, SE-701 82, Örebro, Sweden
- , Örebro, Sweden
| | - Per-Erik Olsson
- Biology, The Life Science Center, School of Science and Technology, Örebro University, SE-701 82, Örebro, Sweden.
- , Örebro, Sweden.
| |
Collapse
|
6
|
Nevoránková P, Šulcová M, Kavková M, Zimčík D, Balková SM, Peléšková K, Kristeková D, Jakešová V, Zikmund T, Kaiser J, Holá LI, Kolář M, Buchtová M. Region-specific gene expression profiling of early mouse mandible uncovered SATB2 as a key molecule for teeth patterning. Sci Rep 2024; 14:18212. [PMID: 39107332 PMCID: PMC11303781 DOI: 10.1038/s41598-024-68016-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 07/18/2024] [Indexed: 08/10/2024] Open
Abstract
Mammalian dentition exhibits distinct heterodonty, with more simple teeth located in the anterior area of the jaw and more complex teeth situated posteriorly. While some region-specific differences in signalling have been described previously, here we performed a comprehensive analysis of gene expression at the early stages of odontogenesis to obtain complete knowledge of the signalling pathways involved in early jaw patterning. Gene expression was analysed separately on anterior and posterior areas of the lower jaw at two early stages (E11.5 and E12.5) of odontogenesis. Gene expression profiling revealed distinct region-specific expression patterns in mouse mandibles, including several known BMP and FGF signalling members and we also identified several new molecules exhibiting significant differences in expression along the anterior-posterior axis, which potentially can play the role during incisor and molar specification. Next, we followed one of the anterior molecules, SATB2, which was expressed not only in the anterior mesenchyme where incisor germs are initiated, however, we uncovered a distinct SATB2-positive region in the mesenchyme closely surrounding molars. Satb2-deficient animals demonstrated defective incisor development confirming a crucial role of SATB2 in formation of anterior teeth. On the other hand, ectopic tooth germs were observed in the molar area indicating differential effect of Satb2-deficiency in individual jaw regions. In conclusion, our data provide a rich source of fundamental information, which can be used to determine molecular regulation driving early embryonic jaw patterning and serve for a deeper understanding of molecular signalling directed towards incisor and molar development.
Collapse
Affiliation(s)
- Petra Nevoránková
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
- Department of Stomatology, Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Department of Stomatology, St. Anne's University Hospital, Brno, Czech Republic
| | - Marie Šulcová
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Michaela Kavková
- Laboratory of Computed Tomography, CEITEC BUT, Brno, Czech Republic
| | - David Zimčík
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Simona Moravcová Balková
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
| | - Kristýna Peléšková
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
| | - Daniela Kristeková
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Veronika Jakešová
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic
| | - Tomáš Zikmund
- Laboratory of Computed Tomography, CEITEC BUT, Brno, Czech Republic
| | - Jozef Kaiser
- Laboratory of Computed Tomography, CEITEC BUT, Brno, Czech Republic
| | - Lydie Izakovičová Holá
- Department of Stomatology, Faculty of Medicine, Masaryk University, Brno, Czech Republic
- Department of Stomatology, St. Anne's University Hospital, Brno, Czech Republic
| | - Michal Kolář
- Laboratory of Genomics and Bioinformatics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Marcela Buchtová
- Laboratory of Molecular Morphogenesis, Institute of Animal Physiology and Genetics, v.v.i., Czech Academy of Sciences, Veveri 97, 602 00, Brno, Czech Republic.
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czech Republic.
| |
Collapse
|
7
|
Sun J, Zhang Z, Cai J, Li X, Xu X. Identification of Hub Genes in Liver Hepatocellular Carcinoma Based on Weighted Gene Co-expression Network Analysis. Biochem Genet 2024:10.1007/s10528-024-10803-8. [PMID: 38683466 DOI: 10.1007/s10528-024-10803-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 04/05/2024] [Indexed: 05/01/2024]
Abstract
Liver hepatocellular carcinoma (LIHC) is a malignant cancer with high incidence and poor prognosis. To investigate the correlation between hub genes and progression of LIHC and to provided potential prognostic markers and therapy targets for LIHC. Our study mainly used The Cancer Genome Atlas (TCGA) LIHC database and the gene expression profiles of GSE54236 from the Gene Expression Omnibus (GEO) to explore the differential co-expression genes between LIHC and normal tissues. The differential co-expression genes were extracted by Weighted Gene Co-expression Network Analysis (WGCNA) and differential gene expression analysis methods. The Genetic Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were carried out to annotate the function of differential genes. Then the hub genes were validated using protein-protein interaction (PPI) network. And the expression level and prognostic analysis were performed. The probable associations between the expression of hub genes and both tumor purity and infiltration of immune cells were explored by TIMER. A total of 68 differential co-expression genes were extracted. These genes were mainly enriched in complement activation (biological process), collagen trimer (cellular component), carbohydrate binding and receptor ligand activity (molecular function) and cytokine - cytokine receptor interaction. Then we demonstrated that the 10 hub genes (CFP, CLEC1B, CLEC4G, CLEC4M, FCN2, FCN3, PAMR1 and TIMD4) were weakly expressed in LIHC tissues, the qRT-PCR results of clinical samples showed that six genes were significantly downregulated in LIHC patients compared with adjacent tissues. Worse overall survival (OS) and disease-free survival (DFS) in LIHC patients were associated with the lower expression of CFP, CLEC1B, FCN3 and TIMD4. Ten hub genes had positive association with tumor purity. CFP, CLEC1B, FCN3 and TIMD4 could serve as novel potential molecular targets for prognosis prediction in LIHC.
Collapse
Affiliation(s)
- Jiawei Sun
- Shulan International Medical College, Zhejiang Shuren University, Hangzhou, 31005, China
| | - Zizhen Zhang
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Department of Gastrointestinal Oncology, Peking University Cancer Hospital & Institute, Beijing, 100142, China
| | - Jiaru Cai
- Shulan International Medical College, Zhejiang Shuren University, Hangzhou, 31005, China
| | - Xiaoping Li
- Shulan International Medical College, Zhejiang Shuren University, Hangzhou, 31005, China.
| | - Xiaoling Xu
- Shulan International Medical College, Zhejiang Shuren University, Hangzhou, 31005, China.
| |
Collapse
|
8
|
Rezaei Z, Tahmasebi A, Pourabbas B. Using meta-analysis and machine learning to investigate the transcriptional response of immune cells to Leishmania infection. PLoS Negl Trop Dis 2024; 18:e0011892. [PMID: 38190401 PMCID: PMC10798641 DOI: 10.1371/journal.pntd.0011892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Revised: 01/19/2024] [Accepted: 12/29/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Leishmaniasis is a parasitic disease caused by the Leishmania protozoan affecting millions of people worldwide, especially in tropical and subtropical regions. The immune response involves the activation of various cells to eliminate the infection. Understanding the complex interplay between Leishmania and the host immune system is crucial for developing effective treatments against this disease. METHODS This study collected extensive transcriptomic data from macrophages, dendritic, and NK cells exposed to Leishmania spp. Our objective was to determine the Leishmania-responsive genes in immune system cells by applying meta-analysis and feature selection algorithms, followed by co-expression analysis. RESULTS As a result of meta-analysis, we discovered 703 differentially expressed genes (DEGs), primarily associated with the immune system and cellular metabolic processes. In addition, we have substantiated the significance of transcription factor families, such as bZIP and C2H2 ZF, in response to Leishmania infection. Furthermore, the feature selection techniques revealed the potential of two genes, namely G0S2 and CXCL8, as biomarkers and therapeutic targets for Leishmania infection. Lastly, our co-expression analysis has unveiled seven hub genes, including PFKFB3, DIAPH1, BSG, BIRC3, GOT2, EIF3H, and ATF3, chiefly related to signaling pathways. CONCLUSIONS These findings provide valuable insights into the molecular mechanisms underlying the response of immune system cells to Leishmania infection and offer novel potential targets for the therapeutic goals.
Collapse
Affiliation(s)
- Zahra Rezaei
- Professor Alborzi Clinical Microbiology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Ahmad Tahmasebi
- Professor Alborzi Clinical Microbiology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
- Shiraz Institute for Cancer Research, School of Medicine, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Bahman Pourabbas
- Professor Alborzi Clinical Microbiology Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| |
Collapse
|
9
|
Kim K, Kim S, Ahn T, Kim H, Shin SJ, Choi CH, Park S, Kim YB, No JH, Suh DH. A differential diagnosis between uterine leiomyoma and leiomyosarcoma using transcriptome analysis. BMC Cancer 2023; 23:1215. [PMID: 38066476 PMCID: PMC10709939 DOI: 10.1186/s12885-023-11394-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 09/11/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND The objective of this study was to estimate the accuracy of transcriptome-based classifier in differential diagnosis of uterine leiomyoma and leiomyosarcoma. We manually selected 114 normal uterine tissue and 31 leiomyosarcoma samples from publicly available transcriptome data in UCSC Xena as training/validation sets. We developed pre-processing procedure and gene selection method to sensitively find genes of larger variance in leiomyosarcoma than normal uterine tissues. Through our method, 17 genes were selected to build transcriptome-based classifier. The prediction accuracies of deep feedforward neural network (DNN), support vector machine (SVM), random forest (RF), and gradient boosting (GB) models were examined. We interpret the biological functionality of selected genes via network-based analysis using GeneMANIA. To validate the performance of trained model, we additionally collected 35 clinical samples of leiomyosarcoma and leiomyoma as a test set (18 + 17 as 1st and 2nd test sets). RESULTS We discovered genes expressed in a highly variable way in leiomyosarcoma while these genes are expressed in a conserved way in normal uterine samples. These genes were mainly associated with DNA replication. As gene selection and model training were made in leiomyosarcoma and uterine normal tissue, proving discriminant of ability between leiomyosarcoma and leiomyoma is necessary. Thus, further validation of trained model was conducted in newly collected clinical samples of leiomyosarcoma and leiomyoma. The DNN classifier performed sensitivity 0.88, 0.77 (8/9, 7/9) while the specificity 1.0 (8/8, 8/8) in two test data set supporting that the selected genes in conjunction with DNN classifier are well discriminating the difference between leiomyosarcoma and leiomyoma in clinical sample. CONCLUSION The transcriptome-based classifier accurately distinguished uterine leiomyosarcoma from leiomyoma. Our method can be helpful in clinical practice through the biopsy of sample in advance of surgery. Identification of leiomyosarcoma let the doctor avoid of laparoscopic surgery, thus it minimizes un-wanted tumor spread.
Collapse
Affiliation(s)
- Kidong Kim
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Sarah Kim
- Department of Life Science, Handong Global University, Pohang, Republic of Korea
| | - TaeJin Ahn
- Department of Life Science, Handong Global University, Pohang, Republic of Korea.
| | - Hyojin Kim
- Department of Pathology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - So-Jin Shin
- Department of Gynecology and Obstetrics, School of Medicine, Keimyung University, Daegu, Republic of Korea
| | - Chel Hun Choi
- Department of Obstetrics and Gynecology, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea
| | - Sungmin Park
- Department of Life Science, Handong Global University, Pohang, Republic of Korea
| | - Yong-Beom Kim
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Jae Hong No
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | - Dong Hoon Suh
- Department of Obstetrics and Gynecology, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| |
Collapse
|
10
|
Biswas A, Kumari A, Gaikwad DS, Pandey DK. Revolutionizing Biological Science: The Synergy of Genomics in Health, Bioinformatics, Agriculture, and Artificial Intelligence. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:550-569. [PMID: 38100404 DOI: 10.1089/omi.2023.0197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2023]
Abstract
With climate emergency, COVID-19, and the rise of planetary health scholarship, the binary of human and ecosystem health has been deeply challenged. The interdependence of human and nonhuman animal health is increasingly acknowledged and paving the way for new frontiers in integrative biology. The convergence of genomics in health, bioinformatics, agriculture, and artificial intelligence (AI) has ushered in a new era of possibilities and applications. However, the sheer volume of genomic/multiomics big data generated also presents formidable sociotechnical challenges in extracting meaningful biological, planetary health and ecological insights. Over the past few years, AI-guided bioinformatics has emerged as a powerful tool for managing, analyzing, and interpreting complex biological datasets. The advances in AI, particularly in machine learning and deep learning, have been transforming the fields of genomics, planetary health, and agriculture. This article aims to unpack and explore the formidable range of possibilities and challenges that result from such transdisciplinary integration, and emphasizes its radically transformative potential for human and ecosystem health. The integration of these disciplines is also driving significant advancements in precision medicine and personalized health care. This presents an unprecedented opportunity to deepen our understanding of complex biological systems and advance the well-being of all life in planetary ecosystems. Notwithstanding in mind its sociotechnical, ethical, and critical policy challenges, the integration of genomics, multiomics, planetary health, and agriculture with AI-guided bioinformatics opens up vast opportunities for transnational collaborative efforts, data sharing, analysis, valorization, and interdisciplinary innovations in life sciences and integrative biology.
Collapse
Affiliation(s)
- Aakanksha Biswas
- Amity Institute of Biotechnology, Amity University Jharkhand, Ranchi, India
| | - Aditi Kumari
- Amity Institute of Biotechnology, Amity University Jharkhand, Ranchi, India
| | - D S Gaikwad
- Amity Institute of Organic Agriculture, Amity University, Noida, India
| | - Dhananjay K Pandey
- Amity Institute of Biotechnology, Amity University Jharkhand, Ranchi, India
| |
Collapse
|
11
|
Cava C, D'Antona S, Maselli F, Castiglioni I, Porro D. From genetic correlations of Alzheimer's disease to classification with artificial neural network models. Funct Integr Genomics 2023; 23:293. [PMID: 37682415 PMCID: PMC10491691 DOI: 10.1007/s10142-023-01228-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 08/30/2023] [Accepted: 09/03/2023] [Indexed: 09/09/2023]
Abstract
Sporadic Alzheimer's disease (AD) is a complex neurological disorder characterized by many risk loci with potential associations with different traits and diseases. AD, characterized by a progressive loss of neuronal functions, manifests with different symptoms such as decline in memory, movement, coordination, and speech. The mechanisms underlying the onset of AD are not always fully understood, but involve a multiplicity of factors. Early diagnosis of AD plays a central role as it can offer the possibility of early treatment, which can slow disease progression. Currently, the methods of diagnosis are cognitive testing, neuroimaging, or cerebrospinal fluid analysis that can be time-consuming, expensive, invasive, and not always accurate. In the present study, we performed a genetic correlation analysis using genome-wide association statistics from a large study of AD and UK Biobank, to examine the association of AD with other human traits and disorders. In addition, since hippocampus, a part of cerebral cortex could play a central role in several traits that are associated with AD; we analyzed the gene expression profiles of hippocampus of AD patients applying 4 different artificial neural network models. We found 65 traits correlated with AD grouped into 9 clusters: medical conditions, fluid intelligence, education, anthropometric measures, employment status, activity, diet, lifestyle, and sexuality. The comparison of different 4 neural network models along with feature selection methods on 5 Alzheimer's gene expression datasets showed that the simple basic neural network model obtains a better performance (66% of accuracy) than other more complex methods with dropout and weight regularization of the network.
Collapse
Affiliation(s)
- Claudia Cava
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy.
- Department of Science, Technology and Society, University School for Advanced Studies IUSS Pavia, Palazzo del Broletto, Piazza Della Vittoria 15, 27100, Pavia, Italy.
| | - Salvatore D'Antona
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
| | - Francesca Maselli
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
| | - Isabella Castiglioni
- Department of Physics "Giuseppe Occhialini", University of Milan-Bicocca Piazza Dell'Ateneo Nuovo, 20126, Milan, Italy
| | - Danilo Porro
- Institute of Molecular Bioimaging and Physiology, National Research Council (IBFM-CNR), Via F. Cervi 93, Segrate-Milan, 20090, Milan, Italy
- NBFC, National Biodiversity Future Center, 90133, Palermo, Italy
| |
Collapse
|
12
|
Mirza Z, Ansari MS, Iqbal MS, Ahmad N, Alganmi N, Banjar H, Al-Qahtani MH, Karim S. Identification of Novel Diagnostic and Prognostic Gene Signature Biomarkers for Breast Cancer Using Artificial Intelligence and Machine Learning Assisted Transcriptomics Analysis. Cancers (Basel) 2023; 15:3237. [PMID: 37370847 DOI: 10.3390/cancers15123237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/10/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
BACKGROUND Breast cancer (BC) is one of the most common female cancers. Clinical and histopathological information is collectively used for diagnosis, but is often not precise. We applied machine learning (ML) methods to identify the valuable gene signature model based on differentially expressed genes (DEGs) for BC diagnosis and prognosis. METHODS A cohort of 701 samples from 11 GEO BC microarray datasets was used for the identification of significant DEGs. Seven ML methods, including RFECV-LR, RFECV-SVM, LR-L1, SVC-L1, RF, and Extra-Trees were applied for gene reduction and the construction of a diagnostic model for cancer classification. Kaplan-Meier survival analysis was performed for prognostic signature construction. The potential biomarkers were confirmed via qRT-PCR and validated by another set of ML methods including GBDT, XGBoost, AdaBoost, KNN, and MLP. RESULTS We identified 355 DEGs and predicted BC-associated pathways, including kinetochore metaphase signaling, PTEN, senescence, and phagosome-formation pathways. A hub of 28 DEGs and a novel diagnostic nine-gene signature (COL10A, S100P, ADAMTS5, WISP1, COMP, CXCL10, LYVE1, COL11A1, and INHBA) were identified using stringent filter conditions. Similarly, a novel prognostic model consisting of eight-gene signatures (CCNE2, NUSAP1, TPX2, S100P, ITM2A, LIFR, TNXA, and ZBTB16) was also identified using disease-free survival and overall survival analysis. Gene signatures were validated by another set of ML methods. Finally, qRT-PCR results confirmed the expression of the identified gene signatures in BC. CONCLUSION The ML approach helped construct novel diagnostic and prognostic models based on the expression profiling of BC. The identified nine-gene signature and eight-gene signatures showed excellent potential in BC diagnosis and prognosis, respectively.
Collapse
Affiliation(s)
- Zeenat Mirza
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Department of Medical Laboratory Science, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Md Shahid Ansari
- Department of Clinical Data Analytics, Max Super Speciality Hospital, Saket, New Delhi 110017, India
| | - Md Shahid Iqbal
- Department of Statistics and Computer Applications, Tilka Manjhi Bhagalpur University, Bhagalpur 812007, India
| | - Nesar Ahmad
- Department of Statistics and Computer Applications, Tilka Manjhi Bhagalpur University, Bhagalpur 812007, India
| | - Nofe Alganmi
- Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Centre of Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Haneen Banjar
- Computer Science Department, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Centre of Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Mohammed H Al-Qahtani
- Department of Medical Laboratory Science, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - Sajjad Karim
- Department of Medical Laboratory Science, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| |
Collapse
|
13
|
Zhu J, Luo J, Ma Y. Screening of serum exosome markers for colorectal cancer based on Boruta and multi-cluster feature selection algorithms. Mol Cell Toxicol 2023. [DOI: 10.1007/s13273-023-00348-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
14
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
15
|
Bajo-Morales J, Castillo-Secilla D, Herrera LJ, Caba O, Prados JC, Rojas I. Predicting COVID-19 Severity Integrating RNA-Seq Data Using Machine
Learning Techniques. Curr Bioinform 2023; 18:221-231. [DOI: 10.2174/1574893617666220718110053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/21/2022] [Accepted: 05/31/2022] [Indexed: 11/22/2022]
Abstract
Abstract:
A fundamental challenge in the fight against COVID -19 is the development of reliable and accurate tools to predict disease progression in a patient. This information can be extremely useful in distinguishing hospitalized patients at higher risk for needing UCI from patients with low severity. How SARS-CoV-2 infection will evolve is still unclear.
Methods:
A novel pipeline was developed that can integrate RNA-Seq data from different databases to obtain a genetic biomarker COVID -19 severity index using an artificial intelligence algorithm. Our pipeline ensures robustness through multiple cross-validation processes in different steps.
Results:
CD93, RPS24, PSCA, and CD300E were identified as a COVID -19 severity gene signature. Furthermore, using the obtained gene signature, an effective multi-class classifier capable of discriminating between control, outpatient, inpatient, and ICU COVID -19 patients was optimized, achieving an accuracy of 97.5%.
Conclusion:
In summary, during this research, a new intelligent pipeline was implemented with the goal of developing a specific gene signature that can detect the severity of patients suffering COVID -19. Our approach to clinical decision support systems achieved excellent results, even when processing unseen samples. Our system can be of great clinical utility for the strategy of planning, organizing and managing human and material resources, as well as for automatically classifying the severity of patients affected by COVID -19.
Collapse
Affiliation(s)
- Javier Bajo-Morales
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
- Deuser Tech Group, Calle Islandia, 182-NAV 24A, Córdoba,
14014, Córdoba; Spain
| | - Daniel Castillo-Secilla
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
- Fujitsu Technology Solutions S.A, CoE Data Intelligence, Camino del Cerro
de los Gamos, 1, Pozuelo de Alarcón, 28224, Madrid, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
| | - Octavio Caba
- Nuclear Medicine Department, IMIBIC, University Hospital Reina Sofia, Menéndez
Pidal Avenue, 14004, Córdoba, Spain
| | - Jose Carlos Prados
- Nuclear Medicine Department, IMIBIC, University Hospital Reina Sofia, Menéndez
Pidal Avenue, 14004, Córdoba, Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology, University of Granada, C.I.T.I.C., Periodista Rafael Gómez
Montero, 2, 18014, Granada, Spain
| |
Collapse
|
16
|
Ashraf MT, Hamid I, Nawaz Q, Ali H. Hybrid Approach using Extreme Gradient Boosting (XGBoost) and Evolutionary Algorithm for Cancer Classification. 2023 INTERNATIONAL MULTI-DISCIPLINARY CONFERENCE IN EMERGING RESEARCH TRENDS (IMCERT) 2023. [DOI: 10.1109/imcert57083.2023.10075236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Affiliation(s)
| | - Isma Hamid
- National Textie University,Department of Computer Science,Faisalabad,Pakistan
| | - Qamar Nawaz
- University of Agriculture,Department of Computer Science,Faisalabad,Pakistan
| | - Hamid Ali
- National Textile University,Department of Computer Science,Faisalabad,Pakistan
| |
Collapse
|
17
|
Sussman L, Garcia-Robledo JE, Ordóñez-Reyes C, Forero Y, Mosquera AF, Ruíz-Patiño A, Chamorro DF, Cardona AF. Integration of artificial intelligence and precision oncology in Latin America. FRONTIERS IN MEDICAL TECHNOLOGY 2022; 4:1007822. [PMID: 36311461 PMCID: PMC9608820 DOI: 10.3389/fmedt.2022.1007822] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 09/21/2022] [Indexed: 11/07/2022] Open
Abstract
Next-generation medicine encompasses different concepts related to healthcare models and technological developments. In Latin America and the Caribbean, healthcare systems are quite different between countries, and cancer control is known to be insufficient and inefficient considering socioeconomically discrepancies. Despite advancements in knowledge about the biology of different oncological diseases, the disease remains a challenge in terms of diagnosis, treatment, and prognosis for clinicians and researchers. With the development of molecular biology, better diagnosis methods, and therapeutic tools in the last years, artificial intelligence (AI) has become important, because it could improve different clinical scenarios: predicting clinically relevant parameters, cancer diagnosis, cancer research, and accelerating the growth of personalized medicine. The incorporation of AI represents an important challenge in terms of diagnosis, treatment, and prognosis for clinicians and researchers in cancer care. Therefore, some studies about AI in Latin America and the Caribbean are being conducted with the aim to improve the performance of AI in those countries. This review introduces AI in cancer care in Latin America and the Caribbean, and the advantages and promising results that it has shown in this socio-demographic context.
Collapse
Affiliation(s)
- Liliana Sussman
- Department of Neurology, Fundación Universitaria de Ciencias de la Salud, Bogotá, Colombia,Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia
| | - Juan Esteban Garcia-Robledo
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,Division of Hematology/Oncology, Mayo Clinic, Scottsdale, AZ, United States
| | - Camila Ordóñez-Reyes
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia
| | - Yency Forero
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia
| | - Andrés F. Mosquera
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia
| | - Alejandro Ruíz-Patiño
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia
| | - Diego F. Chamorro
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia
| | - Andrés F. Cardona
- Foundation for Clinical and Applied Cancer Research – FICMAC, Bogotá, Colombia,MolecularOncology and Biology Systems Research Group (Fox-G), Universidad el Bosque, Bogotá, Colombia,Direction of Research, Science and Education, Luis Carlos Sarmiento Angulo Cancer Treatment and Research Center (CTIC), Bogotá, Colombia,Correspondence: Andrés F. Cardona
| |
Collapse
|
18
|
Raufaste-Cazavieille V, Santiago R, Droit A. Multi-omics analysis: Paving the path toward achieving precision medicine in cancer treatment and immuno-oncology. Front Mol Biosci 2022; 9:962743. [PMID: 36304921 PMCID: PMC9595279 DOI: 10.3389/fmolb.2022.962743] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/21/2022] [Indexed: 11/13/2022] Open
Abstract
The acceleration of large-scale sequencing and the progress in high-throughput computational analyses, defined as omics, was a hallmark for the comprehension of the biological processes in human health and diseases. In cancerology, the omics approach, initiated by genomics and transcriptomics studies, has revealed an incredible complexity with unsuspected molecular diversity within a same tumor type as well as spatial and temporal heterogeneity of tumors. The integration of multiple biological layers of omics studies brought oncology to a new paradigm, from tumor site classification to pan-cancer molecular classification, offering new therapeutic opportunities for precision medicine. In this review, we will provide a comprehensive overview of the latest innovations for multi-omics integration in oncology and summarize the largest multi-omics dataset available for adult and pediatric cancers. We will present multi-omics techniques for characterizing cancer biology and show how multi-omics data can be combined with clinical data for the identification of prognostic and treatment-specific biomarkers, opening the way to personalized therapy. To conclude, we will detail the newest strategies for dissecting the tumor immune environment and host–tumor interaction. We will explore the advances in immunomics and microbiomics for biomarker identification to guide therapeutic decision in immuno-oncology.
Collapse
Affiliation(s)
| | - Raoul Santiago
- CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- Division of Pediatric Hematology-Oncology, Centre Hospitalier Universitaire de L’Université Laval, Charles Bruneau Cancer Center, Québec, QC, Canada
- *Correspondence: Raoul Santiago, ; Arnaud Droit,
| | - Arnaud Droit
- CHU de Québec Research Center, Université Laval, Québec, QC, Canada
- *Correspondence: Raoul Santiago, ; Arnaud Droit,
| |
Collapse
|
19
|
Guha A, Goda JS, Dasgupta A, Mahajan A, Halder S, Gawde J, Talole S. Classifying primary central nervous system lymphoma from glioblastoma using deep learning and radiomics based machine learning approach - a systematic review and meta-analysis. Front Oncol 2022; 12:884173. [PMID: 36263203 PMCID: PMC9574102 DOI: 10.3389/fonc.2022.884173] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 09/07/2022] [Indexed: 01/06/2023] Open
Abstract
BackgroundGlioblastoma (GBM) and primary central nervous system lymphoma (PCNSL) are common in elderly yet difficult to differentiate on MRI. Their management and prognosis are quite different. Recent surge of interest in predictive analytics, using machine learning (ML) from radiomic features and deep learning (DL) for diagnosing, predicting response and prognosticating disease has evinced interest among radiologists and clinicians. The objective of this systematic review and meta-analysis was to evaluate the deep learning & ML algorithms in classifying PCNSL from GBM.MethodsThe authors performed a systematic review of the literature from MEDLINE, EMBASE and the Cochrane central trials register for the search strategy in accordance with PRISMA guidelines to select and evaluate studies that included themes of ML, DL, AI, GBM, PCNSL. All studies reporting on ML algorithms or DL that for differentiating PCNSL from GBM on MR imaging were included. These studies were further narrowed down to focus on works published between 2018 and 2021. Two researchers independently conducted the literature screening, database extraction and risk bias assessment. The extracted data was synthesised and analysed by forest plots. Outcomes assessed were test characteristics such as accuracy, sensitivity, specificity and balanced accuracy.ResultsTen articles meeting the eligibility criteria were identified addressing use of ML and DL in training and validation classifiers to distinguish PCNSL from GBM on MR imaging. The total sample size was 1311 in the included studies. ML approach was used in 6 studies while DL in 4 studies. The lowest reported sensitivity was 80%, while the highest reported sensitivity was 99% in studies in which ML and DL was directly compared with the gold standard histopathology. The lowest reported specificity was 87% while the highest reported specificity was 100%. The highest reported balanced accuracy was 100% and the lowest was 84%.ConclusionsExtensive search of the database revealed a limited number of studies that have applied ML or DL to differentiate PCNSL from GBM. Of the currently published studies, Both DL & ML algorithms have demonstrated encouraging results and certainly have the potential to aid neurooncologists in taking preoperative decisions in the future leading to not only reduction in morbidities but also be cost effective.
Collapse
Affiliation(s)
- Amrita Guha
- Department of Radio Diagnosis, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
- *Correspondence: Amrita Guha, ; Jayant S. Goda,
| | - Jayant S. Goda
- Department of Radio Diagnosis, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
- *Correspondence: Amrita Guha, ; Jayant S. Goda,
| | - Archya Dasgupta
- Department of Radiation Oncology, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
| | - Abhishek Mahajan
- Department of Radio Diagnosis, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
| | - Soutik Halder
- Department of Biostatistics, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
| | - Jeetendra Gawde
- Department of Biostatistics, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
| | - Sanjay Talole
- Department of Biostatistics, Tata Memorial Centre, Homi Bhaba National Institute, Mumbai, India
| |
Collapse
|
20
|
Janssen A, Bennis FC, Mathôt RAA. Adoption of Machine Learning in Pharmacometrics: An Overview of Recent Implementations and Their Considerations. Pharmaceutics 2022; 14:1814. [PMID: 36145562 PMCID: PMC9502080 DOI: 10.3390/pharmaceutics14091814] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/17/2022] [Accepted: 08/22/2022] [Indexed: 11/23/2022] Open
Abstract
Pharmacometrics is a multidisciplinary field utilizing mathematical models of physiology, pharmacology, and disease to describe and quantify the interactions between medication and patient. As these models become more and more advanced, the need for advanced data analysis tools grows. Recently, there has been much interest in the adoption of machine learning (ML) algorithms. These algorithms offer strong function approximation capabilities and might reduce the time spent on model development. However, ML tools are not yet an integral part of the pharmacometrics workflow. The goal of this work is to discuss how ML algorithms have been applied in four stages of the pharmacometrics pipeline: data preparation, hypothesis generation, predictive modelling, and model validation. We will also discuss considerations before the use of ML algorithms with respect to each topic. We conclude by summarizing applications that hold potential for adoption by pharmacometricians.
Collapse
Affiliation(s)
- Alexander Janssen
- Department of Clinical Pharmacology, Hospital Pharmacy, Amsterdam University Medical Center, 1105 Amsterdam, The Netherlands
| | - Frank C. Bennis
- Quantitative Data Analytics Group, Department of Computer Science, Vrije Universiteit Amsterdam, 1081 Amsterdam, The Netherlands
| | - Ron A. A. Mathôt
- Department of Clinical Pharmacology, Hospital Pharmacy, Amsterdam University Medical Center, 1105 Amsterdam, The Netherlands
| |
Collapse
|
21
|
Giannikopoulos P, Parham DM. Pediatric Sarcomas: The Next Generation of Molecular Studies. Cancers (Basel) 2022; 14:2515. [PMID: 35626119 PMCID: PMC9139929 DOI: 10.3390/cancers14102515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 05/13/2022] [Accepted: 05/13/2022] [Indexed: 02/04/2023] Open
Abstract
Pediatric sarcomas constitute one of the largest groups of childhood cancers, following hematopoietic, neural, and renal lesions. Partly because of their diversity, they continue to offer challenges in diagnosis and treatment. In spite of the diagnostic, nosologic, and therapeutic gains made with genetic technology, newer means for investigation are needed. This article reviews emerging technology being used to study human neoplasia and how these methods might be applicable to pediatric sarcomas. Methods reviewed include single cell RNA sequencing (scRNAseq), spatial multi-omics, high-throughput functional genomics, and clustered regularly interspersed short palindromic sequence-Cas9 (CRISPR-Cas9) technology. In spite of these advances, the field continues to be challenged by a dearth of properly annotated materials, particularly from recurrences and metastases and pre- and post-treatment samples.
Collapse
Affiliation(s)
| | - David M. Parham
- Department of Anatomic Pathology, Children’s Hospital Los Angeles, Los Angeles, CA 90027, USA
- Department of Pathology, University of Southern California Keck School of Medicine, Los Angeles, CA 90033, USA
| |
Collapse
|
22
|
Ma Z, Zhu T, Wang H, Wang B, Fu L, Yu G. Investigation of serum markers of esophageal squamous cell carcinoma based on machine learning methods. J Biochem 2022; 172:29-36. [PMID: 35415740 DOI: 10.1093/jb/mvac030] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Accepted: 03/19/2022] [Indexed: 11/15/2022] Open
Abstract
Esophageal squamous cell carcinoma (ESCC) is one of the malignant tumors with high mortality in humans, and there is a lack of effective and convenient early diagnosis methods. By analyzing the serum miRNA expression data in ESCC tumor samples and normal samples, on the basis of the max-relevance and min-redundancy (mRMR) feature selection and the incremental feature selection (IFS) method, a random forest classifier constructed by 5 feature miRNAs was acquired in our study. The receiver operator characteristic (ROC) curve showed that the model was able to distinguish samples. Principal component analysis (PCA) and sample hierarchical cluster analysis showed that 5 feature miRNAs could well distinguish ESCC patients from healthy individuals. The expression levels of miR-663a, miR-5100 and miR-221-3p all showed a higher expression level in ESCC patients than those in healthy individuals. On the contrary, miR-6763-5p and miR-7111-5p both showed lower expression levels in ESCC patients than those in healthy individuals. In addition, the collected clinical serum samples were used for qRT-PCR analysis. It was uncovered that the expression trends of the 5 feature miRNAs followed a similar pattern with those in the training set. The above findings indicated that the 5 feature miRNAs may be serum tumor markers of ESCC. This study offers new insights for the early diagnosis of ESCC.
Collapse
Affiliation(s)
- Zhifeng Ma
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| | - Ting Zhu
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| | - Haiyong Wang
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| | - Bin Wang
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| | - Linhai Fu
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| | - Guangmao Yu
- Department of Thoracic Surgery, Shaoxing People's Hospital, Shaoxing, Zhejiang Province, China, 312000
| |
Collapse
|
23
|
Bajo-Morales J, Prieto-Prieto JC, Herrera LJ, Rojas I, Castillo-Secilla D. COVID-19 Biomarkers Recognition & Classification Using Intelligent Systems. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220328125029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background:
SARS-CoV-2 has paralyzed mankind due to its high transmissibility and its associated mortality, causing millions of infections and deaths worldwide. The search for gene expression biomarkers from the host transcriptional response to infection may help understand the underlying mechanisms by which the virus causes COVID-19. This research proposes a smart methodology integrating different RNA-Seq datasets from SARS-CoV-2, other respiratory diseases, and healthy patients.
Methods:
The proposed pipeline exploits the functionality of the ‘KnowSeq’ R/Bioc package, integrating different data sources and attaining a significantly larger gene expression dataset, thus endowing the results with higher statistical significance and robustness in comparison with previous studies in the literature. A detailed preprocessing step was carried out to homogenize the samples and build a clinical decision system for SARS-CoV-2. It uses machine learning techniques such as feature selection algorithm and supervised classification system. This clinical decision system uses the most differentially expressed genes among different diseases (including SARS-Cov-2) to develop a four-class classifier.
Results:
The multiclass classifier designed can discern SARS-CoV-2 samples, reaching an accuracy equal to 91.5%, a mean F1-Score equal to 88.5%, and a SARS-CoV-2 AUC equal to 94% by using only 15 genes as predictors. A biological interpretation of the gene signature extracted reveals relations with processes involved in viral responses.
Conclusion:
This work proposes a COVID-19 gene signature composed of 15 genes, selected after applying the feature selection ‘minimum Redundancy Maximum Relevance’ algorithm. The integration among several RNA-Seq datasets was a success, allowing for a considerable large number of samples and therefore providing greater statistical significance to the results than previous studies. Biological interpretation of the selected genes was also provided.
Collapse
Affiliation(s)
- Javier Bajo-Morales
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Juan Carlos Prieto-Prieto
- Nuclear Medicine Department, IMIBIC, University Hospital Reina Sofia, Menéndez Pidal Avenue, 14004, Córdoba, Spain
| | - Luis Javier Herrera
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Ignacio Rojas
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| | - Daniel Castillo-Secilla
- Department of Computer Architecture and Technology, University of Granada. C.I.T.I.C., Periodista Rafael Gómez Montero, 2, 18014, Granada, Spain
| |
Collapse
|
24
|
Integration of Multimodal Data from Disparate Sources for Identifying Disease Subtypes. BIOLOGY 2022; 11:biology11030360. [PMID: 35336734 PMCID: PMC8945377 DOI: 10.3390/biology11030360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 02/17/2022] [Accepted: 02/23/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary The diagnostic and treatment strategies of cancer remain generally suboptimal resulting in over-diagnosis or under-treatment. Though many attempts on optimizing treatment decisions by early prediction of disease progression have been undertaken, these efforts yielded only modest success so far due to the heterogeneity of cancer with multifactorial etiology. Here, we propose a deep-learning based data integration model capable of predicting disease progression by integrating collective information available through multiple studies with different cohorts and heterogeneous data types. The results have shown that the proposed data integration pipeline is able to identify disease progression with higher accuracy and robustness compared to using a single cohort, by offering a more complete picture of the specific disease on patients with brain, blood, and pancreatic cancers. Abstract Studies over the past decade have generated a wealth of molecular data that can be leveraged to better understand cancer risk, progression, and outcomes. However, understanding the progression risk and differentiating long- and short-term survivors cannot be achieved by analyzing data from a single modality due to the heterogeneity of disease. Using a scientifically developed and tested deep-learning approach that leverages aggregate information collected from multiple repositories with multiple modalities (e.g., mRNA, DNA Methylation, miRNA) could lead to a more accurate and robust prediction of disease progression. Here, we propose an autoencoder based multimodal data fusion system, in which a fusion encoder flexibly integrates collective information available through multiple studies with partially coupled data. Our results on a fully controlled simulation-based study have shown that inferring the missing data through the proposed data fusion pipeline allows a predictor that is superior to other baseline predictors with missing modalities. Results have further shown that short- and long-term survivors of glioblastoma multiforme, acute myeloid leukemia, and pancreatic adenocarcinoma can be successfully differentiated with an AUC of 0.94, 0.75, and 0.96, respectively.
Collapse
|
25
|
Li MD, Ahmed SR, Choy E, Lozano-Calderon SA, Kalpathy-Cramer J, Chang CY. Artificial intelligence applied to musculoskeletal oncology: a systematic review. Skeletal Radiol 2022; 51:245-256. [PMID: 34013447 DOI: 10.1007/s00256-021-03820-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 05/13/2021] [Accepted: 05/13/2021] [Indexed: 02/02/2023]
Abstract
Developments in artificial intelligence have the potential to improve the care of patients with musculoskeletal tumors. We performed a systematic review of the published scientific literature to identify the current state of the art of artificial intelligence applied to musculoskeletal oncology, including both primary and metastatic tumors, and across the radiology, nuclear medicine, pathology, clinical research, and molecular biology literature. Through this search, we identified 252 primary research articles, of which 58 used deep learning and 194 used other machine learning techniques. Articles involving deep learning have mostly involved bone scintigraphy, histopathology, and radiologic imaging. Articles involving other machine learning techniques have mostly involved transcriptomic analyses, radiomics, and clinical outcome prediction models using medical records. These articles predominantly present proof-of-concept work, other than the automated bone scan index for bone metastasis quantification, which has translated to clinical workflows in some regions. We systematically review and discuss this literature, highlight opportunities for multidisciplinary collaboration, and identify potentially clinically useful topics with a relative paucity of research attention. Musculoskeletal oncology is an inherently multidisciplinary field, and future research will need to integrate and synthesize noisy siloed data from across clinical, imaging, and molecular datasets. Building the data infrastructure for collaboration will help to accelerate progress towards making artificial intelligence truly useful in musculoskeletal oncology.
Collapse
Affiliation(s)
- Matthew D Li
- Division of Musculoskeletal Imaging and Intervention, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA. .,Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.
| | - Syed Rakin Ahmed
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA.,Harvard Medical School, Harvard Graduate Program in Biophysics, Harvard University, Cambridge, MA, USA.,Geisel School of Medicine At Dartmouth, Dartmouth College, Hanover, NH, USA
| | - Edwin Choy
- Division of Hematology Oncology, Department of Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Santiago A Lozano-Calderon
- Department of Orthopedic Surgery, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Jayashree Kalpathy-Cramer
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Connie Y Chang
- Division of Musculoskeletal Imaging and Intervention, Department of Radiology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
26
|
El-Nakeep S. Molecular and genetic markers in hepatocellular carcinoma: In silico analysis to clinical validation (current limitations and future promises). World J Gastrointest Pathophysiol 2022; 13:1-14. [PMID: 35116176 PMCID: PMC8788164 DOI: 10.4291/wjgp.v13.i1.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/21/2021] [Revised: 05/15/2021] [Accepted: 12/22/2021] [Indexed: 02/06/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is the second cause of cancer-related mortality. The diagnosis of HCC depends mainly on -fetoprotein, which is limited in its diagnostic and screening capabilities. There is an urgent need for a biomarker that detects early HCC to give the patients a chance for curative treatment. New targets of therapy could enhance survival and create future alternative curative methods. In silico analysis provides both; discovery of biomarkers, and understanding of the molecular pathways, to pave the way for treatment development. This review discusses the role of in silico analysis in the discovery of biomarkers, molecular pathways, and the role the author has contributed to this area of research. It also discusses future aspirations and current limitations. A literature review was conducted on the topic using various databases (PubMed, Science Direct, and Wiley Online Library), searching in various reviews, and editorials on the topic, with overviewing the author's own published and unpublished work. This review discussed the steps of the validation process from in silico analysis to in vivo validation, to incorporation into clinical practice guidelines. In addition, reviewing the recent lines of research of bioinformatic studies related to HCC. In conclusion, the genetic, molecular and epigenetic markers discoveries are hot areas for HCC research. Bioinformatics will enhance our ability to accomplish this understanding in the near future. We face certain limitations that we need to overcome.
Collapse
Affiliation(s)
- Sarah El-Nakeep
- Gastroenterology and Hepatology Unit, Department of Internal Medicine, Faculty of Medicine, Ain Shams University, Cairo 11591, Egypt
| |
Collapse
|
27
|
Perrier A, Hainaut P, Guenoun A, Nguyen DP, Lamy PJ, Guerber F, Troalen F, Denis JA, Boissan M. En marche vers une oncologie personnalisée : l’apport des techniques génomiques et de l’intelligence artificielle dans l’usage des biomarqueurs tumoraux circulants. Bull Cancer 2022; 109:170-184. [DOI: 10.1016/j.bulcan.2021.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 11/20/2021] [Accepted: 12/17/2021] [Indexed: 11/24/2022]
|
28
|
Liu Y, Qu HQ, Chang X, Tian L, Glessner J, Sleiman PAM, Hakonarson H. Expansion of Schizophrenia Gene Network Knowledge Using Machine Learning Selected Signals From Dorsolateral Prefrontal Cortex and Amygdala RNA-seq Data. Front Psychiatry 2022; 13:797329. [PMID: 35386517 PMCID: PMC8978801 DOI: 10.3389/fpsyt.2022.797329] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 02/07/2022] [Indexed: 11/13/2022] Open
Abstract
It is widely accepted, given the complex nature of schizophrenia (SCZ) gene networks, that a few or a small number of genes are unlikely to represent the underlying functional pathways responsible for SCZ pathogenesis. Several studies from large cohorts have been performed to search for key SCZ network genes using different analytical approaches, such as differential expression tests, genome-wide association study (GWAS), copy number variations, and differential methylations, or from the analysis of mutations residing in the coding regions of the genome. However, only a small portion (<10%) of candidate genes identified in these studies were considered SCZ disease-associated genes in SCZ pathways. RNA sequencing (RNA-seq) has been a powerful method to detect functional signals. In this study, we used RNA-seq data from the dorsolateral prefrontal cortex (DLPFC) from 254 individuals and RNA-seq data from the amygdala region from 46 individuals. Analysis was performed using machine learning methods, including random forest and factor analysis, to prioritize the numbers of genes from previous SCZ studies. For genes most differentially expressed between SCZ and healthy controls, 18 were added to known SCZ-associated pathways. These include three genes (GNB2, ITPR1, and PLCB2) for the glutamatergic synapse pathway, six genes (P2RX6, EDNRB, GHR, GRID2, TSPO, and S1PR1) for neuroactive ligand-receptor interaction, eight genes (CAMK2G, MAP2K1, RAF1, PDE3A, RRAS2, VAV1, ATP1B2, and GLI3) for the cAMP signaling pathway, and four genes (GNB2, CAMK2G, ITPR1, and PLCB2) for the dopaminergic synapse pathway. Besides the previously established pathways, 103 additional gene interactions were expanded to SCZ-associated networks, which were shared among both the DLPFC and amygdala regions. The novel knowledge of molecular targets gained from this study brings opportunities for a more complete picture of the SCZ pathogenesis. A noticeable fact is that hub genes, in the expanded networks, are not necessary differentially expressed or containing hotspots from GWAS studies, indicating that individual methods, such as differential expression tests, are not enough to identify the underlying SCZ pathways and that more integrative analysis is required to unfold the pathobiology of SCZ.
Collapse
Affiliation(s)
- Yichuan Liu
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Hui-Qi Qu
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Xiao Chang
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Lifeng Tian
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Joseph Glessner
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Patrick A M Sleiman
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States.,Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.,Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, United States
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, PA, United States.,Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States.,Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, United States.,Division of Pulmonary Medicine, Children's Hospital of Philadelphia, Philadelphia, PA, United States.,Faculty of Medicine, University of Iceland, Reykjavik, Iceland
| |
Collapse
|
29
|
Liu Y, Liu C, Zhang H, Yi X, Yu A. Establishment of A Nomogram for Predicting the Prognosis of Soft Tissue Sarcoma Based on Seven Glycolysis-Related Gene Risk Score. Front Genet 2021; 12:675865. [PMID: 34925434 PMCID: PMC8674658 DOI: 10.3389/fgene.2021.675865] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Accepted: 11/16/2021] [Indexed: 12/31/2022] Open
Abstract
Background: Soft tissue sarcoma (STS) is a group of tumors with a low incidence and a complex type. Therefore, it is an arduous task to accurately diagnose and treat them. Glycolysis-related genes are closely related to tumor progression and metastasis. Hence, our study is dedicated to the development of risk characteristics and nomograms based on glycolysis-related genes to assess the survival possibility of patients with STS. Methods: All data sets used in our research include gene expression data and clinical medical characteristics in the Genomic Data Commons Data Portal (National Cancer Institute) Soft Tissue Sarcoma (TCGA SARC) and GEO database, gene sequence data of corresponding non-diseased human tissues in the Genotype Tissue Expression (GTEx).Next, transcriptome data in TCGA SARC was analyzed as the training set to construct a glycolysis-related gene risk signature and nomogram, which were confirmed in external test set. Results: We identified and verified the 7 glycolysis-related gene signature that is highly correlated with the overall survival (OS) of STS patients, which performed excellently in the evaluation of the size of AUC, and calibration curve. As well as, the results of the analysis of univariate and multivariate Cox regression demonstrated that this 7 glycolysis-related gene characteristic acts independently as an influence predictor for STS patients. Therefore, a prognostic-related nomogram combing 7 gene signature with clinical influencing features was constructed to predict OS of patients with STS in the training set that demonstrated strong predictive values for survival. Conclusion: These results demonstrate that both glycolysis-related gene risk signature and nomogram were efficient prognostic indicators for patients with STS. These findings may contribute to make individualize clinical decisions on prognosis and treatment.
Collapse
Affiliation(s)
- Yuhang Liu
- Department of Trauma and Microsurgery Orthopedics, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Changjiang Liu
- Department of Trauma and Microsurgery Orthopedics, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Hao Zhang
- Department of Trauma and Microsurgery Orthopedics, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Xinzeyu Yi
- Department of Trauma and Microsurgery Orthopedics, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Aixi Yu
- Department of Trauma and Microsurgery Orthopedics, Zhongnan Hospital of Wuhan University, Wuhan, China
| |
Collapse
|
30
|
Lanzi C, Favini E, Dal Bo L, Tortoreto M, Arrighetti N, Zaffaroni N, Cassinelli G. Upregulation of ERK-EGR1-heparanase axis by HDAC inhibitors provides targets for rational therapeutic intervention in synovial sarcoma. J Exp Clin Cancer Res 2021; 40:381. [PMID: 34857011 PMCID: PMC8638516 DOI: 10.1186/s13046-021-02150-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 10/21/2021] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Synovial sarcoma (SS) is an aggressive soft tissue tumor with limited therapeutic options in advanced stage. SS18-SSX fusion oncogenes, which are the hallmarks of SS, cause epigenetic rewiring involving histone deacetylases (HDACs). Promising preclinical studies supporting HDAC targeting for SS treatment were not reflected in clinical trials with HDAC inhibitor (HDACi) monotherapies. We investigated pathways implicated in SS cell response to HDACi to identify vulnerabilities exploitable in combination treatments and improve the therapeutic efficacy of HDACi-based regimens. METHODS Antiproliferative and proapoptotic effects of the HDACi SAHA and FK228 were examined in SS cell lines in parallel with biochemical and molecular analyses to bring out cytoprotective pathways. Treatments combining HDACi with drugs targeting HDACi-activated prosurvival pathways were tested in functional assays in vitro and in a SS orthotopic xenograft model. Molecular mechanisms underlying synergisms were investigated in SS cells through pharmacological and gene silencing approaches and validated by qRT-PCR and Western blotting. RESULTS SS cell response to HDACi was consistently characterized by activation of a cytoprotective and auto-sustaining axis involving ERKs, EGR1, and the β-endoglycosidase heparanase, a well recognized pleiotropic player in tumorigenesis and disease progression. HDAC inhibition was shown to upregulate heparanase by inducing expression of the positive regulator EGR1 and by hampering negative regulation by p53 through its acetylation. Interception of HDACi-induced ERK-EGR1-heparanase pathway by cell co-treatment with a MEK inhibitor (trametinib) or a heparanase inhibitor (SST0001/roneparstat) enhanced antiproliferative and pro-apoptotic effects. HDAC and heparanase inhibitors had opposite effects on histone acetylation and nuclear heparanase levels. The combination of SAHA with SST0001 prevented the upregulation of ERK-EGR1-heparanase induced by the HDACi and promoted caspase-dependent cell death. In vivo, the combined treatment with SAHA and SST0001 potentiated the antitumor efficacy against the CME-1 orthotopic SS model as compared to single agent administration. CONCLUSIONS The present study provides preclinical rationale and mechanistic insights into drug combinatory strategies based on the use of ERK pathway and heparanase inhibitors to improve the efficacy of HDACi-based antitumor therapies in SS. The involvement of classes of agents already clinically available, or under clinical evaluation, indicates the transferability potential of the proposed approaches.
Collapse
Affiliation(s)
- Cinzia Lanzi
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Enrica Favini
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Laura Dal Bo
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Monica Tortoreto
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Noemi Arrighetti
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Nadia Zaffaroni
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy
| | - Giuliana Cassinelli
- Department of Applied Research and Technological Development, Molecular Pharmacology Unit, Fondazione IRCCS Istituto Nazionale dei Tumori, Via Amadeo 42, 20133, Milan, Italy.
| |
Collapse
|
31
|
Cheng ASK, Guan Q, Su Y, Zhou P, Zeng Y. Integration of Machine Learning and Blockchain Technology in the Healthcare Field: A Literature Review and Implications for Cancer Care. Asia Pac J Oncol Nurs 2021; 8:720-724. [PMID: 34790856 PMCID: PMC8522602 DOI: 10.4103/apjon.apjon-2140] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 08/04/2021] [Indexed: 11/04/2022] Open
Abstract
This brief report aimed to describe a narrative review about the application of machine learning (ML) methods and Blockchain technology (BCT) in the healthcare field, and to illustrate the integration of these two technologies in cancer survivorship care. A total of six eligible papers were included in the narrative review. ML and BCT are two data-driven technologies, and there is rapidly growing interest in integrating them for clinical data management and analysis in healthcare. The findings of this report indicate that both technologies can integrate feasibly and effectively. In conclusion, this brief report provided the state-of-art evidence about the integration of the most promising technologies of ML and BCT in health field, and gave an example of how to apply these two most disruptive technologies in cancer survivorship care.
Collapse
Affiliation(s)
- Andy S K Cheng
- Department of Rehabilitation Sciences, The Hong Kong Polytechnic University, Hong Kong, China
| | - Qiongyao Guan
- Department of Nursing, Yunnan Cancer Hospital, Kunming, China
| | - Yan Su
- Department of Nursing, Yunnan Cancer Hospital, Kunming, China
| | - Ping Zhou
- Department of Oncology, Affiliated Hospital of Southwest Medical University, Luzhou, China
| | - Yingchun Zeng
- Department of Rehabilitation Sciences, The Hong Kong Polytechnic University, Hong Kong, China
| |
Collapse
|
32
|
WNT/β-Catenin Pathway in Soft Tissue Sarcomas: New Therapeutic Opportunities? Cancers (Basel) 2021; 13:cancers13215521. [PMID: 34771683 PMCID: PMC8583315 DOI: 10.3390/cancers13215521] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/27/2021] [Accepted: 10/28/2021] [Indexed: 12/12/2022] Open
Abstract
Simple Summary The WNT/β-catenin signaling pathway is involved in fundamental processes for the proliferation and differentiation of mesenchymal stem cells. However, little is known about its relevance for mesenchymal neoplasms, such us soft tissue sarcomas (STS). Chemotherapy based on doxorubicin (DXR) still remains the standard first-line treatment for locally advanced unresectable or metastatic STS, although overall survival could not be improved by combination with other chemotherapeutics. In this sense, the development of new therapeutic approaches continues to be an unmatched goal. This review covers the most important molecular alterations of the WNT signaling pathway in STS, broadening the current knowledge about STS as well as identifying novel drug targets. Furthermore, the current therapeutic options and drug candidates to modulate WNT signaling, which are usually classified by their interaction site upstream or downstream of β-catenin, and their presumable clinical impact on STS are discussed. Abstract Soft tissue sarcomas (STS) are a very heterogeneous group of rare tumors, comprising more than 50 different histological subtypes that originate from mesenchymal tissue. Despite their heterogeneity, chemotherapy based on doxorubicin (DXR) has been in use for forty years now and remains the standard first-line treatment for locally advanced unresectable or metastatic STS, although overall survival could not be improved by combination with other chemotherapeutics. In this sense, the development of new therapeutic approaches continues to be a largely unmatched goal. The WNT/β-catenin signaling pathway is involved in various fundamental processes for embryogenic development, including the proliferation and differentiation of mesenchymal stem cells. Although the role of this pathway has been widely researched in neoplasms of epithelial origin, little is known about its relevance for mesenchymal neoplasms. This review covers the most important molecular alterations of the WNT signaling pathway in STS. The detection of these alterations and the understanding of their functional consequences for those pathways controlling sarcomagenesis development and progression are crucial to broaden the current knowledge about STS as well as to identify novel drug targets. In this regard, the current therapeutic options and drug candidates to modulate WNT signaling, which are usually classified by their interaction site upstream or downstream of β-catenin, and their presumable clinical impact on STS are also discussed.
Collapse
|
33
|
Caudai C, Galizia A, Geraci F, Le Pera L, Morea V, Salerno E, Via A, Colombo T. AI applications in functional genomics. Comput Struct Biotechnol J 2021; 19:5762-5790. [PMID: 34765093 PMCID: PMC8566780 DOI: 10.1016/j.csbj.2021.10.009] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 10/05/2021] [Accepted: 10/05/2021] [Indexed: 12/13/2022] Open
Abstract
We review the current applications of artificial intelligence (AI) in functional genomics. The recent explosion of AI follows the remarkable achievements made possible by "deep learning", along with a burst of "big data" that can meet its hunger. Biology is about to overthrow astronomy as the paradigmatic representative of big data producer. This has been made possible by huge advancements in the field of high throughput technologies, applied to determine how the individual components of a biological system work together to accomplish different processes. The disciplines contributing to this bulk of data are collectively known as functional genomics. They consist in studies of: i) the information contained in the DNA (genomics); ii) the modifications that DNA can reversibly undergo (epigenomics); iii) the RNA transcripts originated by a genome (transcriptomics); iv) the ensemble of chemical modifications decorating different types of RNA transcripts (epitranscriptomics); v) the products of protein-coding transcripts (proteomics); and vi) the small molecules produced from cell metabolism (metabolomics) present in an organism or system at a given time, in physiological or pathological conditions. After reviewing main applications of AI in functional genomics, we discuss important accompanying issues, including ethical, legal and economic issues and the importance of explainability.
Collapse
Affiliation(s)
- Claudia Caudai
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Antonella Galizia
- CNR, Institute of Applied Mathematics and Information Technologies (IMATI), Genoa, Italy
| | - Filippo Geraci
- CNR, Institute for Informatics and Telematics (IIT), Pisa, Italy
| | - Loredana Le Pera
- CNR, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Bari, Italy
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Veronica Morea
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Emanuele Salerno
- CNR, Institute of Information Science and Technologies “A. Faedo” (ISTI), Pisa, Italy
| | - Allegra Via
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| | - Teresa Colombo
- CNR, Institute of Molecular Biology and Pathology (IBPM), Rome, Italy
| |
Collapse
|
34
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 407] [Impact Index Per Article: 101.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
35
|
Hou Y, Zhang G. Identification of immune-infiltrating cell-related biomarkers in hepatocellular carcinoma based on gene co-expression network analysis. Diagn Pathol 2021; 16:57. [PMID: 34218795 PMCID: PMC8255019 DOI: 10.1186/s13000-021-01118-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 06/14/2021] [Indexed: 12/24/2022] Open
Abstract
Background Hepatocellular carcinoma (HCC) is often caused by chronic liver infection or inflammation. Searching for potential immunotherapy targets will aid the early diagnosis and treatment of HCC. Methods Firstly, detailed HCC data were downloaded from The Cancer Genome Atlas database. GDCRNATools was used for the comprehensive analysis of RNA sequencing data. Subsequently, the CIBERSORT package was used to estimate infiltration scores of 22 types of immune cells in complex samples. Furthermore, hub genes were identified via weighted gene co-expression network analysis (WGCNA) and protein-protein interaction (PPI) network analysis. In addition, multiple databases were used to validate the expression of hub gene in the tumor tissue. Finally, prognostic, diagnostic and immunohistochemical analysis of key hub genes was performed. Results In the present study, 9 hub genes were identified using WGCNA and PPI network analysis. Furthermore, the expression levels of 9 genes were positively correlated with the infiltration levels of CD8-positive T (CD8+ T) cells. In multiple dataset validations, the expression levels of CCL5, CXCR6, CD3E, and LCK were decreased in cancer tissues. In addition, survival analysis revealed that patients with LCK low expression had a poor survival prognosis (P < 0.05). Immunohistochemistry results demonstrated that CCL5, CD3E and LCK were expressed at low levels in HCC cancer tissues. Conclusion The identification of CCL5, CXCR6, CD3E and LCK may be helpful in the development of early diagnosis and therapy of HCC. LCK may be a potential prognostic biomarker for immunotherapy for HCC. Supplementary Information The online version contains supplementary material available at 10.1186/s13000-021-01118-y.
Collapse
Affiliation(s)
- Yinghui Hou
- Department of Gastroenterology, The Second People's Hospital of Liaocheng City, No.306 Jiankang Street, Linqing City, 252600, Shandong Province, China
| | - Guizhi Zhang
- Department of Gastroenterology, The Second People's Hospital of Liaocheng City, No.306 Jiankang Street, Linqing City, 252600, Shandong Province, China.
| |
Collapse
|
36
|
Lin LL, Liu ZZ, Tian JZ, Zhang X, Zhang Y, Yang M, Zhong HC, Fang W, Wei RX, Hu C. Integrated Analysis of Nine Prognostic RNA-Binding Proteins in Soft Tissue Sarcoma. Front Oncol 2021; 11:633024. [PMID: 34026613 PMCID: PMC8138553 DOI: 10.3389/fonc.2021.633024] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/10/2021] [Indexed: 12/24/2022] Open
Abstract
RNA-binding proteins (RBPs) have been shown to be dysregulated in cancer transcription and translation, but few studies have investigated their mechanism of action in soft tissue sarcoma (STS). Here, The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) databases were used to identify differentially expressed RBPs in STS and normal tissues. Through a series of biological information analyses, 329 differentially expressed RBPs were identified. Functional enrichment analysis showed that differentially expressed RBPs were mainly involved in RNA transport, RNA splicing, mRNA monitoring pathways, ribosome biogenesis and translation regulation. Through Cox regression analyses, 9 RBPs (BYSL, IGF2BP3, DNMT3B, TERT, CD3EAP, SRSF12, TLR7, TRIM21 and MEX3A) were all up-regulated in STS as prognosis-related genes, and a prognostic model was established. The model calculated a risk score based on the expression of 9 hub RBPs. The risk score could be used for risk stratification of patients and had a high prognostic value based on the receiver operating characteristic (ROC) curve. We also established a nomogram containing risk scores and 9 key RBPs to predict the 1-year, 3-year, and 5-year survival rates of patients in STS. Afterwards, methylation analysis showed significant changes in the methylation degree of BYSL, CD3EAP and MEX2A. Furthermore, the expression of 9 hub RBPs was closely related to immune infiltration rather than tumor purity. Based on the above studies, these findings may provide new insights into the pathogenesis of STS and will provide candidate biomarkers for the prognosis of STS.
Collapse
Affiliation(s)
- Lu-Lu Lin
- Department of Pathology and Pathophysiology, School of Basic Medicine, Wuhan University, Wuhan, China
| | - Zi-Zhen Liu
- The Third Clinical School, Hubei University of Medicine, Shiyan, China
| | - Jing-Zhuo Tian
- The Third Clinical School, Hubei University of Medicine, Shiyan, China
| | - Xiao Zhang
- Department of Hepatobiliary and Pancreatic Surgery, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Yan Zhang
- The Third Clinical School, Hubei University of Medicine, Shiyan, China
| | - Min Yang
- Department of Spine and Orthopedic Oncology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Hou-Cheng Zhong
- Department of Spine and Orthopedic Oncology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Wei Fang
- Hubei University of Medicine, Shiyan, China
| | - Ren-Xiong Wei
- Department of Spine and Orthopedic Oncology, Zhongnan Hospital of Wuhan University, Wuhan, China
| | - Chao Hu
- Department of Spine and Orthopedic Oncology, Zhongnan Hospital of Wuhan University, Wuhan, China
| |
Collapse
|
37
|
Dey TK, Mandal S, Mukherjee S. Gene expression data classification using topology and machine learning models. BMC Bioinformatics 2021; 22:627. [PMID: 35596135 PMCID: PMC9121583 DOI: 10.1186/s12859-022-04704-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 04/28/2022] [Indexed: 12/02/2022] Open
Abstract
Background Interpretation of high-throughput gene expression data continues to require mathematical tools in data analysis that recognizes the shape of the data in high dimensions. Topological data analysis (TDA) has recently been successful in extracting robust features in several applications dealing with high dimensional constructs. In this work, we utilize some recent developments in TDA to curate gene expression data. Our work differs from the predecessors in two aspects: (1) Traditional TDA pipelines use topological signatures called barcodes to enhance feature vectors which are used for classification. In contrast, this work involves curating relevant features to obtain somewhat better representatives with the help of TDA. This representatives of the entire data facilitates better comprehension of the phenotype labels. (2) Most of the earlier works employ barcodes obtained using topological summaries as fingerprints for the data. Even though they are stable signatures, there exists no direct mapping between the data and said barcodes. Results The topology relevant curated data that we obtain provides an improvement in shallow learning as well as deep learning based supervised classifications. We further show that the representative cycles we compute have an unsupervised inclination towards phenotype labels. This work thus shows that topological signatures are able to comprehend gene expression levels and classify cohorts accordingly. Conclusions In this work, we engender representative persistent cycles to discern the gene expression data. These cycles allow us to directly procure genes entailed in similar processes.
Collapse
|
38
|
Del Giudice M, Peirone S, Perrone S, Priante F, Varese F, Tirtei E, Fagioli F, Cereda M. Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology. Int J Mol Sci 2021; 22:ijms22094563. [PMID: 33925407 PMCID: PMC8123853 DOI: 10.3390/ijms22094563] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 02/01/2023] Open
Abstract
Artificial intelligence, or the discipline of developing computational algorithms able to perform tasks that requires human intelligence, offers the opportunity to improve our idea and delivery of precision medicine. Here, we provide an overview of artificial intelligence approaches for the analysis of large-scale RNA-sequencing datasets in cancer. We present the major solutions to disentangle inter- and intra-tumor heterogeneity of transcriptome profiles for an effective improvement of patient management. We outline the contributions of learning algorithms to the needs of cancer genomics, from identifying rare cancer subtypes to personalizing therapeutic treatments.
Collapse
Affiliation(s)
- Marco Del Giudice
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
| | - Serena Peirone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Sarah Perrone
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Francesca Priante
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
| | - Fabiola Varese
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
| | - Elisa Tirtei
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
| | - Franca Fagioli
- Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
- Department of Public Health and Paediatric Sciences, University of Torino, 10124 Turin, Italy
| | - Matteo Cereda
- Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.)
- Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
- Correspondence: ; Tel.: +39-011-993-3969
| |
Collapse
|
39
|
Lam SW, Kostine M, de Miranda NFCC, Schöffski P, Lee CJ, Morreau H, Bovée JVMG. Mismatch repair deficiency is rare in bone and soft tissue tumors. Histopathology 2021; 79:509-520. [PMID: 33825202 PMCID: PMC8518745 DOI: 10.1111/his.14377] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 03/25/2021] [Accepted: 03/30/2021] [Indexed: 12/19/2022]
Abstract
Introduction There has been an increased demand for mismatch repair (MMR) status testing in sarcoma patients after the success of immune checkpoint inhibition (ICI) in MMR deficient tumors. However, data on MMR deficiency in bone and soft tissue tumors is sparse, rendering it unclear if routine screening should be applied. Hence, we aimed to study the frequency of MMR deficiency in bone and soft tissue tumors after we were prompted by two (potential) Lynch syndrome patients developing sarcomas. Methods Immunohistochemical expression of MLH1, PMS2, MSH2 and MSH6 was assessed on tissue micro arrays (TMAs), and included 353 bone and 539 soft tissue tumors. Molecular data was either retrieved from reports or microsatellite instability (MSI) analysis was performed. In MLH1 negative cases, additional MLH1 promoter hypermethylation analysis followed. Furthermore, a systematic literature review on MMR deficiency in bone and soft tissue tumors was conducted. Results Eight MMR deficient tumors were identified (1%), which included four leiomyosarcoma, two rhabdomyosarcoma, one malignant peripheral nerve sheath tumor and one radiation‐associated sarcoma. Three patients were suspected for Lynch syndrome. Literature review revealed 30 MMR deficient sarcomas, of which 33% were undifferentiated/unclassifiable sarcomas. 57% of the patients were genetically predisposed. Conclusion MMR deficiency is rare in bone and soft tissue tumors. Screening focusing on tumors with myogenic differentiation, undifferentiated/unclassifiable sarcomas and in patients with a genetic predisposition / co‐occurrence of other malignancies can be helpful in identifying patients potentially eligible for ICI.
Collapse
Affiliation(s)
- Suk Wai Lam
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
| | - Marie Kostine
- Department of Rheumatology, Centre Hospitalier Universitaire de Bordeaux Groupe hospitalier Pellegrin, Bordeaux, France
| | | | - Patrick Schöffski
- Department of General Medical Oncology, University Hospitals Leuven, Leuven Cancer Institute, Leuven, Belgium.,Department of Oncology, KU Leuven, Laboratory of Experimental Oncology, Leuven, Belgium
| | - Che-Jui Lee
- Department of General Medical Oncology, University Hospitals Leuven, Leuven Cancer Institute, Leuven, Belgium.,Department of Oncology, KU Leuven, Laboratory of Experimental Oncology, Leuven, Belgium
| | - Hans Morreau
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
| | - Judith V M G Bovée
- Department of Pathology, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
40
|
Saghaleyni R, Sheikh Muhammad A, Bangalore P, Nielsen J, Robinson JL. Machine learning-based investigation of the cancer protein secretory pathway. PLoS Comput Biol 2021; 17:e1008898. [PMID: 33819271 PMCID: PMC8049480 DOI: 10.1371/journal.pcbi.1008898] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 04/15/2021] [Accepted: 03/22/2021] [Indexed: 12/13/2022] Open
Abstract
Deregulation of the protein secretory pathway (PSP) is linked to many hallmarks of cancer, such as promoting tissue invasion and modulating cell-cell signaling. The collection of secreted proteins processed by the PSP, known as the secretome, is often studied due to its potential as a reservoir of tumor biomarkers. However, there has been less focus on the protein components of the secretory machinery itself. We therefore investigated the expression changes in secretory pathway components across many different cancer types. Specifically, we implemented a dual approach involving differential expression analysis and machine learning to identify PSP genes whose expression was associated with key tumor characteristics: mutation of p53, cancer status, and tumor stage. Eight different machine learning algorithms were included in the analysis to enable comparison between methods and to focus on signals that were robust to algorithm type. The machine learning approach was validated by identifying PSP genes known to be regulated by p53, and even outperformed the differential expression analysis approach. Among the different analysis methods and cancer types, the kinesin family members KIF20A and KIF23 were consistently among the top genes associated with malignant transformation or tumor stage. However, unlike most cancer types which exhibited elevated KIF20A expression that remained relatively constant across tumor stages, renal carcinomas displayed a more gradual increase that continued with increasing disease severity. Collectively, our study demonstrates the complementary nature of a combined differential expression and machine learning approach for analyzing gene expression data, and highlights key PSP components relevant to features of tumor pathophysiology that may constitute potential therapeutic targets. The secretory pathway is a series of intracellular compartments and enzymes that process and export proteins from the cell to its surrounding environment. Dysfunction of the secretory pathway is associated with many diseases, including cancer, and therefore constitutes a potential target for novel therapeutic strategies. The large number of interacting components that comprise the secretory pathway pose a challenge when attempting to identify where the dysfunction originates or how to restore healthy function. To improve our understanding of how the secretory pathway is changed within tumors, we used gene expression data from normal tissue and tumor samples from thousands of individuals which included many different types of cancers. The data was analyzed using different machine learning algorithms which we trained to predict sample characteristics, such as disease severity. This training quantified the relative degree to which each gene was associated with the tumor characteristic, allowing us to predict which secretory pathway components were important for processes such as tumor progression—both within specific cancer types and across many different cancer types. The machine learning-based approach demonstrated excellent performance compared to traditional gene expression analysis methods and identified several secretory pathway components with strong evidence of involvement in tumor development.
Collapse
Affiliation(s)
- Rasool Saghaleyni
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Azam Sheikh Muhammad
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | | | - Jens Nielsen
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Wallenberg Center for Protein Research, Chalmers University of Technology, Gothenburg, Sweden
- BioInnovation Institute, Copenhagen, Denmark
| | - Jonathan L. Robinson
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Wallenberg Center for Protein Research, Chalmers University of Technology, Gothenburg, Sweden
- Department of Biology and Biological Engineering, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Chalmers University of Technology, Gothenburg, Sweden
- * E-mail:
| |
Collapse
|
41
|
Kim Y, Kang JW, Kang J, Kwon EJ, Ha M, Kim YK, Lee H, Rhee JK, Kim YH. Novel deep learning-based survival prediction for oral cancer by analyzing tumor-infiltrating lymphocyte profiles through CIBERSORT. Oncoimmunology 2021; 10:1904573. [PMID: 33854823 PMCID: PMC8018482 DOI: 10.1080/2162402x.2021.1904573] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/22/2021] [Accepted: 03/13/2021] [Indexed: 01/13/2023] Open
Abstract
The tumor microenvironment (TME) within mucosal neoplastic tissue in oral cancer (ORCA) is greatly influenced by tumor-infiltrating lymphocytes (TILs). Here, a clustering method was performed using CIBERSORT profiles of ORCA data that were filtered from the publicly accessible data of patients with head and neck cancer in The Cancer Genome Atlas (TCGA) using hierarchical clustering where patients were regrouped into binary risk groups based on the clustering-measuring scores and survival patterns associated with individual groups. Based on this analysis, clinically reasonable differences were identified in 16 out of 22 TIL fractions between groups. A deep neural network classifier was trained using the TIL fraction patterns. This internally validated classifier was used on another individual ORCA dataset from the International Cancer Genome Consortium data portal, and patient survival patterns were precisely predicted. Seven common differentially expressed genes between the two risk groups were obtained. This new approach confirms the importance of TILs in the TME and provides a direction for the use of a novel deep-learning approach for cancer prognosis.
Collapse
Affiliation(s)
- Yeongjoo Kim
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Ji Wan Kang
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Junho Kang
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Eun Jung Kwon
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Mihyang Ha
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Yoon Kyeong Kim
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Hansong Lee
- Interdisplinary Program of Genomic Science, Pusan National University, Yangsan, Republic of Korea
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| | - Je-Keun Rhee
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| | - Yun Hak Kim
- Department of Biomedical Informatics, School of Medicine, Pusan National University, Yangsan, Republic of Korea
- Department of Anatomy, School of Medicine, Pusan National University, Yangsan, Republic of Korea
| |
Collapse
|
42
|
Liu Y, Qu HQ, Chang X, Tian L, Qu J, Glessner J, Sleiman PMA, Hakonarson H. Machine Learning Reduced Gene/Non-Coding RNA Features That Classify Schizophrenia Patients Accurately and Highlight Insightful Gene Clusters. Int J Mol Sci 2021; 22:3364. [PMID: 33805976 PMCID: PMC8037538 DOI: 10.3390/ijms22073364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 03/20/2021] [Accepted: 03/23/2021] [Indexed: 12/28/2022] Open
Abstract
RNA-seq has been a powerful method to detect the differentially expressed genes/long non-coding RNAs (lncRNAs) in schizophrenia (SCZ) patients; however, due to overfitting problems differentially expressed targets (DETs) cannot be used properly as biomarkers. This study used machine learning to reduce gene/non-coding RNA features. Dorsolateral prefrontal cortex (dlpfc) RNA-seq data from 254 individuals was obtained from the CommonMind consortium. The average predictive accuracy for SCZ patients was 67% based on coding genes, and 96% based on long non-coding RNAs (lncRNAs). Machine learning is a powerful algorithm to reduce functional biomarkers in SCZ patients. The lncRNAs capture the characteristics of SCZ tissue more accurately than mRNA as the former regulate every level of gene expression, not limited to mRNA levels.
Collapse
Affiliation(s)
- Yichuan Liu
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Hui-Qi Qu
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Xiao Chang
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Lifeng Tian
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Jingchun Qu
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Joseph Glessner
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
| | - Patrick M. A. Sleiman
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
- Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Hakon Hakonarson
- Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; (Y.L.); (H.-Q.Q.); (X.C.); (L.T.); (J.Q.); (J.G.); (P.M.A.S.)
- Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Department of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA
| |
Collapse
|
43
|
Zhang X, Jonassen I, Goksøyr A. Machine Learning Approaches for Biomarker Discovery Using Gene Expression Data. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
|
44
|
Ye T, Li S, Zhang Y. Genomic pan-cancer classification using image-based deep learning. Comput Struct Biotechnol J 2021; 19:835-846. [PMID: 33598099 PMCID: PMC7848437 DOI: 10.1016/j.csbj.2021.01.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 01/05/2021] [Accepted: 01/08/2021] [Indexed: 12/24/2022] Open
Abstract
Accurate cancer type classification based on genetic mutation can significantly facilitate cancer-related diagnosis. However, existing methods usually use feature selection combined with simple classifiers to quantify key mutated genes, resulting in poor classification performance. To circumvent this problem, a novel image-based deep learning strategy is employed to distinguish different types of cancer. Unlike conventional methods, we first convert gene mutation data containing single nucleotide polymorphisms, insertions and deletions into a genetic mutation map, and then apply the deep learning networks to classify different cancer types based on the mutation map. We outline these methods and present results obtained in training VGG-16, Inception-v3, ResNet-50 and Inception-ResNet-v2 neural networks to classify 36 types of cancer from 9047 patient samples. Our approach achieves overall higher accuracy (over 95%) compared with other widely adopted classification methods. Furthermore, we demonstrate the application of a Guided Grad-CAM visualization to generate heatmaps and identify the top-ranked tumor-type-specific genes and pathways. Experimental results on prostate and breast cancer demonstrate our method can be applied to various types of cancer. Powered by the deep learning, this approach can potentially provide a new solution for pan-cancer classification and cancer driver gene discovery. The source code and datasets supporting the study is available at https://github.com/yetaoyu/Genomic-pan-cancer-classification.
Collapse
Affiliation(s)
- Taoyu Ye
- Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, 518055, China
| | - Sen Li
- Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, 518055, China
| | - Yang Zhang
- Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, 518055, China
| |
Collapse
|
45
|
Feng Y, Wang Y, Zeng C, Mao H. Artificial Intelligence and Machine Learning in Chronic Airway Diseases: Focus on Asthma and Chronic Obstructive Pulmonary Disease. Int J Med Sci 2021; 18:2871-2889. [PMID: 34220314 PMCID: PMC8241767 DOI: 10.7150/ijms.58191] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 05/20/2021] [Indexed: 02/05/2023] Open
Abstract
Chronic airway diseases are characterized by airway inflammation, obstruction, and remodeling and show high prevalence, especially in developing countries. Among them, asthma and chronic obstructive pulmonary disease (COPD) show the highest morbidity and socioeconomic burden worldwide. Although there are extensive guidelines for the prevention, early diagnosis, and rational treatment of these lifelong diseases, their value in precision medicine is very limited. Artificial intelligence (AI) and machine learning (ML) techniques have emerged as effective methods for mining and integrating large-scale, heterogeneous medical data for clinical practice, and several AI and ML methods have recently been applied to asthma and COPD. However, very few methods have significantly contributed to clinical practice. Here, we review four aspects of AI and ML implementation in asthma and COPD to summarize existing knowledge and indicate future steps required for the safe and effective application of AI and ML tools by clinicians.
Collapse
Affiliation(s)
- Yinhe Feng
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China.,Department of Respiratory and Critical Care Medicine, People's Hospital of Deyang City, Affiliated Hospital of Chengdu College of Medicine, Deyang, Sichuan Province, China
| | - Yubin Wang
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| | - Chunfang Zeng
- Department of Respiratory and Critical Care Medicine, People's Hospital of Deyang City, Affiliated Hospital of Chengdu College of Medicine, Deyang, Sichuan Province, China
| | - Hui Mao
- Department of Respiratory and Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan Province, China
| |
Collapse
|
46
|
Martin E, Acem I, Grünhagen DJ, Bovée JVMG, Verhoef C. Prognostic Significance of Immunohistochemical Markers and Genetic Alterations in Malignant Peripheral Nerve Sheath Tumors: A Systematic Review. Front Oncol 2020; 10:594069. [PMID: 33415076 PMCID: PMC7783392 DOI: 10.3389/fonc.2020.594069] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 11/19/2020] [Indexed: 12/20/2022] Open
Abstract
Background Malignant peripheral nerve sheath tumors (MPNSTs) are aggressive soft tissue sarcomas with dismal prognosis. Pathological and genetic markers may predict more aggressive behavior in MPNSTs but have uncommonly been investigated, and few are used in daily practice. This study reviews the prognostic value of immunohistochemical markers and genetic alterations in MPNST. Methods A systematic search was performed in PubMed and Embase databases according to the PRISMA guidelines. Search terms related to ‘MPNST’ and ‘prognostic’ were used. Studies investigating the association of immunohistochemical markers or genetic alterations with prognosis were included. Qualitative synthesis was performed on all studies. A distinction was made between univariable and multivariable associations. Results Forty-six studies were included after full-text screening. Sixty-seven different immunohistochemical markers were investigated. Absence of S100 and H3K27me3 and high Ki67 and p53 staining was most commonly independently associated with worse survival and disease-free survival. Several genetic alterations were investigated as well with varying association to survival. TP53, CDK4, RASSF1A alterations were independently associated with worse survival, as well as changes in chromosomal length in Xp, 10q, and 16p. Conclusions MPNSTs harbor complex and heterogeneous biology. Immunohistochemical markers and genetic alterations have variable prognostic value. Absence of S100 and H3K27me3 and increased Ki67 can be of prognostic value. Alterations in TP53 or increase in p53 staining may distinguish MPNSTs with worse outcomes. Genetic alterations and staining of other cell cycle regulatory and Ras pathway proteins may also help stratifying patients with worse outcomes. A combination of markers can increase the prognostic value.
Collapse
Affiliation(s)
- Enrico Martin
- Department of Surgical Oncology, Erasmus Medical Center, Rotterdam, Netherlands.,Department of Plastic and Reconstructive Surgery, University Medical Center Utrecht, Utrecht, Netherlands
| | - Ibtissam Acem
- Department of Surgical Oncology, Erasmus Medical Center, Rotterdam, Netherlands
| | - Dirk J Grünhagen
- Department of Surgical Oncology, Erasmus Medical Center, Rotterdam, Netherlands
| | - Judith V M G Bovée
- Department of Pathology, Leiden University Medical Center, Leiden, Netherlands
| | - Cornelis Verhoef
- Department of Surgical Oncology, Erasmus Medical Center, Rotterdam, Netherlands
| |
Collapse
|
47
|
Cheng N, Schulte AJ, Santosa F, Kim JH. Machine learning application identifies novel gene signatures from transcriptomic data of spontaneous canine hemangiosarcoma. Brief Bioinform 2020; 22:5930848. [PMID: 33078825 DOI: 10.1093/bib/bbaa252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 09/02/2020] [Accepted: 09/08/2020] [Indexed: 01/20/2023] Open
Abstract
Angiosarcomas are soft-tissue sarcomas that form malignant vascular tissues. Angiosarcomas are very rare, and due to their aggressive behavior and high metastatic propensity, they have poor clinical outcomes. Hemangiosarcomas commonly occur in domestic dogs, and share pathological and clinical features with human angiosarcomas. Typical pathognomonic features of this tumor are irregular vascular channels that are filled with blood and are lined by a mixture of malignant and nonmalignant endothelial cells. The current gold standard is the histological diagnosis of angiosarcoma; however, microscopic evaluation may be complicated, particularly when tumor cells are undetectable due to the presence of excessive amounts of nontumor cells or when tissue specimens have insufficient tumor content. In this study, we implemented machine learning applications from next-generation transcriptomic data of canine hemangiosarcoma tumor samples (n = 76) and nonmalignant tissues (n = 10) to evaluate their training performance for diagnostic utility. The 10-fold cross-validation test and multiple feature selection methods were applied. We found that extra trees and random forest learning models were the best classifiers for hemangiosarcoma in our testing datasets. We also identified novel gene signatures using the mutual information and Monte Carlo feature selection method. The extra trees model revealed high classification accuracy for hemangiosarcoma in validation sets. We demonstrate that high-throughput sequencing data of canine hemangiosarcoma are trainable for machine learning applications. Furthermore, our approach enables us to identify novel gene signatures as reliable determinants of hemangiosarcoma, providing significant insights into the development of potential applications for this vascular malignancy.
Collapse
Affiliation(s)
- Nuojin Cheng
- School of Mathematics, College of Science and Engineering at the University of Minnesota, Minneapolis, MN, USA
| | - Ashley J Schulte
- Animal Cancer Care and Research Program, Department of Veterinary Clinical Sciences, College of Veterinary Medicine at the University of Minnesota, St Paul, MN, USA
| | - Fadil Santosa
- Department of Applied Mathematics & Statistics, Whiting School of Engineering at the Johns Hopkins University, Baltimore, MD, USA
| | - Jong Hyuk Kim
- Department of Veterinary Clinical Sciences, College of Veterinary Medicine at the University of Minnesota, St Paul, MN, USA
| |
Collapse
|
48
|
Chen C, Zheng A, Ou X, Wang J, Ma X. Comparison of Radiomics-Based Machine-Learning Classifiers in Diagnosis of Glioblastoma From Primary Central Nervous System Lymphoma. Front Oncol 2020; 10:1151. [PMID: 33042784 PMCID: PMC7522159 DOI: 10.3389/fonc.2020.01151] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 06/08/2020] [Indexed: 02/05/2023] Open
Abstract
Purpose: The purpose of the current study was to evaluate the ability of magnetic resonance (MR) radiomics-based machine-learning algorithms in differentiating glioblastoma (GBM) from primary central nervous system lymphoma (PCNSL). Method: One-hundred and thirty-eight patients were enrolled in this study. Radiomics features were extracted from contrast-enhanced MR images, and the machine-learning models were established using five selection methods (distance correlation, random forest, least absolute shrinkage and selection operator (LASSO), eXtreme gradient boosting (Xgboost), and Gradient Boosting Decision Tree) and three radiomics-based machine-learning classifiers [linear discriminant analysis (LDA), support vector machine (SVM), and logistic regression (LR)]. Sensitivity, specificity, accuracy, and areas under curves (AUC) of models were calculated, with which the performances of classifiers were evaluated and compared with each other. Result: Brilliant discriminative performance would be observed among all classifiers when combined with the suitable selection method. For LDA-based models, the optimal one was Distance Correlation + LDA with AUC of 0.978. For SVM-based models, Distance Correlation + SVM was the one with highest AUC of 0.959, while for LR-based models, the highest AUC was 0.966 established with LASSO + LR. Conclusion: Radiomics-based machine-learning algorithms potentially have promising performances in differentiating GBM from PCNSL.
Collapse
Affiliation(s)
- Chaoyue Chen
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Collaborative Innovation Center for Biotherapy, Chengdu, China.,Department of Neurosurgery, West China Hospital, Sichuan University, Chengdu, China
| | - Aiping Zheng
- West China School of Medicine, West China Hospital, Sichuan University, Chengdu, China
| | - Xuejin Ou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Collaborative Innovation Center for Biotherapy, Chengdu, China.,West China School of Medicine, West China Hospital, Sichuan University, Chengdu, China
| | - Jian Wang
- School of Computer Science, Nanjing University of Science and Technology, Nanjing, China
| | - Xuelei Ma
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Collaborative Innovation Center for Biotherapy, Chengdu, China.,Department of Biotherapy, Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
49
|
Exarchos KP, Beltsiou M, Votti CA, Kostikas K. Artificial intelligence techniques in asthma: a systematic review and critical appraisal of the existing literature. Eur Respir J 2020; 56:13993003.00521-2020. [PMID: 32381498 DOI: 10.1183/13993003.00521-2020] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 04/29/2020] [Indexed: 12/22/2022]
Abstract
Artificial intelligence (AI) when coupled with large amounts of well characterised data can yield models that are expected to facilitate clinical practice and contribute to the delivery of better care, especially in chronic diseases such as asthma.The purpose of this paper is to review the utilisation of AI techniques in all aspects of asthma research, i.e. from asthma screening and diagnosis, to patient classification and the overall asthma management and treatment, in order to identify trends, draw conclusions and discover potential gaps in the literature.We conducted a systematic review of the literature using PubMed and DBLP from 1988 up to 2019, yielding 425 articles; after removing duplicate and irrelevant articles, 98 were further selected for detailed review.The resulting articles were organised in four categories, and subsequently compared based on a set of qualitative and quantitative factors. Overall, we observed an increasing adoption of AI techniques for asthma research, especially within the last decade.AI is a scientific field that is in the spotlight, especially the last decade. In asthma there are already numerous studies; however, there are certain unmet needs that need to be further elucidated.
Collapse
Affiliation(s)
| | - Maria Beltsiou
- Respiratory Medicine Dept, School of Medicine, University of Ioannina, Ioannina, Greece
| | | | - Konstantinos Kostikas
- Respiratory Medicine Dept, School of Medicine, University of Ioannina, Ioannina, Greece
| |
Collapse
|
50
|
Benzekry S. Artificial Intelligence and Mechanistic Modeling for Clinical Decision Making in Oncology. Clin Pharmacol Ther 2020; 108:471-486. [PMID: 32557598 DOI: 10.1002/cpt.1951] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 06/04/2020] [Indexed: 12/24/2022]
Abstract
The amount of "big" data generated in clinical oncology, whether from molecular, imaging, pharmacological, or biological origin, brings novel challenges. To mine efficiently this source of information, mathematical models able to produce predictive algorithms and simulations are required, with applications for diagnosis, prognosis, drug development, or prediction of the response to therapy. Such mathematical and computational constructs can be subdivided into two broad classes: biologically agnostic, statistical models using artificial intelligence techniques, and physiologically based, mechanistic models. In this review, recent advances in the applications of such methods in clinical oncology are outlined. These include machine learning applied to big data (omics, imaging, or electronic health records), pharmacometrics and quantitative systems pharmacology, as well as tumor kinetics and metastasis modeling. Focus is set on studies with high potential of clinical translation, and particular attention is given to cancer immunotherapy. Perspectives are given in terms of combinations of the two approaches: "mechanistic learning."
Collapse
Affiliation(s)
- Sebastien Benzekry
- MONC Team, Inria Bordeaux Sud-Ouest, Talence, France
- Institut de Mathématiques de Bordeaux, CNRS UMR 5251, Bordeaux University, Talence, France
| |
Collapse
|