1
|
Petrušić I, Savić A, Mitrović K, Bačanin N, Sebastianelli G, Secci D, Coppola G. Machine learning classification meets migraine: recommendations for study evaluation. J Headache Pain 2024; 25:215. [PMID: 39639193 PMCID: PMC11622592 DOI: 10.1186/s10194-024-01924-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024] Open
Abstract
The integration of machine learning (ML) classification techniques into migraine research has offered new insights into the pathophysiology and classification of migraine types and subtypes. However, inconsistencies in study design, lack of methodological transparency, and the absence of external validation limit the impact and reproducibility of such studies. This paper presents a framework of six essential recommendations for evaluating ML-based classification in migraine research: (1) group homogenization by clinical phenotype, attack frequency, comorbidity, therapy, and demographics; (2) defining adequate sample size; (3) quality control of collected and preprocessed data; (4) transparent training, testing, and performance evaluation of ML models, including strategies for data splitting, overfitting control, and feature selection; (5) interpretability of results with clinical relevance; and (6) open data and code sharing to facilitate reproducibility. These recommendations aim to balance the trade-off between model generalization and precision while encouraging collaborative standardization across the ML and headache communities. Furthermore, this framework intends to stimulate discussion toward forming a consortium to establish definitive guidelines for ML-based classification research in migraine field.
Collapse
Affiliation(s)
- Igor Petrušić
- Laboratory for Advanced Analysis of Neuroimages, Faculty of Physical Chemistry, University of Belgrade, Belgrade, Serbia.
| | - Andrej Savić
- Science and Research Centre, School of Electrical Engineering, University of Belgrade, University of Belgrade, Belgrade, Serbia
| | - Katarina Mitrović
- Department of Information Technologies, Faculty of Technical Sciences Čačak, University of Kragujevac, Čačak, Serbia
| | - Nebojša Bačanin
- Department of Informatics and Computing, Singidunum University, Belgrade, Serbia
| | - Gabriele Sebastianelli
- Department of Medico-Surgical Sciences and Biotechnologies, Sapienza University of Rome Polo Pontino ICOT, Latina, Italy
| | - Daniele Secci
- Department of Engineering and Architecture, University of Parma, Parma, Italy
| | - Gianluca Coppola
- Department of Medico-Surgical Sciences and Biotechnologies, Sapienza University of Rome Polo Pontino ICOT, Latina, Italy
| |
Collapse
|
2
|
Rydzewski NR, Shi Y, Li C, Chrostek MR, Bakhtiar H, Helzer KT, Bootsma ML, Berg TJ, Harari PM, Floberg JM, Blitzer GC, Kosoff D, Taylor AK, Sharifi MN, Yu M, Lang JM, Patel KR, Citrin DE, Sundling KE, Zhao SG. A platform-independent AI tumor lineage and site (ATLAS) classifier. Commun Biol 2024; 7:314. [PMID: 38480799 PMCID: PMC10937974 DOI: 10.1038/s42003-024-05981-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 02/27/2024] [Indexed: 03/17/2024] Open
Abstract
Histopathologic diagnosis and classification of cancer plays a critical role in guiding treatment. Advances in next-generation sequencing have ushered in new complementary molecular frameworks. However, existing approaches do not independently assess both site-of-origin (e.g. prostate) and lineage (e.g. adenocarcinoma) and have minimal validation in metastatic disease, where classification is more difficult. Utilizing gradient-boosted machine learning, we developed ATLAS, a pair of separate AI Tumor Lineage and Site-of-origin models from RNA expression data on 8249 tumor samples. We assessed performance independently in 10,376 total tumor samples, including 1490 metastatic samples, achieving an accuracy of 91.4% for cancer site-of-origin and 97.1% for cancer lineage. High confidence predictions (encompassing the majority of cases) were accurate 98-99% of the time in both localized and remarkably even in metastatic samples. We also identified emergent properties of our lineage scores for tumor types on which the model was never trained (zero-shot learning). Adenocarcinoma/sarcoma lineage scores differentiated epithelioid from biphasic/sarcomatoid mesothelioma. Also, predicted lineage de-differentiation identified neuroendocrine/small cell tumors and was associated with poor outcomes across tumor types. Our platform-independent single-sample approach can be easily translated to existing RNA-seq platforms. ATLAS can complement and guide traditional histopathologic assessment in challenging situations and tumors of unknown primary.
Collapse
Affiliation(s)
- Nicholas R Rydzewski
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Yue Shi
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Chenxuan Li
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | | | - Hamza Bakhtiar
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Kyle T Helzer
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Matthew L Bootsma
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Tracy J Berg
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
| | - Paul M Harari
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
| | - John M Floberg
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
| | - Grace C Blitzer
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
| | - David Kosoff
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
- Department of Medicine, University of Wisconsin, Madison, WI, USA
| | - Amy K Taylor
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
- Department of Medicine, University of Wisconsin, Madison, WI, USA
| | - Marina N Sharifi
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
- Department of Medicine, University of Wisconsin, Madison, WI, USA
| | - Menggang Yu
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, USA
| | - Joshua M Lang
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA
- Department of Medicine, University of Wisconsin, Madison, WI, USA
| | - Krishnan R Patel
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Deborah E Citrin
- Radiation Oncology Branch, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Kaitlin E Sundling
- Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, WI, USA
- Wisconsin State Laboratory of Hygiene, University of Wisconsin, Madison, WI, USA
| | - Shuang G Zhao
- Department of Human Oncology, University of Wisconsin, Madison, WI, USA.
- Carbone Cancer Center, University of Wisconsin, Madison, WI, USA.
- William S. Middleton Veterans Hospital, Madison, WI, USA.
| |
Collapse
|
3
|
Quraish RU, Hirahata T, Quraish AU, ul Quraish S. An Overview: Genetic Tumor Markers for Early Detection and Current Gene Therapy Strategies. Cancer Inform 2023; 22:11769351221150772. [PMID: 36762284 PMCID: PMC9903029 DOI: 10.1177/11769351221150772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 12/24/2022] [Indexed: 02/04/2023] Open
Abstract
Genomic instability is considered a fundamental factor involved in any neoplastic disease. Consequently, the genetically unstable cells contribute to intratumoral genetic heterogeneity and phenotypic diversity of cancer. These genetic alterations can be detected by several diagnostic techniques of molecular biology and the detection of alteration in genomic integrity may serve as reliable genetic molecular markers for the early detection of cancer or cancer-related abnormal changes in the body cells. These genetic molecular markers can detect cancer earlier than any other method of cancer diagnosis, once a tumor is diagnosed, then replacement or therapeutic manipulation of these cancer-related abnormal genetic changes can be possible, which leads toward effective and target-specific cancer treatment and in many cases, personalized treatment of cancer could be performed without the adverse effects of chemotherapy and radiotherapy. In this review, we describe how these genetic molecular markers can be detected and the possible ways for the application of this gene diagnosis for gene therapy that can attack cancerous cells, directly or indirectly, which lead to overall improved management and quality of life for a cancer patient.
Collapse
Affiliation(s)
| | - Tetsuyuki Hirahata
- Tetsuyuki Hirahata, Hirahata Gene Therapy Laboratory, HIC Clinic #1105, Itocia Office Tower 11F, 2-7-1, Yurakucho, Chiyoda-ku, Tokyo 100-0006, Japan.
| | | | | |
Collapse
|
4
|
Liu Q, Zhang M, He Y, Zhang L, Zou J, Yan Y, Guo Y. Predicting the Risk of Incident Type 2 Diabetes Mellitus in Chinese Elderly Using Machine Learning Techniques. J Pers Med 2022; 12:jpm12060905. [PMID: 35743691 PMCID: PMC9224915 DOI: 10.3390/jpm12060905] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 05/21/2022] [Accepted: 05/27/2022] [Indexed: 02/04/2023] Open
Abstract
Early identification of individuals at high risk of diabetes is crucial for implementing early intervention strategies. However, algorithms specific to elderly Chinese adults are lacking. The aim of this study is to build effective prediction models based on machine learning (ML) for the risk of type 2 diabetes mellitus (T2DM) in Chinese elderly. A retrospective cohort study was conducted using the health screening data of adults older than 65 years in Wuhan, China from 2018 to 2020. With a strict data filtration, 127,031 records from the eligible participants were utilized. Overall, 8298 participants were diagnosed with incident T2DM during the 2-year follow-up (2019–2020). The dataset was randomly split into training set (n = 101,625) and test set (n = 25,406). We developed prediction models based on four ML algorithms: logistic regression (LR), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost). Using LASSO regression, 21 prediction features were selected. The Random under-sampling (RUS) was applied to address the class imbalance, and the Shapley Additive Explanations (SHAP) was used to calculate and visualize feature importance. Model performance was evaluated by the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy. The XGBoost model achieved the best performance (AUC = 0.7805, sensitivity = 0.6452, specificity = 0.7577, accuracy = 0.7503). Fasting plasma glucose (FPG), education, exercise, gender, and waist circumference (WC) were the top five important predictors. This study showed that XGBoost model can be applied to screen individuals at high risk of T2DM in the early phrase, which has the strong potential for intelligent prevention and control of diabetes. The key features could also be useful for developing targeted diabetes prevention interventions.
Collapse
Affiliation(s)
- Qing Liu
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Miao Zhang
- Department of Epidemiology, School of Public Health, Wuhan University, Wuhan 430071, China; (Q.L.); (M.Z.)
| | - Yifeng He
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Lei Zhang
- School of Mathematics and Statistics, Wuhan University, Wuhan 430070, China;
| | - Jingui Zou
- School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China; (Y.H.); (J.Z.)
| | - Yaqiong Yan
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
| | - Yan Guo
- Wuhan Center for Disease Control and Prevention, Wuhan 430015, China;
- Correspondence:
| |
Collapse
|