1
|
Nath A, Kumar CJ, Kalita SK, Singh TP, Dhir R. HybridGWOSPEA2ABC: a novel feature selection algorithm for gene expression data analysis and cancer classification. Comput Methods Biomech Biomed Engin 2025:1-22. [PMID: 40285642 DOI: 10.1080/10255842.2025.2495248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 02/25/2025] [Accepted: 04/13/2025] [Indexed: 04/29/2025]
Abstract
BACKGROUND AND OBJECTIVE DNA micro-array technology has a remarkable impact on biological research, particularly in categorizing and diagnosing cancer and studying gene features and functions. With the availability of extensive collections of cancer-related data, there has been an increased focus on developing optimized Machine Learning (ML) techniques for cancer classification through gene pattern analysis and the identification of specific genes for cancer type categorization. The relevant gene selection for diagnosing and treating cancer poses a significant challenge, which requires efficient feature selection methods. METHODS This study introduces a novel hybrid algorithm, for gene selection, integrating the Grey Wolf Optimizer (GWO), Strength Pareto Evolutionary Algorithm 2 (SPEA2), and Artificial Bee Colony (ABC). This combination uses intelligence and evolutionary computation to enhance solution diversity, convergence efficiency, and exploration and exploitation capabilities in high-dimensional gene expression data. The algorithm was compared with five bio-inspired algorithms using five different classifiers on various cancer datasets to validate its effectiveness in feature selection. RESULTS The HybridGWOSPEA2ABC algorithm demonstrated superior performance in identifying relevant cancer biomarkers compared to the conventional bio-inspired algorithms. Comparison with the benchmark algorithms has shown the hybrid approach's enhanced capability in addressing the challenges of high-dimensional data and advancing the gene selection problem for cancer classification. CONCLUSION The novel hybridization algorithm enhances performance by maintaining solution diversity, efficiently converging to optimal solutions, and improving the exploration and exploitation of the search space. This study provides a better understanding of relevant genes for cancer classification and promotes effective methodologies for disease detection and classification.
Collapse
Affiliation(s)
- Ashimjyoti Nath
- Department of Computer Science and Information Technology, Cotton University, Guwahati, Assam, India
| | - Chandan Jyoti Kumar
- Department of Computer Science and Information Technology, Cotton University, Guwahati, Assam, India
| | - Sanjib Kr Kalita
- Department of Computer Science, Gauhati University, Guwahati, Assam, India
| | - Thipendra Pal Singh
- Department of Computer Science Engineering and Technology, Bennett University, Greater Noida, India
| | - Renu Dhir
- Department of Computer Science and Engineering, NIT Jalandhar, Jalandhar, Punjab, India
| |
Collapse
|
2
|
Shah SNA, Parveen R. Differential gene expression analysis and machine learning identified structural, TFs, cytokine and glycoproteins, including SOX2, TOP2A, SPP1, COL1A1, and TIMP1 as potential drivers of lung cancer. Biomarkers 2025; 30:200-215. [PMID: 39888730 DOI: 10.1080/1354750x.2025.2461698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Accepted: 01/26/2025] [Indexed: 02/02/2025]
Abstract
BACKGROUND Lung cancer is a primary global health concern, responsible for a considerable portion of cancer-related fatalities worldwide. Understanding its molecular complexities is crucial for identifying potential targets for treatment. The goal is to slow disease progression and intervene early to prevent the development of advanced lung cancer cases. Hence, there's an urgent need for new biomarkers that can detect lung cancer in its early stages. METHODS The study conducted RNA-Seq analysis of lung cancer samples from the publicly available SRA database (NCBI SRP009408), including both control and tumour samples. The genes with differential expression between tumour and healthy tissues were identified using R and Bioconductor. Machine learning (ML) techniques, Random Forest, Lasso, XGBoost, Gradient Boosting and Elastic Net were employed to pinpoint significant genes followed by classifiers, Multilayer Perceptron (MLP), Support Vector Machines (SVM) and k-Nearest Neighbours (k-NN). Gene ontology and pathway analyses were performed on the significant differentially expressed genes (DEGs). The top genes from DEG and machine learning analyses were combined for protein-protein interaction (PPI) analysis, identifying 10 hub genes essential for lung cancer progression. RESULTS The integrated analysis of ML and DEGs revealed the significance of specific genes in lung cancer samples, identified the top 5 upregulated genes (COL11A1, TOP2A, SULF1, DIO2, MIR196A2) and the top 5 downregulated genes (PDK4, FOSB, FLYWCH1, CYB5D2, MIR328), along with their associated genes implicated in pathways or co-expression networks were identified. Among the various algorithms employed, Random Forest and XGBoost proved effective in identifying common genes, underscoring their potential significance in lung cancer pathogenesis. The MLP exhibited the highest accuracy in classifying samples using all genes. Additionally, the protein-protein interaction (PPI) analysis identified 10 hub genes that are pivotal in lung cancer pathogenesis: COL1A1, SOX2, SPP1, THBS2, POSTN, COL5A1, COL11A1, TIMP1, TOP2A and PKP1. CONCLUSION The study contributes to the early prediction of lung cancer by identifying potential biomarkers that could enhance early diagnosis and pave the way for practical clinical applications in the future. Integrating DEGs and machine learning-derived significant genes for PPI analysis offers a robust approach to uncovering critical molecular targets for lung cancer treatment.
Collapse
Affiliation(s)
| | - Rafat Parveen
- Department of Computer Science, Jamia Millia Islamia, New Delhi, India
| |
Collapse
|
3
|
Tang X, Zhong J, Luo H, Zhou F, Wang L, Lin S, Xiong J, Lv H, Zhou Z, Yu H, Cao K. Efficacy of Naringenin against aging and degeneration of nucleus pulposus cells through IGFBP3 inhibition. Sci Rep 2025; 15:6780. [PMID: 40000729 PMCID: PMC11861589 DOI: 10.1038/s41598-025-90909-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Accepted: 02/17/2025] [Indexed: 02/27/2025] Open
Abstract
Naringenin (NAR), a natural flavonoid, exerts anti-inflammatory and antioxidant pharmacology. However, the pharmacological mechanisms through which NAR prevents and treats intervertebral disc degeneration (IDD) remain unclear. We utilized bioinformatics, machine learning, and network pharmacology to identify shared targets among NAR, senescence, and IDD. Subsequently, molecular docking was conducted to evaluate NAR's binding affinity to common target. Additionally, we used IL-1β to induce senescence and degeneration in nucleus pulposus cells (NPCs) and conducted a series of cellular assays, including immunoblotting, immunofluorescence, β-galactosidase staining, cell proliferation, cell cycle analysis, and measurement of reactive oxygen species levels, to investigate NAR's impact on IL-1β-induced senescence and degeneration of NPCs. Our study revealed that Insulin-like growth factor binding protein 3 (IGFBP3) was the only common target. IGFBP3 exhibited significant differences between the IDD and healthy groups and proved to be an effective diagnostic marker for IDD. Molecular docking confirmed the binding between NAR and IGFBP3. In vitro experiments, we observed that Igfbp3 expression increased in the senescence and degeneration groups. Igfbp3 knockdown and NAR attenuated IL-1β-induced senescence and degenerative phenotypes in NPCs. In contrast, the effect of NAR was attenuated by recombinant IGFBP3 protein. In conclusion, our findings suggest that NAR plays a preventive and therapeutic role in IDD, likely achieved through the inhibition of Igfbp3 expression.
Collapse
Affiliation(s)
- Xiaokai Tang
- Department of Orthopedics, People's Hospital of Deyang City, Deyang, China
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China
| | - Junlong Zhong
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
| | - Hao Luo
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China
| | - Faxin Zhou
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China
| | - Lixia Wang
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China
| | - Sijian Lin
- Department of Rehabilitation Medicine, The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, Jiangxi, China
| | - Jiachao Xiong
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
| | - Hao Lv
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China
| | - Zhenhai Zhou
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China
| | - Honggui Yu
- Orthopedic Hospital, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, #1519 Dongyue Avenue, Nanchang, 330209, Jiangxi, China.
| | - Kai Cao
- The Key Laboratory of Spine and Spinal Cord Disease of Jiangxi Province, Nanchang, 330006, China.
- Department of Orthopedics, Affiliated Rehabilitation Hospital of Nanchang University, Nanchang, 330002, China.
| |
Collapse
|
4
|
Tsai PF, Yuan SM. Using Infrared Raman Spectroscopy with Machine Learning and Deep Learning as an Automatic Textile-Sorting Technology for Waste Textiles. SENSORS (BASEL, SWITZERLAND) 2024; 25:57. [PMID: 39796848 PMCID: PMC11722779 DOI: 10.3390/s25010057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Revised: 12/18/2024] [Accepted: 12/20/2024] [Indexed: 01/13/2025]
Abstract
With the fast-fashion trend, an increasing number of discarded clothing items are being eliminated at the stages of both pre-consumer and post-consumer each year. The linear economy produces large volumes of waste, which harm environmental sustainability. This study addresses the pressing need for efficient textile recycling in the circular economy (CE). We developed a highly accurate Raman-spectroscopy-based textile-sorting technology, which overcomes the challenge of diverse fiber combinations in waste textiles. By categorizing textiles into six groups based on their fiber compositions, the sorter improves the quality of recycled fibers. Our study demonstrates the potential of Raman spectroscopy in providing detailed molecular compositional information, which is crucial for effective textile sorting. Furthermore, AI technologies, including PCA, KNN, SVM, RF, ANN, and CNN, are integrated into the sorting process, further enhancing the efficiency to 1 piece per second with a precision of over 95% in grouping textiles based on the fiber compositional analysis. This interdisciplinary approach offers a promising solution for sustainable textile recycling, contributing to the objectives of the CE.
Collapse
Affiliation(s)
| | - Shyan-Ming Yuan
- Department of Computer Science, National Yang Ming Chiao Tung University, ChiaoTung Campus, Hsinchu 300093, Taiwan;
| |
Collapse
|
5
|
Zhou Z, Zhang R, Zhou A, Lv J, Chen S, Zou H, Zhang G, Lin T, Wang Z, Zhang Y, Weng S, Han X, Liu Z. Proteomics appending a complementary dimension to precision oncotherapy. Comput Struct Biotechnol J 2024; 23:1725-1739. [PMID: 38689716 PMCID: PMC11058087 DOI: 10.1016/j.csbj.2024.04.044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/11/2024] [Accepted: 04/17/2024] [Indexed: 05/02/2024] Open
Abstract
Recent advances in high-throughput proteomic profiling technologies have facilitated the precise quantification of numerous proteins across multiple specimens concurrently. Researchers have the opportunity to comprehensively analyze the molecular signatures in plentiful medical specimens or disease pattern cell lines. Along with advances in data analysis and integration, proteomics data could be efficiently consolidated and employed to recognize precise elementary molecular mechanisms and decode individual biomarkers, guiding the precision treatment of tumors. Herein, we review a broad array of proteomics technologies and the progress and methods for the integration of proteomics data and further discuss how to better merge proteomics in precision medicine and clinical settings.
Collapse
Affiliation(s)
- Zhaokai Zhou
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
- Department of Urology, The First Affiliated Hospital of Zhengzhou University, Henan 450052, China
| | - Ruiqi Zhang
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Aoyang Zhou
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Jinxiang Lv
- Department of Gastroenterology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Shuang Chen
- Center of Reproductive Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Haijiao Zou
- Center of Reproductive Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Ge Zhang
- Department of Cardiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Ting Lin
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Zhan Wang
- Department of Urology, The First Affiliated Hospital of Zhengzhou University, Henan 450052, China
| | - Yuyuan Zhang
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Siyuan Weng
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
| | - Xinwei Han
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
- Interventional Institute of Zhengzhou University, Zhengzhou, Henan 450052, China
- Interventional Treatment and Clinical Research Center of Henan Province, Zhengzhou, Henan 450052, China
| | - Zaoqu Liu
- Department of Interventional Radiology, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, China
- Interventional Institute of Zhengzhou University, Zhengzhou, Henan 450052, China
- Interventional Treatment and Clinical Research Center of Henan Province, Zhengzhou, Henan 450052, China
- Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100730, China
| |
Collapse
|
6
|
Cai K, Fu W, Liu H, Yang X, Wang Z, Zhao X. Leveraging Bioinformatics and Machine Learning for Identifying Prognostic Biomarkers and Predicting Clinical Outcomes in Lung Adenocarcinoma. Genes (Basel) 2024; 15:1497. [PMID: 39766765 PMCID: PMC11675206 DOI: 10.3390/genes15121497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2024] [Revised: 11/06/2024] [Accepted: 11/21/2024] [Indexed: 01/11/2025] Open
Abstract
Background/Objectives: There exist significant challenges for lung adenocarcinoma (LUAD) due to its poor prognosis and limited treatment options, particularly in the advanced stages. It is crucial to identify genetic biomarkers for improving outcome predictions and guiding personalized therapies. Methods: In this study, we utilize a multi-step approach that combines principled sure independence screening, penalized regression methods and information gain to identify the key genetic features of the ultra-high dimensional RNA-sequencing data from LUAD patients. We then evaluate three methods of survival analysis: the Cox model, survival tree, and random survival forests (RSFs), to compare their predictive performance. Additionally, a protein-protein interaction network is used to explore the biological significance of identified genes. Results:DKK1 and TNS4 are consistently selected as significant predictors across all feature selection methods. The Kaplan-Meier method shows that high expression levels of these genes are strongly correlated with poorer survival outcomes, suggesting their potential as prognostic biomarkers. RSF outperforms Cox and survival tree methods, showing higher AUC and C-index values. The protein-protein interaction network highlights key nodes such as VEGFC and LAMA3, which play central roles in LUAD progression. Conclusions: Our findings provide valuable insights into the genetic mechanisms of LUAD. These results contribute to the development of more accurate prognostic tools and personalized treatment strategies for LUAD.
Collapse
Affiliation(s)
- Kaida Cai
- Department of Epidemiology and Biostatistics, School of Public Health, Southeast University, Nanjing 210009, China
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
- Key Laboratory of Environmental Medicine Engineering, Ministry of Education, School of Public Health, Southeast University, Nanjing 210009, China
| | - Wenzhi Fu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
| | - Hanwen Liu
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
| | - Xiaofang Yang
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
| | - Zhengyan Wang
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
| | - Xin Zhao
- Department of Statistics and Actuarial Science, School of Mathematics, Southeast University, Nanjing 211189, China; (W.F.); (H.L.); (X.Y.); (Z.W.); (X.Z.)
- Key Laboratory of Measurement and Control of Complex Systems of Engineering, Ministry of Education, Southeast University, Nanjing 210096, China
| |
Collapse
|
7
|
Bourdiec A, Messaoudi S, El Kasmi I, Chow-Shi-Yée M, Kadoch E, Stebenne ME, Tadevosyan A, Kadoch IJ. Development of a New Personalized Molecular Test Based on Endometrial Receptivity and Maternal-Fetal Dialogue: Adhesio. Biochem Genet 2024:10.1007/s10528-024-10950-y. [PMID: 39488671 DOI: 10.1007/s10528-024-10950-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Accepted: 10/21/2024] [Indexed: 11/04/2024]
Abstract
Successful embryo implantation relies on a receptive endometrium and a maternofetal dialogue. Abnormal receptivity is a common cause of implantation failure in assisted reproductive techniques. This study aimed to develop a novel transcriptomic-based diagnostic assay, Adhesio, for assessing endometrial receptivity and guiding personalized embryo transfer. Adhesio was developed based on an initial dataset of 74 endometrial biopsies. Two types of biopsy samples were involved: 45 endometrial biopsies collected during the optimal theoretical window of implantation (WOI) and 29 endometrial biopsies which cells have been cultured with or without an autologous embryo. Microarray analysis was performed to identify differentially expressed genes associated with endometrial receptivity and selected candidate genes were assessed using quantitative real-time polymerase chain reaction (RT-qPCR) on biopsy samples. Statistical analyses were conducted to assess the performance and accuracy of Adhesio. The microarray analysis identified three distinct clusters of endometrial samples with differential gene expression patterns. Cluster 1 exhibited 1717 differentially expressed genes involved in biological processes associated with endometrial receptivity. A specific transcriptomic signature of 60 genes associated with endometrial co-culture was obtained using class prediction approach. Thereafter, an original panel of 10 genes was selected as potential biomarkers for endometrial receptivity based on their expression profiles in both endometrial biopsies and co-cultured cells. This article outlines the methodology employed to develop Adhesio, a test that assesses endometrial receptivity using an original panel of 10 genes. These genes are not only involved during the WOI but are also influenced by the maternal-fetal dialogue.
Collapse
Affiliation(s)
- Amelie Bourdiec
- Clinique ovo, 8000 Boul. Décarie, Montreal, QC, H4P 2S4, Canada
| | | | - Imane El Kasmi
- Clinique ovo, 8000 Boul. Décarie, Montreal, QC, H4P 2S4, Canada
| | | | - Eva Kadoch
- Clinique ovo, 8000 Boul. Décarie, Montreal, QC, H4P 2S4, Canada
| | | | - Artak Tadevosyan
- Clinique ovo, 8000 Boul. Décarie, Montreal, QC, H4P 2S4, Canada
- Department of Pharmacology and Physiology, Université de Montreal, Montreal, QC, Canada
| | - Isaac-Jacques Kadoch
- Clinique ovo, 8000 Boul. Décarie, Montreal, QC, H4P 2S4, Canada.
- Department of Obstetrics and Gynecology, Université de Montreal, Montreal, QC, Canada.
| |
Collapse
|
8
|
Liu G, Yang X, Li N. Towards key genes identification for breast cancer survival risk with neural network models. Comput Biol Chem 2024; 112:108143. [PMID: 39142146 DOI: 10.1016/j.compbiolchem.2024.108143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 06/27/2024] [Accepted: 07/02/2024] [Indexed: 08/16/2024]
Abstract
Breast cancer, one common malignant tumor all over the world, has a considerably high rate of recurrence, which endangers the health and life of patients. While more and more data have been available, how to leverage the gene expression data to predict the survival risk of cancer patients and identify key genes has become a hot topic for cancer research. Therefore, in this work, we investigate the gene expression and clinical data of breast cancer patients, specifically a novel framework is proposed focusing on the survival risk classification and key gene identification task. We firstly combine the differential expression and univariate Cox regression analysis to achieve dimensional reduction of gene expression data. The median survival time is subsequently proposed as the risk classification threshold and a learning model based on neural network is trained to classify the survival risk of patients. Innovatively, in this work, the activation region visualization technology is selected as the identification tool, which identify 20 key genes related to the survival risk of breast cancer patients. We further analyze the gene function of these 20 key genes based on STRING database. It is critical to learn that, the genetic biomarkers identified in this paper may possess value for the following clinical treatment of breast cancer according to the literature findings. Importantly, the genetic biomarkers identified in this paper may possess value for the following clinical treatment of breast cancer according to the literature findings. Our work accomplishes the objective of proposing a targeted approach to enhancing the survival analysis and therapeutic strategies in breast cancer through advanced computational techniques and gene analysis.
Collapse
Affiliation(s)
- Gang Liu
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China.
| | - Xiao Yang
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China.
| | - Nan Li
- School of Information Science and Engineering, Lanzhou University, Lanzhou, Gansu 730000, China.
| |
Collapse
|
9
|
Murmu A, Győrffy B. Artificial intelligence methods available for cancer research. Front Med 2024; 18:778-797. [PMID: 39115792 DOI: 10.1007/s11684-024-1085-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 05/17/2024] [Indexed: 11/01/2024]
Abstract
Cancer is a heterogeneous and multifaceted disease with a significant global footprint. Despite substantial technological advancements for battling cancer, early diagnosis and selection of effective treatment remains a challenge. With the convenience of large-scale datasets including multiple levels of data, new bioinformatic tools are needed to transform this wealth of information into clinically useful decision-support tools. In this field, artificial intelligence (AI) technologies with their highly diverse applications are rapidly gaining ground. Machine learning methods, such as Bayesian networks, support vector machines, decision trees, random forests, gradient boosting, and K-nearest neighbors, including neural network models like deep learning, have proven valuable in predictive, prognostic, and diagnostic studies. Researchers have recently employed large language models to tackle new dimensions of problems. However, leveraging the opportunity to utilize AI in clinical settings will require surpassing significant obstacles-a major issue is the lack of use of the available reporting guidelines obstructing the reproducibility of published studies. In this review, we discuss the applications of AI methods and explore their benefits and limitations. We summarize the available guidelines for AI in healthcare and highlight the potential role and impact of AI models on future directions in cancer research.
Collapse
Affiliation(s)
- Ankita Murmu
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary
- National Laboratory for Drug Research and Development, Budapest, 1117, Hungary
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary
| | - Balázs Győrffy
- Institute of Molecular Life Sciences, HUN-REN Research Centre for Natural Sciences, Budapest, 1117, Hungary.
- Department of Bioinformatics, Semmelweis University, Budapest, 1094, Hungary.
- Department of Biophysics, University of Pecs, Pecs, 7624, Hungary.
| |
Collapse
|
10
|
Hwang J, Cheney P, Kanick SC, Le HND, McClatchy DM, Zhang H, Liu N, John Lu ZQ, Cho TJ, Briggman K, Allen DW, Wells WA, Pogue BW. Hyperspectral dark-field microscopy of human breast lumpectomy samples for tumor margin detection in breast-conserving surgery. JOURNAL OF BIOMEDICAL OPTICS 2024; 29:093503. [PMID: 38715717 PMCID: PMC11075096 DOI: 10.1117/1.jbo.29.9.093503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 03/18/2024] [Accepted: 03/20/2024] [Indexed: 01/06/2025]
Abstract
Significance Hyperspectral dark-field microscopy (HSDFM) and data cube analysis algorithms demonstrate successful detection and classification of various tissue types, including carcinoma regions in human post-lumpectomy breast tissues excised during breast-conserving surgeries. Aim We expand the application of HSDFM to the classification of tissue types and tumor subtypes in pre-histopathology human breast lumpectomy samples. Approach Breast tissues excised during breast-conserving surgeries were imaged by the HSDFM and analyzed. The performance of the HSDFM is evaluated by comparing the backscattering intensity spectra of polystyrene microbead solutions with the Monte Carlo simulation of the experimental data. For classification algorithms, two analysis approaches, a supervised technique based on the spectral angle mapper (SAM) algorithm and an unsupervised technique based on the K -means algorithm are applied to classify various tissue types including carcinoma subtypes. In the supervised technique, the SAM algorithm with manually extracted endmembers guided by H&E annotations is used as reference spectra, allowing for segmentation maps with classified tissue types including carcinoma subtypes. Results The manually extracted endmembers of known tissue types and their corresponding threshold spectral correlation angles for classification make a good reference library that validates endmembers computed by the unsupervised K -means algorithm. The unsupervised K -means algorithm, with no a priori information, produces abundance maps with dominant endmembers of various tissue types, including carcinoma subtypes of invasive ductal carcinoma and invasive mucinous carcinoma. The two carcinomas' unique endmembers produced by the two methods agree with each other within < 2 % residual error margin. Conclusions Our report demonstrates a robust procedure for the validation of an unsupervised algorithm with the essential set of parameters based on the ground truth, histopathological information. We have demonstrated that a trained library of the histopathology-guided endmembers and associated threshold spectral correlation angles computed against well-defined reference data cubes serve such parameters. Two classification algorithms, supervised and unsupervised algorithms, are employed to identify regions with carcinoma subtypes of invasive ductal carcinoma and invasive mucinous carcinoma present in the tissues. The two carcinomas' unique endmembers used by the two methods agree to < 2 % residual error margin. This library of high quality and collected under an environment with no ambient background may be instrumental to develop or validate more advanced unsupervised data cube analysis algorithms, such as effective neural networks for efficient subtype classification.
Collapse
Affiliation(s)
- Jeeseong Hwang
- National Institute of Standards and Technology, Applied Physics Division, Boulder, Colorado, United States
| | - Philip Cheney
- National Institute of Standards and Technology, Applied Physics Division, Boulder, Colorado, United States
- Battelle Memorial Institute, Columbus, Ohio, United States
| | - Stephen C. Kanick
- Dartmouth College, Thayer School of Engineering, Hanover, New Hampshire, United States
| | - Hanh N. D. Le
- National Institute of Standards and Technology, Applied Physics Division, Boulder, Colorado, United States
| | - David M. McClatchy
- Dartmouth College, Thayer School of Engineering, Hanover, New Hampshire, United States
- Massachusetts General Hospital, Department of Radiation Oncology, Boston, Massachusetts, United States
| | - Helen Zhang
- National Institute of Standards and Technology, Applied Physics Division, Boulder, Colorado, United States
| | - Nian Liu
- National Institute of Standards and Technology, Statistical Engineering Division, Gaithersburg, Maryland, United States
| | - Zhan-Qian John Lu
- National Institute of Standards and Technology, Statistical Engineering Division, Gaithersburg, Maryland, United States
| | - Tae Joon Cho
- National Institute of Standards and Technology, Materials Measurement Science Division, Gaithersburg, Maryland, United States
| | - Kimberly Briggman
- National Institute of Standards and Technology, Applied Physics Division, Boulder, Colorado, United States
| | - David W. Allen
- National Institute of Standards and Technology, Sensor Science Division, Gaithersburg, Maryland, United States
| | - Wendy A. Wells
- Dartmouth Hitchcock Medical Center, Department of Pathology, Lebanon, New Hampshire, United States
| | - Brian W. Pogue
- Dartmouth College, Thayer School of Engineering, Hanover, New Hampshire, United States
| |
Collapse
|
11
|
Wang B, Chen RQ, Li J, Roy K. Interfacing data science with cell therapy manufacturing: where we are and where we need to be. Cytotherapy 2024; 26:967-979. [PMID: 38842968 DOI: 10.1016/j.jcyt.2024.03.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 08/25/2024]
Abstract
Although several cell-based therapies have received FDA approval, and others are showing promising results, scalable, and quality-driven reproducible manufacturing of therapeutic cells at a lower cost remains challenging. Challenges include starting material and patient variability, limited understanding of manufacturing process parameter effects on quality, complex supply chain logistics, and lack of predictive, well-understood product quality attributes. These issues can manifest as increased production costs, longer production times, greater batch-to-batch variability, and lower overall yield of viable, high-quality cells. The lack of data-driven insights and decision-making in cell manufacturing and delivery is an underlying commonality behind all these problems. Data collection and analytics from discovery, preclinical and clinical research, process development, and product manufacturing have not been sufficiently utilized to develop a "systems" understanding and identify actionable controls. Experience from other industries shows that data science and analytics can drive technological innovations and manufacturing optimization, leading to improved consistency, reduced risk, and lower cost. The cell therapy manufacturing industry will benefit from implementing data science tools, such as data-driven modeling, data management and mining, AI, and machine learning. The integration of data-driven predictive capabilities into cell therapy manufacturing, such as predicting product quality and clinical outcomes based on manufacturing data, or ensuring robustness and reliability using data-driven supply-chain modeling could enable more precise and efficient production processes and lead to better patient access and outcomes. In this review, we introduce some of the relevant computational and data science tools and how they are being or can be implemented in the cell therapy manufacturing workflow. We also identify areas where innovative approaches are required to address challenges and opportunities specific to the cell therapy industry. We conclude that interfacing data science throughout a cell therapy product lifecycle, developing data-driven manufacturing workflow, designing better data collection tools and algorithms, using data analytics and AI-based methods to better understand critical quality attributes and critical-process parameters, and training the appropriate workforce will be critical for overcoming current industry and regulatory barriers and accelerating clinical translation.
Collapse
Affiliation(s)
- Bryan Wang
- Marcus Center for Therapeutic Cell Characterization and Manufacturing, Parker H. Petit Institute of Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA, USA; Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA; School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta, GA, USA; NSF Engineering Research Center (ERC) for cell Manufacturing Technologies (CMaT), USA
| | - Rui Qi Chen
- H. Milton Stewart School of Industrial and Systems Engineering, Atlanta, GA, USA
| | - Jing Li
- H. Milton Stewart School of Industrial and Systems Engineering, Atlanta, GA, USA
| | - Krishnendu Roy
- NSF Engineering Research Center (ERC) for cell Manufacturing Technologies (CMaT), USA; Department of Biomedical Engineering, School of Engineering, Vanderbilt University, Nashville, TN, USA; Department of Pathology, Microbiology, and Immunology, Vanderbilt University School of Medicine, Nashville, TN, USA; Department of Chemical and Biomolecular Engineering, School of Engineering, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
12
|
Barrett JS. Artificial Intelligence Opportunities to Guide Precision Dosing Strategies. J Pediatr Pharmacol Ther 2024; 29:434-440. [PMID: 39144390 PMCID: PMC11321806 DOI: 10.5863/1551-6776-29.4.434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 03/12/2024] [Indexed: 08/16/2024]
|
13
|
Wu Y, Yao J, Xu XM, Zhou LL, Salvi R, Ding S, Gao X. Combination of static and dynamic neural imaging features to distinguish sensorineural hearing loss: a machine learning study. Front Neurosci 2024; 18:1402039. [PMID: 38933814 PMCID: PMC11201293 DOI: 10.3389/fnins.2024.1402039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Purpose Sensorineural hearing loss (SNHL) is the most common form of sensory deprivation and is often unrecognized by patients, inducing not only auditory but also nonauditory symptoms. Data-driven classifier modeling with the combination of neural static and dynamic imaging features could be effectively used to classify SNHL individuals and healthy controls (HCs). Methods We conducted hearing evaluation, neurological scale tests and resting-state MRI on 110 SNHL patients and 106 HCs. A total of 1,267 static and dynamic imaging characteristics were extracted from MRI data, and three methods of feature selection were computed, including the Spearman rank correlation test, least absolute shrinkage and selection operator (LASSO) and t test as well as LASSO. Linear, polynomial, radial basis functional kernel (RBF) and sigmoid support vector machine (SVM) models were chosen as the classifiers with fivefold cross-validation. The receiver operating characteristic curve, area under the curve (AUC), sensitivity, specificity and accuracy were calculated for each model. Results SNHL subjects had higher hearing thresholds in each frequency, as well as worse performance in cognitive and emotional evaluations, than HCs. After comparison, the selected brain regions using LASSO based on static and dynamic features were consistent with the between-group analysis, including auditory and nonauditory areas. The subsequent AUCs of the four SVM models (linear, polynomial, RBF and sigmoid) were as follows: 0.8075, 0.7340, 0.8462 and 0.8562. The RBF and sigmoid SVM had relatively higher accuracy, sensitivity and specificity. Conclusion Our research raised attention to static and dynamic alterations underlying hearing deprivation. Machine learning-based models may provide several useful biomarkers for the classification and diagnosis of SNHL.
Collapse
Affiliation(s)
- Yuanqing Wu
- Department of Otorhinolaryngology Head and Neck Surgery, Nanjing Drum Tower Hospital Clinical College of Nanjing Medical University, Nanjing, China
- Department of Otorhinolaryngology Head and Neck Surgery, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Jun Yao
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Xiao-Min Xu
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Lei-Lei Zhou
- Department of Radiology, Nanjing First Hospital, Nanjing Medical University, Nanjing, China
| | - Richard Salvi
- Center for Hearing and Deafness, University at Buffalo, The State University of New York, Buffalo, NY, United States
| | - Shaohua Ding
- Department of Radiology, The Affiliated Taizhou People's Hospital of Nanjing Medical University, Taizhou School of Clinical Medicine, Nanjing Medical University, Taizhou, China
| | - Xia Gao
- Department of Otorhinolaryngology Head and Neck Surgery, Nanjing Drum Tower Hospital Clinical College of Nanjing Medical University, Nanjing, China
| |
Collapse
|
14
|
Khanna NN, Singh M, Maindarkar M, Kumar A, Johri AM, Mentella L, Laird JR, Paraskevas KI, Ruzsa Z, Singh N, Kalra MK, Fernandes JFE, Chaturvedi S, Nicolaides A, Rathore V, Singh I, Teji JS, Al-Maini M, Isenovic ER, Viswanathan V, Khanna P, Fouda MM, Saba L, Suri JS. Polygenic Risk Score for Cardiovascular Diseases in Artificial Intelligence Paradigm: A Review. J Korean Med Sci 2023; 38:e395. [PMID: 38013648 PMCID: PMC10681845 DOI: 10.3346/jkms.2023.38.e395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 10/15/2023] [Indexed: 11/29/2023] Open
Abstract
Cardiovascular disease (CVD) related mortality and morbidity heavily strain society. The relationship between external risk factors and our genetics have not been well established. It is widely acknowledged that environmental influence and individual behaviours play a significant role in CVD vulnerability, leading to the development of polygenic risk scores (PRS). We employed the PRISMA search method to locate pertinent research and literature to extensively review artificial intelligence (AI)-based PRS models for CVD risk prediction. Furthermore, we analyzed and compared conventional vs. AI-based solutions for PRS. We summarized the recent advances in our understanding of the use of AI-based PRS for risk prediction of CVD. Our study proposes three hypotheses: i) Multiple genetic variations and risk factors can be incorporated into AI-based PRS to improve the accuracy of CVD risk predicting. ii) AI-based PRS for CVD circumvents the drawbacks of conventional PRS calculators by incorporating a larger variety of genetic and non-genetic components, allowing for more precise and individualised risk estimations. iii) Using AI approaches, it is possible to significantly reduce the dimensionality of huge genomic datasets, resulting in more accurate and effective disease risk prediction models. Our study highlighted that the AI-PRS model outperformed traditional PRS calculators in predicting CVD risk. Furthermore, using AI-based methods to calculate PRS may increase the precision of risk predictions for CVD and have significant ramifications for individualized prevention and treatment plans.
Collapse
Affiliation(s)
- Narendra N Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi, India
- Asia Pacific Vascular Society, New Delhi, India
| | - Manasvi Singh
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- Bennett University, Greater Noida, India
| | - Mahesh Maindarkar
- Asia Pacific Vascular Society, New Delhi, India
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- School of Bioengineering Sciences and Research, Maharashtra Institute of Technology's Art, Design and Technology University, Pune, India
| | | | - Amer M Johri
- Department of Medicine, Division of Cardiology, Queen's University, Kingston, Canada
| | - Laura Mentella
- Department of Medicine, Division of Cardiology, University of Toronto, Toronto, Canada
| | - John R Laird
- Heart and Vascular Institute, Adventist Health St. Helena, St. Helena, CA, USA
| | | | - Zoltan Ruzsa
- Invasive Cardiology Division, University of Szeged, Szeged, Hungary
| | - Narpinder Singh
- Department of Food Science and Technology, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | | | | | - Seemant Chaturvedi
- Department of Neurology & Stroke Program, University of Maryland, Baltimore, MD, USA
| | - Andrew Nicolaides
- Vascular Screening and Diagnostic Centre and University of Nicosia Medical School, Cyprus
| | - Vijay Rathore
- Nephrology Department, Kaiser Permanente, Sacramento, CA, USA
| | - Inder Singh
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
| | - Jagjit S Teji
- Ann and Robert H. Lurie Children's Hospital of Chicago, Chicago, IL, USA
| | - Mostafa Al-Maini
- Allergy, Clinical Immunology and Rheumatology Institute, Toronto, ON, Canada
| | - Esma R Isenovic
- Department of Radiobiology and Molecular Genetics, National Institute of The Republic of Serbia, University of Belgrade, Beograd, Serbia
| | | | - Puneet Khanna
- Department of Anaesthesiology, AIIMS, New Delhi, India
| | - Mostafa M Fouda
- Department of Electrical and Computer Engineering, Idaho State University, Pocatello, ID, USA
| | - Luca Saba
- Department of Radiology, Azienda Ospedaliero Universitaria, Cagliari, Italy
| | - Jasjit S Suri
- Asia Pacific Vascular Society, New Delhi, India
- Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, USA
- Department of Computer Engineering, Graphic Era Deemed to be University, Dehradun, India.
| |
Collapse
|
15
|
Zengin HY, Karabulut E. Biomarker detection using corrected degree of domesticity in hybrid social network feature selection for improving classifier performance. BMC Bioinformatics 2023; 24:407. [PMID: 37904081 PMCID: PMC10617059 DOI: 10.1186/s12859-023-05540-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Dimension reduction, especially feature selection, is an important step in improving classification performance for high-dimensional data. Particularly in cancer research, when reducing the number of features, i.e., genes, it is important to select the most informative features/potential biomarkers that could affect the diagnostic accuracy. Therefore, researchers continuously try to explore more efficient ways to reduce the large number of features/genes to a small but informative subset before the classification task. Hybrid methods have been extensively investigated for this purpose, and research to find the optimal approach is ongoing. Social network analysis is used as a part of a hybrid method, although there are several issues that have arisen when using social network tools, such as using a single environment for computing, constructing an adjacency matrix or computing network measures. Therefore, in our study, we apply a hybrid feature selection method consisting of several machine learning algorithms in addition to social network analysis with our proposed network metric, called the corrected degree of domesticity, in a single environment, R, to improve the support vector machine classifier's performance. In addition, we evaluate and compare the performances of several combinations used in the different steps of the method with a simulation experiment. RESULTS The proposed method improves the classifier's performance compared to using the whole feature set in all the cases we investigate. Additionally, in terms of the area under the receiver operating characteristic (ROC) curve, our approach improves classification performance compared to several approaches in the literature. CONCLUSION When using the corrected degree of domesticity as a network degree centrality measure, it is important to use our correction to compare nodes/features with no connection outside of their community since it provides a more accurate ranking among the features. Due to the nature of the hybrid method, which includes social network analysis, it is necessary to investigate possible combinations to provide an optimal solution for the microarray data used in the research.
Collapse
Affiliation(s)
- Hatice Yağmur Zengin
- Department of Biostatistics, Hacettepe University Faculty of Medicine, Sıhhiye, 06230, Ankara, Türkiye.
| | - Erdem Karabulut
- Department of Biostatistics, Hacettepe University Faculty of Medicine, Sıhhiye, 06230, Ankara, Türkiye
| |
Collapse
|
16
|
Khatun R, Akter M, Islam MM, Uddin MA, Talukder MA, Kamruzzaman J, Azad AKM, Paul BK, Almoyad MAA, Aryal S, Moni MA. Cancer Classification Utilizing Voting Classifier with Ensemble Feature Selection Method and Transcriptomic Data. Genes (Basel) 2023; 14:1802. [PMID: 37761941 PMCID: PMC10530870 DOI: 10.3390/genes14091802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/10/2023] [Accepted: 09/12/2023] [Indexed: 09/29/2023] Open
Abstract
Biomarker-based cancer identification and classification tools are widely used in bioinformatics and machine learning fields. However, the high dimensionality of microarray gene expression data poses a challenge for identifying important genes in cancer diagnosis. Many feature selection algorithms optimize cancer diagnosis by selecting optimal features. This article proposes an ensemble rank-based feature selection method (EFSM) and an ensemble weighted average voting classifier (VT) to overcome this challenge. The EFSM uses a ranking method that aggregates features from individual selection methods to efficiently discover the most relevant and useful features. The VT combines support vector machine, k-nearest neighbor, and decision tree algorithms to create an ensemble model. The proposed method was tested on three benchmark datasets and compared to existing built-in ensemble models. The results show that our model achieved higher accuracy, with 100% for leukaemia, 94.74% for colon cancer, and 94.34% for the 11-tumor dataset. This study concludes by identifying a subset of the most important cancer-causing genes and demonstrating their significance compared to the original data. The proposed approach surpasses existing strategies in accuracy and stability, significantly impacting the development of ML-based gene analysis. It detects vital genes with higher precision and stability than other existing methods.
Collapse
Affiliation(s)
- Rabea Khatun
- Department of Computer Science and Engineering, Green University of Bangladesh, Dhaka 1207, Bangladesh;
| | - Maksuda Akter
- Department of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh; (M.A.); (M.A.T.)
| | - Md. Manowarul Islam
- Department of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh; (M.A.); (M.A.T.)
| | - Md. Ashraf Uddin
- School of Information Technology, Deakin University, Waurn Ponds Campus, Geelong, VIC 3125, Australia; (M.A.U.); (S.A.)
| | - Md. Alamin Talukder
- Department of Computer Science and Engineering, Jagannath University, Dhaka 1100, Bangladesh; (M.A.); (M.A.T.)
| | - Joarder Kamruzzaman
- Centre for Smart Analytics, Federation University Australia, Ballarat, VIC 3842, Australia;
| | - AKM Azad
- Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia;
| | - Bikash Kumar Paul
- Department of Information and Communication Technology, Mawlana Bhashani Science and Technology University, Tangail 1902, Bangladesh;
- Department of Software Engineering, Daffodil International University (DIU), Dhaka 1342, Bangladesh
| | - Muhammad Ali Abdulllah Almoyad
- Department of Basic Medical Sciences, College of Applied Medical Sciences in Khamis Mushyt King Khalid University, Abha 61412, Saudi Arabia;
| | - Sunil Aryal
- School of Information Technology, Deakin University, Waurn Ponds Campus, Geelong, VIC 3125, Australia; (M.A.U.); (S.A.)
| | - Mohammad Ali Moni
- Artificial Intelligence & Data Science, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia
| |
Collapse
|
17
|
Barik K, Watanabe K, Hirosawa T, Yoshimura Y, Kikuchi M, Bhattacharya J, Saha G. Autism Detection in Children using Common Spatial Patterns of MEG Signals. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2023; 2023:1-4. [PMID: 38083789 DOI: 10.1109/embc40787.2023.10340449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Autism exhibits a wide range of developmental disabilities and is associated with aberrant anatomical and functional neural patterns. To detect autism in young children (4-7 years) in an automatic and non-invasive fashion, we have recorded magnetoencephalogram (MEG) signals from 30 autistic and 30 age-matched typically developing (TD) children. We have used a machine learning classification framework with common spatial pattern (CSP)-based logarithmic band power (LBP) features. When comparing the LBP feature to the conventional logarithmic variance (LV) spatial pattern, CSP + LBP (92.77%) has performed better than CSP + LV (90.66%) in the 1-100 Hz frequency range for distinguishing autistic children from TD children. In frequency band-wise analysis using our proposed method, the high gamma frequency band (50-100 Hz) has shown the highest classification accuracy (97.14%). Our findings reveal that the occipital lobe exhibits the most distinct spatial pattern in autistic children over the whole frequency range. This study shows that spatial brain activation patterns can be utilized as potential biomarkers of autism in young children. The improved performance signifies the clinical relevance of the work for autism detection using MEG signals.
Collapse
|
18
|
Shenker JJ, Steele CJ, Zatorre RJ, Penhune VB. Using cortico-cerebellar structural patterns to classify early- and late-trained musicians. Hum Brain Mapp 2023. [PMID: 37326147 PMCID: PMC10365229 DOI: 10.1002/hbm.26395] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 04/19/2023] [Accepted: 05/25/2023] [Indexed: 06/17/2023] Open
Abstract
A body of current evidence suggests that there is a sensitive period for musical training: people who begin training before the age of seven show better performance on tests of musical skill, and also show differences in brain structure-especially in motor cortical and cerebellar regions-compared with those who start later. We used support vector machine models-a subtype of supervised machine learning-to investigate distributed patterns of structural differences between early-trained (ET) and late-trained (LT) musicians and to better understand the age boundaries of the sensitive period for early musicianship. After selecting regions of interest from the cerebellum and cortical sensorimotor regions, we applied recursive feature elimination with cross-validation to produce a model which optimally and accurately classified ET and LT musicians. This model identified a combination of 17 regions, including 9 cerebellar and 8 sensorimotor regions, and maintained a high accuracy and sensitivity (true positives, i.e., ET musicians) without sacrificing specificity (true negatives, i.e., LT musicians). Critically, this model-which defined ET musicians as those who began their training before the age of 7-outperformed all other models in which age of start was earlier or later (between ages 5-10). Our model's ability to accurately classify ET and LT musicians provides additional evidence that musical training before age 7 affects cortico-cerebellar structure in adulthood, and is consistent with the hypothesis that connected brain regions interact during development to reciprocally influence brain and behavioral maturation.
Collapse
Affiliation(s)
- Joseph J Shenker
- Department of Psychology, Concordia University, Montreal, Quebec, Canada
- BRAMS: International Laboratory for Brain, Music, and Sound Research, Montreal, Quebec, Canada
| | - Christopher J Steele
- Department of Psychology, Concordia University, Montreal, Quebec, Canada
- Department of Neurology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Robert J Zatorre
- BRAMS: International Laboratory for Brain, Music, and Sound Research, Montreal, Quebec, Canada
- Cognitive Neuroscience Unit, Montreal Neurological Institute, McGill University, Montreal, Quebec, Canada
| | - Virginia B Penhune
- Department of Psychology, Concordia University, Montreal, Quebec, Canada
- BRAMS: International Laboratory for Brain, Music, and Sound Research, Montreal, Quebec, Canada
| |
Collapse
|
19
|
Xiang T, Li T, Li J, Li X, Wang J. Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs. FASEB J 2023; 37:e22961. [PMID: 37178007 DOI: 10.1096/fj.202300245r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 03/30/2023] [Accepted: 04/25/2023] [Indexed: 05/15/2023]
Abstract
Genomic prediction, which is based on solving linear mixed-model (LMM) equations, is the most popular method for predicting breeding values or phenotypic performance for economic traits in livestock. With the need to further improve the performance of genomic prediction, nonlinear methods have been considered as an alternative and promising approach. The excellent ability to predict phenotypes in animal husbandry has been demonstrated by machine learning (ML) approaches, which have been rapidly developed. To investigate the feasibility and reliability of implementing genomic prediction using nonlinear models, the performances of genomic predictions for pig productive traits using the linear genomic selection model and nonlinear machine learning models were compared. Then, to reduce the high-dimensional features of genome sequence data, different machine learning algorithms, including the random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and convolutional neural network (CNN) algorithms, were used to perform genomic feature selection as well as genomic prediction on reduced feature genome data. All of the analyses were processed on two real pig datasets: the published PIC pig dataset and a dataset comprising data from a national pig nucleus herd in Chifeng, North China. Overall, the accuracies of predicted phenotypic performance for traits T1, T2, T3 and T5 in the PIC dataset and average daily gain (ADG) in the Chifeng dataset were higher using the ML methods than the LMM method, while those for trait T4 in the PIC dataset and total number of piglets born (TNB) in the Chifeng dataset were slightly lower using the ML methods than the LMM method. Among all the different ML algorithms, SVM was the most appropriate for genomic prediction. For the genomic feature selection experiment, the most stable and most accurate results across different algorithms were achieved using XGBoost in combination with the SVM algorithm. Through feature selection, the number of genomic markers can be reduced to 1 in 20, while the predictive performance on some traits can even be improved compared to using the full genome data. Finally, we developed a new tool that can be used to execute combined XGBoost and SVM algorithms to realize genomic feature selection and phenotypic prediction.
Collapse
Affiliation(s)
- Tao Xiang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Tao Li
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Jielin Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, China
| | - Xin Li
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| | - Jia Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, China
- Key Laboratory of Smart Farming for Agricultural Animals, Huazhong Agricultural University, Wuhan, China
- Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
20
|
Wu Z, Wang W, Zhang K, Fan M, Lin R. Epigenetic and Tumor Microenvironment for Prognosis of Patients with Gastric Cancer. Biomolecules 2023; 13:biom13050736. [PMID: 37238607 DOI: 10.3390/biom13050736] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 04/02/2023] [Accepted: 04/12/2023] [Indexed: 05/28/2023] Open
Abstract
BACKGROUND Epigenetics studies heritable or inheritable mechanisms that regulate gene expression rather than altering the DNA sequence. However, no research has investigated the link between TME-related genes (TRGs) and epigenetic-related genes (ERGs) in GC. METHODS A complete review of genomic data was performed to investigate the relationship between the epigenesis tumor microenvironment (TME) and machine learning algorithms in GC. RESULTS Firstly, TME-related differential expression of genes (DEGs) performed non-negative matrix factorization (NMF) clustering analysis and determined two clusters (C1 and C2). Then, Kaplan-Meier curves for overall survival (OS) and progression-free survival (PFS) rates suggested that cluster C1 predicted a poorer prognosis. The Cox-LASSO regression analysis identified eight hub genes (SRMS, MET, OLFML2B, KIF24, CLDN9, RNF43, NETO2, and PRSS21) to build the TRG prognostic model and nine hub genes (TMPO, SLC25A15, SCRG1, ISL1, SOD3, GAD1, LOXL4, AKR1C2, and MAGEA3) to build the ERG prognostic model. Additionally, the signature's area under curve (AUC) values, survival rates, C-index scores, and mean squared error (RMS) curves were evaluated against those of previously published signatures, which revealed that the signature identified in this study performed comparably. Meanwhile, based on the IMvigor210 cohort, a statistically significant difference in OS between immunotherapy and risk scores was observed. It was followed by LASSO regression analysis which identified 17 key DEGs and a support vector machine (SVM) model identified 40 significant DEGs, and based on the Venn diagram, eight co-expression genes (ENPP6, VMP1, LY6E, SHISA6, TMEM158, SYT4, IL11, and KLK8) were discovered. CONCLUSION The study identified some hub genes that could be useful in predicting prognosis and management in GC.
Collapse
Affiliation(s)
- Zenghong Wu
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Weijun Wang
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Kun Zhang
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Mengke Fan
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Rong Lin
- Division of Gastroenterology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
21
|
Hoang VT, Jeon HJ, You ES, Yoon Y, Jung S, Lee OJ. Graph Representation Learning and Its Applications: A Survey. SENSORS (BASEL, SWITZERLAND) 2023; 23:4168. [PMID: 37112507 PMCID: PMC10144941 DOI: 10.3390/s23084168] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/16/2023] [Accepted: 04/17/2023] [Indexed: 06/19/2023]
Abstract
Graphs are data structures that effectively represent relational data in the real world. Graph representation learning is a significant task since it could facilitate various downstream tasks, such as node classification, link prediction, etc. Graph representation learning aims to map graph entities to low-dimensional vectors while preserving graph structure and entity relationships. Over the decades, many models have been proposed for graph representation learning. This paper aims to show a comprehensive picture of graph representation learning models, including traditional and state-of-the-art models on various graphs in different geometric spaces. First, we begin with five types of graph embedding models: graph kernels, matrix factorization models, shallow models, deep-learning models, and non-Euclidean models. In addition, we also discuss graph transformer models and Gaussian embedding models. Second, we present practical applications of graph embedding models, from constructing graphs for specific domains to applying models to solve tasks. Finally, we discuss challenges for existing models and future research directions in detail. As a result, this paper provides a structured overview of the diversity of graph embedding models.
Collapse
Affiliation(s)
- Van Thuy Hoang
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| | - Hyeon-Ju Jeon
- Data Assimilation Group, Korea Institute of Atmospheric Prediction Systems (KIAPS), 35, Boramae-ro 5-gil, Dongjak-gu, Seoul 07071, Republic of Korea;
| | - Eun-Soon You
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| | - Yoewon Yoon
- Department of Social Welfare, Dongguk University, 30, Pildong-ro 1-gil, Jung-gu, Seoul 04620, Republic of Korea;
| | - Sungyeop Jung
- Semiconductor Devices and Circuits Laboratory, Advanced Institute of Convergence Technology (AICT), Seoul National University, 145, Gwanggyo-ro, Yeongtong-gu, Suwon-si 16229, Gyeonggi-do, Republic of Korea;
| | - O-Joun Lee
- Department of Artificial Intelligence, The Catholic University of Korea, 43, Jibong-ro, Bucheon-si 14662, Gyeonggi-do, Republic of Korea; (V.T.H.); (E.-S.Y.)
| |
Collapse
|
22
|
Sharma K, Aminian M, Ghosh T, Liu X, Kirby M. Using machine learning to determine the time of exposure to infection by a respiratory pathogen. Sci Rep 2023; 13:5340. [PMID: 37005391 PMCID: PMC10067823 DOI: 10.1038/s41598-023-30306-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 02/21/2023] [Indexed: 04/04/2023] Open
Abstract
Given an infected host, estimating the time that has elapsed since initial exposure to the pathogen is an important problem in public health. In this paper we use longitudinal gene expression data from human challenge studies of viral respiratory illnesses for building predictive models to estimate the time elapsed since onset of respiratory infection. We apply sparsity driven machine learning to this time-stamped gene expression data to model the time of exposure by a pathogen and subsequent infection accompanied by the onset of the host immune response. These predictive models exploit the fact that the host gene expression profile evolves in time and its characteristic temporal signature can be effectively modeled using a small number of features. Predicting the time of exposure to infection to be in first 48 h after exposure produces BSR in the range of 80-90% on sequestered test data. A variety of machine learning experiments provide evidence that models developed on one virus can be used to predict exposure time for other viruses, e.g., H1N1, H3N2, and HRV. The interferon [Formula: see text] signaling pathway appears to play a central role in keeping time from onset of infection. Successful prediction of the time of exposure to a pathogen has potential ramifications for patient treatment and contact tracing.
Collapse
Affiliation(s)
- Kartikay Sharma
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Manuchehr Aminian
- Department of Mathematics, California State Polytechnic University, Pomona, CA, USA
| | - Tomojit Ghosh
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA
| | - Xiaoyu Liu
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Michael Kirby
- Department of Computer Science, Colorado State University, Fort Collins, CO, USA.
- Department of Mathematics, Colorado State University, Fort Collins, CO, USA.
| |
Collapse
|
23
|
Barik K, Watanabe K, Bhattacharya J, Saha G. Functional connectivity based machine learning approach for autism detection in young children using MEG signals. J Neural Eng 2023; 20. [PMID: 36812588 DOI: 10.1088/1741-2552/acbe1f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Accepted: 02/22/2023] [Indexed: 02/24/2023]
Abstract
Objective.Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder, and identifying early autism biomarkers plays a vital role in improving detection and subsequent life outcomes. This study aims to reveal hidden biomarkers in the patterns of functional brain connectivity as recorded by the neuro-magnetic brain responses in children with ASD.Approach.We recorded resting-state magnetoencephalogram signals from thirty children with ASD (4-7 years) and thirty age and gender-matched typically developing (TD) children. We used a complex coherency-based functional connectivity analysis to understand the interactions between different brain regions of the neural system. The work characterizes the large-scale neural activity at different brain oscillations using functional connectivity analysis and assesses the classification performance of coherence-based (COH) measures for autism detection in young children. A comparative study has also been carried out on COH-based connectivity networks both region-wise and sensor-wise to understand frequency-band-specific connectivity patterns and their connections with autism symptomatology. We used artificial neural network (ANN) and support vector machine (SVM) classifiers in the machine learning framework with a five-fold CV technique.Main results.To classify ASD from TD children, the COH connectivity feature yields the highest classification accuracy of 91.66% in the high gamma (50-100 Hz) frequency band. In region-wise connectivity analysis, the second highest performance is in the delta band (1-4 Hz) after the gamma band. Combining the delta and gamma band features, we achieved a classification accuracy of 95.03% and 93.33% in the ANN and SVM classifiers, respectively. Using classification performance metrics and further statistical analysis, we show that ASD children demonstrate significant hyperconnectivity.Significance.Our findings support the weak central coherency theory in autism detection. Further, despite its lower complexity, we show that region-wise COH analysis outperforms the sensor-wise connectivity analysis. Altogether, these results demonstrate the functional brain connectivity patterns as an appropriate biomarker of autism in young children.
Collapse
Affiliation(s)
- Kasturi Barik
- Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Katsumi Watanabe
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Joydeep Bhattacharya
- Department of Psychology, Goldsmiths, University of London, London, United Kingdom
| | - Goutam Saha
- Department of Electronics and Electrical Communication Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India
| |
Collapse
|
24
|
Jin D, Yang M, Qin Z, Peng J, Ying S. A Weighting Method for Feature Dimension by Semisupervised Learning With Entropy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:1218-1227. [PMID: 34546928 DOI: 10.1109/tnnls.2021.3105127] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this article, a semisupervised weighting method for feature dimension based on entropy is proposed for classification, dimension reduction, and correlation analysis. For real-world data, different feature dimensions usually show different importance. Generally, data in the same class are supposed to be similar, so their entropy should be small; and those in different classes are supposed to be dissimilar, so their entropy should be large. According to this, we propose a way to construct the weights of feature dimensions with the whole entropy and the innerclass entropies. The weights indicate the contribution of their corresponding feature dimensions in classification. They can be used to improve the performance of classification by giving a weighted distance metric and can be applied to dimension reduction and correlation analysis as well. Some numerical experiments are given to test the proposed method by comparing it with some other representative methods. They demonstrate that the proposed method is feasible and efficient in classification, dimension reduction, and correlation analysis.
Collapse
|
25
|
Song X, Liang K, Li J. WGRLR: A Weighted Group Regularized Logistic Regression for Cancer Diagnosis and Gene Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1563-1573. [PMID: 36044492 DOI: 10.1109/tcbb.2022.3203167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Sparse regressions applied to cancer diagnosis suffer from noise reduction, gene grouping, and group significance evaluation. This paper presented the weighted group regularized logistic regression (WGRLR) for dealing with the above problems. Clean data was separated from noisy gene expression profile data, based on which gene grouping and model building were performed. An interpretable gene group significance evaluation criterion was proposed based on symmetrical uncertainty and module eigengene. A group-wise individual gene significance evaluation criterion was also presented. The performances of the proposed method were compared with WGGL, ASGL-CMI, SGL, GL, Elastic Net, and lasso on acute leukemia and brain cancer data. Experimental results demonstrate that the proposed method is superior to the other six methods in cancer diagnosis accuracy and gene selection.
Collapse
|
26
|
Wyatt CDR, Bentley MA, Taylor D, Favreau E, Brock RE, Taylor BA, Bell E, Leadbeater E, Sumner S. Social complexity, life-history and lineage influence the molecular basis of castes in vespid wasps. Nat Commun 2023; 14:1046. [PMID: 36828829 PMCID: PMC9958023 DOI: 10.1038/s41467-023-36456-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Accepted: 01/31/2023] [Indexed: 02/26/2023] Open
Abstract
A key mechanistic hypothesis for the evolution of division of labour in social insects is that a shared set of genes co-opted from a common solitary ancestral ground plan (a genetic toolkit for sociality) regulates caste differentiation across levels of social complexity. Using brain transcriptome data from nine species of vespid wasps, we test for overlap in differentially expressed caste genes and use machine learning models to predict castes using different gene sets. We find evidence of a shared genetic toolkit across species representing different levels of social complexity. We also find evidence of additional fine-scale differences in predictive gene sets, functional enrichment and rates of gene evolution that are related to level of social complexity, lineage and of colony founding. These results suggest that the concept of a shared genetic toolkit for sociality may be too simplistic to fully describe the process of the major transition to sociality.
Collapse
Affiliation(s)
- Christopher Douglas Robert Wyatt
- Centre for Biodiversity and Environment Research, Dept Genetics, Evolution & Environment, University College London, London, WC1E 6BT, UK.
| | - Michael Andrew Bentley
- Centre for Biodiversity and Environment Research, Dept Genetics, Evolution & Environment, University College London, London, WC1E 6BT, UK
| | - Daisy Taylor
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK
| | - Emeline Favreau
- Centre for Biodiversity and Environment Research, Dept Genetics, Evolution & Environment, University College London, London, WC1E 6BT, UK
| | - Ryan Edward Brock
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, Norfolk, NR4 7UH, UK
| | - Benjamin Aaron Taylor
- Centre for Biodiversity and Environment Research, Dept Genetics, Evolution & Environment, University College London, London, WC1E 6BT, UK
| | - Emily Bell
- School of Biological Sciences, University of Bristol, Bristol, BS8 1TQ, UK
| | - Ellouise Leadbeater
- Department of Biological Sciences, Royal Holloway University of London, Egham, TW20 0EX, UK
| | - Seirian Sumner
- Centre for Biodiversity and Environment Research, Dept Genetics, Evolution & Environment, University College London, London, WC1E 6BT, UK.
| |
Collapse
|
27
|
Automated Lung Cancer Segmentation in Tissue Micro Array Analysis Histopathological Images Using a Prototype of Computer-Assisted Diagnosis. J Pers Med 2023; 13:jpm13030388. [PMID: 36983570 PMCID: PMC10051974 DOI: 10.3390/jpm13030388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 02/16/2023] [Accepted: 02/16/2023] [Indexed: 02/25/2023] Open
Abstract
Background: Lung cancer is a fatal disease that kills approximately 85% of those diagnosed with it. In recent years, advances in medical imaging have greatly improved the acquisition, storage, and visualization of various pathologies, making it a necessary component in medicine today. Objective: Develop a computer-aided diagnostic system to detect lung cancer early by segmenting tumor and non-tumor tissue on Tissue Micro Array Analysis (TMA) histopathological images. Method: The prototype computer-aided diagnostic system was developed to segment tumor areas, non-tumor areas, and fundus on TMA histopathological images. Results: The system achieved an average accuracy of 83.4% and an F-measurement of 84.4% in segmenting tumor and non-tumor tissue. Conclusion: The computer-aided diagnostic system provides a second diagnostic opinion to specialists, allowing for more precise diagnoses and more appropriate treatments for lung cancer.
Collapse
|
28
|
Depto DS, Rizvee MM, Rahman A, Zunair H, Rahman MS, Mahdy MRC. Quantifying imbalanced classification methods for leukemia detection. Comput Biol Med 2023; 152:106372. [PMID: 36516574 DOI: 10.1016/j.compbiomed.2022.106372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 11/01/2022] [Accepted: 11/27/2022] [Indexed: 12/03/2022]
Abstract
Uncontrolled proliferation of B-lymphoblast cells is a common characterization of Acute Lymphoblastic Leukemia (ALL). B-lymphoblasts are found in large numbers in peripheral blood in malignant cases. Early detection of the cell in bone marrow is essential as the disease progresses rapidly if left untreated. However, automated classification of the cell is challenging, owing to its fine-grained variability with B-lymphoid precursor cells and imbalanced data points. Deep learning algorithms demonstrate potential for such fine-grained classification as well as suffer from the imbalanced class problem. In this paper, we explore different deep learning-based State-Of-The-Art (SOTA) approaches to tackle imbalanced classification problems. Our experiment includes input, GAN (Generative Adversarial Networks), and loss-based methods to mitigate the issue of imbalanced class on the challenging C-NMC and ALLIDB-2 dataset for leukemia detection. We have shown empirical evidence that loss-based methods outperform GAN-based and input-based methods in imbalanced classification scenarios.
Collapse
Affiliation(s)
- Deponker Sarker Depto
- Department of Electrical and Computer Engineering, North South University, Bashundhara, Dhaka, 1229, Bangladesh.
| | - Md Mashfiq Rizvee
- Department of Electrical and Computer Engineering, North South University, Bashundhara, Dhaka, 1229, Bangladesh; Texas Tech University, Lubbock, TX, United States of America.
| | - Aimon Rahman
- Department of Electrical and Computer Engineering, North South University, Bashundhara, Dhaka, 1229, Bangladesh.
| | | | - M Sohel Rahman
- Department of Computer Science and Engineering, Bangladesh University of Engineering and Technology, ECE Building, West Palasi, Dhaka 1205, Bangladesh.
| | - M R C Mahdy
- Department of Electrical and Computer Engineering, North South University, Bashundhara, Dhaka, 1229, Bangladesh.
| |
Collapse
|
29
|
Jiang X, Dai W, Cai Y. Comparison of machine learning algorithms to SAPS II in predicting in-hospital mortality of fractures of the pelvis and acetabulum: analyzes based on MIMIC-III database. ALL LIFE 2022. [DOI: 10.1080/26895293.2022.2125448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022] Open
Affiliation(s)
- Xiang Jiang
- Department of Orthopaedics and Rehabilitation, Shanghai Yangzhi Rehabilitation Hospital (Shanghai Sunshine Rehabilitation Center), Tongji University School of Medicine, Shanghai, People’s Republic of China
| | - Weifan Dai
- Department of Digital Hub, Decathlon International, Shanghai, People’s Republic of China
| | - Yanrong Cai
- Department of Medicine, Heidelberg University Hospital, University of Heidelberg, Heidelberg, Germany
| |
Collapse
|
30
|
Sanders LM, Chandra R, Zebarjadi N, Beale HC, Lyle AG, Rodriguez A, Kephart ET, Pfeil J, Cheney A, Learned K, Currie R, Gitlin L, Vengerov D, Haussler D, Salama SR, Vaske OM. Machine learning multi-omics analysis reveals cancer driver dysregulation in pan-cancer cell lines compared to primary tumors. Commun Biol 2022; 5:1367. [PMID: 36513728 PMCID: PMC9747808 DOI: 10.1038/s42003-022-04075-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Accepted: 10/06/2022] [Indexed: 12/15/2022] Open
Abstract
Cancer cell lines have been widely used for decades to study biological processes driving cancer development, and to identify biomarkers of response to therapeutic agents. Advances in genomic sequencing have made possible large-scale genomic characterizations of collections of cancer cell lines and primary tumors, such as the Cancer Cell Line Encyclopedia (CCLE) and The Cancer Genome Atlas (TCGA). These studies allow for the first time a comprehensive evaluation of the comparability of cancer cell lines and primary tumors on the genomic and proteomic level. Here we employ bulk mRNA and micro-RNA sequencing data from thousands of samples in CCLE and TCGA, and proteomic data from partner studies in the MD Anderson Cell Line Project (MCLP) and The Cancer Proteome Atlas (TCPA), to characterize the extent to which cancer cell lines recapitulate tumors. We identify dysregulation of a long non-coding RNA and microRNA regulatory network in cancer cell lines, associated with differential expression between cell lines and primary tumors in four key cancer driver pathways: KRAS signaling, NFKB signaling, IL2/STAT5 signaling and TP53 signaling. Our results emphasize the necessity for careful interpretation of cancer cell line experiments, particularly with respect to therapeutic treatments targeting these important cancer pathways.
Collapse
Affiliation(s)
- Lauren M. Sanders
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Rahul Chandra
- grid.34477.330000000122986657Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA USA
| | - Navid Zebarjadi
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Department of Molecular, Cell and Developmental Biology, UC Santa Cruz, Santa Cruz, CA USA
| | - Holly C. Beale
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Department of Molecular, Cell and Developmental Biology, UC Santa Cruz, Santa Cruz, CA USA
| | - A. Geoffrey Lyle
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Department of Molecular, Cell and Developmental Biology, UC Santa Cruz, Santa Cruz, CA USA
| | - Analiz Rodriguez
- grid.241054.60000 0004 4687 1637Department of Neurosurgery, University of Arkansas for Medical Sciences, Little Rock, AR USA
| | - Ellen Towle Kephart
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Jacob Pfeil
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Allison Cheney
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Department of Molecular, Cell and Developmental Biology, UC Santa Cruz, Santa Cruz, CA USA
| | - Katrina Learned
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Rob Currie
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Leonid Gitlin
- grid.266102.10000 0001 2297 6811Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, California USA
| | - David Vengerov
- grid.419799.b0000 0004 4662 4679Oracle Labs, Oracle Corporation, Pleasanton, CA USA
| | - David Haussler
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA
| | - Sofie R. Salama
- grid.205975.c0000 0001 0740 6917Department of Biomolecular Engineering, UC Santa Cruz, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Howard Hughes Medical Institute, UC Santa Cruz, Santa Cruz, CA USA
| | - Olena M. Vaske
- grid.205975.c0000 0001 0740 6917UC Santa Cruz Genomics Institute, Santa Cruz, CA USA ,grid.205975.c0000 0001 0740 6917Department of Molecular, Cell and Developmental Biology, UC Santa Cruz, Santa Cruz, CA USA
| |
Collapse
|
31
|
Ke W, Crist RM, Clogston JD, Stern ST, Dobrovolskaia MA, Grodzinski P, Jensen MA. Trends and patterns in cancer nanotechnology research: A survey of NCI's caNanoLab and nanotechnology characterization laboratory. Adv Drug Deliv Rev 2022; 191:114591. [PMID: 36332724 PMCID: PMC9712232 DOI: 10.1016/j.addr.2022.114591] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 10/22/2022] [Accepted: 10/27/2022] [Indexed: 11/11/2022]
Abstract
Cancer nanotechnologies possess immense potential as therapeutic and diagnostic treatment modalities and have undergone significant and rapid advancement in recent years. With this emergence, the complexities of data standards in the field are on the rise. Data sharing and reanalysis is essential to more fully utilize this complex, interdisciplinary information to answer research questions, promote the technologies, optimize use of funding, and maximize the return on scientific investments. In order to support this, various data-sharing portals and repositories have been developed which not only provide searchable nanomaterial characterization data, but also provide access to standardized protocols for synthesis and characterization of nanomaterials as well as cutting-edge publications. The National Cancer Institute's (NCI) caNanoLab is a dedicated repository for all aspects pertaining to cancer-related nanotechnology data. The searchable database provides a unique opportunity for data mining and the use of artificial intelligence and machine learning, which aims to be an essential arm of future research studies, potentially speeding the design and optimization of next-generation therapies. It also provides an opportunity to track the latest trends and patterns in nanomedicine research. This manuscript provides the first look at such trends extracted from caNanoLab and compares these to similar metrics from the NCI's Nanotechnology Characterization Laboratory, a laboratory providing preclinical characterization of cancer nanotechnologies to researchers around the globe. Together, these analyses provide insight into the emerging interests of the research community and rise of promising nanoparticle technologies.
Collapse
Affiliation(s)
- Weina Ke
- Bioinformatics and Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States
| | - Rachael M Crist
- Nanotechnology Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States
| | - Jeffrey D Clogston
- Nanotechnology Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States
| | - Stephan T Stern
- Nanotechnology Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States
| | - Marina A Dobrovolskaia
- Nanotechnology Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States
| | - Piotr Grodzinski
- Nanodelivery Systems and Devices Branch, Cancer Imaging Program, National Cancer Institute, Rockville, MD, United States
| | - Mark A Jensen
- Bioinformatics and Computational Science, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD, United States.
| |
Collapse
|
32
|
Disease-related compound identification based on deeping learning method. Sci Rep 2022; 12:20594. [PMID: 36446871 PMCID: PMC9708143 DOI: 10.1038/s41598-022-24385-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Accepted: 11/15/2022] [Indexed: 12/02/2022] Open
Abstract
Acute lung injury (ALI) is a serious respiratory disease, which can lead to acute respiratory failure or death. It is closely related to the pathogenesis of New Coronavirus pneumonia (COVID-19). Many researches showed that traditional Chinese medicine (TCM) had a good effect on its intervention, and network pharmacology could play a very important role. In order to construct "disease-gene-target-drug" interaction network more accurately, deep learning algorithm is utilized in this paper. Two ALI-related target genes (REAL and SATA3) are considered, and the active and inactive compounds of the two corresponding target genes are collected as training data, respectively. Molecular descriptors and molecular fingerprints are utilized to characterize each compound. Forest graph embedded deep feed forward network (forgeNet) is proposed to train. The experimental results show that forgeNet performs better than support vector machines (SVM), random forest (RF), logical regression (LR), Naive Bayes (NB), XGBoost, LightGBM and gcForest. forgeNet could identify 19 compounds in Erhuang decoction (EhD) and Dexamethasone (DXMS) more accurately.
Collapse
|
33
|
Liu CF, Hung CM, Ko SC, Cheng KC, Chao CM, Sung MI, Hsing SC, Wang JJ, Chen CJ, Lai CC, Chen CM, Chiu CC. An artificial intelligence system to predict the optimal timing for mechanical ventilation weaning for intensive care unit patients: A two-stage prediction approach. Front Med (Lausanne) 2022; 9:935366. [PMID: 36465940 PMCID: PMC9715756 DOI: 10.3389/fmed.2022.935366] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/11/2022] [Indexed: 11/03/2023] Open
Abstract
BACKGROUND For the intensivists, accurate assessment of the ideal timing for successful weaning from the mechanical ventilation (MV) in the intensive care unit (ICU) is very challenging. PURPOSE Using artificial intelligence (AI) approach to build two-stage predictive models, namely, the try-weaning stage and weaning MV stage to determine the optimal timing of weaning from MV for ICU intubated patients, and implement into practice for assisting clinical decision making. METHODS AI and machine learning (ML) technologies were used to establish the predictive models in the stages. Each stage comprised 11 prediction time points with 11 prediction models. Twenty-five features were used for the first-stage models while 20 features were used for the second-stage models. The optimal models for each time point were selected for further practical implementation in a digital dashboard style. Seven machine learning algorithms including Logistic Regression (LR), Random Forest (RF), Support Vector Machines (SVM), K Nearest Neighbor (KNN), lightGBM, XGBoost, and Multilayer Perception (MLP) were used. The electronic medical records of the intubated ICU patients of Chi Mei Medical Center (CMMC) from 2016 to 2019 were included for modeling. Models with the highest area under the receiver operating characteristic curve (AUC) were regarded as optimal models and used to develop the prediction system accordingly. RESULTS A total of 5,873 cases were included in machine learning modeling for Stage 1 with the AUCs of optimal models ranging from 0.843 to 0.953. Further, 4,172 cases were included for Stage 2 with the AUCs of optimal models ranging from 0.889 to 0.944. A prediction system (dashboard) with the optimal models of the two stages was developed and deployed in the ICU setting. Respiratory care members expressed high recognition of the AI dashboard assisting ventilator weaning decisions. Also, the impact analysis of with- and without-AI assistance revealed that our AI models could shorten the patients' intubation time by 21 hours, besides gaining the benefit of substantial consistency between these two decision-making strategies. CONCLUSION We noticed that the two-stage AI prediction models could effectively and precisely predict the optimal timing to wean intubated patients in the ICU from ventilator use. This could reduce patient discomfort, improve medical quality, and lower medical costs. This AI-assisted prediction system is beneficial for clinicians to cope with a high demand for ventilators during the COVID-19 pandemic.
Collapse
Affiliation(s)
- Chung-Feng Liu
- Department of Medical Research, Chi Mei Medical Center, Tainan, Taiwan
| | - Chao-Ming Hung
- Department of General Surgery, E-Da Cancer Hospital, Kaohsiung, Taiwan
- College of Medicine, I-Shou University, Kaohsiung, Taiwan
| | - Shian-Chin Ko
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Kuo-Chen Cheng
- Department of Internal Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chien-Ming Chao
- Department of Intensive Care Medicine, Chi Mei Medical Center, Liouying, Taiwan
- Department of Dental Laboratory Technology, Min-Hwei College of Health Care Management, Liouying, Taiwan
| | - Mei-I Sung
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Shu-Chen Hsing
- Department of Respiratory Therapy, Chi Mei Medical Center, Tainan, Taiwan
| | - Jhi-Joung Wang
- Department of Anesthesiology, Chi Mei Medical Center, Tainan, Taiwan
- Department of Anesthesiology, National Defense Medical Center, Taipei, Taiwan
| | - Chia-Jung Chen
- Department of Information Systems, Chi Mei Medical Center, Tainan, Taiwan
| | - Chih-Cheng Lai
- Division of Hospital Medicine, Department of Internal Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chin-Ming Chen
- Department of Intensive Care Medicine, Chi Mei Medical Center, Tainan, Taiwan
| | - Chong-Chi Chiu
- Department of General Surgery, E-Da Cancer Hospital, Kaohsiung, Taiwan
- School of Medicine, College of Medicine, I-Shou University, Kaohsiung, Taiwan
- Department of Medical Education and Research, E-Da Cancer Hospital, Kaohsiung, Taiwan
- Department of General Surgery, Chi Mei Medical Center, Tainan, Taiwan
| |
Collapse
|
34
|
Bhandari N, Walambe R, Kotecha K, Khare SP. A comprehensive survey on computational learning methods for analysis of gene expression data. Front Mol Biosci 2022; 9:907150. [PMID: 36458095 PMCID: PMC9706412 DOI: 10.3389/fmolb.2022.907150] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 09/28/2022] [Indexed: 09/19/2023] Open
Abstract
Computational analysis methods including machine learning have a significant impact in the fields of genomics and medicine. High-throughput gene expression analysis methods such as microarray technology and RNA sequencing produce enormous amounts of data. Traditionally, statistical methods are used for comparative analysis of gene expression data. However, more complex analysis for classification of sample observations, or discovery of feature genes requires sophisticated computational approaches. In this review, we compile various statistical and computational tools used in analysis of expression microarray data. Even though the methods are discussed in the context of expression microarrays, they can also be applied for the analysis of RNA sequencing and quantitative proteomics datasets. We discuss the types of missing values, and the methods and approaches usually employed in their imputation. We also discuss methods of data normalization, feature selection, and feature extraction. Lastly, methods of classification and class discovery along with their evaluation parameters are described in detail. We believe that this detailed review will help the users to select appropriate methods for preprocessing and analysis of their data based on the expected outcome.
Collapse
Affiliation(s)
- Nikita Bhandari
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
| | - Rahee Walambe
- Electronics and Telecommunication Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Ketan Kotecha
- Computer Science Department, Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune, India
- Symbiosis Center for Applied AI (SCAAI), Symbiosis International (Deemed University), Pune, India
| | - Satyajeet P. Khare
- Symbiosis School of Biological Sciences, Symbiosis International (Deemed University), Pune, India
| |
Collapse
|
35
|
Application of Advanced Non-Linear Spectral Decomposition and Regression Methods for Spectroscopic Analysis of Targeted and Non-Targeted Irradiation Effects in an In-Vitro Model. Int J Mol Sci 2022; 23:ijms232112986. [DOI: 10.3390/ijms232112986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Revised: 09/30/2022] [Accepted: 10/11/2022] [Indexed: 12/24/2022] Open
Abstract
Irradiation of the tumour site during treatment for cancer with external-beam ionising radiation results in a complex and dynamic series of effects in both the tumour itself and the normal tissue which surrounds it. The development of a spectral model of the effect of each exposure and interaction mode between these tissues would enable label free assessment of the effect of radiotherapeutic treatment in practice. In this study Fourier transform Infrared microspectroscopic imaging was employed to analyse an in-vitro model of radiotherapeutic treatment for prostate cancer, in which a normal cell line (PNT1A) was exposed to low-dose X-ray radiation from the scattered treatment beam, and also to irradiated cell culture medium (ICCM) from a cancer cell line exposed to a treatment relevant dose (2 Gy). Various exposure modes were studied and reference was made to previously acquired data on cellular survival and DNA double strand break damage. Spectral analysis with manifold methods, linear spectral fitting, non-linear classification and non-linear regression approaches were found to accurately segregate spectra on irradiation type and provide a comprehensive set of spectral markers which differentiate on irradiation mode and cell fate. The study demonstrates that high dose irradiation, low-dose scatter irradiation and radiation-induced bystander exposure (RIBE) signalling each produce differential effects on the cell which are observable through spectroscopic analysis.
Collapse
|
36
|
Irshad MT, Nisar MA, Huang X, Hartz J, Flak O, Li F, Gouverneur P, Piet A, Oltmanns KM, Grzegorzek M. SenseHunger: Machine Learning Approach to Hunger Detection Using Wearable Sensors. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22207711. [PMID: 36298061 PMCID: PMC9609214 DOI: 10.3390/s22207711] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Revised: 09/26/2022] [Accepted: 10/06/2022] [Indexed: 05/23/2023]
Abstract
The perception of hunger and satiety is of great importance to maintaining a healthy body weight and avoiding chronic diseases such as obesity, underweight, or deficiency syndromes due to malnutrition. There are a number of disease patterns, characterized by a chronic loss of this perception. To our best knowledge, hunger and satiety cannot be classified using non-invasive measurements. Aiming to develop an objective classification system, this paper presents a multimodal sensory system using associated signal processing and pattern recognition methods for hunger and satiety detection based on non-invasive monitoring. We used an Empatica E4 smartwatch, a RespiBan wearable device, and JINS MEME smart glasses to capture physiological signals from five healthy normal weight subjects inactively sitting on a chair in a state of hunger and satiety. After pre-processing the signals, we compared different feature extraction approaches, either based on manual feature engineering or deep feature learning. Comparative experiments were carried out to determine the most appropriate sensor channel, device, and classifier to reliably discriminate between hunger and satiety states. Our experiments showed that the most discriminative features come from three specific sensor modalities: Electrodermal Activity (EDA), infrared Thermopile (Tmp), and Blood Volume Pulse (BVP).
Collapse
Affiliation(s)
- Muhammad Tausif Irshad
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
- Department of IT, University of the Punjab, Katchery Road, Lahore 54000, Pakistan
| | - Muhammad Adeel Nisar
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
- Department of IT, University of the Punjab, Katchery Road, Lahore 54000, Pakistan
| | - Xinyu Huang
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Jana Hartz
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Olaf Flak
- Department of Management, Faculty of Law and Social Sciences, Jan Kochanowski University of Kielce, ul. Żeromskiego 5, 25-369 Kielce, Poland
| | - Frédéric Li
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Philip Gouverneur
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Artur Piet
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Kerstin M. Oltmanns
- Section of Psychoneurobiology, Center of Brain, Behavior and Metabolism, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
| | - Marcin Grzegorzek
- Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, 23562 Lübeck, Germany
- Department of Knowledge Engineering, University of Economics in Katowice, Bogucicka 3, 40-287 Katowice, Poland
| |
Collapse
|
37
|
Xu D, Liu B, Wang J, Zhang Z. Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers. Front Bioeng Biotechnol 2022; 10:998298. [PMID: 36277390 PMCID: PMC9585160 DOI: 10.3389/fbioe.2022.998298] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background: In the biotechnology and applied microbiology sectors, artificial intelligence (AI) has been extensively used in disease diagnostics, drug research and development, functional genomics, biomarker recognition, and medical imaging diagnostics. In our study, from 2000 to 2021, science publications focusing on AI in biotechnology were reviewed, and quantitative, qualitative, and modeling analyses were performed. Methods: On 6 May 2022, the Web of Science Core Collection (WoSCC) was screened for AI applications in biotechnology and applied microbiology; 3,529 studies were identified between 2000 and 2022, and analyzed. The following information was collected: publication, country or region, references, knowledgebase, institution, keywords, journal name, and research hotspots, and examined using VOSviewer and CiteSpace V bibliometric platforms. Results: We showed that 128 countries published articles related to AI in biotechnology and applied microbiology; the United States had the most publications. In addition, 584 global institutions contributed to publications, with the Chinese Academy of Science publishing the most. Reference clusters from studies were categorized into ten headings: deep learning, prediction, support vector machines (SVM), object detection, feature representation, synthetic biology, amyloid, human microRNA precursors, systems biology, and single cell RNA-Sequencing. Research frontier keywords were represented by microRNA (2012–2020) and protein-protein interactions (PPIs) (2012–2020). Conclusion: We systematically, objectively, and comprehensively analyzed AI-related biotechnology and applied microbiology literature, and additionally, identified current hot spots and future trends in this area. Our review provides researchers with a comprehensive overview of the dynamic evolution of AI in biotechnology and applied microbiology and identifies future key research areas.
Collapse
Affiliation(s)
- Dongyu Xu
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
| | - Bing Liu
- Department of Bone Oncology, The People’s Hospital of Liaoning Province, Shenyang, Liaoning, China
| | - Jian Wang
- Department of Pathogenic Biology, School of Basic Medicine, China Medical University, Shenyang, Liaoning, China
| | - Zhichang Zhang
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
- *Correspondence: Zhichang Zhang,
| |
Collapse
|
38
|
Li T, Huang H, Zhang S, Zhang Y, Jing H, Sun T, Zhang X, Lu L, Zhang M. Predictive models based on machine learning for bone metastasis in patients with diagnosed colorectal cancer. Front Public Health 2022; 10:984750. [PMID: 36203663 PMCID: PMC9531117 DOI: 10.3389/fpubh.2022.984750] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Accepted: 08/25/2022] [Indexed: 01/25/2023] Open
Abstract
Background This study aimed to develop an artificial intelligence predictive model for predicting the probability of developing BM in CRC patients. Methods From SEER database, 50,566 CRC patients were identified between January 2015 and December 2019 without missing data. SVM and LR models were trained and tested on the dataset. Accuracy, area under the curve (AUC), and IDI were used to evaluate and compare the models. Results For bone metastases in the entire cohort, SVM model with poly as kernel function presents the best performance, whose accuracy is 0.908, recall is 0.838, and AUC is 0.926, outperforming LR model. The top three most important factors affecting the model's prediction of BM include extraosseous metastases (EM), CEA, and size. Conclusion Our study developed an SVM model with poly as kernel function for predicting BM in CRC patients. SVM model could improve personalized clinical decision-making, help rationalize the bone metastasis screening process, and reduce the burden on healthcare systems and patients.
Collapse
Affiliation(s)
- Tianhao Li
- Tianjin Union Medical Center, Tianjin Medical University, Tianjin, China
| | - Honghong Huang
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China
| | - Shuocun Zhang
- Department of General Surgery, Tianjin Hongqiao Hospital, Tianjin, China
| | - Yongdan Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, China,Tianjin Institute of Coloproctology, Tianjin, China
| | - Haoren Jing
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, China,Tianjin Institute of Coloproctology, Tianjin, China
| | - Tianwei Sun
- Department of Spinal Surgery, Tianjin Union Medical Center, Tianjin, China
| | - Xipeng Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, China,Tianjin Institute of Coloproctology, Tianjin, China,The Institute of Translational Medicine, Tianjin Union Medical Center of Nankai University, Tianjin, China,Nankai University School of Medicine, Nankai University, Tianjin, China,*Correspondence: Xipeng Zhang
| | - Liangfu Lu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, China,Liangfu Lu
| | - Mingqing Zhang
- Department of Colorectal Surgery, Tianjin Union Medical Center, Tianjin, China,Tianjin Institute of Coloproctology, Tianjin, China,The Institute of Translational Medicine, Tianjin Union Medical Center of Nankai University, Tianjin, China,Nankai University School of Medicine, Nankai University, Tianjin, China,Mingqing Zhang
| |
Collapse
|
39
|
Tazikeh S, Davoudi A, Shafiei A, Parsaei H, Atabaev TS, Ivakhnenko OP. A Comparison between the Perturbed-Chain Statistical Associating Fluid Theory Equation of State and Machine Learning Modeling Approaches in Asphaltene Onset Pressure and Bubble Point Pressure Prediction during Gas Injection. ACS OMEGA 2022; 7:30113-30124. [PMID: 36061711 PMCID: PMC9434618 DOI: 10.1021/acsomega.2c03192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/22/2022] [Accepted: 08/01/2022] [Indexed: 06/15/2023]
Abstract
Predicting asphaltene onset pressure (AOP) and bubble point pressure (Pb) is essential for optimization of gas injection for enhanced oil recovery. Pressure-Volume-Temperature or PVT studies along with equations of state (EoSs) are widely used to predict AOP and Pb. However, PVT experiments are costly and time-consuming. The perturbed-chain statistical associating fluid theory or PC-SAFT is a sophisticated EoS used for prediction of the AOP and Pb. However, this method is computationally complex and has high data requirements. Hence, developing precise and reliable smart models for prediction of the AOP and Pb is inevitable. In this paper, we used machine learning (ML) methods to develop predictive tools for the estimation of the AOP and Pb using experimental data (AOP data set: 170 samples; Pb data set: 146 samples). Extra trees (ET), support vector machine (SVM), decision tree, and k-nearest neighbors ML methods were used. Reservoir temperature, reservoir pressure, SARA fraction, API gravity, gas-oil ratio, fluid molecular weight, monophasic composition, and composition of gas injection are considered as input data. The ET (R 2: 0.793, RMSE: 7.5) and the SVM models (R 2: 0.988, RMSE: 0.76) attained more reliable results for estimation of the AOP and Pb, respectively. Generally, the accuracy of the PC-SAFT model is higher than that of the AI/ML models. However, our results confirm that the AI/ML approach is an acceptable alternative for the PC-SAFT model when we face lack of data and/or complex mathematical equations. The developed smart models are accurate and fast and produce reliable results with lower data requirements.
Collapse
Affiliation(s)
- Simin Tazikeh
- Petroleum
Engineering Program, School of Mining and Geosciences, Nazarbayev University, 53 Kabanbay Batyr Avenue, Nur-Sultan 010000, Kazakhstan
| | - Abdollah Davoudi
- Department
of Petroleum Engineering, School of Chemical and Petroleum Engineering, Shiraz University, Shiraz 71348-14336, Iran
| | - Ali Shafiei
- Petroleum
Engineering Program, School of Mining and Geosciences, Nazarbayev University, 53 Kabanbay Batyr Avenue, Nur-Sultan 010000, Kazakhstan
| | - Hossein Parsaei
- Department
of Medical Physics and Engineering, School of Medicine, Shiraz University of Medical Sciences, Shiraz 71348-14336, Iran
| | - Timur Sh. Atabaev
- Department
of Chemistry, Nazarbayev University, Nur-Sultan 010000, Kazakhstan
| | - Oleksandr P. Ivakhnenko
- Department
of Petroleum Engineering, Kazakh British
Technical University, Almaty 050000, Kazakhstan
| |
Collapse
|
40
|
Bayesian nonnegative matrix factorization in an incremental manner for data representation. APPL INTELL 2022. [DOI: 10.1007/s10489-022-03522-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
41
|
Goettsch KA, Zhang L, Singh AB, Dhawan P, Bastola DK. Reliable epithelial-mesenchymal transition biomarkers for colorectal cancer detection. Biomark Med 2022; 16:889-901. [PMID: 35892269 PMCID: PMC9442548 DOI: 10.2217/bmm-2022-0071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Aims: To combat increases in colorectal cancer (CRC) incidence and mortality, biomarkers among differentially expressed genes (DEGs) have been identified to objectively detect cancer. However, DEGs are numerous, and additional parameters may identify more reliable biomarkers. Here, CRC DEGs were filtered into a prioritized list of biomarkers. Materials & methods: Two independent datasets (COAD-READ [n = 698] and GSE50760 [n = 36]) were input alternatively to the recently published data-driven reference method. Results were filtered based on epithelial-mesenchymal transition enrichment (χ-square statistic: 919.05; p = 2.2e-16) to produce 37 potential CRC biomarkers. Results: All 37 genes reliably classified CRC samples and ETV4, CLDN1 and CA2 together were top-ranked by DDR (accuracy: 89%; F1 score: 0.89). Conclusion: Biological and statistical information were combined to produce a better set of CRC detection biomarkers.
Collapse
Affiliation(s)
- Kaitlin A Goettsch
- School of Interdisciplinary Informatics, College of Information Science & Technology, University of Nebraska at Omaha, 1110 S. 67th Street, Omaha, NE 68182, USA
| | - Ling Zhang
- School of Interdisciplinary Informatics, College of Information Science & Technology, University of Nebraska at Omaha, 1110 S. 67th Street, Omaha, NE 68182, USA
| | - Amar B Singh
- Department of Biochemistry & Molecular Biology, University of Nebraska Medical Center, 42nd & Emile Streets, Omaha, NE 68198, USA.,Veterans Affairs Nebraska - Western Iowa Health Care System, Research Service, Omaha, NE 68105, USA
| | - Punita Dhawan
- Department of Biochemistry & Molecular Biology, University of Nebraska Medical Center, 42nd & Emile Streets, Omaha, NE 68198, USA
| | - Dhundy K Bastola
- School of Interdisciplinary Informatics, College of Information Science & Technology, University of Nebraska at Omaha, 1110 S. 67th Street, Omaha, NE 68182, USA
| |
Collapse
|
42
|
Thiis‐Evensen E, Kjellman M, Knigge U, Gronbaek H, Schalin‐Jäntti C, Welin S, Sorbye H, del Pilar Schneider M, Belusa R. Plasma protein biomarkers for the detection of pancreatic neuroendocrine tumors and differentiation from small intestinal neuroendocrine tumors. J Neuroendocrinol 2022; 34:e13176. [PMID: 35829662 PMCID: PMC9787472 DOI: 10.1111/jne.13176] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Revised: 03/31/2022] [Accepted: 05/31/2022] [Indexed: 12/30/2022]
Abstract
There is an unmet need for novel biomarkers to diagnose and monitor patients with neuroendocrine neoplasms. The EXPLAIN study explores a multi-plasma protein and supervised machine learning strategy to improve the diagnosis of pancreatic neuroendocrine tumors (PanNET) and differentiate them from small intestinal neuroendocrine tumors (SI-NET). At time of diagnosis, blood samples were collected and analyzed from 39 patients with PanNET, 135 with SI-NET (World Health Organization Grade 1-2) and 144 controls. Exclusion criteria were other malignant diseases, chronic inflammatory diseases, reduced kidney or liver function. Prosed Oncology-II (i.e., OLink) was used to measure 92 cancer related plasma proteins. Chromogranin A was analyzed separately. Median age in all groups was 65-67 years and with a similar sex distribution (females: PanNET, 51%; SI-NET, 42%; controls, 42%). Tumor grade (G1/G2): PanNET, 39/61%; SI-NET, 46/54%. Patients with liver metastases: PanNET, 78%; SI-NET, 63%. The classification model of PanNET versus controls provided a sensitivity (SEN) of 0.84, specificity (SPE) 0.98, positive predictive value (PPV) of 0.92 and negative predictive value (NPV) of 0.95, and area under the receiver operating characteristic curve (AUROC) of 0.99; the model for the discrimination of PanNET versus SI-NET providing a SEN 0.61, SPE 0.96, PPV 0.83, NPV 0.90 and AUROC 0.98. These results suggest that a multi-plasma protein strategy can significantly improve diagnostic accuracy of PanNET and SI-NET.
Collapse
Affiliation(s)
- Espen Thiis‐Evensen
- Center for Neuroendocrine tumors, ENETS Neuroendocrine Tumor Centre of Excellence, Department of Transplantation MedicineOslo University Hospital RikshospitaletOsloNorway
| | - Magnus Kjellman
- Department of Breast, Endocrine Tumours and SarcomaKarolinska University Hospital SolnaStockholmSweden
| | - Ulrich Knigge
- Departments of Surgery and Endocrinology, ENETS Neuroendocrine Tumor Centre of ExcellenceCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Henning Gronbaek
- Department of Hepatology and Gastroenterology, ENETS Neuroendocrine Tumor Centre of ExcellenceAarhus University Hospital and Clinical InstituteAarhusDenmark
| | - Camilla Schalin‐Jäntti
- Endocrinology, Abdominal CentreUniversity of Helsinki and Helsinki University HospitalHelsinkiFinland
| | - Staffan Welin
- Department of Endocrine Oncology, ENETS Neuroendocrine Tumor Centre of ExcellenceUppsala University HospitalUppsalaSweden
| | - Halfdan Sorbye
- Department of OncologyHaukeland University HospitalBergenNorway
- Department of Clinical ScienceUniversity of BergenBergenNorway
| | | | | | | |
Collapse
|
43
|
Pei Q, Luo Y, Chen Y, Li J, Xie D, Ye T. Artificial intelligence in clinical applications for lung cancer: diagnosis, treatment and prognosis. Clin Chem Lab Med 2022; 60:1974-1983. [PMID: 35771735 DOI: 10.1515/cclm-2022-0291] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 06/17/2022] [Indexed: 12/12/2022]
Abstract
Artificial Intelligence (AI) is a branch of computer science that includes research in robotics, language recognition, image recognition, natural language processing, and expert systems. AI is poised to change medical practice, and oncology is not an exception to this trend. As the matter of fact, lung cancer has the highest morbidity and mortality worldwide. The leading cause is the complexity of associating early pulmonary nodules with neoplastic changes and numerous factors leading to strenuous treatment choice and poor prognosis. AI can effectively enhance the diagnostic efficiency of lung cancer while providing optimal treatment and evaluating prognosis, thereby reducing mortality. This review seeks to provide an overview of AI relevant to all the fields of lung cancer. We define the core concepts of AI and cover the basics of the functioning of natural language processing, image recognition, human-computer interaction and machine learning. We also discuss the most recent breakthroughs in AI technologies and their clinical application regarding diagnosis, treatment, and prognosis in lung cancer. Finally, we highlight the future challenges of AI in lung cancer and its impact on medical practice.
Collapse
Affiliation(s)
- Qin Pei
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Yanan Luo
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Yiyu Chen
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Jingyuan Li
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Dan Xie
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| | - Ting Ye
- Department of Laboratory Medicine, The Affiliated Hospital of Southwest Medical University, Luzhou, Sichuan, P.R. China
| |
Collapse
|
44
|
Unique Deep Radiomic Signature Shows NMN Treatment Reverses Morphology of Oocytes from Aged Mice. Biomedicines 2022; 10:biomedicines10071544. [PMID: 35884850 PMCID: PMC9313081 DOI: 10.3390/biomedicines10071544] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 06/20/2022] [Accepted: 06/27/2022] [Indexed: 01/02/2023] Open
Abstract
The purpose of this study is to develop a deep radiomic signature based on an artificial intelligence (AI) model. This radiomic signature identifies oocyte morphological changes corresponding to reproductive aging in bright field images captured by optical light microscopy. Oocytes were collected from three mice groups: young (4- to 5-week-old) C57BL/6J female mice, aged (12-month-old) mice, and aged mice treated with the NAD+ precursor nicotinamide mononucleotide (NMN), a treatment recently shown to rejuvenate aspects of fertility in aged mice. We applied deep learning, swarm intelligence, and discriminative analysis to images of mouse oocytes taken by bright field microscopy to identify a highly informative deep radiomic signature (DRS) of oocyte morphology. Predictive DRS accuracy was determined by evaluating sensitivity, specificity, and cross-validation, and was visualized using scatter plots of the data associated with three groups: Young, old and Old + NMN. DRS could successfully distinguish morphological changes in oocytes associated with maternal age with 92% accuracy (AUC~1), reflecting this decline in oocyte quality. We then employed the DRS to evaluate the impact of the treatment of reproductively aged mice with NMN. The DRS signature classified 60% of oocytes from NMN-treated aged mice as having a ‘young’ morphology. In conclusion, the DRS signature developed in this study was successfully able to detect aging-related oocyte morphological changes. The significance of our approach is that DRS applied to bright field oocyte images will allow us to distinguish and select oocytes originally affected by reproductive aging and whose quality has been successfully restored by the NMN therapy.
Collapse
|
45
|
Al-Obeidat F, Rocha Á, Akram M, Razzaq S, Maqbool F. (CDRGI)-Cancer detection through relevant genes identification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-05739-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
46
|
Dong X, Li M, Zhou P, Deng X, Li S, Zhao X, Wu Y, Qin J, Guo W. Fusing pre-trained convolutional neural networks features for multi-differentiated subtypes of liver cancer on histopathological images. BMC Med Inform Decis Mak 2022; 22:122. [PMID: 35509058 PMCID: PMC9066403 DOI: 10.1186/s12911-022-01798-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 02/21/2022] [Indexed: 11/10/2022] Open
Abstract
Liver cancer is a malignant tumor with high morbidity and mortality, which has a tremendous negative impact on human survival. However, it is a challenging task to recognize tens of thousands of histopathological images of liver cancer by naked eye, which poses numerous challenges to inexperienced clinicians. In addition, factors such as long time-consuming, tedious work and huge number of images impose a great burden on clinical diagnosis. Therefore, our study combines convolutional neural networks with histopathology images and adopts a feature fusion approach to help clinicians efficiently discriminate the differentiation types of primary hepatocellular carcinoma histopathology images, thus improving their diagnostic efficiency and relieving their work pressure. In this study, for the first time, 73 patients with different differentiation types of primary liver cancer tumors were classified. We performed an adequate classification evaluation of liver cancer differentiation types using four pre-trained deep convolutional neural networks and nine different machine learning (ML) classifiers on a dataset of liver cancer histopathology images with multiple differentiation types. And the test set accuracy, validation set accuracy, running time with different strategies, precision, recall and F1 value were used for adequate comparative evaluation. Proved by experimental results, fusion networks (FuNet) structure is a good choice, which covers both channel attention and spatial attention, and suppresses channel interference with less information. Meanwhile, it can clarify the importance of each spatial location by learning the weights of different locations in space, then apply it to the study of classification of multi-differentiated types of liver cancer. In addition, in most cases, the Stacking-based integrated learning classifier outperforms other ML classifiers in the classification task of multi-differentiation types of liver cancer with the FuNet fusion strategy after dimensionality reduction of the fused features by principle component analysis (PCA) features, and a satisfactory result of 72.46% is achieved in the test set, which has certain practicality.
Collapse
Affiliation(s)
- Xiaogang Dong
- Department of Hepatopancreatobiliary Surgery, Cancer Affiliated Hospital of Xinjiang Medical University, Ürümqi, Xinjiang, China
| | - Min Li
- Key Laboratory of Signal Detection and Processing, Xinjiang University, Ürümqi, 830046, China.,College of Information Science and Engineering, Xinjiang University, Ürümqi, 830046, China
| | - Panyun Zhou
- College of Software, Xinjiang University, Ürümqi, 830046, China
| | - Xin Deng
- College of Software, Xinjiang University, Ürümqi, 830046, China
| | - Siyu Li
- College of Software, Xinjiang University, Ürümqi, 830046, China
| | - Xingyue Zhao
- College of Software, Xinjiang University, Ürümqi, 830046, China
| | - Yi Wu
- College of Software, Xinjiang University, Ürümqi, 830046, China
| | - Jiwei Qin
- College of Information Science and Engineering, Xinjiang University, Ürümqi, 830046, China.
| | - Wenjia Guo
- Cancer Institute, Affiliated Cancer Hospital of Xinjiang Medical University, Ürümqi, 830011, China. .,Key Laboratory of Oncology of Xinjiang Uyghur Autonomous Region, Ürümqi, 830011, China.
| |
Collapse
|
47
|
Cho H, Tong F, You S, Jung S, Kim WH, Kim J. Prediction of the Immune Phenotypes of Bladder Cancer Patients for Precision Oncology. IEEE OPEN JOURNAL OF ENGINEERING IN MEDICINE AND BIOLOGY 2022; 3:47-57. [PMID: 35519421 PMCID: PMC9060513 DOI: 10.1109/ojemb.2022.3163533] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/27/2022] [Accepted: 03/19/2022] [Indexed: 11/11/2022] Open
Abstract
Bladder cancer (BC) is the most common urinary malignancy; however accurate diagnosis and prediction of recurrence after therapies remain elusive. This study aimed to develop a biosignature of immunotherapy-based responses using gene expression data. Publicly available BC datasets were collected, and machine learning (ML) approaches were applied to identify a novel biosignature to differentiate patient subgroups. Immune phenotyping of BC in the IMvigor210 dataset included three subtypes: inflamed, excluded, and desert immune. Immune phenotypes were analyzed with gene expressions using traditional but powerful classification methods such as random forests, Deep Neural Networks (DNN), Support Vector Machines (SVM) together with boosting and feature selection methods. Specifically, DNN yielded the highest area under the curve (AUC) with precision and recall (PR) curves and receiver operating characteristic (ROC) curves for each phenotype ([Formula: see text] and [Formula: see text], respectively) resulting in the identification of gene expression features useful for immune phenotype classification. Our results suggest significant potential to further develop and utilize machine learning algorithms for analysis of BC and its precaution. In conclusion, the findings from this study present a novel gene expression assay that can accurately discriminate BC patients from controls. Upon further validation in independent cohorts, this gene signature could be developed into a predictive test that can support clinical evaluation and patient care.
Collapse
Affiliation(s)
- Hyuna Cho
- Graduate School of Artificial Intelligence (GSAI)Pohang University of Science and TechnologyPohang37673South Korea
| | - Feng Tong
- Department of Computer Science and EngineeringUniversity of Texas at ArlingtonArlingtonTX76019USA
| | - Sungyong You
- Department of Surgery and Biomedical SciencesCedars-Sinai Medical CenterLos AngelesCA90048USA
| | - Sungyoung Jung
- Department of Electrical EngineeringUniversity of Texas at ArlingtonArlingtonTX76019USA
| | - Won Hwa Kim
- Graduate School of Artificial Intelligence (GSAI)Pohang University of Science and TechnologyPohang37673South Korea
- Department of Computer Science and EngineeringUniversity of Texas at ArlingtonArlingtonTX76019USA
- Department of Computer Science and EngineeringPohang University of Science and TechnologyPohang37673South Korea
| | - Jayoung Kim
- Department of Surgery and Biomedical SciencesCedars-Sinai Medical CenterLos AngelesCA90048USA
- Department of MedicineUniversity of California Los AngelesLos AngelesCA90095USA
| |
Collapse
|
48
|
Hasan MK, Alam MA, Dahal L, Roy S, Wahid SR, Elahi MTE, Martí R, Khanal B. Challenges of deep learning methods for COVID-19 detection using public datasets. INFORMATICS IN MEDICINE UNLOCKED 2022; 30:100945. [PMID: 35434261 PMCID: PMC9005223 DOI: 10.1016/j.imu.2022.100945] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2022] [Revised: 04/03/2022] [Accepted: 04/04/2022] [Indexed: 02/07/2023] Open
Abstract
Since the COVID-19 pandemic, several research studies have proposed Deep Learning (DL)-based automated COVID-19 detection, reporting high cross-validation accuracy when classifying COVID-19 patients from normal or other common Pneumonia. Although the reported outcomes are very high in most cases, these results were obtained without an independent test set from a separate data source(s). DL models are likely to overfit training data distribution when independent test sets are not utilized or are prone to learn dataset-specific artifacts rather than the actual disease characteristics and underlying pathology. This study aims to assess the promise of such DL methods and datasets by investigating the key challenges and issues by examining the compositions of the available public image datasets and designing different experimental setups. A convolutional neural network-based network, called CVR-Net (COVID-19 Recognition Network), has been proposed for conducting comprehensive experiments to validate our hypothesis. The presented end-to-end CVR-Net is a multi-scale-multi-encoder ensemble model that aggregates the outputs from two different encoders and their different scales to convey the final prediction probability. Three different classification tasks, such as 2-, 3-, 4-classes, are designed where the train-test datasets are from the single, multiple, and independent sources. The obtained binary classification accuracy is 99.8% for a single train-test data source, where the accuracies fall to 98.4% and 88.7% when multiple and independent train-test data sources are utilized. Similar outcomes are noticed in multi-class categorization tasks for single, multiple, and independent data sources, highlighting the challenges in developing DL models with the existing public datasets without an independent test set from a separate dataset. Such a result concludes a requirement for a better-designed dataset for developing DL tools applicable in actual clinical settings. The dataset should have an independent test set; for a single machine or hospital source, have a more balanced set of images for all the prediction classes; and have a balanced dataset from several hospitals and demography. Our source codes and model are publicly available for the research community for further improvements.
Collapse
Affiliation(s)
- Md Kamrul Hasan
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
| | - Md Ashraful Alam
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
| | - Lavsen Dahal
- Nepal Applied Mathematics and Informatics Institute for Research (NAAMII), Nepal
| | - Shidhartho Roy
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
| | - Sifat Redwan Wahid
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
| | - Md Toufick E Elahi
- Department of Electrical and Electronic Engineering (EEE), Khulna University of Engineering & Technology (KUET), Khulna 9203, Bangladesh
| | - Robert Martí
- Computer Vision and Robotics Institute, University of Girona, Spain
| | - Bishesh Khanal
- Nepal Applied Mathematics and Informatics Institute for Research (NAAMII), Nepal
| |
Collapse
|
49
|
Liver Disease Detection – Evaluation of Machine Learning Algorithms Performances with Optimal Thresholds. INTERNATIONAL JOURNAL OF HEALTHCARE INFORMATION SYSTEMS AND INFORMATICS 2022. [DOI: 10.4018/ijhisi.299956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Intelligent predictive systems are showing a greater level of accuracy and effectiveness in early detection of critical diseases like cancer and liver and lung disease.Predictive models assist medical practitioners in identifying the diseases based on symptoms and health indicators like hormone,enzymes,age,bloodcounts,etc.This study proposes a framework to use classification models to accurately detect chronic liver disease by enhancing the prediction accuracy through cutting-edge analytics techniques.The article proposes an enhanced framework on the original study by Ramana et al. (2011).It uses evaluation measures like Precision and Balanced Accuracy to choose the most efficient classification algorithm in INDIA and USA patient datasets using various factors like enzymes,age,etc.Using Youden’s Index, individual thresholds for each model were identified to increase the power of sensitivity and specificity.A framework is proposed for highly accurate automated disease detection in the medical industry,and it helps in strategizing preventive measures for patients with liver diseases.
Collapse
|
50
|
Shehab M, Abualigah L, Shambour Q, Abu-Hashem MA, Shambour MKY, Alsalibi AI, Gandomi AH. Machine learning in medical applications: A review of state-of-the-art methods. Comput Biol Med 2022; 145:105458. [PMID: 35364311 DOI: 10.1016/j.compbiomed.2022.105458] [Citation(s) in RCA: 146] [Impact Index Per Article: 48.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 03/23/2022] [Accepted: 03/24/2022] [Indexed: 12/11/2022]
Abstract
Applications of machine learning (ML) methods have been used extensively to solve various complex challenges in recent years in various application areas, such as medical, financial, environmental, marketing, security, and industrial applications. ML methods are characterized by their ability to examine many data and discover exciting relationships, provide interpretation, and identify patterns. ML can help enhance the reliability, performance, predictability, and accuracy of diagnostic systems for many diseases. This survey provides a comprehensive review of the use of ML in the medical field highlighting standard technologies and how they affect medical diagnosis. Five major medical applications are deeply discussed, focusing on adapting the ML models to solve the problems in cancer, medical chemistry, brain, medical imaging, and wearable sensors. Finally, this survey provides valuable references and guidance for researchers, practitioners, and decision-makers framing future research and development directions.
Collapse
Affiliation(s)
- Mohammad Shehab
- Information Technology, The World Islamic Sciences and Education University. Amman, Jordan.
| | - Laith Abualigah
- Faculty of Computer Sciences and Informatics, Amman Arab University, Amman, Jordan; School of Computer Sciences, Universiti Sains Malaysia, Pulau, Pinang, 11800, Malaysia.
| | - Qusai Shambour
- Department of Software Engineering, Al-Ahliyya Amman University, Amman, Jordan.
| | - Muhannad A Abu-Hashem
- Department of Geomatics, Faculty of Architecture and Planning, King Abdulaziz University, Jeddah, Saudi Arabia.
| | | | | | - Amir H Gandomi
- Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, 2007, Australia.
| |
Collapse
|