1
|
Schwartz B, Giesemann J, Delgadillo J, Schaffrath J, Hehlmann MI, Moggia D, Baumann C, Lutz W. Comparing three neural networks to predict depression treatment outcomes in psychological therapies. Behav Res Ther 2025; 190:104752. [PMID: 40286684 DOI: 10.1016/j.brat.2025.104752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 03/21/2025] [Accepted: 04/21/2025] [Indexed: 04/29/2025]
Abstract
OBJECTIVE Artificial neural networks have been used in various fields to solve classification and prediction tasks. However, it is unclear if these may be adequate methods to predict psychological treatment outcomes. This study aimed to evaluate the prognostic accuracy of neural networks using psychological treatment outcomes data. METHOD Three neural network models (TensorFlow, nnet, and monmlp) and a generalised linear regression model were compared in their ability to predict post-treatment remission of depression symptoms in a large naturalistic sample (n = 69,489) of patients accessing low intensity cognitive behavioural therapy. Prognostic accuracy was evaluated using the area under the curve (AUC) in an external cross-validation design. RESULTS The AUC of the neural networks in an external test sample ranged from 0.64 to 0.65 and the AUC of the linear regression model was 0.63. CONCLUSION Neural networks can help predict symptom remission in new samples with moderate accuracy, although these models were no more accurate than a simpler inferential statistical linear regression model.
Collapse
Affiliation(s)
| | | | | | | | | | - Danilo Moggia
- Department of Psychology, Trier University, Germany.
| | | | - Wolfgang Lutz
- Department of Psychology, Trier University, Germany.
| |
Collapse
|
2
|
Kuo DP, Chen YC, Cheng SJ, Hsieh KLC, Li YT, Kuo PC, Chang YC, Chen CY. A vision transformer-convolutional neural network framework for decision-transparent dual-energy X-ray absorptiometry recommendations using chest low-dose CT. Int J Med Inform 2025; 199:105901. [PMID: 40187299 DOI: 10.1016/j.ijmedinf.2025.105901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2025] [Revised: 03/20/2025] [Accepted: 03/26/2025] [Indexed: 04/07/2025]
Abstract
OBJECTIVE This study introduces an ensemble framework that integrates Vision Transformer (ViT) and Convolutional Neural Networks (CNN) models to leverage their complementary strengths, generating visualized and decision-transparent recommendations for dual-energy X-ray absorptiometry (DXA) scans from chest low-dose computed tomography (LDCT). METHODS The framework was developed using data from 321 individuals and validated with an independent test cohort of 186 individuals. It addresses two classification tasks: (1) distinguishing normal from abnormal bone mineral density (BMD) and (2) differentiating osteoporosis from non-osteoporosis. Three field-of-view (FOV) settings-fitFOV (entire vertebra), halfFOV (vertebral body only), and largeFOV (fitFOV + 20 %)-were analyzed to assess their impact on model performance. Model predictions were weighted and combined to enhance classification accuracy, and visualizations were generated to improve decision transparency. DXA scans were recommended for individuals classified as having abnormal BMD or osteoporosis. RESULTS The ensemble framework significantly outperformed individual models in both classification tasks (McNemar test, p < 0.001). In the development cohort, it achieved 91.6 % accuracy for task 1 with largeFOV (area under the receiver operating characteristic curve [AUROC]: 0.97) and 86.0 % accuracy for task 2 with fitFOV (AUROC: 0.94). In the test cohort, it demonstrated 86.6 % accuracy for task 1 (AUROC: 0.93) and 76.9 % accuracy for task 2 (AUROC: 0.99). DXA recommendation accuracy was 91.6 % and 87.1 % in the development and test cohorts, respectively, with notably high accuracy for osteoporosis detection (98.7 % and 100 %). CONCLUSIONS This combined ViT-CNN framework effectively assesses bone status from LDCT images, particularly when utilizing fitFOV and largeFOV settings. By visualizing classification confidence and vertebral abnormalities, the proposed framework enhances decision transparency and supports clinicians in making informed DXA recommendations following opportunistic osteoporosis screening.
Collapse
Affiliation(s)
- Duen-Pang Kuo
- Department of Medical Imaging, Taipei Medical University Hospital, Taipei, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Yung-Chieh Chen
- Department of Medical Imaging, Taipei Medical University Hospital, Taipei, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Sho-Jen Cheng
- Department of Medical Imaging, Taipei Medical University Hospital, Taipei, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan
| | - Kevin Li-Chun Hsieh
- Department of Medical Imaging, Taipei Medical University Hospital, Taipei, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan
| | - Yi-Tien Li
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Research Center for Neuroscience, Taipei Medical University, Taipei, Taiwan; Ph.D. Program in Medical Neuroscience, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Po-Chih Kuo
- Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan
| | - Yung-Chun Chang
- Graduate Institute of Data Science, Taipei Medical University, Taipei, Taiwan; Clinical Big Data Research Center, Taipei Medical University Hospital, Taipei, Taiwan.
| | - Cheng-Yu Chen
- Department of Medical Imaging, Taipei Medical University Hospital, Taipei, Taiwan; Translational Imaging Research Center, Taipei Medical University Hospital, Taipei, Taiwan; Department of Radiology, School of Medicine, College of Medicine, Taipei Medical University, Taipei, Taiwan; Research Center for Artificial Intelligence in Medicine, Taipei Medical University, Taipei, Taiwan; Department of Radiology, National Defense Medical Center, Taipei, Taiwan
| |
Collapse
|
3
|
Khalil RU, Sajjad M, Dhahbi S, Bourouis S, Hijji M, Muhammad K. Mitosis detection and classification for breast cancer diagnosis: What we know and what is next. Comput Biol Med 2025; 191:110057. [PMID: 40209577 DOI: 10.1016/j.compbiomed.2025.110057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 02/22/2025] [Accepted: 03/18/2025] [Indexed: 04/12/2025]
Abstract
Breast cancer is the second most deadly malignancy in women, behind lung cancer. Despite significant improvements in medical research, breast cancer is still accurately diagnosed with histological analysis. During this procedure, pathologists examine a physical sample for the presence of mitotic cells, or dividing cells. However, the high resolution of histopathology images and the difficulty of manually detecting tiny mitotic nuclei make it particularly challenging to differentiate mitotic cells from other types of cells. Numerous studies have addressed the detection and classification of mitosis, owing to increasing capacity and developments in automated approaches. The combination of machine learning and deep learning techniques has greatly revolutionized the process of identifying mitotic cells by offering automated, precise, and efficient solutions. In the last ten years, several pioneering methods have been presented, advancing towards practical applications in clinical settings. Unlike other forms of cancer, breast cancer and gliomas are categorized according to the number of mitotic divisions. Numerous papers have been published on techniques for identifying mitosis due to easy access to datasets and open competitions. Convolutional neural networks and other deep learning architectures can precisely identify mitotic cells, significantly decreasing the amount of labor that pathologists must perform. This article examines the techniques used over the past decade to identify and classify mitotic cells in histologically stained breast cancer hematoxylin and eosin images. Furthermore, we examine the benefits of current research techniques and predict forthcoming developments in the investigation of breast cancer mitosis, specifically highlighting machine learning and deep learning.
Collapse
Affiliation(s)
- Rafi Ullah Khalil
- Digital Image Processing Lab, Department of Computer Science, Islamia College Peshawar, Peshawar, 25000, Pakistan.
| | - Muhammad Sajjad
- Digital Image Processing Lab, Department of Computer Science, Islamia College Peshawar, Peshawar, 25000, Pakistan.
| | - Sami Dhahbi
- Applied College of Mahail Aseer, King Khalid University, Muhayil Aseer, 62529, Saudi Arabia.
| | - Sami Bourouis
- Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, 21944, Saudi Arabia.
| | - Mohammad Hijji
- Faculty of Computers and Information Technology, University of Tabuk, Tabuk, 71491 Saudi Arabia.
| | - Khan Muhammad
- Visual Analytics for Knowledge Laboratory (VIS2KNOW Lab), Department of Applied AI, School of Convergence, College of Computing and Informatics, Sungkyunkwan University, Seoul, 03063, South Korea.
| |
Collapse
|
4
|
Synek A, Benca E, Licandro R, Hirtler L, Pahr DH. Predicting strength of femora with metastatic lesions from single 2D radiographic projections using convolutional neural networks. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2025; 265:108724. [PMID: 40174318 DOI: 10.1016/j.cmpb.2025.108724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 03/02/2025] [Accepted: 03/14/2025] [Indexed: 04/04/2025]
Abstract
BACKGROUND AND OBJECTIVE Patients with metastatic bone disease are at risk of pathological femoral fractures and may require prophylactic surgical fixation. Current clinical decision support tools often overestimate fracture risk, leading to overtreatment. While novel scores integrating femoral strength assessment via finite element (FE) models show promise, they require 3D imaging, extensive computation, and are difficult to automate. Predicting femoral strength directly from single 2D radiographic projections using convolutional neural networks (CNNs) could address these limitations, but this approach has not yet been explored for femora with metastatic lesions. This study aimed to test whether CNNs can accurately predict strength of femora with metastatic lesions from single 2D radiographic projections. METHODS CNNs with various architectures were developed and trained using an FE model generated training dataset. This training dataset was based on 36,000 modified computed tomography (CT) scans, created by randomly inserting artificial lytic lesions into the CT scans of 36 intact anatomical femoral specimens. From each modified CT scan, an anterior-posterior 2D projection was generated and femoral strength in one-legged stance was determined using nonlinear FE models. Following training, the CNN performance was evaluated on an independent experimental test dataset consisting of 31 anatomical femoral specimens (16 intact, 15 with artificial lytic lesions). 2D projections of each specimen were created from corresponding CT scans and femoral strength was assessed in mechanical tests. The CNNs' performance was evaluated using linear regression analysis and compared to 2D densitometric predictors (bone mineral density and content) and CT-based 3D FE models. RESULTS All CNNs accurately predicted the experimentally measured strength in femora with and without metastatic lesions of the test dataset (R²≥0.80, CCC≥0.81). In femora with metastatic lesions, the performance of the CNNs (best: R²=0.84, CCC=0.86) was considerably superior to 2D densitometric predictors (R²≤0.07) and slightly inferior to 3D FE models (R²=0.90, CCC=0.94). CONCLUSIONS CNNs, trained on a large dataset generated via FE models, predicted experimentally measured strength of femora with artificial metastatic lesions with accuracy comparable to 3D FE models. By eliminating the need for 3D imaging and reducing computational demands, this novel approach demonstrates potential for application in a clinical setting.
Collapse
Affiliation(s)
- Alexander Synek
- Institute of Lightweight Design and Structural Biomechanics, TU Wien, Gumpendorfer Straße 7, 1060 Vienna, Austria.
| | - Emir Benca
- Department of Orthopedics and Trauma-Surgery, Medical University of Vienna, Währinger Gürtel 18-20, 1090 Vienna, Austria
| | - Roxane Licandro
- Department of Biomedical Imaging and Image-guided Therapy, Computational Imaging Research Lab (CIR), Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria
| | - Lena Hirtler
- Center for Anatomy and Cell Biology, Medical University of Vienna, Währinger Straße 13, 1090 Vienna, Austria
| | - Dieter H Pahr
- Institute of Lightweight Design and Structural Biomechanics, TU Wien, Gumpendorfer Straße 7, 1060 Vienna, Austria
| |
Collapse
|
5
|
Fan F, Li F, Wang Y, Liu T, Wang K, Xi X, Wang B. Deep learning based on ultrasound images predicting cervical lymph node metastasis in postoperative patients with differentiated thyroid carcinoma. Br J Radiol 2025; 98:920-928. [PMID: 40073229 DOI: 10.1093/bjr/tqaf047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Revised: 12/06/2024] [Accepted: 02/22/2025] [Indexed: 03/14/2025] Open
Abstract
OBJECTIVES To develop a deep learning (DL) model based on ultrasound (US) images of lymph nodes for predicting cervical lymph node metastasis (CLNM) in postoperative patients with differentiated thyroid carcinoma (DTC). METHODS Retrospective collection of 352 lymph nodes from 330 patients with cytopathology findings between June 2021 and December 2023 at our institution. The database was randomly divided into the training and test cohort at an 8:2 ratio. The DL basic model of longitudinal and cross-sectional of lymph nodes was constructed based on ResNet50 respectively, and the results of the 2 basic models were fused (1:1) to construct a longitudinal + cross-sectional DL model. Univariate and multivariate analyses were used to assess US features and construct a conventional US model. Subsequently, a combined model was constructed by integrating DL and US. RESULTS The diagnostic accuracy of the longitudinal + cross-sectional DL model was higher than that of longitudinal or cross-sectional alone. The area under the curve (AUC) of the combined model (US + DL) was 0.855 (95% CI, 0.767-0.942) and the accuracy, sensitivity, and specificity were 0.786 (95% CI, 0.671-0.875), 0.972 (95% CI, 0.855-0.999), and 0.588 (95% CI, 0.407-0.754), respectively. Compared with US and DL models, the integrated discrimination improvement and net reclassification improvement of the combined models are both positive. CONCLUSIONS This preliminary study shows that the DL model based on US images of lymph nodes has a high diagnostic efficacy for predicting CLNM in postoperative patients with DTC, and the combined model of US+DL is superior to single conventional US and DL for predicting CLNM in this population. ADVANCES IN KNOWLEDGE We innovatively used DL of lymph node US images to predict the status of cervical lymph nodes in postoperative patients with DTC.
Collapse
Affiliation(s)
- Fengjing Fan
- Department of Medical Ultrasound, Shandong Medicine and Health Key Laboratory of Abdominal Medical Imaging, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Fei Li
- Department of Medical Ultrasound, Shandong Medicine and Health Key Laboratory of Abdominal Medical Imaging, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Yixuan Wang
- Department of Medical Ultrasound, Shandong Medicine and Health Key Laboratory of Abdominal Medical Imaging, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| | - Tong Liu
- Ultrasound Medicine, Jining Medical University, Jining, 272000, China
| | - Kesong Wang
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, 250100, China
| | - Xiaoming Xi
- School of Computer Science and Technology, Shandong Jianzhu University, Jinan, 250100, China
| | - Bei Wang
- Department of Medical Ultrasound, Shandong Medicine and Health Key Laboratory of Abdominal Medical Imaging, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
| |
Collapse
|
6
|
Li X, Li L, Jiang Y, Wang H, Qiao X, Feng T, Luo H, Zhao Y. Vision-Language Models in medical image analysis: From simple fusion to general large models. INFORMATION FUSION 2025; 118:102995. [DOI: 10.1016/j.inffus.2025.102995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2025]
|
7
|
Krishnan C, Hussain S, Stanford D, Sthanam V, Bodduluri S, Raju SV, Rowe SM, Kim H. AirSeg: Learnable Interconnected Attention Framework for Robust Airway Segmentation. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01545-z. [PMID: 40404874 DOI: 10.1007/s10278-025-01545-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2025] [Revised: 05/06/2025] [Accepted: 05/07/2025] [Indexed: 05/24/2025]
Abstract
Accurate airway segmentation is vital for diagnosing and managing lung diseases, yet it remains challenging due to data imbalance and difficulty detecting small airway branches. This study proposes AirSeg, a learnable interconnected attention framework incorporating advanced attention mechanisms and a learnable embedding module, to enhance airway segmentation accuracy in computed tomography (CT) images. The proposed framework integrates multiple attention mechanisms, including image, positional, semantic, self-channel, and cross-spatial attention, to refine feature representations at various network and data levels. Additionally, a learnable variance-based embedding module dynamically adjusts input features, improving robustness against spatial inconsistencies and noise. This improves the model's robustness to spatial inconsistencies and noise, leading to more reliable segmentation results, especially in clinically challenging regions. AirSeg can be integrated with any UNet-like network with flexibility. The framework was evaluated on two datasets (in vivo and in situ) using several UNet-based architectures, comparing performance with and without AirSeg integration. Training employed data augmentation, a hybrid loss function combining Dice Similarity Coefficient and Intersection over Union losses, and statistical analysis to assess accuracy improvements. Integrating AirSeg into segmentation models led to statistically significant improvements in accuracy. Specifically, accuracy increased by 16.18% (p = 0.0035) for in vivo datasets and by 10.32% (p = 0.0097) for in situ datasets. These enhancements enable more precise identification of airway structures, including small branches, critical for early diagnosis and treatment planning in pulmonary care. The proposed model achieved a weighted average accuracy improvement of 12.43% (p = 0.0004) over other conventional models. AirSeg demonstrated superior performance in capturing both global structures and fine details, effectively segmenting large airways and intricate branches. Ablation studies validated the contributions and impact of individual attention mechanisms and the embedding module. The improvement in accuracy translates to more precise airway segmentation, enhancing the detection of small branches crucial for early diagnosis and treatment planning. The statistically significant p-values confirm that these gains are reliable, reducing manual correction efforts and improving the efficiency of automated airway analysis in clinical settings.
Collapse
Affiliation(s)
- Chetana Krishnan
- Department of Biomedical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Shah Hussain
- Department of Pulmonary, Allergy, and Critical Care, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Denise Stanford
- Department of Pulmonary, Allergy, and Critical Care, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Venkata Sthanam
- Department of Electrical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Sandeep Bodduluri
- Department of Pulmonary, Allergy, and Critical Care, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
- Department of Electrical Engineering, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - S Vamsee Raju
- Department of Pulmonary, Allergy, and Critical Care, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Steven M Rowe
- Department of Pulmonary, Allergy, and Critical Care, The University of Alabama at Birmingham, Birmingham, AL, 35294, USA
| | - Harrison Kim
- Department of Radiology, The University of Alabama at Birmingham, VH G082, 1720 2nd Avenue South, Birmingham, AL 35294, USA.
| |
Collapse
|
8
|
Abudalou S, Choi J, Gage K, Pow-Sang J, Yilmaz Y, Balagurunathan Y. Challenges in Using Deep Neural Networks Across Multiple Readers in Delineating Prostate Gland Anatomy. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-025-01504-8. [PMID: 40392414 DOI: 10.1007/s10278-025-01504-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2024] [Revised: 03/28/2025] [Accepted: 04/12/2025] [Indexed: 05/22/2025]
Abstract
Deep learning methods provide enormous promise in automating manually intense tasks such as medical image segmentation and provide workflow assistance to clinical experts. Deep neural networks (DNN) require a significant amount of training examples and a variety of expert opinions to capture the nuances and the context, a challenging proposition in oncological studies (H. Wang et al., Nature, vol. 620, no. 7972, pp. 47-60, Aug 2023). Inter-reader variability among clinical experts is a real-world problem that severely impacts the generalization of DNN reproducibility. This study proposes quantifying the variability in DNN performance using expert opinions and exploring strategies to train the network and adapt between expert opinions. We address the inter-reader variability problem in the context of prostate gland segmentation using a well-studied DNN, the 3D U-Net model. Reference data includes magnetic resonance imaging (MRI, T2-weighted) with prostate glandular anatomy annotations from two expert readers (R#1, n = 342 and R#2, n = 204). 3D U-Net was trained and tested with individual expert examples (R#1 and R#2) and had an average Dice coefficient of 0.825 (CI, [0.81 0.84]) and 0.85 (CI, [0.82 0.88]), respectively. Combined training with a representative cohort proportion (R#1, n = 100 and R#2, n = 150) yielded enhanced model reproducibility across readers, achieving an average test Dice coefficient of 0.863 (CI, [0.85 0.87]) for R#1 and 0.869 (CI, [0.87 0.88]) for R#2. We re-evaluated the model performance across the gland volumes (large, small) and found improved performance for large gland size with an average Dice coefficient to be at 0.846 [CI, 0.82 0.87] and 0.872 [CI, 0.86 0.89] for R#1 and R#2, respectively, estimated using fivefold cross-validation. Performance for small gland sizes diminished with average Dice of 0.8 [0.79, 0.82] and 0.8 [0.79, 0.83] for R#1 and R#2, respectively.
Collapse
Affiliation(s)
- Shatha Abudalou
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
- Department of Electrical Engineering, University of South Florida, Tampa, FL, USA
| | - Jung Choi
- Department of Diagnostic Radiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Kenneth Gage
- Department of Diagnostic Radiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Julio Pow-Sang
- Department of Genitourinary Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
| | - Yasin Yilmaz
- Department of Electrical Engineering, University of South Florida, Tampa, FL, USA
| | - Yoganand Balagurunathan
- Department of Machine Learning, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
- Department of Electrical Engineering, University of South Florida, Tampa, FL, USA.
- Department of Diagnostic Radiology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
- Department of Genitourinary Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA.
| |
Collapse
|
9
|
Wang P, Zhang J, Li Y, Guo Y, Li P, Chen R. Histopathology image classification based on semantic correlation clustering domain adaptation. Artif Intell Med 2025; 163:103110. [PMID: 40107119 DOI: 10.1016/j.artmed.2025.103110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 02/17/2025] [Accepted: 03/14/2025] [Indexed: 03/22/2025]
Abstract
Deep learning has been successfully applied to histopathology image classification tasks. However, the performance of deep models is data-driven, and the acquisition and annotation of pathological image samples are difficult, which limit the model's performance. Compared to whole slide images (WSI) of patients, histopathology image datasets of animal models are easier to acquire and annotate. Therefore, this paper proposes an unsupervised domain adaptation method based on semantic correlation clustering for histopathology image classification. The aim is to utilize Minmice model histopathology image dataset to achieve the classification and recognition of human WSIs. Firstly, the multi-scale fused features extracted from the source and target domains are normalized and mapped. In the new feature space, the cosine distance between class centers is used to measure the semantic correlation between categories. Then, the domain centers, class centers, and sample distributions are self-constrainedly aligned. Multi-granular information is applied to achieve cross-domain semantic correlation knowledge transfer between classes. Finally, the probabilistic heatmap is used to visualize the model's prediction results and annotate the cancerous regions in WSIs. Experimental results show that the proposed method has high classification accuracy for WSI, and the annotated result is close to manual annotation, indicating its potential for clinical applications.
Collapse
Affiliation(s)
- Pin Wang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400030, PR China.
| | - Jinhua Zhang
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400030, PR China
| | - Yongming Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400030, PR China
| | - Yurou Guo
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400030, PR China
| | - Pufei Li
- School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400030, PR China
| | - Rui Chen
- Chongqing University Cancer Hospital, Chongqing 400030, PR China
| |
Collapse
|
10
|
Dong Y, Li J, Huang S, Wu L, Zhao H, Zhao Y. Artificial Intelligence Measurement of Preoperative Radiographs in Adolescent Idiopathic Scoliosis Based on Multiple-View Semantic Segmentation. Global Spine J 2025; 15:1924-1931. [PMID: 39109794 PMCID: PMC11571967 DOI: 10.1177/21925682241270036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 06/18/2024] [Accepted: 07/08/2024] [Indexed: 11/20/2024] Open
Abstract
Study DesignCross-sectional study.ObjectivesImaging classification of adolescent idiopathic scoliosis (AIS) is directly related to the surgical strategy, but the artificial classification is complex and depends on doctors' experience. This study investigated deep learning-based automated classification methods (DL group) for AIS and validated the consistency of machine classification and manual classification (M group).MethodsA total of 506 cases (81 males and 425 females) and 1812 AIS full spine images in the anteroposterior (AP), lateral (LAT), left bending (LB) and right bending (RB) positions were retrospectively used for training. The mean age was 13.6 ± 1.8. The mean maximum Cobb angle was 46.8 ± 12.0. U-Net semantic segmentation neural network technology and deep learning methods were used to automatically segment and establish the alignment relationship between multiple views of the spine, and to extract spinal features such as the Cobb angle. The type of each test case was automatically calculated according to Lenke's rule. An additional 107 cases of adolescent idiopathic scoliosis imaging were prospectively used for testing. The consistency of the DL group and M group was compared.ResultsAutomatic vertebral body segmentation and recognition, multi-view alignment of the spine and automatic Cobb angle measurement were implemented. Compare to the M group, the consistency of the DL group was significantly higher in 3 aspects: type of lateral convexity (0.989 vs 0.566), lumbar curvature modifier (0.932 vs 0.738), and sagittal plane modifier (0.987 vs 0.522).ConclusionsDeep learning enables automated Cobb angle measurement and automated Lenke classification of idiopathic scoliosis whole spine radiographs with higher consistency than manual measurement classification.
Collapse
Affiliation(s)
- Yulei Dong
- Department of Orthopaedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jiahao Li
- Department of Orthopaedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Shanqi Huang
- Beijing Sanyuanju Technology Co., Ltd., Beijing, China
| | - Ling Wu
- Beijing Sanyuanju Technology Co., Ltd., Beijing, China
| | - Hong Zhao
- Department of Orthopaedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yu Zhao
- Department of Orthopaedic Surgery, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- State Key Laboratory of Common Mechanism Research for Major Diseases, Beijing, China
| |
Collapse
|
11
|
Zhang C, Li S, Huang D, Wen B, Wei S, Song Y, Wu X. Development and Validation of an AI-Based Multimodal Model for Pathological Staging of Gastric Cancer Using CT and Endoscopic Images. Acad Radiol 2025; 32:2604-2617. [PMID: 39753481 DOI: 10.1016/j.acra.2024.12.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2024] [Revised: 12/10/2024] [Accepted: 12/13/2024] [Indexed: 04/23/2025]
Abstract
RATIONALE AND OBJECTIVES Accurate preoperative pathological staging of gastric cancer is crucial for optimal treatment selection and improved patient outcomes. Traditional imaging methods such as CT and endoscopy have limitations in staging accuracy. METHODS This retrospective study included 691 gastric cancer patients treated from March 2017 to March 2024. Enhanced venous-phase CT and endoscopic images, along with postoperative pathological results, were collected. We developed three modeling approaches: (1) nine deep learning models applied to CT images (DeepCT), (2) 11 machine learning algorithms using handcrafted radiomic features from CT images (HandcraftedCT), and (3) ResNet-50-extracted deep features from endoscopic images followed by 11 machine learning algorithms (DeepEndo). The two top-performing models from each approach were combined into the Integrated Multi-Modal Model using a stacking ensemble method. Performance was assessed using ROC-AUC, sensitivity, and specificity. RESULTS The Integrated Multi-Modal Model achieved an ROC-AUC of 0.933 (95% CI, 0.887-0.979) on the test set, outperforming individual models. Sensitivity and specificity were 0.869 and 0.840, respectively. Various evaluation metrics demonstrated that the final fusion model effectively integrated the strengths of each sub-model, resulting in a balanced and robust performance with reduced false-positive and false-negative rates. CONCLUSION The Integrated Multi-Modal Model effectively integrates radiomic and deep learning features from CT and endoscopic images, demonstrating superior performance in preoperative pathological staging of gastric cancer. This multimodal approach enhances predictive accuracy and provides a reliable tool for clinicians to develop individualized treatment plans, thereby improving patient outcomes. DATA AVAILABILITY The data presented in this study are available on request from the corresponding author. The data are not publicly available due to ethical reasons. All code used in this study is based on third-party libraries and all custom code developed for this study is available upon reasonable request from the corresponding author.
Collapse
Affiliation(s)
- Chao Zhang
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.)
| | - Siyuan Li
- Department of Obstetrics, Qingdao Municipal Hospital, Qingdao, Shandong 266071, China (S.L.)
| | - Daolai Huang
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Department of Gastrointestinal Gland Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China (D.H., X.W.)
| | - Bo Wen
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.)
| | - Shizhuang Wei
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.)
| | - Yaodong Song
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.)
| | - Xianghua Wu
- Guangxi Medical University, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Guangxi Key Laboratory of Enhanced Recovery After Surgery for Gastrointestinal Cancer, Nanning, Guangxi 530021, China (C.Z., D.H., B.W., S.W., Y.S., X.W.); Department of Gastrointestinal Gland Surgery, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China (D.H., X.W.).
| |
Collapse
|
12
|
Cai X, Lu Z, Peng Z, Xu Y, Huang J, Luo H, Zhao Y, Lou Z, Shen Z, Chen Z, Yang X, Wu Y, Lu S. A Neural Network Model for Intelligent Classification of Distal Radius Fractures Using Statistical Shape Model Extraction Features. Orthop Surg 2025; 17:1513-1524. [PMID: 40180705 PMCID: PMC12050184 DOI: 10.1111/os.70034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Revised: 03/10/2025] [Accepted: 03/12/2025] [Indexed: 04/05/2025] Open
Abstract
OBJECTIVE Distal radius fractures account for 12%-17% of all fractures, with accurate classification being crucial for proper treatment planning. Studies have shown that in emergency settings, the misdiagnosis rate of hand/wrist fractures can reach up to 29%, particularly among non-specialist physicians due to a high workload and limited experience. While existing AI methods can detect fractures, they typically require large training datasets and are limited to fracture detection without type classification. Therefore, there is an urgent need for an efficient and accurate method that can both detect and classify different types of distal radius fractures. To develop and validate an intelligent classifier for distal radius fractures by combining a statistical shape model (SSM) with a neural network (NN) based on CT imaging data. METHODS From August 2022 to May 2023, a total of 80 CT scans were collected, including 43 normal radial bones and 37 distal radius fractures (17 Colles', 12 Barton's, and 8 Smith's fractures). We established the distal radius SSM by combining mean values with PCA (Principal Component Analysis) features and proposed six morphological indicators across four groups. The intelligent classifier (SSM + NN) was trained using SSM features as input data and different fracture types as output data. Four-fold cross-validations were performed to verify the classifier's robustness. The SSMs for both normal and fractured distal radius were successfully established based on CT data. Analysis of variance revealed significant differences in all six morphological indicators among groups (p < 0.001). The intelligent classifier achieved optimal performance when using the first 15 PCA-extracted features, with a cumulative variance contribution rate exceeding 75%. The classifier demonstrated excellent discrimination capability with a mean area under the curve (AUC) of 0.95 in four-fold cross-validation, and achieved an overall classification accuracy of 97.5% in the test set. The optimal prediction threshold range was determined to be 0.2-0.4. RESULTS The SSMs for both normal and fractured distal radius were successfully established based on CT data. Analysis of variance revealed significant differences in all six morphological indicators among groups (p < 0.001). The intelligent classifier achieved optimal performance when using the first 15 PCA-extracted features, with a cumulative variance contribution rate exceeding 75%. The classifier demonstrated excellent discrimination capability with a mean AUC of 0.95 in four-fold cross-validation and achieved an overall classification accuracy of 97.5% in the test set. The optimal prediction threshold range was determined to be 0.2-0.4. CONCLUSION The CT-based SSM + NN intelligent classifier demonstrated excellent performance in identifying and classifying different types of distal radius fractures. This novel approach provides an efficient, accurate, and automated tool for clinical fracture diagnosis, which could potentially improve diagnostic efficiency and treatment planning in orthopedic practice.
Collapse
Affiliation(s)
- Xing‐bo Cai
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
- Department of Orthopedics920th Hospital of Joint Logistics Support Force, PLAKunmingChina
| | - Ze‐hui Lu
- The Faculty of Medicine, Nursing, and Health Sciences, Monash UniversityAustralia
| | - Zhi Peng
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Yong‐qing Xu
- Department of Orthopedics920th Hospital of Joint Logistics Support Force, PLAKunmingChina
| | - Jun‐shen Huang
- Key Lab of Statistical Modeling and Data Analysis of Yunnan, Yunnan UniversityKunmingChina
| | - Hao‐tian Luo
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Yu Zhao
- Department of OrthopaedicsPeking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical SciencesBeijingChina
| | - Zhong‐qi Lou
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Zi‐qi Shen
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Zhang‐cong Chen
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Xiong‐gang Yang
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| | - Ying Wu
- Key Lab of Statistical Modeling and Data Analysis of Yunnan, Yunnan UniversityKunmingChina
| | - Sheng Lu
- Department of Orthopedic SurgeryThe First People's Hospital of Yunnan Province, The Affiliated Hospital of Kunming University of Science and TechnologyKunmingYunnanChina
- The Key Laboratory of Digital Orthopaedics of Yunnan ProvinceKunmingYunnanChina
| |
Collapse
|
13
|
Boumendil A, Bechkit W, Benatchba K. On-Device Deep Learning: Survey on Techniques Improving Energy Efficiency of DNNs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:7806-7821. [PMID: 39046860 DOI: 10.1109/tnnls.2024.3430028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Providing high-quality predictions is no longer the sole goal for neural networks. As we live in an increasingly interconnected world, these models need to match the constraints of resource-limited devices powering the Internet of Things (IoT) and embedded systems. Moreover, in the era of climate change, reducing the carbon footprint of neural networks is a critical step for green artificial intelligence, which is no longer an aspiration but a major need. Enhancing the energy efficiency of neural networks, in both training and inference phases, became a predominant research topic in the field. Training optimization has grown in interest recently but remains challenging, as it involves changes in the learning procedure that can impact the prediction quality significantly. This article presents a study on the most popular techniques aiming to reduce the energy consumption of neural networks' training. We first propose a classification of the methods before discussing and comparing the different categories. In addition, we outline some energy measurement techniques. We discuss the limitations identified during our study as well as some interesting directions, such as neuromorphic and reservoir computing (RC).
Collapse
|
14
|
Bennasar C, Nadal-Martínez A, Arroyo S, Gonzalez-Cid Y, López-González ÁA, Tárraga PJ. Integrating Machine Learning and Deep Learning for Predicting Non-Surgical Root Canal Treatment Outcomes Using Two-Dimensional Periapical Radiographs. Diagnostics (Basel) 2025; 15:1009. [PMID: 40310439 PMCID: PMC12025965 DOI: 10.3390/diagnostics15081009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2025] [Revised: 04/10/2025] [Accepted: 04/12/2025] [Indexed: 05/02/2025] Open
Abstract
Background/Objectives: In a previous study, we utilized categorical variables and machine learning (ML) algorithms to predict the success of non-surgical root canal treatments (NSRCTs) in apical periodontitis (AP), classifying the outcome as either success (healed) or failure (not healed). Given the importance of radiographic imaging in diagnosis, the present study evaluates the efficacy of deep learning (DL) in predicting NSRCT outcomes using two-dimensional (2D) periapical radiographs, comparing its performance with ML models. Methods: The DL model was trained and validated using leave-one-out cross-validation (LOOCV). Its output was incorporated into the set of categorical variables, and the ML study was reproduced using backward stepwise selection (BSS). The chi-square test was applied to assess the association between this new variable and NSRCT outcomes. Finally, after identifying the best-performing method from the ML study reproduction, statistical comparisons were conducted between this method, clinical professionals, and the image-based model using Fisher's exact test. Results: The association study yielded a p-value of 0.000000127, highlighting the predictive capability of 2D radiographs. After incorporating the DL-based predictive variable, the ML algorithm that demonstrated the best performance was logistic regression (LR), differing from the previous study, where random forest (RF) was the top performer. When comparing the deep learning-logistic regression (DL-LR) model with the clinician's prognosis (DP), DL-LR showed superior performance with a statistically significant difference (p-value < 0.05) in sensitivity, NPV, and accuracy. The same trend was observed in the DL vs. DP comparison. However, no statistically significant differences were found in the comparisons of RF vs. DL-LR, RF vs. DL, or DL vs. DL-LR. Conclusions: The findings of this study suggest that image-based artificial intelligence models exhibit superior predictive capability compared with those relying exclusively on categorical data. Moreover, they outperform clinician prognosis.
Collapse
Affiliation(s)
- Catalina Bennasar
- Academia Dental de Mallorca (ADEMA), School of Dentistry, University of Balearic Islands, 07122 Palma de Mallorca, Spain;
| | - Antonio Nadal-Martínez
- Soft Computing, Image Processing and Aggregation (SCOPIA) Research Group, University of the Balearic Islands (UIB), 07122 Palma de Mallorca, Spain;
| | - Sebastiana Arroyo
- Academia Dental de Mallorca (ADEMA), School of Dentistry, University of Balearic Islands, 07122 Palma de Mallorca, Spain;
| | - Yolanda Gonzalez-Cid
- Department of Mathematical Sciences and Informatics, University of the Balearic Islands, 07120 Palma de Mallorca, Spain;
| | - Ángel Arturo López-González
- ADEMA-Health Group, University Institute of Health Sciences of Balearic Islands (IUNICS), 02008 Palma de Mallorca, Spain;
| | - Pedro Juan Tárraga
- Faculty of Medicine, University of Castilla-La Mancha, 02001 Albacete, Spain;
| |
Collapse
|
15
|
Agyekum EA, Wang YG, Issaka E, Ren YZ, Tan G, Shen X, Qian XQ. Predicting the efficacy of microwave ablation of benign thyroid nodules from ultrasound images using deep convolutional neural networks. BMC Med Inform Decis Mak 2025; 25:161. [PMID: 40217199 PMCID: PMC11987319 DOI: 10.1186/s12911-025-02989-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Accepted: 03/26/2025] [Indexed: 04/15/2025] Open
Abstract
BACKGROUND Thyroid nodules are frequent in clinical settings, and their diagnosis in adults is growing, with some persons experiencing symptoms. Ultrasound-guided thermal ablation can shrink nodules and alleviate discomfort. Because the degree and rate of lesion absorption vary greatly between individuals, there is no reliable model for predicting the therapeutic efficacy of thermal ablation. METHODS Five convolutional neural network models including VGG19, Resnet 50, EfficientNetB1, EfficientNetB0, and InceptionV3, pre-trained with ImageNet, were compared for predicting the efficacy of ultrasound-guided microwave ablation (MWA) for benign thyroid nodules using ultrasound data. The patients were randomly assigned to one of two data sets: training (70%) or validation (30%). Accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the curve (AUC) were all used to assess predictive performance. RESULTS In the validation set, fine-tuned EfficientNetB1 performed best, with an AUC of 0.85 and an ACC of 0.79. CONCLUSIONS The study found that our deep learning model accurately predicts nodules with VRR < 50% after a single MWA session. Indeed, when thermal therapies compete with surgery, anticipating which nodules will be poor responders provides useful information that may assist physicians and patients determine whether thermal ablation or surgery is the preferable option. This was a preliminary study of deep learning, with a gap in actual clinical applications. As a result, more in-depth study should be undertaken to develop deep-learning models that can better help clinics. Prospective studies are expected to generate high-quality evidence and improve clinical performance in subsequent research.
Collapse
Affiliation(s)
- Enock Adjei Agyekum
- Department of Ultrasound, Affiliated People's Hospital of Jiangsu University, Zhenjiang, 212002, China
- School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yu-Guo Wang
- Department of Ultrasound, Jiangsu Hospital of Integrated Traditional Chinese and Western Medicine, Nanjing, China
| | - Eliasu Issaka
- College of Engineering, Birmingham City University, Birmingham, B4 7XG, UK
| | - Yong-Zhen Ren
- Department of Ultrasound, Affiliated People's Hospital of Jiangsu University, Zhenjiang, 212002, China
| | - Gongxun Tan
- Department of Ultrasound, Affiliated People's Hospital of Jiangsu University, Zhenjiang, 212002, China
| | - Xiangjun Shen
- School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, Jiangsu Province, China.
| | - Xiao-Qin Qian
- Northern Jiangsu People's Hospital Affiliated to Yangzhou University, Yangzhou, China.
- Northern Jiangsu People's Hospital, Yangzhou, Jiangsu Province, China.
- The Yangzhou Clinical Medical College of Xuzhou Medical University, Yangzhou, Jiangsu Province, China.
| |
Collapse
|
16
|
Wu C, Andaloussi MA, Hormuth DA, Lima EABF, Lorenzo G, Stowers CE, Ravula S, Levac B, Dimakis AG, Tamir JI, Brock KK, Chung C, Yankeelov TE. A critical assessment of artificial intelligence in magnetic resonance imaging of cancer. NPJ IMAGING 2025; 3:15. [PMID: 40226507 PMCID: PMC11981920 DOI: 10.1038/s44303-025-00076-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2024] [Accepted: 03/17/2025] [Indexed: 04/15/2025]
Abstract
Given the enormous output and pace of development of artificial intelligence (AI) methods in medical imaging, it can be challenging to identify the true success stories to determine the state-of-the-art of the field. This report seeks to provide the magnetic resonance imaging (MRI) community with an initial guide into the major areas in which the methods of AI are contributing to MRI in oncology. After a general introduction to artificial intelligence, we proceed to discuss the successes and current limitations of AI in MRI when used for image acquisition, reconstruction, registration, and segmentation, as well as its utility for assisting in diagnostic and prognostic settings. Within each section, we attempt to present a balanced summary by first presenting common techniques, state of readiness, current clinical needs, and barriers to practical deployment in the clinical setting. We conclude by presenting areas in which new advances must be realized to address questions regarding generalizability, quality assurance and control, and uncertainty quantification when applying MRI to cancer to maintain patient safety and practical utility.
Collapse
Affiliation(s)
- Chengyue Wu
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Breast Imaging, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
| | | | - David A. Hormuth
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Livestrong Cancer Institutes, The University of Texas at Austin, Austin, TX USA
| | - Ernesto A. B. F. Lima
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX USA
| | - Guillermo Lorenzo
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Health Research Institute of Santiago de Compostela, Santiago de Compostela, Spain
| | - Casey E. Stowers
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
| | - Sriram Ravula
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Brett Levac
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Alexandros G. Dimakis
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
| | - Jonathan I. Tamir
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX USA
- Department of Diagnostic Medicine, The University of Texas at Austin, Austin, TX USA
| | - Kristy K. Brock
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Radiation Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Caroline Chung
- Institute for Data Science in Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Department of Neuroradiology, The University of Texas MD Anderson Cancer Center, Houston, TX USA
| | - Thomas E. Yankeelov
- Department of Imaging Physics, The University of Texas MD Anderson Cancer Center, Houston, TX USA
- Oden Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, TX USA
- Livestrong Cancer Institutes, The University of Texas at Austin, Austin, TX USA
- Department of Diagnostic Medicine, The University of Texas at Austin, Austin, TX USA
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, TX USA
- Department of Oncology, The University of Texas at Austin, Austin, TX USA
| |
Collapse
|
17
|
Zhang D, Cheng KT. Generalized Task-Driven Medical Image Quality Enhancement With Gradient Promotion. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2025; 47:2785-2798. [PMID: 40030882 DOI: 10.1109/tpami.2025.3525671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/05/2025]
Abstract
Thanks to the recent achievements in task-driven image quality enhancement (IQE) models like ESTR (Liu et al. 2023), the image enhancement model and the visual recognition model can mutually enhance each other's quantitation while producing high-quality processed images that are perceivable by our human vision systems. However, existing task-driven IQE models tend to overlook an underlying fact-different levels of vision tasks have varying and sometimes conflicting requirements of image features. To address this problem, this paper proposes a generalized gradient promotion (GradProm) training strategy for task-driven IQE of medical images. Specifically, we partition a task-driven IQE system into two sub-models, i.e., a mainstream model for image enhancement and an auxiliary model for visual recognition. During training, GradProm updates only parameters of the image enhancement model using gradients of the visual recognition model and the image enhancement model, but only when gradients of these two sub-models are aligned in the same direction, which is measured by their cosine similarity. In case gradients of these two sub-models are not in the same direction, GradProm only uses the gradient of the image enhancement model to update its parameters. Theoretically, we have proved that the optimization direction of the image enhancement model will not be biased by the auxiliary visual recognition model under the implementation of GradProm. Empirically, extensive experimental results on four public yet challenging medical image datasets demonstrated the superior performance of GradProm over existing state-of-the-art methods.
Collapse
|
18
|
Mekki YM, Rhim HC, Daneshvar D, Pouliopoulos AN, Curtin C, Hagert E. Applications of artificial intelligence in ultrasound imaging for carpal-tunnel syndrome diagnosis: a scoping review. INTERNATIONAL ORTHOPAEDICS 2025; 49:965-973. [PMID: 40100390 PMCID: PMC11971218 DOI: 10.1007/s00264-025-06497-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2025] [Accepted: 03/08/2025] [Indexed: 03/20/2025]
Abstract
PURPOSE The purpose of this scoping review is to analyze the application of artificial intelligence (AI) in ultrasound (US) imaging for diagnosing carpal tunnel syndrome (CTS), with an aim to explore the potential of AI in enhancing diagnostic accuracy, efficiency, and patient outcomes by automating tasks, providing objective measurements, and facilitating earlier detection of CTS. METHODS We systematically searched multiple electronic databases, including Embase, PubMed, IEEE Xplore, and Scopus, to identify relevant studies published up to January 1, 2025. Studies were included if they focused on the application of AI in US imaging for CTS diagnosis. Editorials, expert opinions, conference papers, dataset publications, and studies that did not have a clear clinical application of the AI algorithm were excluded. RESULTS 345 articles were identified, following abstract and full-text review by two independent reviewers, 18 manuscripts were included. Of these, thirteen studies were experimental studies, three were comparative studies, and one was a feasibility study. All eighteen studies shared the common objective of improving CTS diagnosis and/or initial assessment using AI, with shared aims ranging from median nerve segmentation (n = 12) to automated diagnosis (n = 9) and severity classification (n = 2). The majority of studies utilized deep learning approaches, particularly CNNs (n = 15), and some focused on radiomics features (n = 5) and traditional machine learning techniques. CONCLUSION The integration of AI in US imaging for CTS diagnosis holds significant promise for transforming clinical practice. AI has the potential to improve diagnostic accuracy, streamline the diagnostic process, reduce variability, and ultimately lead to better patient outcomes. Further research is needed to address challenges related to dataset limitations, variability in US imaging, and ethical considerations.
Collapse
Affiliation(s)
| | - Hye Chang Rhim
- Department of Physical Medicine and Rehabilitation, Harvard Medical School, Spaulding Rehabilitation Hospital, Boston, MA, USA
| | - Daniel Daneshvar
- Department of Physical Medicine and Rehabilitation, Harvard Medical School, Spaulding Rehabilitation Hospital, Boston, MA, USA
| | - Antonios N Pouliopoulos
- Department of Surgical & Interventional Engineering, School of Biomedical Engineering & Imaging Sciences, King's College London, London, UK
| | - Catherine Curtin
- Department of Plastic Surgery, Stanford Medicine, Stanford, CA, USA
| | - Elisabet Hagert
- Aspetar Orthopedic and Sports Medicine Hospital, Doha, Qatar.
- Karolinska Institutet, Stockholm, Sweden.
| |
Collapse
|
19
|
Wang DD, Lin S, Lyu GR. Advances in the Application of Artificial Intelligence in the Ultrasound Diagnosis of Vulnerable Carotid Atherosclerotic Plaque. ULTRASOUND IN MEDICINE & BIOLOGY 2025; 51:607-614. [PMID: 39828500 DOI: 10.1016/j.ultrasmedbio.2024.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 12/16/2024] [Accepted: 12/17/2024] [Indexed: 01/22/2025]
Abstract
Vulnerable atherosclerotic plaque is a type of plaque that poses a significant risk of high mortality in patients with cardiovascular disease. Ultrasound has long been used for carotid atherosclerosis screening and plaque assessment due to its safety, low cost and non-invasive nature. However, conventional ultrasound techniques have limitations such as subjectivity, operator dependence, and low inter-observer agreement, leading to inconsistent and possibly inaccurate diagnoses. In recent years, a promising approach to address these limitations has emerged through the integration of artificial intelligence (AI) into ultrasound imaging. It was found that by training AI algorithms with large data sets of ultrasound images, the technology can learn to recognize specific characteristics and patterns associated with vulnerable plaques. This allows for a more objective and consistent assessment, leading to improved diagnostic accuracy. This article reviews the application of AI in the field of diagnostic ultrasound, with a particular focus on carotid vulnerable plaques, and discusses the limitations and prospects of AI-assisted ultrasound. This review also provides a deeper understanding of the role of AI in diagnostic ultrasound and promotes more research in the field.
Collapse
Affiliation(s)
- Dan-Dan Wang
- Department of Ultrasound, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China
| | - Shu Lin
- Centre of Neurological and Metabolic Research, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China; Group of Neuroendocrinology, Garvan Institute of Medical Research, Sydney, Australia
| | - Guo-Rong Lyu
- Department of Ultrasound, The Second Affiliated Hospital of Fujian Medical University, Quanzhou, China; Departments of Medical Imaging, Quanzhou Medical College, Quanzhou, China.
| |
Collapse
|
20
|
Faddi Z, da Mata K, Silva P, Nagaraju V, Ghosh S, Kul G, Fiondella L. Quantitative assessment of machine learning reliability and resilience. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2025; 45:790-807. [PMID: 39043579 DOI: 10.1111/risa.14666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/25/2024]
Abstract
Advances in machine learning (ML) have led to applications in safety-critical domains, including security, defense, and healthcare. These ML models are confronted with dynamically changing and actively hostile conditions characteristic of real-world applications, requiring systems incorporating ML to be reliable and resilient. Many studies propose techniques to improve the robustness of ML algorithms. However, fewer consider quantitative techniques to assess changes in the reliability and resilience of these systems over time. To address this gap, this study demonstrates how to collect relevant data during the training and testing of ML suitable for the application of software reliability, with and without covariates, and resilience models and the subsequent interpretation of these analyses. The proposed approach promotes quantitative risk assessment of ML technologies, providing the ability to track and predict degradation and improvement in the ML model performance and assisting ML and system engineers with an objective approach to compare the relative effectiveness of alternative training and testing methods. The approach is illustrated in the context of an image recognition model, which is subjected to two generative adversarial attacks and then iteratively retrained to improve the system's performance. Our results indicate that software reliability models incorporating covariates characterized the misclassification discovery process more accurately than models without covariates. Moreover, the resilience model based on multiple linear regression incorporating interactions between covariates tracks and predicts degradation and recovery of performance best. Thus, software reliability and resilience models offer rigorous quantitative assurance methods for ML-enabled systems and processes.
Collapse
Affiliation(s)
- Zakaria Faddi
- Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth, Dartmouth, Massachusetts, USA
| | - Karen da Mata
- Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth, Dartmouth, Massachusetts, USA
| | - Priscila Silva
- Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth, Dartmouth, Massachusetts, USA
| | | | - Susmita Ghosh
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
| | - Gokhan Kul
- Department of Computer and Information Science, University of Massachusetts Dartmouth, Dartmouth, Massachusetts, USA
| | - Lance Fiondella
- Department of Electrical and Computer Engineering, University of Massachusetts Dartmouth, Dartmouth, Massachusetts, USA
| |
Collapse
|
21
|
Ocampo-López-Escalera J, Ochoa-Díaz-López H, Sánchez-Chino XM, Irecta-Nájera CA, Tobar-Alas SD, Rosete-Aguilar M. A low-cost platform for automated cervical cytology: addressing health and socioeconomic challenges in low-resource settings. FRONTIERS IN MEDICAL TECHNOLOGY 2025; 7:1531817. [PMID: 40231005 PMCID: PMC11994738 DOI: 10.3389/fmedt.2025.1531817] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2024] [Accepted: 03/10/2025] [Indexed: 04/16/2025] Open
Abstract
Introduction Cervical cancer remains a significant health challenge around the globe, with particularly high prevalence in low- and middle-income countries. This disease is preventable and curable if detected in early stages, making regular screening critically important. Cervical cytology, the most widely used screening method, has proven highly effective in reducing cervical cancer incidence and mortality in high income countries. However, its effectiveness in low-resource settings has been limited, among other factors, by insufficient diagnostic infrastructure and a shortage of trained healthcare personnel. Methods This paper introduces the development of a low-cost microscopy platform designed to address these limitations by enabling automatic reading of cervical cytology slides. The system features a robotized microscope capable of slide scanning, autofocus, and digital image capture, while supporting the integration of artificial intelligence (AI) algorithms. All at a production cost below 500 USD. A dataset of nearly 2,000 images, captured with the custom-built microscope and covering seven distinct cervical cellular types relevant in cytologic analysis, was created. This dataset was then used to fine-tune and test several pre-trained models for classifying between images containing normal and abnormal cell subtypes. Results Most of the tested models showed good performance for properly classifying images containing abnormal and normal cervical cells, with sensitivities above 90%. Among these models, MobileNet demonstrated the highest accuracy in detecting abnormal cell types, achieving sensitivities of 98.26% and 97.95%, specificities of 88.91% and 88.72%, and F-scores of 96.42% and 96.23% on the validation and test sets, respectively. Conclusions The results indicate that MobileNet might be a suitable model for real-world deployment on the low-cost platform, offering high precision and efficiency in classifying cervical cytology images. This system presents a first step towards a promising solution for improving cervical cancer screening in low-resource settings.
Collapse
Affiliation(s)
| | - Héctor Ochoa-Díaz-López
- Departamento de Salud, El Colegio de la Frontera Sur, San Cristóbal de las Casas, Chiapas, México
| | - Xariss M. Sánchez-Chino
- SECIHTI - Departamento de Salud, El Colegio de la Frontera Sur, Villahermosa, Tabasco, México
| | | | - Saúl D. Tobar-Alas
- Hospital General de Zona No. 2, Instituto Mexicano del Seguro Social, Tuxtla Gutiérrez, Chiapas, México
| | - Martha Rosete-Aguilar
- Instituto de Ciencias Aplicadas y Tecnología, Universidad Nacional Autónoma de México, Circuito Exterior S/N, Cd. Universitaria, México City, México
| |
Collapse
|
22
|
R P, M JPP, J S N. Brain tumor segmentation using multi-scale attention U-Net with EfficientNetB4 encoder for enhanced MRI analysis. Sci Rep 2025; 15:9914. [PMID: 40121246 PMCID: PMC11929897 DOI: 10.1038/s41598-025-94267-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2024] [Accepted: 03/12/2025] [Indexed: 03/25/2025] Open
Abstract
Accurate brain tumor segmentation is critical for clinical diagnosis and treatment planning. This study proposes an advanced segmentation framework that combines Multiscale Attention U-Net with the EfficientNetB4 encoder to enhance segmentation performance. Unlike conventional U-Net-based architectures, the proposed model leverages EfficientNetB4's compound scaling to optimize feature extraction at multiple resolutions while maintaining low computational overhead. Additionally, the Multi-Scale Attention Mechanism (utilizing [Formula: see text], and [Formula: see text] kernels) enhances feature representation by capturing tumor boundaries across different scales, addressing limitations of existing CNN-based segmentation methods. Our approach effectively suppresses irrelevant regions and enhances tumor localization through attention-enhanced skip connections and residual attention blocks. Extensive experiments were conducted on the publicly available Figshare brain tumor dataset, comparing different EfficientNet variants to determine the optimal architecture. EfficientNetB4 demonstrated superior performance, achieving an Accuracy of 99.79%, MCR of 0.21%, Dice Coefficient of 0.9339, and an Intersection over Union (IoU) of 0.8795, outperforming other variants in accuracy and computational efficiency. The training process was analyzed using key metrics, including Dice Coefficient, dice loss, precision, recall, specificity, and IoU, showing stable convergence and generalization. Additionally, the proposed method was evaluated against state-of-the-art approaches, surpassing them in all critical metrics, including accuracy, IoU, Dice Coefficient, precision, recall, specificity, and mean IoU. This study demonstrates the effectiveness of the proposed method for robust and efficient segmentation of brain tumors, positioning it as a valuable tool for clinical and research applications.
Collapse
Affiliation(s)
- Preetha R
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014, Tamilnadu, India
| | | | - Nisha J S
- School of Electronics Engineering, Vellore Institute of Technology, Vellore, 632014, Tamilnadu, India
| |
Collapse
|
23
|
Chou T, Goldstein JA. How can artificial intelligence models advance placental biology? Placenta 2025:S0143-4004(25)00081-5. [PMID: 40187949 DOI: 10.1016/j.placenta.2025.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2025] [Accepted: 03/12/2025] [Indexed: 04/07/2025]
Abstract
The placenta is a vital organ that supports the developing fetus during pregnancy. Histologic examination of the placenta can reveal abnormalities in morphology and structure that impact placental function. Machine learning (ML) models have been successfully developed for digital pathology, leveraging rich image datasets from human tissue. ML models can be advantageous to placenta researchers, either by supplementing pathologist expertise or providing knowledge to inform future hypothesis generation. Research projects fall into several categories: Cell classification methods have been introduced to the placental disc and membranes. Cell classification is useful as a "bottom up" approach to characterizing tissue, using smaller image inputs than at a tissue region or whole slide level. Classification of normal tissues, cells, and development can identify pathologies that deviate. Several studies have identified pathologies within the great obstetric syndromes or placental inflammation. These studies often use mechanisms to aggregate findings from small images patches up to the whole slide level. Digital pathology slides are rich with data that inform our knowledge of placental function and disease - while many articles focus on model design and performance, the features extracted can add valuable biological and clinical knowledge to the field.
Collapse
Affiliation(s)
- Teresa Chou
- Department of Pathology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Jeffery A Goldstein
- Department of Pathology, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
24
|
Yan T, Jin Y, Liu S, Li Q, Zuo G, Ye Z, Li J, Han B. ResGloTBNet: An interpretable deep residual network with global long-range dependency for tuberculosis screening of sputum smear microscopy images. Med Eng Phys 2025; 137:104300. [PMID: 40057359 DOI: 10.1016/j.medengphy.2025.104300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 11/22/2024] [Accepted: 02/05/2025] [Indexed: 05/13/2025]
Abstract
Tuberculosis is a high-mortality infectious disease. Manual sputum smear microscopy is a common and effective method for screening tuberculosis. However, it is time-consuming, labor-intensive, and has low sensitivity. In this study, we propose ResGloTBNet, a framework that integrates convolutional neural network and graph convolutional network for sputum smear image classification with high discriminative power. In this framework, the global reasoning unit is introduced into the residual structure of ResNet to form the ResGloRe module, which not only fully extracts the local features of the image but also models the global relationship between different regions in the image. Furthermore, we applied activation maximization and class activation mapping to generate explanations for the model's predictions on the test sets. ResGloTBNet achieved remarkable results on a publicly available dataset, reaching 97.2 % accuracy and 99.0 % sensitivity. It also maintained a high level of performance on a private dataset, attaining 98.0 % accuracy and 96.6 % sensitivity. In addition, interpretable analysis demonstrated that ResGloTBNet can effectively identify the features and regions in the input images that contribute the most to the model's predictions, providing valuable insights into the decision-making process of the deep learning model.
Collapse
Affiliation(s)
- Taocui Yan
- Medical Data Science Academy, College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
| | - Yaqian Jin
- Department of Clinical Laboratory, University-Town Hospital of Chongqing Medical University, Chongqing 401331, China
| | - Shangqing Liu
- Medical Data Science Academy, College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
| | - Qiuni Li
- Medical Data Science Academy, College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
| | - Guowei Zuo
- Key Laboratory of Diagnostic Medicine Designated by the Chinese Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing 400016, China
| | - Ziqian Ye
- Key Laboratory of Diagnostic Medicine Designated by the Chinese Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing 400016, China
| | - Jin Li
- Department of Laboratory Medicine, Chongqing Medical University Affiliated Dazu Hospital, The People's Hospital of Dazu, Chongqing, 402360, China.
| | - Baoru Han
- Medical Data Science Academy, College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China.
| |
Collapse
|
25
|
Ho QH, Nguyen TNQ, Tran TT, Pham VT. LiteMamba-Bound: A lightweight Mamba-based model with boundary-aware and normalized active contour loss for skin lesion segmentation. Methods 2025; 235:10-25. [PMID: 39864606 DOI: 10.1016/j.ymeth.2025.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 01/05/2025] [Accepted: 01/13/2025] [Indexed: 01/28/2025] Open
Abstract
In the field of medical science, skin segmentation has gained significant importance, particularly in dermatology and skin cancer research. This domain demands high precision in distinguishing critical regions (such as lesions or moles) from healthy skin in medical images. With growing technological advancements, deep learning models have emerged as indispensable tools in addressing these challenges. One of the state-of-the-art modules revealed in recent years, the 2D Selective Scan (SS2D), based on state-space models that have already seen great success in natural language processing, has been increasingly adopted and is gradually replacing Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Leveraging the strength of this module, this paper introduces LiteMamba-Bound, a lightweight model with approximately 957K parameters, designed for skin image segmentation tasks. Notably, the Channel Attention Dual Mamba (CAD-Mamba) block is proposed within both the encoder and decoder alongside the Mix Convolution with Simple Attention bottleneck block to emphasize key features. Additionally, we propose the Reverse Attention Boundary Module to highlight challenging boundary features. Also, the Normalized Active Contour loss function presented in this paper significantly improves the model's performance compared to other loss functions. To validate performance, we conducted tests on two skin image datasets, ISIC2018 and PH2, with results consistently showing superior performance compared to other models. Our code will be made publicly available at: https://github.com/kwanghwi242/A-new-segmentation-model.
Collapse
Affiliation(s)
- Quang-Huy Ho
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam
| | - Thi-Nhu-Quynh Nguyen
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam
| | - Thi-Thao Tran
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam
| | - Van-Truong Pham
- School of Electrical and Electronic Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam.
| |
Collapse
|
26
|
Zhao S, Sun Q, Yang J, Yuan Y, Huang Y, Li Z. Structure preservation constraints for unsupervised domain adaptation intracranial vessel segmentation. Med Biol Eng Comput 2025; 63:609-627. [PMID: 39432222 DOI: 10.1007/s11517-024-03195-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 09/11/2024] [Indexed: 10/22/2024]
Abstract
Unsupervised domain adaptation (UDA) has received interest as a means to alleviate the burden of data annotation. Nevertheless, existing UDA segmentation methods exhibit performance degradation in fine intracranial vessel segmentation tasks due to the problem of structure mismatch in the image synthesis procedure. To improve the image synthesis quality and the segmentation performance, a novel UDA segmentation method with structure preservation approaches, named StruP-Net, is proposed. The StruP-Net employs adversarial learning for image synthesis and utilizes two domain-specific segmentation networks to enhance the semantic consistency between real images and synthesized images. Additionally, two distinct structure preservation approaches, feature-level structure preservation (F-SP) and image-level structure preservation (I-SP), are proposed to alleviate the problem of structure mismatch in the image synthesis procedure. The F-SP, composed of two domain-specific graph convolutional networks (GCN), focuses on providing feature-level constraints to enhance the structural similarity between real images and synthesized images. Meanwhile, the I-SP imposes constraints on structure similarity based on perceptual loss. The cross-modality experimental results from magnetic resonance angiography (MRA) images to computed tomography angiography (CTA) images indicate that StruP-Net achieves better segmentation performance compared with other state-of-the-art methods. Furthermore, high inference efficiency demonstrates the clinical application potential of StruP-Net. The code is available at https://github.com/Mayoiuta/StruP-Net .
Collapse
Affiliation(s)
- Sizhe Zhao
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Qi Sun
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Jinzhu Yang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China.
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China.
| | - Yuliang Yuan
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Yan Huang
- Key Laboratory of Intelligent Computing in Medical Image, Ministry of Education, Northeastern University, Shenyang, Liaoning, China
- School of Computer Science and Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Zhiqing Li
- The First Affiliated Hospital of China Medical University, Shenyang, Liaoning, China
| |
Collapse
|
27
|
Kayadibi İ, Köse U, Güraksın GE, Çetin B. An AI-assisted explainable mTMCNN architecture for detection of mandibular third molar presence from panoramic radiography. Int J Med Inform 2025; 195:105724. [PMID: 39626596 DOI: 10.1016/j.ijmedinf.2024.105724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Revised: 11/21/2024] [Accepted: 11/22/2024] [Indexed: 02/12/2025]
Abstract
OBJECTIVE This study aimed to design and systematically evaluate an architecture, proposed as the Explainable Mandibular Third Molar Convolutional Neural Network (E-mTMCNN), for detecting the presence of mandibular third molars (m-M3) in panoramic radiography (PR). The proposed architecture seeks to enhance the accuracy of early detection and improve clinical decision-making and treatment planning in dentistry. METHODS A new dataset, named the Mandibular Third Molar (m-TM) dataset, was developed through expert labeling of raw PR images from the UESB dataset. This dataset was subsequently made publicly accessible to support further research. Several advanced image preprocessing techniques, including Gaussian filtering, gamma correction, and data augmentation, were applied to improve image quality. Various Deep learning (DL) based Convolutional Neural Network (CNN) architectures were trained and validated using Transfer Learning (TL) methodologies. Among these, the E-mTMCNN, leveraging the GoogLeNet architecture, achieved the highest performance metrics. To ensure transparency in the model's decision-making process, Local Interpretable Model-Agnostic Explanations (LIME) were integrated as an eXplainable Artificial Intelligence (XAI) approach. Clinical reliability and applicability were assessed through an expert survey conducted among specialized dentists using a decision support system based on the E-mTMCNN. RESULTS The E-mTMCNN architecture demonstrated a classification accuracy of 87.02%, with a sensitivity of 75%, specificity of 94.73%, precision of 77.68%, an F1 score of 75.51%, and an area under the curve (AUC) of 87.01%. The integration of LIME provided visual explanations of the model's decision-making rationale, reinforcing the robustness of the proposed architecture. Results from the expert survey indicated high clinical acceptance and confidence in the reliability of the system. CONCLUSION The findings demonstrate that the E-mTMCNN architecture effectively detects the presence of m-M3 in PRs, outperforming current state-of-the-art methodologies. The proposed architecture shows considerable potential for integration into computer-aided diagnostic systems, advancing early detection capabilities and enhancing the precision of treatment planning in dental practice.
Collapse
Affiliation(s)
- İsmail Kayadibi
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Suleyman Demirel University, Isparta, Turkey; Department of Management Information Systems, Faculty of Economic and Administrative Sciences, Afyon Kocatepe University, Afyonkarahisar, Turkey.
| | - Utku Köse
- Department of Computer Engineering, Faculty of Engineering and Natural Sciences, Suleyman Demirel University, Isparta, Turkey.
| | - Gür Emre Güraksın
- Department of Computer Engineering, Faculty of Engineering, Afyon Kocatepe University, Afyonkarahisar, Turkey.
| | - Bilgün Çetin
- Department of Oral and Maxillofacial Radiology, Faculty of Dentistry, Selcuk University, Konya, Turkey.
| |
Collapse
|
28
|
Yucheng L, Lingyun Q, Kainan S, Yongshi J, Wenming Z, Jieni D, Weijun C. Development and validation of a deep reinforcement learning algorithm for auto-delineation of organs at risk in cervical cancer radiotherapy. Sci Rep 2025; 15:6800. [PMID: 40000766 PMCID: PMC11861648 DOI: 10.1038/s41598-025-91362-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 02/19/2025] [Indexed: 02/27/2025] Open
Abstract
This study was conducted to develop and validate a novel deep reinforcement learning (DRL) algorithm incorporating the segment anything model (SAM) to enhance the accuracy of automatic contouring organs at risk during radiotherapy for cervical cancer patients. CT images were collected from 150 cervical cancer patients treated at our hospital between 2021 and 2023. Among these images, 122 CT images were used as a training set for the algorithm training of the DRL model based on the SAM model, and 28 CT images were used for the test set. The model's performance was evaluated by comparing its segmentation results with the ground truth (manual contouring) obtained through manual contouring by expert clinicians. The test results were compared with the contouring results of commercial automatic contouring software based on the deep learning (DL) algorithm model. The Dice similarity coefficient (DSC), 95th percentile Hausdorff distance, average symmetric surface distance (ASSD), and relative absolute volume difference (RAVD) were used to quantitatively assess the contouring accuracy from different perspectives, enabling the contouring results to be comprehensively and objectively evaluated. The DRL model outperformed the DL model across all evaluated metrics. DRL achieved higher median DSC values, such as 0.97 versus 0.96 for the left kidney (P < 0.001), and demonstrated better boundary accuracy with lower HD95 values, e.g., 14.30 mm versus 17.24 mm for the rectum (P < 0.001). Moreover, DRL exhibited superior spatial agreement (median ASSD: 1.55 mm vs. 1.80 mm for the rectum, P < 0.001) and volume prediction accuracy (median RAVD: 10.25 vs. 10.64 for the duodenum, P < 0.001). These findings indicate that integrating SAM with RL (reinforcement learning) enhances segmentation accuracy and consistency compared to conventional DL methods. The proposed approach introduces a novel training strategy that improves performance without increasing model complexity, demonstrating its potential applicability in clinical practice.
Collapse
Affiliation(s)
- Li Yucheng
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Qiu Lingyun
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Shao Kainan
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Jia Yongshi
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Zhan Wenming
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Ding Jieni
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China
| | - Chen Weijun
- Cancer Center, Department of Radiation Oncology, Zhejiang Provincial People's Hospital, Affiliated People's Hospital, Hangzhou Medical College, Hangzhou, Zhejiang, China.
| |
Collapse
|
29
|
Tayfur B, Ritsche P, Sunderlik O, Wheeler M, Ramirez E, Leuteneker J, Faude O, Franchi MV, Johnson AK, Palmieri-Smith R. Automatic Segmentation of Quadriceps Femoris Cross-Sectional Area in Ultrasound Images: Development and Validation of Convolutional Neural Networks in People With Anterior Cruciate Ligament Injury and Surgery. ULTRASOUND IN MEDICINE & BIOLOGY 2025; 51:364-372. [PMID: 39581823 DOI: 10.1016/j.ultrasmedbio.2024.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Revised: 10/03/2024] [Accepted: 11/04/2024] [Indexed: 11/26/2024]
Abstract
OBJECTIVE Deep learning approaches such as DeepACSA enable automated segmentation of muscle ultrasound cross-sectional area (CSA). Although they provide fast and accurate results, most are developed using data from healthy populations. The changes in muscle size and quality following anterior cruciate ligament (ACL) injury challenges the validity of these automated approaches in the ACL population. Quadriceps muscle CSA is an important outcome following ACL injury; therefore, our aim was to validate DeepACSA, a convolutional neural network (CNN) approach for ACL injury. METHODS Quadriceps panoramic CSA ultrasound images (vastus lateralis [VL] n = 430, rectus femoris [RF] n = 349, and vastus medialis [VM] n = 723) from 124 participants with an ACL injury (age 22.8 ± 7.9 y, 61 females) were used to train CNN models. For VL and RF, combined models included extra images from healthy participants (n = 153, age 38.2, range 13-78) that the DeepACSA was developed from. All models were tested on unseen external validation images (n = 100) from ACL-injured participants. Model predicted CSA results were compared to manual segmentation results. RESULTS All models showed good comparability (ICC > 0.81, < 14.1% standard error of measurement, mean differences of <1.56 cm2) to manual segmentation. Removal of the erroneous predictions resulted in excellent comparability (ICC > 0.94, < 7.40% standard error of measurement, mean differences of <0.57 cm2). Erroneous predictions were 17% for combined VL, 11% for combined RF, and 20% for ACL-only VM models. CONCLUSION The new CNN models provided can be used in ACL-injured populations to measure CSA of VL, RF, and VM muscles automatically. The models yield high comparability to manual segmentation results and reduce the burden of manual segmentation.
Collapse
Affiliation(s)
- Beyza Tayfur
- School of Kinesiology, University of Michigan, Ann Arbor, MI, USA; Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Paul Ritsche
- Department of Sport, Exercise and Health, University of Basel, Basel, Switzerland
| | - Olivia Sunderlik
- Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Madison Wheeler
- Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Eric Ramirez
- Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Jacob Leuteneker
- Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Oliver Faude
- Department of Sport, Exercise and Health, University of Basel, Basel, Switzerland
| | - Martino V Franchi
- Department of Biomedical Sciences, University of Padua, Padua, Italy
| | - Alexa K Johnson
- School of Kinesiology, University of Michigan, Ann Arbor, MI, USA; Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA
| | - Riann Palmieri-Smith
- School of Kinesiology, University of Michigan, Ann Arbor, MI, USA; Orthopedic Rehabilitation & Biomechanics (ORB) Laboratory, University of Michigan, Ann Arbor, MI, USA; Department of Orthopaedic Surgery, Michigan Medicine, Ann Arbor, MI, USA.
| |
Collapse
|
30
|
Lima RV, Arruda MP, Muniz MCR, Filho HNF, Ferrerira DMR, Pereira SM. Artificial intelligence methods in diagnosis of retinoblastoma based on fundus imaging: a systematic review and meta-analysis. Graefes Arch Clin Exp Ophthalmol 2025; 263:547-553. [PMID: 39289309 DOI: 10.1007/s00417-024-06643-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 07/26/2024] [Accepted: 09/09/2024] [Indexed: 09/19/2024] Open
Abstract
BACKGROUND Artificial intelligence (AI) algorithms for the detection of retinoblastoma (RB) by fundus image analysis have been proposed as a potentially effective technique to facilitate diagnosis and screening programs. However, doubts remain about the accuracy of the technique, the best type of AI for this situation, and its feasibility for everyday use. Therefore, we performed a systematic review and meta-analysis to evaluate this issue. METHODS Following PRISMA 2020 guidelines, a comprehensive search of MEDLINE, Embase, ClinicalTrials.gov and IEEEX databases identified 494 studies whose titles and abstracts were screened for eligibility. We included diagnostic studies that evaluated the accuracy of AI in identifying retinoblastoma based on fundus imaging. Univariate and bivariate analysis was performed using the random effects model. The study protocol was registered in PROSPERO under CRD42024499221. RESULTS Six studies with 9902 fundus images were included, of which 5944 (60%) had confirmed RB. Only one dataset used a semi-supervised machine learning (ML) based method, all other studies used supervised ML, three using architectures requiring high computational power and two using more economical models. The pooled analysis of all models showed a sensitivity of 98.2% (95% CI: 0.947-0.994), a specificity of 98.5% (95% CI: 0.916-0.998) and an AUC of 0.986 (95% CI: 0.970-0.989). Subgroup analyses comparing models with high and low computational power showed no significant difference (p=0.824). CONCLUSIONS AI methods showed a high precision in the diagnosis of RB based on fundus images with no significant difference when comparing high and low computational power models, suggesting a viability of their use. Validation and cost-effectiveness studies are needed in different income countries. Subpopulations should also be analyzed, as AI may be useful as an initial screening tool in populations at high risk for RB, serving as a bridge to the pediatric ophthalmologist or ocular oncologist, who are scarce globally. KEY MESSAGES What is known Retinoblastoma is the most common intraocular cancer in childhood and diagnostic delay is the main factor leading to a poor prognosis. The application of machine learning techniques proposes reliable methods for screening and diagnosis of retinal diseases. What is new The meta-analysis of the diagnostic accuracy of artificial intelligence methods for diagnosing retinoblastoma based on fundus images showed a sensitivity of 98.2% (95% CI: 0.947-0.994) and a specificity of 98.5% (95% CI: 0.916-0.998). There was no statistically significant difference in the diagnostic accuracy of high and low computational power models. The overall performance of supervised machine learning was best than unsupervised, although few studies were available on the second type.
Collapse
Affiliation(s)
- Rian Vilar Lima
- Department of Medicine, University of Fortaleza, Av. Washington Soares, 1321 - Edson Queiroz, Fortaleza - CE, Ceará, 60811-905, Brazil.
| | | | - Maria Carolina Rocha Muniz
- Department of Medicine, University of Fortaleza, Av. Washington Soares, 1321 - Edson Queiroz, Fortaleza - CE, Ceará, 60811-905, Brazil
| | - Helvécio Neves Feitosa Filho
- Department of Medicine, University of Fortaleza, Av. Washington Soares, 1321 - Edson Queiroz, Fortaleza - CE, Ceará, 60811-905, Brazil
| | | | | |
Collapse
|
31
|
Zheng Q, Ma L, Wu Y, Gao Y, Li H, Lin J, Qing S, Long D, Chen X, Zhang W. Automatic 3-dimensional quantification of orthodontically induced root resorption in cone-beam computed tomography images based on deep learning. Am J Orthod Dentofacial Orthop 2025; 167:188-201. [PMID: 39503671 DOI: 10.1016/j.ajodo.2024.09.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 08/24/2024] [Accepted: 09/04/2024] [Indexed: 01/03/2025]
Abstract
INTRODUCTION Orthodontically induced root resorption (OIRR) is a common and undesirable consequence of orthodontic treatment. Traditionally, studies employ manual methods to conduct 3-dimensional quantitative analysis of OIRR via cone-beam computed tomography (CBCT), which is often subjective and time-consuming. With advancements in computer technology, deep learning-based approaches have gained traction in medical image processing. This study presents a deep learning-based model for the fully automatic extraction of root volume information and the localization of root resorption from CBCT images. METHODS In this cross-sectional, retrospective study, 4534 teeth from 105 patients were used to train and validate an automatic model for OIRR quantification. The protocol encompassed several steps: preprocessing of CBCT images involving automatic tooth segmentation and conversion into point clouds, followed by segmentation of tooth crowns and roots via the Dynamic Graph Convolutional Neural Network. The root volume was subsequently calculated, and OIRR localization was performed. The intraclass correlation coefficient was employed to validate the consistency between the automatic model and manual measurements. RESULTS The proposed method strongly correlated with manual measurements in terms of root volume and OIRR severity assessment. The intraclass correlation coefficient values for average volume measurements at each tooth position exceeded 0.95 (P <0.001), with the accuracy of different OIRR severity classifications surpassing 0.8. CONCLUSIONS The proposed methodology provides automatic and reliable tools for OIRR assessment, offering potential improvements in orthodontic treatment planning and monitoring.
Collapse
Affiliation(s)
- Qianhan Zheng
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Lei Ma
- Department of Control Science and Engineering, School of Electronics and Information Engineering, Tongji University, Shanghai, China
| | - Yongjia Wu
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Yu Gao
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Huimin Li
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Jiaqi Lin
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Shuhong Qing
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China
| | - Dan Long
- Zhejiang Cancer Hospital, Hangzhou Institute of Medicine, Chinese Academy of Sciences, Hangzhou, Zhejiang, China
| | - Xuepeng Chen
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China.
| | - Weifang Zhang
- Stomatology Hospital, School of Stomatology, Zhejiang University School of Medicine, Clinical Research Center for Oral Diseases of Zhejiang Province, Key Laboratory of Oral Biomedical Research of Zhejiang Province, Cancer Center of Zhejiang University, Hangzhou, Zhejiang, China; Social Medicine and Health Affairs Administration, Zhejiang University, Hangzhou, Zhejiang, China.
| |
Collapse
|
32
|
Wang Y, Chen F, Ouyang Z, He S, Qin X, Liang X, Huang W, Wang R, Hu K. MRI-based deep learning and radiomics for predicting the efficacy of PD-1 inhibitor combined with induction chemotherapy in advanced nasopharyngeal carcinoma: A prospective cohort study. Transl Oncol 2025; 52:102245. [PMID: 39662448 PMCID: PMC11697067 DOI: 10.1016/j.tranon.2024.102245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 11/20/2024] [Accepted: 12/07/2024] [Indexed: 12/13/2024] Open
Abstract
BACKGROUND An increasing number of nasopharyngeal carcinoma (NPC) patients benefit from immunotherapy with chemotherapy as an induction treatment. Currently, there isn't a reliable method to assess the efficacy of this regimen, which hinders informed decision-making for follow-up care. AIM To establish and evaluate a model for predicting the efficacy of programmed death-1 (PD-1) inhibitor combined with GP (gemcitabine and cisplatin) induction chemotherapy based on deep learning features (DLFs) and radiomic features. METHODS Ninety-nine patients diagnosed with advanced NPC were enrolled and randomly divided into training set and test set in a 7:3 ratio. From MRI scans, DLFs and conventional radiomic characteristics were recovered. The random forest algorithm was employed to identify the most valuable features. A prediction model was then created using these radiomic characteristics and DLFs to determine the effectiveness of PD-1 inhibitor combined with GP chemotherapy. The model's performance was assessed using Receiver Operating Characteristic (ROC) curve analysis, area under the curve (AUC), accuracy (ACC), and negative predictive value (NPV). RESULTS Twenty-one prediction models were constructed. The Tf_Radiomics+Resnet101 model, which combines radiomic features and DLFs, demonstrated the best performance. The model's AUC, ACC, and NPV values in the training and test sets were 0.936 (95%CI: 0.827-1.0), 0.9, and 0.923, respectively. CONCLUSION The Tf_Radiomics+Resnet101 model, based on MRI and Resnet101 deep learning, shows a high ability to predict the clinically complete response (cCR) efficacy of PD-1 inhibitor combined with GP in advanced NPC. This model can significantly enhance the treatment management of patients with advanced NPC.
Collapse
Affiliation(s)
- Yiru Wang
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Fuli Chen
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Zhechen Ouyang
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Siyi He
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Xinling Qin
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Xian Liang
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Weimei Huang
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
| | - Rensheng Wang
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China.
| | - Kai Hu
- Department of Radiation Oncology, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China; Key Laboratory of Early Prevention and Treatment for Regional High Frequency Tumor (Guangxi Medical University), Ministry of Education, Nanning 530021, Guangxi, China; Guangxi Key Laboratory of Immunology and Metabolism for Liver Diseases, Nanning 530021, Guangxi, China; State Key Laboratory of Targeting Oncology, Guangxi Medical University, Nanning 530021, Guangxi, China.
| |
Collapse
|
33
|
Rosero K, Salman AN, Harrison LM, Kane AA, Busso C, Hallac RR. Deep Learning-Based Assessment of Lip Symmetry for Patients With Repaired Cleft Lip. Cleft Palate Craniofac J 2025; 62:289-299. [PMID: 39838936 PMCID: PMC11909766 DOI: 10.1177/10556656241312730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2025] Open
Abstract
ObjectivePost-surgical lip symmetry assessment is a key indicator of cleft repair success. Traditional methods rely on distances between anatomical landmarks, which are impractical for video analysis and overlook texture and appearance. We propose an artificial intelligence (AI) approach to automate this process, analyzing lateral lip morphology for a quantitative symmetry evaluation.DesignWe utilize contrastive learning to quantify lip symmetry by measuring the similarity between the representations of the sides, which is subsequently used to classify the severity of asymmetry. Our model does not require patient images for training. Instead, we introduce dissimilarities in face images from open datasets using two methods: temporal misalignment for video frames and face transformations to simulate lip asymmetry observed in the target population. The model differentiates the left and right image representations to assess asymmetry. We evaluated our model on 146 images of patients with repaired cleft lip.ResultsThe deep learning model trained with face transformations categorized patient images into five asymmetry levels, achieving a weighted accuracy of 75% and a Pearson correlation of 0.31 with medical expert human evaluations. The model utilizing temporal misalignment achieved a weighted accuracy of 69% and a Pearson correlation of 0.27 for the same classification task.ConclusionsWe propose an automated approach for assessing lip asymmetry in patients with repaired cleft lip by transforming facial images of control subjects to train a deep learning model, eliminating manual anatomical landmarks. Our promising results provide a more efficient and objective tool for evaluating surgical outcomes.
Collapse
Affiliation(s)
- Karen Rosero
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX, USA
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ali N. Salman
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX, USA
| | - Lucas M. Harrison
- Department of Plastic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Alex A. Kane
- Department of Plastic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Carlos Busso
- Department of Electrical and Computer Engineering, The University of Texas at Dallas, Richardson, TX, USA
- Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Rami R. Hallac
- Department of Plastic Surgery, University of Texas Southwestern Medical Center, Dallas, TX, USA
- Analytical Imaging and Modeling Center, Children's Health, Dallas, TX, USA
| |
Collapse
|
34
|
Fu H, Zhang J, Chen L, Zou J. Personalized federated learning for abdominal multi-organ segmentation based on frequency domain aggregation. J Appl Clin Med Phys 2025; 26:e14602. [PMID: 39636019 PMCID: PMC11799920 DOI: 10.1002/acm2.14602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2024] [Revised: 11/19/2024] [Accepted: 11/21/2024] [Indexed: 12/07/2024] Open
Abstract
PURPOSE The training of deep learning (DL) models in medical images requires large amounts of sensitive patient data. However, acquiring adequately labeled datasets is challenging because of the heavy workload of manual annotations and the stringent privacy protocols. METHODS Federated learning (FL) provides an alternative approach in which a coalition of clients collaboratively trains models without exchanging the underlying datasets. In this study, a novel Personalized Federated Learning Framework (PAF-Fed) is presented for abdominal multi-organ segmentation. Different from traditional FL algorithms, PAF-Fed selectively gathers partial model parameters for inter-client collaboration, retaining the remaining parameters to learn local data distributions at individual sites. Additionally, the Fourier Transform with the Self-attention mechanism is employed to aggregate the low-frequency components of parameters, promoting the extraction of shared knowledge and tackling statistical heterogeneity from diverse client datasets. RESULTS The proposed method was evaluated on the Combined Healthy Abdominal Organ Segmentation magnetic resonance imaging (MRI) dataset (CHAOS 2019) and a private computed tomography (CT) dataset, achieving an average Dice Similarity Coefficient (DSC) of 72.65% for CHAOS and 85.50% for the private CT dataset, respectively. CONCLUSION The experimental results demonstrate the superiority of our PAF-Fed by outperforming state-of-the-art FL methods.
Collapse
Affiliation(s)
- Hao Fu
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Jian Zhang
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Lanlan Chen
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| | - Junzhong Zou
- Department of Automation, School of Information Science and EngineeringEast China University of Science and TechnologyShanghaiChina
| |
Collapse
|
35
|
Li X, Zhao L, Zhang L, Wu Z, Liu Z, Jiang H, Cao C, Xu S, Li Y, Dai H, Yuan Y, Liu J, Li G, Zhu D, Yan P, Li Q, Liu W, Liu T, Shen D. Artificial General Intelligence for Medical Imaging Analysis. IEEE Rev Biomed Eng 2025; 18:113-129. [PMID: 39509310 DOI: 10.1109/rbme.2024.3493775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2024]
Abstract
Large-scale Artificial General Intelligence (AGI) models, including Large Language Models (LLMs) such as ChatGPT/GPT-4, have achieved unprecedented success in a variety of general domain tasks. Yet, when applied directly to specialized domains like medical imaging, which require in-depth expertise, these models face notable challenges arising from the medical field's inherent complexities and unique characteristics. In this review, we delve into the potential applications of AGI models in medical imaging and healthcare, with a primary focus on LLMs, Large Vision Models, and Large Multimodal Models. We provide a thorough overview of the key features and enabling techniques of LLMs and AGI, and further examine the roadmaps guiding the evolution and implementation of AGI models in the medical sector, summarizing their present applications, potentialities, and associated challenges. In addition, we highlight potential future research directions, offering a holistic view on upcoming ventures. This comprehensive review aims to offer insights into the future implications of AGI in medical imaging, healthcare, and beyond.
Collapse
|
36
|
Kim JW, Khan AU, Banerjee I. Systematic Review of Hybrid Vision Transformer Architectures for Radiological Image Analysis. JOURNAL OF IMAGING INFORMATICS IN MEDICINE 2025:10.1007/s10278-024-01322-4. [PMID: 39871042 DOI: 10.1007/s10278-024-01322-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2024] [Revised: 10/11/2024] [Accepted: 10/25/2024] [Indexed: 01/29/2025]
Abstract
Vision transformer (ViT)and convolutional neural networks (CNNs) each possess distinct strengths in medical imaging: ViT excels in capturing long-range dependencies through self-attention, while CNNs are adept at extracting local features via spatial convolution filters. While ViT may struggle with capturing detailed local spatial information, critical for tasks like anomaly detection in medical imaging, shallow CNNs often fail to effectively abstract global context. This study aims to explore and evaluate hybrid architectures that integrate ViT and CNN to leverage their complementary strengths for enhanced performance in medical vision tasks, such as segmentation, classification, reconstruction, and prediction. Following PRISMA guideline, a systematic review was conducted on 34 articles published between 2020 and Sept. 2024. These articles proposed novel hybrid ViT-CNN architectures specifically for medical imaging tasks in radiology. The review focused on analyzing architectural variations, merging strategies between ViT and CNN, innovative applications of ViT, and efficiency metrics including parameters, inference time (GFlops), and performance benchmarks. The review identified that integrating ViT and CNN can mitigate the limitations of each architecture offering comprehensive solutions that combine global context understanding with precise local feature extraction. We benchmarked the articles based on architectural variations, merging strategies, innovative uses of ViT, and efficiency metrics (number of parameters, inference time (GFlops), and performance), and derived a ranked list. By synthesizing current literature, this review defines fundamental concepts of hybrid vision transformers and highlights emerging trends in the field. It provides a clear direction for future research aimed at optimizing the integration of ViT and CNN for effective utilization in medical imaging, contributing to advancements in diagnostic accuracy and image analysis. We performed systematic review of hybrid vision transformer architecture using PRISMA guideline and performed thorough comparative analysis to benchmark the architectures.
Collapse
Affiliation(s)
- Ji Woong Kim
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | | | - Imon Banerjee
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA.
- Department of Radiology, Mayo Clinic, Phoenix, AZ, USA.
- Department of Artificial Intelligence and Informatics (AI&I), Mayo Clinic, Scottsdale, AZ, USA.
| |
Collapse
|
37
|
Wang H, Wu G, Liu Y. Efficient Generative-Adversarial U-Net for Multi-Organ Medical Image Segmentation. J Imaging 2025; 11:19. [PMID: 39852332 PMCID: PMC11766170 DOI: 10.3390/jimaging11010019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2024] [Revised: 01/06/2025] [Accepted: 01/10/2025] [Indexed: 01/26/2025] Open
Abstract
Manual labeling of lesions in medical image analysis presents a significant challenge due to its labor-intensive and inefficient nature, which ultimately strains essential medical resources and impedes the advancement of computer-aided diagnosis. This paper introduces a novel medical image-segmentation framework named Efficient Generative-Adversarial U-Net (EGAUNet), designed to facilitate rapid and accurate multi-organ labeling. To enhance the model's capability to comprehend spatial information, we propose the Global Spatial-Channel Attention Mechanism (GSCA). This mechanism enables the model to concentrate more effectively on regions of interest. Additionally, we have integrated Efficient Mapping Convolutional Blocks (EMCB) into the feature-learning process, allowing for the extraction of multi-scale spatial information and the adjustment of feature map channels through optimized weight values. Moreover, the proposed framework progressively enhances its performance by utilizing a generative-adversarial learning strategy, which contributes to improvements in segmentation accuracy. Consequently, EGAUNet demonstrates exemplary segmentation performance on public multi-organ datasets while maintaining high efficiency. For instance, in evaluations on the CHAOS T2SPIR dataset, EGAUNet achieves approximately 2% higher performance on the Jaccard metric, 1% higher on the Dice metric, and nearly 3% higher on the precision metric in comparison to advanced networks such as Swin-Unet and TransUnet.
Collapse
Affiliation(s)
- Haoran Wang
- Faculty of Data Science, City University of Macau, Avenida Padre Tomás Pereira Taipa, Macao 999078, China;
| | - Gengshen Wu
- Faculty of Data Science, City University of Macau, Avenida Padre Tomás Pereira Taipa, Macao 999078, China;
| | - Yi Liu
- School of Computer Science and Artificial Intelligence, Changzhou University, Changzhou 213000, China;
| |
Collapse
|
38
|
Mirabian S, Mohammadian F, Ganji Z, Zare H, Hasanpour Khalesi E. The potential role of machine learning and deep learning in differential diagnosis of Alzheimer's disease and FTD using imaging biomarkers: A review. Neuroradiol J 2025:19714009251313511. [PMID: 39787363 PMCID: PMC11719431 DOI: 10.1177/19714009251313511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Revised: 11/22/2024] [Accepted: 11/28/2024] [Indexed: 01/12/2025] Open
Abstract
INTRODUCTION The prevalence of neurodegenerative diseases has significantly increased, necessitating a deeper understanding of their symptoms, diagnostic processes, and prevention strategies. Frontotemporal dementia (FTD) and Alzheimer's disease (AD) are two prominent neurodegenerative conditions that present diagnostic challenges due to overlapping symptoms. To address these challenges, experts utilize a range of imaging techniques, including magnetic resonance imaging (MRI), diffusion tensor imaging (DTI), functional MRI (fMRI), positron emission tomography (PET), and single-photon emission computed tomography (SPECT). These techniques facilitate a detailed examination of the manifestations of these diseases. Recent research has demonstrated the potential of artificial intelligence (AI) in automating the diagnostic process, generating significant interest in this field. MATERIALS AND METHODS This narrative review aims to compile and analyze articles related to the AI-assisted diagnosis of FTD and AD. We reviewed 31 articles published between 2012 and 2024, with 23 focusing on machine learning techniques and 8 on deep learning techniques. The studies utilized features extracted from both single imaging modalities and multi-modal approaches, and evaluated the performance of various classification models. RESULTS Among the machine learning studies, Support Vector Machines (SVM) exhibited the most favorable performance in classifying FTD and AD. In deep learning studies, the ResNet convolutional neural network outperformed other networks. CONCLUSION This review highlights the utility of different imaging modalities as diagnostic aids in distinguishing between FTD and AD. However, it emphasizes the importance of incorporating clinical examinations and patient symptom evaluations to ensure comprehensive and accurate diagnoses.
Collapse
Affiliation(s)
- Sara Mirabian
- Department of Medical Physics, Faculty of Medicine, Mashhad University of Medical Sciences, Iran
| | - Fatemeh Mohammadian
- Department of Medical Physics, Faculty of Medicine, Mashhad University of Medical Sciences, Iran
| | - Zohreh Ganji
- Department of Medical Physics, Faculty of Medicine, Mashhad University of Medical Sciences, Iran
| | - Hoda Zare
- Department of Medical Physics, Faculty of Medicine, Mashhad University of Medical Sciences, Iran
- Medical Physics Research Center, Basic Sciences Research Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Erfan Hasanpour Khalesi
- Department of Medical Physics, Faculty of Medicine, Mashhad University of Medical Sciences, Iran
| |
Collapse
|
39
|
Lu E, Zhang D, Han M, Wang S, He L. The application of artificial intelligence in insomnia, anxiety, and depression: A bibliometric analysis. Digit Health 2025; 11:20552076251324456. [PMID: 40035038 PMCID: PMC11873874 DOI: 10.1177/20552076251324456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2024] [Accepted: 02/11/2025] [Indexed: 03/05/2025] Open
Abstract
Background Mental health issues like insomnia, anxiety, and depression have increased significantly. Artificial intelligence (AI) has shown promise in diagnosing and providing personalized treatment. Objective This study aims to systematically review the application of AI in addressing insomnia, anxiety, and depression, identifying key research hotspots, and forecasting future trends through bibliometric analysis. Methods We analyzed a total of 875 articles from the Web of Science Core Collection (2000-2024) using bibliometric tools such as VOSviewer and CiteSpace. These tools were used to map research trends, highlight international collaboration, and examine the contributions of leading countries, institutions, and authors in the field. Results The United States and China lead the field in terms of research output and collaborations. Key research areas include "neural networks," "machine learning," "deep learning," and "human-robot interaction," particularly in relation to personalized treatment approaches. However, challenges around data privacy, ethical concerns, and the interpretability of AI models need to be addressed. Conclusions This study highlights the growing role of AI in mental health research and identifies future priorities, such as improving data quality, addressing ethical challenges, and integrating AI more seamlessly into clinical practice. These advancements will be crucial in addressing the global mental health crisis.
Collapse
Affiliation(s)
- Enshi Lu
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Di Zhang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Mingguang Han
- School of Mathematical Sciences, Peking University, Beijing, China
| | - Shihua Wang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| | - Liyun He
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China
| |
Collapse
|
40
|
Zhao H, Ren T, Li W, Wu D, Xu Z. EGFDA: Experience-guided Fine-grained Domain Adaptation for cross-domain pneumonia diagnosis. Knowl Based Syst 2025; 307:112752. [DOI: 10.1016/j.knosys.2024.112752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
41
|
Lam VK, Fischer E, Jawad K, Tabaie S, Cleary K, Anwar SM. An automated framework for pediatric hip surveillance and severity assessment using radiographs. Int J Comput Assist Radiol Surg 2025; 20:203-211. [PMID: 39283409 DOI: 10.1007/s11548-024-03254-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 08/12/2024] [Indexed: 01/25/2025]
Abstract
PURPOSE Hip dysplasia is the second most common orthopedic condition in children with cerebral palsy (CP) and may result in disability and pain. The migration percentage (MP) is a widely used metric in hip surveillance, calculated based on an anterior-posterior pelvis radiograph. However, manual quantification of MP values using hip X-ray scans in current standard practice has challenges including being time-intensive, requiring expert knowledge, and not considering human bias. The purpose of this study is to develop a machine learning algorithm to automatically quantify MP values using a hip X-ray scan, and hence provide an assessment for severity, which then can be used for surveillance, treatment planning, and management. METHODS X-ray scans from 210 patients were curated, pre-processed, and manually annotated at our clinical center. Several machine learning models were trained using pre-trained weights from Inception ResNet-V2, VGG-16, and VGG-19, with different strategies (pre-processing, with and without region of interest (ROI) detection, with and without data augmentation) to find an optimal model for automatic hip landmarking. The predicted landmarks were then used by our geometric algorithm to quantify the MP value for the input hip X-ray scan. RESULTS The pre-trained VGG-19 model, fine-tuned with additional custom layers, outputted the lowest mean squared error values for both train and test data, when ROI cropped images were used along with data augmentation for model training. The MP value calculated by the algorithm was compared to manual ground truth labels from our orthopedic fellows using the hip screen application for benchmarking. CONCLUSION The results showed the feasibility of the machine learning model in automatic hip landmark detection for reliably quantifying MP value from hip X-ray scans. The algorithm could be used as an accurate and reliable tool in orthopedic care for diagnosing, severity assessment, and hence treatment and surgical planning for hip displacement.
Collapse
Affiliation(s)
- Van Khanh Lam
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA
| | - Elizabeth Fischer
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA
| | - Kochai Jawad
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA
| | - Sean Tabaie
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA
| | - Kevin Cleary
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA
| | - Syed Muhammad Anwar
- Sheikh Zayed Institute for Pediatric Surgical Innovation, Childrens National Hospital, Washington, DC, 20008, USA.
- School of Medicine and Health Sciences, George Washington University, Washington, DC, 20052, USA.
| |
Collapse
|
42
|
Mann H, Khan S, Prasad A, Bayat F, Gu J, Jackson K, Li Y, Hosseinidoust Z, Didar TF, Filipe CDM. Bacteriophage-Activated DNAzyme Hydrogels Combined with Machine Learning Enable Point-of-Use Colorimetric Detection of Escherichia coli. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2025; 37:e2411173. [PMID: 39588857 PMCID: PMC11756048 DOI: 10.1002/adma.202411173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 11/10/2024] [Indexed: 11/27/2024]
Abstract
Developing cost-effective, consumer-accessible platforms for point-of-use environmental and clinical pathogen testing is a priority, to reduce reliance on laborious, time-consuming culturing approaches. Unfortunately, a system offering ultrasensitive detection capabilities in a form that requires little auxiliary equipment or training has remained elusive. Here, a colorimetric DNAzyme-crosslinked hydrogel sensor is presented. In the presence of a target pathogen, DNAzyme cleavage results in hydrogel dissolution, yielding the release of entrapped gold nanoparticles in a manner visible to the naked eye. Recognizing that Escherichia coli holds high relevance within both environmental and clinical environments, an E. coli-responsive DNAzyme is incorporated into this platform. Through the optimization of the hydrogel polymerization process and the discovery of bacteriophage-induced DNAzyme signal amplification, 101 CFU mL-1 E. coli is detected within real-world lake water samples. Subsequent pairing with an artificial intelligence model removed ambiguity in sensor readout, offering 96% true positive and 100% true negative accuracy. Finally, high sensor specificity and stability results supported clinical use, where 100% of urine samples collected from patients with E. coli urinary tract infections are accurately identified. No false positives are observed when testing healthy samples. Ultimately, this platform stands to significantly improve population health by substantially increasing pathogen testing accessibility.
Collapse
Affiliation(s)
- Hannah Mann
- Department of Chemical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Shadman Khan
- School of Biomedical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Akansha Prasad
- School of Biomedical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Fereshteh Bayat
- School of Biomedical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Jimmy Gu
- Department of Biochemistry and Biomedical SciencesMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Kyle Jackson
- Department of Chemical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Yingfu Li
- Department of Biochemistry and Biomedical SciencesMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Zeinab Hosseinidoust
- Department of Chemical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| | - Tohid F. Didar
- School of Biomedical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
- Department of Mechanical EngineeringMcMaster UniversityHamiltonONL8S 4L7Canada
| | - Carlos D. M. Filipe
- Department of Chemical EngineeringMcMaster University1280 Main Street WestHamiltonONL8S 4L8Canada
| |
Collapse
|
43
|
Zhong W, Zhang H. EF-net: Accurate edge segmentation for segmenting COVID-19 lung infections from CT images. Heliyon 2024; 10:e40580. [PMID: 39669151 PMCID: PMC11635652 DOI: 10.1016/j.heliyon.2024.e40580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 11/19/2024] [Accepted: 11/19/2024] [Indexed: 12/14/2024] Open
Abstract
Despite advances in modern medicine including the use of computed tomography for detecting COVID-19, precise identification and segmentation of lesions remain a significant challenge owing to indistinct boundaries and low degrees of contrast between infected and healthy lung tissues. This study introduces a novel model called the edge-based dual-parallel attention (EDA)-guided feature-filtering network (EF-Net), specifically designed to accurately segment the edges of COVID-19 lesions. The proposed model comprises two modules: an EDA module and a feature-filtering module (FFM). EDA efficiently extracts structural and textural features from low-level features, enabling the precise identification of lesion boundaries. FFM receives semantically rich features from a deep-level encoder and integrates features with abundant texture and contour information obtained from the EDA module. After filtering through a gating mechanism of the FFM, the EDA features are fused with deep-level features, yielding features rich in both semantic and textural information. Experiments demonstrate that our model outperforms existing models including Inf_Net, GFNet, and BSNet considering various metrics, offering better and clearer segmentation results, particularly for segmenting lesion edges. Moreover, superior performance on the three datasets is achieved, with dice coefficients of 98.1, 97.3, and 72.1 %.
Collapse
|
44
|
Nie W, Jiang Y, Yao L, Zhu X, AL-Danakh AY, Liu W, Chen Q, Yang D. Prediction of bladder cancer prognosis and immune microenvironment assessment using machine learning and deep learning models. Heliyon 2024; 10:e39327. [PMID: 39687145 PMCID: PMC11647853 DOI: 10.1016/j.heliyon.2024.e39327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 10/03/2024] [Accepted: 10/11/2024] [Indexed: 12/18/2024] Open
Abstract
Bladder cancer (BCa) is a heterogeneous malignancy characterized by distinct immune subtypes, primarily due to differences in tumor-infiltrating immune cells and their functional characteristics. Therefore, understanding the tumor immune microenvironment (TIME) landscape in BCa is crucial for prognostic prediction and guiding precision therapy. In this study, we integrated 10 machine learning algorithms to develop an immune-related machine learning signature (IRMLS) and subsequently created a deep learning model to detect the IRMLS subtype based on pathological images. The IRMLS proved to be an independent prognostic factor for overall survival (OS) and demonstrated robust and stable performance (p < 0.01). The high-risk group exhibited an immune-inflamed phenotype, associated with poorer prognosis and higher levels of immune cell infiltration. We further investigated the cancer immune cycle and mutation landscape within the IRMLS model, confirming that the high-risk group is more sensitive to immune checkpoint immunotherapy (ICI) and adjuvant chemotherapy with cisplatin (p = 2.8e-10), docetaxel (p = 8.8e-13), etoposide (p = 1.8e-07), and paclitaxel (p = 6.2e-13). In conclusion, we identified and validated a machine learning-based molecular characteristic, IRMLS, which reflects various aspects of the BCa biological process and offers new insights into personalized precision therapy for BCa patients.
Collapse
Affiliation(s)
- Weihao Nie
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
| | - Yiheng Jiang
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
| | - Luhan Yao
- School of Information and Communication Engineering, Dalian University of Technology, Dalian, China
| | - Xinqing Zhu
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
| | - Abdullah Y. AL-Danakh
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
| | - Wenlong Liu
- School of Information and Communication Engineering, Dalian University of Technology, Dalian, China
| | - Qiwei Chen
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
- Zhongda Hospital, Medical School, Advanced Institute for Life and Health, Southeast University, Nanjing, 210096, China
| | - Deyong Yang
- Department of Urology, First Affiliated Hospital of Dalian Medical University, Dalian, 116021, China
| |
Collapse
|
45
|
Hong Y, Jeong S, Park MJ, Song W, Lee N. Application of Pathomic Features for Differentiating Dysplastic Cells in Patients with Myelodysplastic Syndrome. Bioengineering (Basel) 2024; 11:1230. [PMID: 39768048 PMCID: PMC11673167 DOI: 10.3390/bioengineering11121230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2024] [Revised: 11/29/2024] [Accepted: 11/29/2024] [Indexed: 01/11/2025] Open
Abstract
Myelodysplastic syndromes (MDSs) are a group of hematologic neoplasms accompanied by dysplasia of bone marrow (BM) hematopoietic cells with cytopenia. Recently, digitalized pathology and pathomics using computerized feature analysis have been actively researched for classifying and predicting prognosis in various tumors of hematopoietic tissues. This study analyzed the pathomic features of hematopoietic cells in BM aspiration smears of patients with MDS according to each hematopoietic cell lineage and dysplasia. We included 24 patients with an MDS and 21 with normal BM. The 12,360 hematopoietic cells utilized were to be classified into seven types: normal erythrocytes, normal granulocytes, normal megakaryocytes, dysplastic erythrocytes, dysplastic granulocytes, dysplastic megakaryocytes, and others. Four hundred seventy-six pathomic features quantifying cell intensity, shape, and texture were extracted from each segmented cell. After comparing the combination of feature selection and machine learning classifier methods using 5-fold cross-validation area under the receiver operating characteristic curve (AUROC), the quadratic discriminant analysis (QDA) with gradient boosting decision tree (AUROC = 0.63) and QDA with eXtreme gradient boosting (XGB) (AUROC = 0.64) showed a high AUROC combination. Through a feature selection process, 30 characteristics were further analyzed. Dysplastic erythrocytes and granulocytes showed lower median values on heatmap analysis compared to that of normal erythrocytes and granulocytes. The data suggest that pathomic features could be applied to cell differentiation in hematologic malignancies. It could be used as a new biomarker with an auxiliary role for more accurate diagnosis. Further studies including prediction survival and prognosis with larger cohort of patients are needed.
Collapse
Affiliation(s)
- Youngtaek Hong
- CONNECT-AI Research Center, Yonsei University College of Medicine, Seoul 03764, Republic of Korea;
| | - Seri Jeong
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Republic of Korea; (S.J.); (M.-J.P.); (W.S.)
| | - Min-Jeong Park
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Republic of Korea; (S.J.); (M.-J.P.); (W.S.)
| | - Wonkeun Song
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Republic of Korea; (S.J.); (M.-J.P.); (W.S.)
| | - Nuri Lee
- Department of Laboratory Medicine, Kangnam Sacred Heart Hospital, Hallym University College of Medicine, Seoul 07440, Republic of Korea; (S.J.); (M.-J.P.); (W.S.)
| |
Collapse
|
46
|
Zhou J, Jie B, Wang Z, Zhang Z, Du T, Bian W, Yang Y, Jia J. LCGNet: Local Sequential Feature Coupling Global Representation Learning for Functional Connectivity Network Analysis With fMRI. IEEE TRANSACTIONS ON MEDICAL IMAGING 2024; 43:4319-4330. [PMID: 38949932 DOI: 10.1109/tmi.2024.3421360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Analysis of functional connectivity networks (FCNs) derived from resting-state functional magnetic resonance imaging (rs-fMRI) has greatly advanced our understanding of brain diseases, including Alzheimer's disease (AD) and attention deficit hyperactivity disorder (ADHD). Advanced machine learning techniques, such as convolutional neural networks (CNNs), have been used to learn high-level feature representations of FCNs for automated brain disease classification. Even though convolution operations in CNNs are good at extracting local properties of FCNs, they generally cannot well capture global temporal representations of FCNs. Recently, the transformer technique has demonstrated remarkable performance in various tasks, which is attributed to its effective self-attention mechanism in capturing the global temporal feature representations. However, it cannot effectively model the local network characteristics of FCNs. To this end, in this paper, we propose a novel network structure for Local sequential feature Coupling Global representation learning (LCGNet) to take advantage of convolutional operations and self-attention mechanisms for enhanced FCN representation learning. Specifically, we first build a dynamic FCN for each subject using an overlapped sliding window approach. We then construct three sequential components (i.e., edge-to-vertex layer, vertex-to-network layer, and network-to-temporality layer) with a dual backbone branch of CNN and transformer to extract and couple from local to global topological information of brain networks. Experimental results on two real datasets (i.e., ADNI and ADHD-200) with rs-fMRI data show the superiority of our LCGNet.
Collapse
|
47
|
Mahbod A, Dorffner G, Ellinger I, Woitek R, Hatamikia S. Improving generalization capability of deep learning-based nuclei instance segmentation by non-deterministic train time and deterministic test time stain normalization. Comput Struct Biotechnol J 2024; 23:669-678. [PMID: 38292472 PMCID: PMC10825317 DOI: 10.1016/j.csbj.2023.12.042] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Revised: 12/26/2023] [Accepted: 12/26/2023] [Indexed: 02/01/2024] Open
Abstract
With the advent of digital pathology and microscopic systems that can scan and save whole slide histological images automatically, there is a growing trend to use computerized methods to analyze acquired images. Among different histopathological image analysis tasks, nuclei instance segmentation plays a fundamental role in a wide range of clinical and research applications. While many semi- and fully-automatic computerized methods have been proposed for nuclei instance segmentation, deep learning (DL)-based approaches have been shown to deliver the best performances. However, the performance of such approaches usually degrades when tested on unseen datasets. In this work, we propose a novel method to improve the generalization capability of a DL-based automatic segmentation approach. Besides utilizing one of the state-of-the-art DL-based models as a baseline, our method incorporates non-deterministic train time and deterministic test time stain normalization, and ensembling to boost the segmentation performance. We trained the model with one single training set and evaluated its segmentation performance on seven test datasets. Our results show that the proposed method provides up to 4.9%, 5.4%, and 5.9% better average performance in segmenting nuclei based on Dice score, aggregated Jaccard index, and panoptic quality score, respectively, compared to the baseline segmentation model.
Collapse
Affiliation(s)
- Amirreza Mahbod
- Research Center for Medical Image Analysis and Artificial Intelligence, Department of Medicine, Danube Private University, Krems an der Donau, Austria
| | - Georg Dorffner
- Institute of Artificial Intelligence, Medical University of Vienna, Vienna, Austria
| | - Isabella Ellinger
- Institute for Pathophysiology and Allergy Research, Medical University of Vienna, Vienna, Austria
| | - Ramona Woitek
- Research Center for Medical Image Analysis and Artificial Intelligence, Department of Medicine, Danube Private University, Krems an der Donau, Austria
| | - Sepideh Hatamikia
- Research Center for Medical Image Analysis and Artificial Intelligence, Department of Medicine, Danube Private University, Krems an der Donau, Austria
- Austrian Center for Medical Innovation and Technology, Wiener Neustadt, Austria
| |
Collapse
|
48
|
Fortunov RM, Cabacungan E, Barry JS, Jagarapu J. Artificial intelligence and informatics in neonatal resuscitation. Semin Perinatol 2024; 48:151992. [PMID: 39488455 DOI: 10.1016/j.semperi.2024.151992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2024]
Abstract
Neonatal intensive care unit resuscitative care continually evolves and increasingly relies on data. Data driven precision resuscitation care can be enabled by leveraging informatics tools and artificial intelligence. Despite technological advancements, these data are often underutilized due to suboptimal data capture, aggregation, and low adoption of artificial intelligence and analytic tools. This review describes the fundamentals and explores the evidence behind informatics and artificial intelligence tools supporting neonatal intensive care unit resuscitative care, training and education. Key findings include the need for effective interface design for accurate data capture followed by storage and translation to wisdom using analytics and artificial intelligence tools. This review addresses the issues of data privacy, bias, liability and ethical frameworks when adopting these tools. While these emerging technologies hold great promise to improve resuscitation, further study of these applications in neonatal population and awareness of informatics and artificial intelligence principles among clinicians is imperative.
Collapse
Affiliation(s)
- Regine M Fortunov
- Division of Neonatology, Baylor College of Medicine, Houston, TX, United States.
| | - Erwin Cabacungan
- Section of Neonatology, Medical College of Wisconsin, Milwaukee, WI, United States
| | - James S Barry
- Section of Neonatology, University of Colorado School of Medicine, Aurora, CO, United States
| | - Jawahar Jagarapu
- Division of Neonatology, UT Southwestern Medical Center, Dallas, TX, United States
| |
Collapse
|
49
|
Abaid A, Ilancheran S, Iqbal T, Hynes N, Ullah I. Exploratory analysis of Type B Aortic Dissection (TBAD) segmentation in 2D CTA images using various kernels. Comput Med Imaging Graph 2024; 118:102460. [PMID: 39577205 DOI: 10.1016/j.compmedimag.2024.102460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 10/31/2024] [Accepted: 10/31/2024] [Indexed: 11/24/2024]
Abstract
Type-B Aortic Dissection is a rare but fatal cardiovascular disease characterized by a tear in the inner layer of the aorta, affecting 3.5 per 100,000 individuals annually. In this work, we explore the feasibility of leveraging two-dimensional Convolutional Neural Network (CNN) models to perform accurate slice-by-slice segmentation of true lumen, false lumen and false lumen thrombus in Computed Tomography Angiography images. The study performed an exploratory analysis of three 2D U-Net models: the baseline 2D U-Net, a variant of U-Net with atrous convolutions, and a U-Net with a custom layer featuring a position-oriented, partially shared weighting scheme kernel. These models were trained and benchmarked against a state-of-the-art baseline 3D U-Net model. Overall, our U-Net with the VGG19 encoder architecture achieved the best performance score among all other models, with a mean Dice score of 80.48% and an IoU score of 72.93%. The segmentation results were also compared with the Segment Anything Model (SAM) and the UniverSeg models. Our findings indicate that our 2D U-Net models excel in false lumen and true lumen segmentation accuracy while achieving lower false lumen thrombus segmentation accuracy compared to the state-of-the-art 3D U-Net model. The study findings highlight the complexities involved in developing segmentation models, especially for cardiovascular medical images, and emphasize the importance of developing lightweight models for real-time decision-making to improve overall patient care.
Collapse
Affiliation(s)
- Ayman Abaid
- School of Computer Science, University of Galway, Galway, Ireland
| | | | - Talha Iqbal
- Insight SFI Research Centre for Data Analytics, University of Galway, Galway, Ireland
| | - Niamh Hynes
- University Hospital Galway, Newcastle Road, University of Galway, Galway, Ireland
| | - Ihsan Ullah
- Insight SFI Research Centre for Data Analytics, University of Galway, Galway, Ireland; School of Computer Science, University of Galway, Galway, Ireland.
| |
Collapse
|
50
|
Lu Y, Gao H, Qiu J, Qiu Z, Liu J, Bai X. DSIFNet: Implicit feature network for nasal cavity and vestibule segmentation from 3D head CT. Comput Med Imaging Graph 2024; 118:102462. [PMID: 39556905 DOI: 10.1016/j.compmedimag.2024.102462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 10/14/2024] [Accepted: 11/03/2024] [Indexed: 11/20/2024]
Abstract
This study is dedicated to accurately segment the nasal cavity and its intricate internal anatomy from head CT images, which is critical for understanding nasal physiology, diagnosing diseases, and planning surgeries. Nasal cavity and it's anatomical structures such as the sinuses, and vestibule exhibit significant scale differences, with complex shapes and variable microstructures. These features require the segmentation method to have strong cross-scale feature extraction capabilities. To effectively address this challenge, we propose an image segmentation network named the Deeply Supervised Implicit Feature Network (DSIFNet). This network uniquely incorporates an Implicit Feature Function Module Guided by Local and Global Positional Information (LGPI-IFF), enabling effective fusion of features across scales and enhancing the network's ability to recognize details and overall structures. Additionally, we introduce a deep supervision mechanism based on implicit feature functions in the network's decoding phase, optimizing the utilization of multi-scale feature information, thus improving segmentation precision and detail representation. Furthermore, we constructed a dataset comprising 7116 CT volumes (including 1,292,508 slices) and implemented PixPro-based self-supervised pretraining to utilize unlabeled data for enhanced feature extraction. Our tests on nasal cavity and vestibule segmentation, conducted on a dataset comprising 128 head CT volumes (including 34,006 slices), demonstrate the robustness and superior performance of proposed method, achieving leading results across multiple segmentation metrics.
Collapse
Affiliation(s)
- Yi Lu
- Image Processing Center, Beihang University, Beijing 102206, China
| | - Hongjian Gao
- Image Processing Center, Beihang University, Beijing 102206, China
| | - Jikuan Qiu
- Department of Otolaryngology, Head and Neck Surgery, Peking University First Hospital, Beijing 100034, China
| | - Zihan Qiu
- Department of Otorhinolaryngology, Head and Neck Surgery, The Sixth Affiliated Hospital of Sun Yat-sen University, Sun Yat-sen University, Guangzhou 510655, China
| | - Junxiu Liu
- Department of Otolaryngology, Head and Neck Surgery, Peking University First Hospital, Beijing 100034, China.
| | - Xiangzhi Bai
- Image Processing Center, Beihang University, Beijing 102206, China; The State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing 100191, China; Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing 100191, China.
| |
Collapse
|