Retrospective Study
Copyright ©The Author(s) 2025.
World J Radiol. Jun 28, 2025; 17(6): 106682
Published online Jun 28, 2025. doi: 10.4329/wjr.v17.i6.106682
Table 1 Multi-cohort comparison of thyroid lobes characteristics, n (%)
Characteristics
Training cohort
Validation cohort
Temporal test cohort
External test cohort
P valuea
P valueb
P valuec
Groupedn = 264n = 112n = 97n = 810.2030.6050.053
Benign80 (30)26 (23)26 (27)15 (19)
Malignant184 (70)86 (77)71 (73)66 (81)
Gender0.3550.4150.463
Female211 (80)84 (75)73 (75)61 (75)
Male53 (20)28 (25)24 (25)20 (25)
Age (years), median (P25-P75)51.00 (38.75-60.00)48.50 (35.75-58.00)50.00 (41.00-57.00)52.00 (41.00-60.00)0.1000.9440.413
BMI, median (P25-P75)24.46 (22.04-26.30)24.73 (22.18-27.05)24.77 (23.11-26.9)25.01 (24.29-25.78)0.2880.1610.042
Multiple nodules0.5170.5190.036
No148 (56)58 (52)50 (52)34 (42)
Yes116 (44)54 (48)47 (48)47 (58)
Tumor size group0.0950.0010.007
Small (≤ 5 mm)43 (16)29 (26)21 (22)24 (30)
Medium (5-10 mm)99 (38)38 (34)52 (54)33 (41)
Large (> 10 mm)122 (46)45 (40)24 (25)24 (30)
Calcify0.6290.344< 0.001
No192 (73)78 (70)76 (78)42 (52)
Yes72 (27)34 (30)21 (22)39 (48)
Cystic0.7410.399< 0.001
No229 (87)95 (85)88 (91)38 (47)
Yes35 (13)17 (15)9 (9)43 (53)
FT3 (pmol/L), median (P25-P75)4.76 (4.39-5.09)4.76 (4.44-4.99)4.71 (4.46-5.03)4.79 (4.32-5.45)0.8910.9180.299
FT4 (pmol/L), median (P25-P75)16.20 (14.88-17.90)16.20 (14.57-18.30)17.40 (15.70-19.00)16.80 (14.50-19.60)0.9760.0020.204
TSH (mU/L), median (P25-P75)1.73 (1.37-2.44)1.62 (1.13-2.59)2.16 (1.38-2.67)1.68 (1.04-3.01)0.0980.1530.665
TGAb (IU/mL), median (P25-P75)16.70 (15.4-21.45)16.70 (15.3-17.72)17.00 (15.81-17.90)17.80 (17.30-18.10)0.1730.7890.001
TPOAb (IU/mL), median (P25-P75)13.00 (12.35-13.22)13.00 (11.35-15.10)15.00 (14.20-15.74)12.60 (12.10-13.50)0.707< 0.001< 0.001
Table 2 Comparison between the benign and malignant thyroid lobes in the training cohort, n (%)
Characteristics
Benign (n = 80)
Malignant (n = 184)
P value
Gender0.234
Female68 (85)143 (78)
Male12 (15)41 (22)
Age (years), median (P25-P75)54.50 (49.00-64.25)49.00 (35.00-57.25)< 0.001
BMI, median (P25-P75)24.16 (21.93-25.98)24.80 (22.19-26.35)0.296
Multiple nodules0.716
No43 (54)105 (57)
Yes37 (46)79 (43)
Tumor size group< 0.001
Small (≤ 5 mm)4 (5)39 (21)
Medium (5-10 mm)21 (26)78 (42)
Large (> 10 mm)55 (69)67 (36)
Calcify0.028
No66 (82)126 (68)
Yes14 (18)58 (32)
Cystic< 0.001
No57 (71)172 (93)
Yes23 (29)12 (7)
FT3 (pmol/L), median (P25-P75)4.76 (4.43-5.01)4.76 (4.39-5.14)0.421
FT4 (pmol/L), median (P25-P75)16.20 (14.88-16.92)16.20 (14.95-18.05)0.142
TSH (mU/L), median (P25-P75)1.73 (1.40-2.03)1.73 (1.37-2.63)0.227
TGAb (IU/mL), median (P25-P75)16.70 (15.62-18.05)16.70 (15.4-25.85)0.164
TPOAb (IU/mL), median (P25-P75)13.00 (12.80-13.82)13.00 (12.35-13.00)0.671
Table 3 Weights of least absolute shrinkage and selection operator selected features and training set Z-score parameters
Feature names
Weight
Average
Variance
Intercept0.250252--
Original_shape_Elongation0.3110960.5240.104
Original_firstorder_10Percentile0.21334950.02915.014
Original_firstorder_interquartilerange0.15709229.21610.044
Original_glcm_maximumprobability-0.103440.3040.092
Original_gldm_dependenceentropy0.1921436.1330.351
Original_ngtdm_Contrast-0.034290.0090.006
Original_ngtdm_Strength-0.009310.4721.371
Log-sigma-1-mm-3D_glcm_InverseVariance-0.444710.3100.026
Log-sigma-1-mm-3D_glszm_LargeAreaHighGrayLevelEmphasis0.32620568429444.000388518307.356
log-sigma-1-mm-3D_glszm_LargeAreaLowGrayLevelEmphasis-0.0243810850.30046169.980
Log -sigma-2-mm-3D_firstorder_Maximum0.17341656.16427.585
Log -sigma-2-mm-3D_glrlm_RunLengthNonUniformity-0.116351356.938450.535
Log -sigma-2-mm-3D_glszm_ZoneEntropy-0.187665.2560.371
Log -sigma-3-mm-3D_firstorder_TotalEnergy-0.2676857352644.00020352814.000
Log -sigma-3-mm-3D_glcm_ClusterProminence0.1238951173.183544.414
Wavelet-LLH_firstorder_Mean0.151319.8502.798
Wavelet-LLH_firstorder_Skewness0.088985-1.8691.851
Wavelet-LHH_glcm_InverseVariance-0.388860.5030.005
Wavelet-LHH_gldm_LargeDependenceHighGrayLevelEmphasis-0.250053423.8153331.008
wavelet-HLL_glszm_GrayLevelNonUniformityNormalized0.1042640.1590.094
Wavelet-HLL_glszm_LargeAreaLowGrayLevelEmphasis-0.0597810742.59083666.610
Wavelet-HLH_glcm_InverseVariance-0.145260.5010.005
Wavelet-HHL_glszm_GrayLevelNonUniformity0.08703918.16612.923
Wavelet-HHH_firstorder_InterquartileRange0.2849186.4971.234
Wavelet-LLL_glcm_InverseVariance-0.453740.4480.023
Wavelet-LLL_gldm_DependenceEntropy-0.273257.0390.328
Wavelet-LLL_glszm_GrayLevelNonUniformity-0.2356939.39414.681
Table 4 Multi-cohort predictive performance of various models
Model
AUC (95%CI)
Accuracy
Sensitivity
Specificity
PPV
NPV
Precision
Recall
F1
Brier
Training cohort (benign = 80, malignant = 184), SMOTE (benign = 104)
LR0.845 (0.794-0.896)0.7990.8320.7250.8740.6520.8740.8320.8520.140
DT0.806 (0.745-0.866)0.8260.8290.8150.9460.5500.9460.8290.8830.136
RF1.000 (1.000-1.000)1.0001.0001.0001.0001.0001.0001.0001.0000.027
XGB0.899 (0.845-0.932)0.8450.7830.8750.9350.6360.9350.7830.8520.125
SVM0.844 (0.791-0.896)0.8140.8640.7000.8690.6910.8690.8640.8660.141
KNN1.000 (1.000-1.000)1.0001.0001.0001.0001.0001.0001.0001.0000.031
LGBM0.692 (0.631-0.752)0.5610.7550.6000.8130.5160.8130.7550.7830.208
Senior radiologist0.596 (0.505-0.633)0.5420.5000.6380.7600.3570.7600.5000.6030.209
Junior radiologist0.529 (0.464-0.594)0.4960.4460.6130.7360.3240.7260.4460.5520.211
Validation cohort (benign = 26, malignant = 86)
LR0.834 (0.750-0.917)0.7680.8370.5380.8570.5000.8570.8370.8470.128
DT0.646 (0.529-0.762)0.6960.8020.3460.8020.3460.8020.8020.8020.218
RF0.729 (0.620-0.838)0.759 0.8600.4230.8310.4780.8310.8600.8460.166
XGB0.803 (0.715-0.890)0.6960.7440.5380.8420.3890.8420.7440.7900.144
SVM0.820 (0.728-0.912)0.7950.8600.5770.8710.5560.8710.8600.8650.130
KNN0.793 (0.697-0.890)0.7410.7790.6150.8700.4570.8700.7790.8220.180
LGBM0.724 (0.628-0.820)0.5450.4300.9230.9490.3290.9490.4300.5920.182
Senior radiologist0.558 (0.449-0.667)0.5270.5000.6150.8110.2710.8110.5000.6190.182
Junior radiologist0.538 (0.428-0.649)0.5180.5000.5770.7960.2590.7960.5000.6140.182
Temporal test cohort (benign = 26, malignant = 71)
LR0.814 (0.717-0.912)0.8250.9300.5380.8460.7370.8460.9300.8860.137
DT0.757 (0.652-0.851)0.7420.7800.5330.9010.3080.9010.7800.8370.178
RF0.795 (0.696-0.894)0.7630.9010.3850.8000.5880.8000.9010.8480.152
XGB0.855 (0.775-0.935)0.7730.8310.6150.8550.5710.8550.8310.8430.139
SVM0.816 (0.719-0.913)0.8450.9720.5000.8410.8670.8410.9720.9020.141
KNN0.719 (0.607-0.832)0.7110.8030.4620.8030.4620.8030.8030.8030.208
LGBM0.800 (0.697-0.904)0.8350.8730.7310.8990.6790.8990.8730.8860.193
External test cohort (benign = 15, malignant = 66)
LR0.782 (0.607-0.907)0.7650.8480.4000.8620.3750.8620.8480.8550.127
DT0.715 (0.566-0.863)0.7780.8870.4210.8330.5330.8330.8870.8590.166
RF0.751 (0.589-0.913)0.8150.8790.5330.8920.5000.8920.8790.8850.143
XGB0.802 (0.644-0.939)0.7280.7270.7330.9230.3790.9230.7270.8140.121
SVM0.800 (0.674-0.926)0.8150.9090.4000.8700.5000.8700.9090.8890.120
KNN0.813 (0.697-0.928)0.8030.8480.6000.9030.4740.9030.8480.8750.158
LGBM0.728 (0.668-0.789)0.7530.8480.3330.8480.3330.8480.8480.8480.163
Table 5 DeLong test for different models and human radiologists in the validation cohort
Models
DT
LR
RF
XGB
SVM
KNN
LGBM
Senior radiologist
Junior radiologist
DT-0.0110.118< 0.001< 0.0010.0220.3090.2350.179
LR0.011-0.1370.6150.8330.5370.093< 0.001< 0.001
RF0.1180.137-0.0380.0160.1000.9500.0310.017
XGB< 0.0010.6150.038-0.4690.8250.239< 0.001< 0.001
SVM< 0.0010.8330.0160.469-0.5350.159< 0.001< 0.001
KNN0.0220.5370.1000.8250.535-0.3210.002< 0.001
LGBM0.3090.0930.9500.2390.1590.321-0.0170.008
Senior radiologist0.235< 0.0010.031< 0.001< 0.0010.0020.017-0.810
Junior radiologist0.179< 0.0010.017< 0.001< 0.001< 0.0010.0080.810-