Systematic Reviews
Copyright ©The Author(s) 2025.
World J Gastrointest Surg. Aug 27, 2025; 17(8): 109463
Published online Aug 27, 2025. doi: 10.4240/wjgs.v17.i8.109463
Table 6 Artificial intelligence based and robotics enhanced surgical education and performance assessment
Ref.
Surgical procedure/task
AI method/model
Assessment modality
Data type/source
Performance metrics
Educational outcome
Garfinkle et al[53], 2022Gastrointestinal and endoscopic surgery (priority setting)Survey/DelphiSurvey data from SAGES member surgeonsIdentified core needs: Video training, tech adoption
Huo et al[54], 2024Surgical decision making for GERDChatGPT 3.5/4, copilot, Google Bard, Perplexity AIPrompt based guideline comparisonStandardized clinical vignettes based on SAGES guidelinesSurgeons (accuracy): Bard 6/7 (85.7%), ChatGPT 4 5/7 (71.4%); patients (accuracy): Bard 4/5 (80.0%), ChatGPT 4 3/5 (60.0%); children: Copilot & Bard 3/3 (100.0%)Revealed inconsistencies in LLM advice; need for medical domain training
Huo et al[55], 2024Surgical management of GERDGeneric ChatGPT 4 vs customized GPT (GTS)Prompt based guideline comparison60 surgeon cases and 40 patient cases based on SAGES and UEG EAES guidelinesGTS (custom GPT): 100% accuracy for both surgeons (60/60) and patients (40/40), generic GPT 4 66.7% (40/60) for surgeons, 47.5% (19/40) for patientsDemonstrated impact of domain customization in LLMs
Nasir et al[56], 2021Robotic rectal cancer surgeryNo AI model used (robotics only)RCTs, observational studies, registry dataReduced conversion to open surgery (especially in obese/male patients); improved urogenital function; no difference in long term oncologic outcomesReinforced need for structured robotic training (e.g., EARCS)