Retrospective Study
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
Artif Intell Cancer. Jun 8, 2025; 6(1): 106356
Published online Jun 8, 2025. doi: 10.35713/aic.v6.i1.106356
Can ChatGPT and DeepSeek help cancer patients: A comparative study of artificial intelligence models in clinical decision support
Meng Sun, Jun Yu, Jing-Wen Zhou, Ming Ye, Fang Ye, Mei Ding
Meng Sun, Jun Yu, Jing-Wen Zhou, Ming Ye, Fang Ye, Mei Ding, Nanjing University of Chinese Medicine, Nanjing 210023, Jiangsu Province, China
Co-first authors: Meng Sun and Fang Ye.
Author contributions: Sun M and Ye F conceived and designed the study, supervised the research, and revised the manuscript critically for intellectual content; Yu J and Zhou JW contributed to data acquisition, curation, and preprocessing from TCGA and institutional databases; Ye M performed statistical analyses (ANOVA, post-hoc Tukey tests) and interpreted results; Ding M conducted model evaluations, including guideline compliance and readability assessments; All authors participated in drafting the manuscript, reviewed the final version, and approved its submission; Ye F, as the corresponding author, coordinated interdisciplinary collaboration and ensured adherence to ethical standards.
Institutional review board statement: As this retrospective analysis utilized fully anonymized data from The Cancer Genome Atlas (TCGA) and institutional databases, with no direct interaction with patients or access to identifiable information, the requirement for informed consent was waived in compliance with the Declaration of Helsinki and national ethical guidelines for secondary use of de-identified clinical data. All data handling adhered to institutional and TCGA data-use policies to ensure patient confidentiality and privacy.
Informed consent statement: Consent was not needed as the study was retrospective without exposure to the patients’ data.
Conflict-of-interest statement: All authors declare no competing financial interests.
Data sharing statement: Not applicable.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Fang Ye, Professor, Nanjing University of Chinese Medicine, No. 138 Xianlin Avenue, Qixia District, Nanjing 210023, Jiangsu Province, China. sunareyouok@163.com
Received: February 24, 2025
Revised: April 2, 2025
Accepted: April 27, 2025
Published online: June 8, 2025
Processing time: 103 Days and 0.1 Hours
Abstract
BACKGROUND

Cancer care faces challenges due to tumor heterogeneity and rapidly evolving therapies, necessitating artificial intelligence (AI)-driven clinical decision support. While general-purpose models like ChatGPT offer adaptability, domain-specific systems (e.g., DeepSeek) may better align with clinical guidelines. However, their comparative efficacy in oncology remains underexplored. This study hypothesizes that domain-specific AI will outperform general-purpose models in technical accuracy, while the latter will excel in patient-centered communication.

AIMS

To compare ChatGPT and DeepSeek in oncology decision support for diagnosis, treatment, and patient communication.

METHODS

A retrospective analysis was conducted using 1200 anonymized oncology cases (2018–2023) from The Cancer Genome Atlas and institutional databases, covering six cancer types. Each case included histopathology, imaging, genomic profiles, and treatment histories. Both models generated diagnostic interpretations, staging assessments, and therapy recommendations. Performance was evaluated against NCCN/ESMO guidelines and expert oncologist panels using F1-scores, Cohen's κ, Likert-scale ratings, and readability metrics. Statistical significance was assessed via analysis of variance and post-hoc Tukey tests.

RESULTS

DeepSeek demonstrated superior performance in diagnostic accuracy (F1-score: 89.2% vs ChatGPT's 76.5%, P < 0.001) and treatment alignment with guidelines (κ = 0.82 vs 0.67, P = 0.003). ChatGPT exhibited strengths in patient communication, generating layman-friendly explanations (readability score: 8.2/10 vs DeepSeek's 6.5/10, P = 0.012). Both models showed limitations in rare cancer subtypes (e.g., cholangiocarcinoma), with accuracy dropping below 60%. Clinicians rated DeepSeek's outputs as more actionable (4.3/5 vs 3.7/5, P = 0.021) but highlighted ChatGPT's utility in palliative care discussions.

CONCLUSION

Domain-specific AI (DeepSeek) excels in technical precision, while general-purpose models (ChatGPT) enhance patient engagement. A hybrid system integrating both approaches may optimize oncology workflows, contingent on expanded training for rare cancers and real-time guideline updates.

Keywords: Artificial intelligence; Clinical decision support; Oncology; ChatGPT; DeepSeek; Precision medicine

Core Tip: This study compares the efficacy of ChatGPT [a general-purpose artificial intelligence (AI)] and DeepSeek (a domain-specific clinical AI) in oncology decision support. DeepSeek demonstrated superior diagnostic accuracy (F1-score: 89.2% vs 76.5%) and guideline adherence (Cohen’s κ: 0.82 vs 0.67), while ChatGPT excelled in patient communication (readability score: 8.2/10 vs 6.5/10). Both models underperformed in rare cancer subtypes (F1 < 60%), highlighting the need for hybrid systems integrating technical precision and patient-centered communication. This work advocates for AI models tailored to oncology’s heterogeneous demands, with dynamic updates to address evolving clinical guidelines and rare malignancies.