Machine learning models and over-fitting considerations

doi:10.3748/wjg.v28.i5.605

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 28, Issue 5

This Article

Academic Content and Language Evaluation of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (8247)

All Articles published online

The chart showing PDF series, WORD series, HTML series.

Item

Count

PDF

473

WORD

138

HTML

5728

Sum=6339

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

563

Download

1345

Sum=1908

Feb 7, 2022 (publication date) through Aug 31, 2025

Times Cited of This Article

Times Cited (119)

Journal Information of This Article

Publication Name

World Journal of Gastroenterology

ISSN

1007-9327

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Letter to the Editor

World J Gastroenterol. Feb 7, 2022; 28(5): 605-607
Published online Feb 7, 2022. doi: 10.3748/wjg.v28.i5.605

Machine learning models and over-fitting considerations

Paris Charilaou, Robert Battat

Paris Charilaou, Robert Battat, Jill Roberts Center for Inflammatory Bowel Disease - Division of Gastroenterology & Hepatology, Weill Cornell Medicine, New York, NY 10021, United States

Author contributions: Charilaou P and Battat R drafted and edited the manuscript, and reviewed the intellectual content.

Conflict-of-interest statement: The authors have no conflict of interest to declare.

Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Robert Battat, MD, Assistant Professor, Jill Roberts Center for Inflammatory Bowel Disease - Division of Gastroenterology & Hepatology, Weill Cornell Medicine, 1315 York Avenue, New York, NY 10021, United States. rob9175@med.cornell.edu

Received: October 26, 2021
Peer-review started: October 26, 2021
First decision: December 27, 2021
Revised: December 29, 2021
Accepted: January 14, 2022
Article in press: January 14, 2022
Published online: February 7, 2022
Processing time: 90 Days and 12.8 Hours

Abstract

Machine learning models may outperform traditional statistical regression algorithms for predicting clinical outcomes. Proper validation of building such models and tuning their underlying algorithms is necessary to avoid over-fitting and poor generalizability, which smaller datasets can be more prone to. In an effort to educate readers interested in artificial intelligence and model-building based on machine-learning algorithms, we outline important details on cross-validation techniques that can enhance the performance and generalizability of such models.

Keywords: Machine learning; Over-fitting; Cross-validation; Hyper-parameter tuning

Core Tip: Machine learning models are increasingly being used in clinical medicine to predict outcomes. Proper validation techniques of these models are essential to avoid over-fitting and poor generalization on new data.