Minireviews Open Access
Copyright ©The Author(s) 2023. Published by Baishideng Publishing Group Inc. All rights reserved.
Artif Intell Gastroenterol. Jun 8, 2023; 4(1): 1-9
Published online Jun 8, 2023. doi: 10.35712/aig.v4.i1.1
Big data and variceal rebleeding prediction in cirrhosis patients
Quan Yuan, Department of Gastroenterology, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400042, China
Wen-Long Zhao, College of Medical Informatics, Chongqing Medical University, Chongqing 400016, China
Wen-Long Zhao, Medical Data Science Academy, Chongqing 400016, China
Wen-Long Zhao, Chongqing Engineering Research Centre for Clinical Big-data and Drug Evaluation, Chongqing 400016, China
Bo Qin, Department of Infectious Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing 400042, China
ORCID number: Quan Yuan (0000-0001-7761-4113); Bo Qin (0000-0002-7802-2854).
Author contributions: Yuan Q selected the topic and performed the majority of conception, writing, and revision of the manuscript; Zhao WL provided think tank, platform with regard to big data, site for academic discussion, and revision suggestions for the manuscript; Qin B provided administrative help and was the instigator and coordinator of the study; All authors have read and approved the final manuscript.
Conflict-of-interest statement: All the authors declare that they have no conflicts of interest.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See:
Corresponding author: Bo Qin, MD, Professor, Department of Infectious Diseases, The First Affiliated Hospital of Chongqing Medical University, No. 1 Youyi Road, Yuzhong District, Chongqing 400042, China.
Received: January 8, 2023
Peer-review started: January 8, 2023
First decision: January 21, 2023
Revised: February 3, 2023
Accepted: March 10, 2023
Article in press: March 10, 2023
Published online: June 8, 2023


Big data has convincing merits in developing risk stratification strategies for diseases. The 6 “V”s of big data, namely, volume, velocity, variety, veracity, value, and variability, have shown promise for real-world scenarios. Big data can be applied to analyze health data and advance research in preclinical biology, medicine, and especially disease initiation, development, and control. A study design comprises data selection, inclusion and exclusion criteria, standard confirmation and cohort establishment, follow-up strategy, and events of interest. The development and efficiency verification of a prognosis model consists of deciding the data source, taking previous models as references while selecting candidate predictors, assessing model performance, choosing appropriate statistical methods, and model optimization. The model should be able to inform disease development and outcomes, such as predicting variceal rebleeding in patients with cirrhosis. Our work has merits beyond those of other colleagues with respect to cirrhosis patient screening and data source regarding variceal bleeding.

Key Words: Big data, Disease onset, Prognosis, Modeling, Cirrhosis, Gastrointestinal rebleeding

Core Tip: Big data have been applied in many fields including finance, traffic control, logistics, healthcare, and environmental protection. Modeling is an efficient method for completing various tasks, and verification of its validity is vital for ensuring high-quality operation and yielding satisfactory results. Predictor screening guarantees the establishment of a practical, convenient, and favorable model for prognosis prediction. Utilizing a regression model trained with numerous data mined from big data acquired from real-world hospitals is helpful for informing disease or status onset and its prognosis such as in variceal rebleeding, which is one of the leading causes of death in cirrhosis patients.


Many risk stratification strategies for diseases mainly depend on single-/medium-sized cohort studies or their meta-analysis[1,2], with lead-time bias taken into consideration[3,4]. This type of study method is, by design, well scheduled and well phenotyped but selective for the population sampled, which may not reflect the real-world, pan-subject profile. Real-world patients may have comorbidities, be taking concomitant medications, may be excluded from short-term follow-up, or have poor patient compliance. Direct data acquisition from basic healthcare institutions and cohorts is more representative than limited sampling.


Although the use of piles of data in the medical field has a relatively long history[5-7], the term “big data” appeared only in the 1990s and quickly became popular[8-10]. “Big” is a relative term, especially when it relates to data. Big data usually refers to datasets that exceed the capabilities of commonly used software tools to store, manage, and process that amount of data within a suitable period of time[11]. The term is described by 315 characteristics[12] and fundamentally by the 6 “V”s: volume, velocity, variety, veracity, value, and variability[13-17] (Figure 1).

Figure 1
Figure 1  Six “V”s of big data.

During the recent decade, methods for collecting, storing, and managing big data have evolved[18-20]. We are now entering an era of monitoring health changes using clinical indicators, such as vital signs, serum sugar, lipids, sweating, and bladder fullness, with wearable devices[11]. These changes can reflect physiological change. Constant variation and altered levels may result in different pathological states. Here, we review the applications of big data in predicting disease onset and prognosis, especially variceal rebleeding prediction in cirrhosis patients.


Applications of big data include its use as a tool to monitor the onset of conditions and diseases. Big data have been used for this purpose in relation to hypertension[21], pediatric oncology[22], oral care[23], general practice[24], rheumatic diseases[25], renal diseases[26], mechanical ventilation management in the intensive care unit[27], and cirrhosis and hepatocellular carcinoma morbidity in the nonalcoholic fatty liver disease/nonalcoholic steatohepatitis population[28]. Situations such as the commencement, development, and control of diseases can be studied and visualized using big data techniques, which is a promising and beneficial approach. With the help of big data, the creation of large, collaborative data can lay a more solid foundation for robust data sharing and scientific discovery in predicting the onset of pediatric oncology. Registry-based research, however, is one of the conventional research methods regarding pediatric cancers. In these studies, a multisite registry for the study of pediatric patients was utilized, including fields of descriptive epidemiology, survivors, genomics, new registry description, data harmonization, palliative and supportive care, radiology, consensus guidelines, hereditary pediatric cancer, electronic health records, and prospective clinical trials. Limitations of registry-based research include the latest publication time range only, a restricted single publication database, and a limited amount of research and registries only if they have yielded publicly-published peer-reviewed papers[22]. With this study strategy, data cannot be automatically mined, cleaned, and integrated to perfect the already existing study. When it comes to new subjects, we need to redo the statistical analysis, while modeling and machine study in the big data scenario can perform the whole analysis process.

Healthcare data in some regions are complete and accessible for analysis. Real-world data from primary healthcare facilities in communities in European countries are a good resource, as the primary healthcare service is state-covered and there are few or no co-payments. Therefore, healthcare information and data are collected and stored by state-run big data centers. Most residents are registered at birth and have their complete healthcare information in electronic form, which can be accessed by regional practitioners and analyzed for real-world application scenarios[29]. However, numerous parameters, especially administrative data, mined from patients’ inpatient and outpatient Hospital Information System/Electronic Medical Record system via various algorithms are at risk of information and privacy leaking. Therefore, preliminary selection of data, especially low-dimensional administrative data, is preferable to decrease information leakage and privacy invasion.

Big data boosts the depth and breadth of research in fundamental biology and clinical medicine. There is already impressive progress due to this, including in exome sequencing[30], genomics, and proteomics. Taking the coronavirus disease 2019 pandemic as an example, primary research, clinical practices regarding treatment, and even trends in media campaigns of whether or not executing lockdown and a positive policy of nucleic acid testing can be swiftly analyzed with big data tools to assist epidemic control[31].


Study design comprises data source selection, inclusion and exclusion criteria, standard confirmation and cohort establishment, follow-up strategy, and events of interest. A multicountry European real-world study acquired patient data within a set research period mined from central transcription, laboratories, pharmacy offices, medical insurance departments, administrative departments, and other departmental databases via an electronic health record data repository along with molecular typing from molecular biology laboratories for preventing outbreaks of hospital infections[32]. Chart presentations can be used to analyze and interpret descriptive data. The Fib-4 score (age, aspartate aminotransferase, alanine aminotransferase, and platelets), which is composed of entirely non-invasive parameters, has been used to detect early liver fibrosis[28].


With respect to development and efficiency verification of disease onset and prognosis models, researchers have performed extensive work. Model development is the process of collecting vital parameters (risk factors) of consequence and weighted with varied weight coefficients to form a weighted function. This requires the identification of predominant predictors from a large amount of preselected candidate predictors, assigning proper weights to each predictor to obtain a combined risk score, and assessing the model’s predictive performance with statistical methods such as a calibration plot. The latter includes calibration, discrimination, and (re)classification properties, assessing its potential for generalization using internal validation techniques and if necessary optimizing the model to avoid overfitting. Data sources should preferably be prospective cohort(s) with a randomized controlled trial design or real-world medical record data. Preferred outcome choices are those that are related to patients or individuals such as remission time and follow-up period. Methods for outcome verification should be included, and the blind method is preferred.

Regarding the selection of candidate predictors, a surplus should be defined and analyzed before finally including a subset in the final model. Incorporation bias should be avoided by blinding. Data quality control, missing data processing, continuous predictor modeling, final model development, relative weight assignment for each predictor, and internal validation are essential in the process of creating a final prediction model[33].

Choosing appropriate statistical methods during model establishment is vital to guarantee reliability and validity. Regression analysis, including univariate and multivariate regression, is the most commonly used statistical method, especially Cox regression[34] and LASSO[35]. The hazard ratio is used to differentiate cohorts across different conditions and coefficients. Featured with net benefit and threshold probability for more convenient yet trusty clinical decision making, decision curve analysis has been used to evaluate whether or not to use a certain prediction model[36]. In this approach, the theoretical relationship between the threshold probabilities of a disease (that a disease will take place) and the relative frequency of false positives and false negatives are examined to ensure the validity of a prediction model.

The benefits of applying decision curve analysis can be quantified as whether a model can be easily and effectively applied in clinical situations. Its ability to help compare several different models regarding one issue is another advantage[37]. The parameter indicating risk threshold “T value” has been used to study treatment decisions in risk models. The harm-to-benefit ratio is related to the T value, which is in line with the former. Balancing all benefits and harms in different scenarios is key to determining which T value is reasonable[38]. The net benefit (NB) value, which is a combined “net” effect of the true positives and false positives, was introduced to evaluate the potential clinical application of an estimating tool or a risk-predicting model. Setting the decisive threshold range in modeling is important, which is the boundary to determine whether a patient is judged as positive for a disease or not[39]. However, NB does not directly make up the harms and costs in acquiring the predictors for the chosen model. The focus of NB is to derive the best tradeoff between sufficient indicators and convenience in clinical application[40].

Model optimization should be conducted in order to reduce the number of predictors and avoid an unmanageable dataset or workload. AMSGrad (“far from the minimum”), a putative optimal method for optimizing models, is commonly used for low-cost cause. By switching to the direct linear method near the end of the optimization, AMSGrad can do its magic as it has long convergence tails[41]. As for multiobjective racing algorithms with fixed confidence, SPRINT-Race is the first algorithm developed and uses a nonparametric, ternary-decision, dual-sequential probability ratio test to infer a pairwise dominance or nondominance relationship. In order to minimize the computational effort, the probability of mistakenly erasing any Pareto-optimal models or returning any clearly dominating models is restricted, which can achieve a pre-estimated confidence level to ensure the quality of the models generated[42], by sequentially applying a Holm’s step-down family-wise error rate control method. The quantification of model-to-data correspondence is pivotal to measure a model’s performance and future application for the problem at hand. The Drosophila melanogaster gap gene system model demonstrated the importance of error quantification, and it is applicable to a wide array of developmental modeling studies[43]. The support vector machine, GLM-Net, generalized linear model, partial least squares, neural network, k-nearest neighbors, random forest, and boosted tree are useful tools for establishing the model to predict prognosis in patients with breast cancer[44]. Comparing their differences in performance and necessary model optimization can lead to better and more efficient application in practice.


Researchers have proposed methods for predictor screening with regard to disease prognosis, such as the Model for End-stage Live Disease (MELD) for cirrhosis-related mortality prediction and the APACHE model for critically ill patients. The clinical data of cirrhosis patients who had early admission, including clinical and socioeconomic factors, were mined from electronic medical records and classified for risk stratification in order to predict readmission within 30 d[45]. The European Organization for Research and Treatment of Cancer (EORTC) risk tables[46], which include six clinical and pathological factors (number of tumors, tumor size, prior recurrence rate, T category, carcinoma in situ, and grade), were recommended by the European Association of Urology and used to separately predict the short-term and long-term risks of progression and recurrence in an individual patient with a non-muscular invasive bladder tumor. It divided patients into four groups with individual recurrent and progression scores. However, as EORTC risk tables overestimated recurrence in all risk groups and progression in the high-risk group, the Club Urológico Español de Tratamiento Oncológico scoring model[47] was developed. The well-known new EORTC model[48], or European Association of Urology risk groups, was popular in recurrence and progression prediction, in which tumor diameter and extent were key predictors for progression prediction in multistate analyses. The health belief model has been used for risk factors identifying aged Jordanian adults for prostate cancer screening[49]. Development and validation of a prediction model, including internal and external, temporal and geographical, domain validation, and their revision, are all crucial to identify predictors of prognosis[50].


Studies have reported several prediction models that predict variceal rebleeding in patients with cirrhosis. Risk indicators are components of prediction models. Invariably, studies in spotting possible risk indicators of variceal rebleeding among cirrhosis patients require a long study period. Child-Pugh score and hepatic-venous pressure gradient are the most significant prognostic factors in stratifying the probability of variceal rebleeding[51]. Antiviral treatment significantly reduced rebleeding in patients with hepatitis B virus (HBV)-related cirrhosis. In-time prophylactic endoscopic treatment of upper gastrointestinal varices after first-time bleeding, including endoscopic varix ligation (EVL) and gastric fundus varix gluing, is important in postponing variceal rebleeding[52]. Tachycardia, high creatinine level, and low albumin level are independent factors associated with rebleeding, suggesting a potential predictive role. The transverse of these variables into predictive scores may provide improved prognosis for patients with variceal bleeding[53]. Pre-emptive transjugular intrahepatic portosystemic shunt was independently related to a lower rebleeding rate[54]. Albumin transfusion in patients with low albumin levels was positively associated with a decreased rebleeding rate[55]. Five studies showed a lower rebleeding rate after EVL or drug therapy (non-selective β-blockers ± isosorbide mononitrate), and four trials found decreased variceal rebleeding with combined therapy (EVL+ non-selective β-blockers+ isosorbide mononitrate)[56].

However, some indicators have a negative function in preventing rebleeding. A multicenter, double-blind, parallel study of 158 patients indicated that taking simvastatin besides standard prophylaxis (rest, fluid restriction, preventing infection, regular endoscopic examination, anti-HBV therapy, non-selective β-blocker, etc.) did not decrease the rebleeding rate[57]. The rate of variceal rebleeding was not reduced after anticoagulation according to a single-center, prospective cohort study[58]. Worsened liver function or insensitive hemodynamic response to non-selective β-blockers indicated an elevated rebleeding rate[51]. A Chinese study of 3289 hospitalized patients who underwent EVL indicated that male sex, Child-Pugh score > 7.2, and volume of blood vomited before EVL were independent risk indicators of early rebleeding, while albumin concentration > 31.5 g/L was a protective indicator[59]. Bacterial infection in patients with variceal bleeding was strongly positively related to early rebleeding[60]. Acute-on-chronic liver failure is an independent risk factor of variceal rebleeding[54]. The presence of ascites or hepatic encephalopathy, MELD score > 12, or hepatic-venous pressure gradient > 20 mmHg indicated an elevated early (less than 6 wk) rebleeding rate[61].

The above indicators were then filtered and optimized by statistical methods, such as Cox regression or LASSO, and systemically integrated into a function with the help of programming or statistical software such as R, Python, SPSS, or SAS. This function was actually a preliminary prediction model.


Models predicting disease onset and prognosis play an essential and sometimes surprising role as convenient assistants in planning prophylactic, therapeutic, and follow-up strategies. Traditionally, medical data such as medical history, results of physical examination, laboratory tests, imaging and endoscopic information, etc. were integrated by doctors’ clinical comprehension or into patients’ timelines drafted on a paper to identify how disease progressed and predicted the possible prognosis according to the trend in medical indicators. Prediction models free doctors from numerous medical data of patients with different diseases, complications, physical, psychological, and socioeconomic situations. All they need to do is to type prescribed parameters into the model and click! The results of the onset and prognosis of a given disease are then provided.

Prediction models are currently extensively applied in the medical field to inform individuals and healthcare providers on the risks of developing a particular disease, its outcome, and to guide doctors to make better decisions in mitigating these risks. A recent Chinese study indicated that the MELD score and MELD-Na score, including the R score, were useful in predicting variceal rebleeding[62]. Another study indicated that the MELD-Na score model, which indicates liver function, was more efficient than the MELD model and Child-Pugh score model in predicting rebleeding among cirrhosis patients who underwent EVL.


Last but not least, it is worth noting that models using low-dimensional administrative data outperformed in big data analysis with respect to decreasing information safety and privacy invasion. According to several studies, the models did not improve when high-resolution, privacy-invasive behavioral data were included[63]. De-ID software (De-ID Data) has been used to assign a study identification number to every enrolled patient. Therefore, criteria, included in the informed consent established by the research review board, for exemption from enrollment were met[32]. The Drosophila melanogaster gap gene system gives a good example of demonstrating the significance of error quantification, in which model parameters were optimized against in situ immunofluorescence intensities. It can be applied to other studies in various fields with regard to model development.


Gastrointestinal (GI) rebleeding is a leading cause of mortality in patients with cirrhosis, as massive GI bleeding can induce hemorrhagic shock, disseminated intravascular coagulation, and opportunistic infections, especially pulmonary infection and spontaneous bacterial peritonitis. Thus, reducing or postponing GI rebleeding is significant. A handy tool for clinicians that can be operated on smart phones or other mobile intelligent devices within seconds to evaluate the GI rebleeding rate is interesting and useful for risk grading. Just type in several common laboratory test indicators, click on “go,” and the rebleeding rate and prognosis of a specific patient are provided.

Our work has merits beyond those of other colleagues. According to our literature retrieval on PubMed, there are no other studies on the prediction and prognosis analysis of GI rebleeding except for one article published last year indicating that the degree of liver stiffness is consistent with GI rebleeding rate in cirrhosis patients[64]. However, the above mentioned exclusive study has limitations. First, it was a prospective cohort study with only 289 patients enrolled in the final analysis, although PASS 15 was applied to calculate the statistically minimum sample size. In our ongoing study applying big data platform to evaluation the rebleeding rate of cirrhosis patients, we obtained real-world data from a big data platform collecting many more indicators from six hospitals, which were automatically collected. Second, our study included patients with esophageal and gastric fundus varices rebleeding, which were the most common varices presented in cirrhosis patients, and the other study only included esophageal varix rebleeding. Finally, the previous study only included patients with HBV-related decompensated cirrhosis, while our data were collected from cirrhosis patients with alcohol-related cirrhosis, autoimmune-related cirrhosis, primary biliary cirrhosis, and lipogenic cirrhosis in addition to HBV-related cirrhosis. Following parameter filtering and modeling, our study used a visual nomogram to demonstrate correlations among risk indicators, occurrence, and prognosis of GI rebleeding, which provides clinicians with a more explicit demonstration of all indicators and their effects on one page to easily and rapidly evaluate a patient to establish a strategy for further management and follow-up.


Modeling is popular using regression analysis and has vast applications in predicting disease occurrence and prognosis. However, modeling and its validation are not the ultimate objective in terms of healthcare provider’s clinical participation and patients’ health outcomes. They need to be applied and provide convenience for clinical practice. Studies on the application and optimization of these models should be designed and conducted, focusing on the utilization of existing and updated models and their impact on behavior and (self-) management of physicians, healthcare providers, and general individuals[65,66], especially in patients with decompensated cirrhosis at high risk of variceal rebleeding and mortality. For diagnostic and prognostic modeling with higher consistency and efficiency in predicting, treating, and following up decompensated cirrhosis, more comprehensive data and a clearer display mode are needed.


Provenance and peer review: Unsolicited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Gastroenterology and hepatology

Country/Territory of origin: China

Peer-review report’s scientific quality classification

Grade A (Excellent): 0

Grade B (Very good): B

Grade C (Good): 0

Grade D (Fair): D

Grade E (Poor): 0

P-Reviewer: Byeon H, South Korea; Leowattana W, Thailand S-Editor: Liu JH L-Editor: Filipodia P-Editor: Liu JH

1.  Koehler EM, Schouten JN, Hansen BE, van Rooij FJ, Hofman A, Stricker BH, Janssen HL. Prevalence and risk factors of non-alcoholic fatty liver disease in the elderly: results from the Rotterdam study. J Hepatol. 2012;57:1305-1311.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 159]  [Cited by in F6Publishing: 163]  [Article Influence: 14.8]  [Reference Citation Analysis (0)]
2.  Younossi ZM, Koenig AB, Abdelatif D, Fazel Y, Henry L, Wymer M. Global epidemiology of nonalcoholic fatty liver disease-Meta-analytic assessment of prevalence, incidence, and outcomes. Hepatology. 2016;64:73-84.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5322]  [Cited by in F6Publishing: 5913]  [Article Influence: 844.7]  [Reference Citation Analysis (0)]
3.  Facciorusso A, Ferrusquía J, Muscatiello N. Lead time bias in estimating survival outcomes. Gut. 2016;65:538-539.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 11]  [Cited by in F6Publishing: 12]  [Article Influence: 1.7]  [Reference Citation Analysis (0)]
4.  Jansen RJ, Alexander BH, Anderson KE, Church TR. Quantifying lead-time bias in risk factor studies of cancer through simulation. Ann Epidemiol. 2013;23:735-741.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 5]  [Article Influence: 0.5]  [Reference Citation Analysis (0)]
5.  Graunt J  Mathematical Demography. Berlin, Heidelberg: Springer, 1975: 11-20.  [PubMed]  [DOI]  [Cited in This Article: ]
6.  Dumbill E. A Revolution That Will Transform How We Live, Work, and Think: An Interview with the Authors of Big Data. Big Data. 2013;1:73-77.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 20]  [Cited by in F6Publishing: 2]  [Article Influence: 0.2]  [Reference Citation Analysis (0)]
7.  Rothman KJ. Lessons from John Graunt. Lancet. 1996;347:37-39.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 4]  [Cited by in F6Publishing: 5]  [Article Influence: 0.2]  [Reference Citation Analysis (0)]
8.  de Mauro A, Greco M, Grimaldi M.   A formal definition of big data based on its essential features. Lib Rev. 2016 Apr 4; 65: 122-135.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 462]  [Cited by in F6Publishing: 467]  [Article Influence: 66.7]  [Reference Citation Analysis (0)]
9.  John R  Mashey. Big Data and the Next Wave of Infra Stress. USENIX: The Advanced Computing Systems Association. 1998. Available from:  [PubMed]  [DOI]  [Cited in This Article: ]
10.  Lohr S  The Origins of ‘Big Data’: An Etymological Detective Story. The New York Times. B4. 2013. Available from:  [PubMed]  [DOI]  [Cited in This Article: ]
11.  Mirchev M, Mircheva I, Kerekovska A. The Academic Viewpoint on Patient Data Ownership in the Context of Big Data: Scoping Review. J Med Internet Res. 2020;22:e22214.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 12]  [Cited by in F6Publishing: 14]  [Article Influence: 4.7]  [Reference Citation Analysis (0)]
12.  Kapil G, Agrawal A, Khan RA.   A Study of Big Data Characteristics. International Conference on Communication and Electronics Systems; ICCES'16; 2016 October 21-22, Coimbatore, India.  [PubMed]  [DOI]  [Cited in This Article: ]
13.  Nobanee H. A Bibliometric Review of Big Data in Finance. Big Data. 2021;9:73-78.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 16]  [Cited by in F6Publishing: 4]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
14.  Beyer MA, Laney D.   The Importance of 'Big Data': A Definition. Gartner Inc. June 21, 2021. Available from:  [PubMed]  [DOI]  [Cited in This Article: ]
15.  Tseng IL  Big data: related technologies, challenges and future prospects. Computing reviews, 2015, 56: 476-477. Available from:  [PubMed]  [DOI]  [Cited in This Article: ]
16.  Dobre C, Xhafa F. Intelligent services for big data science. Future Gener Comput Syst. 2014;37:267-281.  [PubMed]  [DOI]  [Cited in This Article: ]
17.  Owais SS, Hussein NS. Extract five categories CPIVW from the 9V’s characteristics of the big data. Int J Adv Comput Sci Appl. 2016;7:254-258.  [PubMed]  [DOI]  [Cited in This Article: ]
18.  O'Driscoll A, Daugelaite J, Sleator RD. 'Big data', Hadoop and cloud computing in genomics. J Biomed Inform. 2013;46:774-781.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 294]  [Cited by in F6Publishing: 302]  [Article Influence: 30.2]  [Reference Citation Analysis (0)]
19.  Costa FF. Big data in biomedicine. Drug Discov Today. 2014;19:433-440.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 204]  [Cited by in F6Publishing: 115]  [Article Influence: 12.8]  [Reference Citation Analysis (0)]
20.  Luo J, Wu M, Gopukumar D, Zhao Y. Big Data Application in Biomedical Research and Health Care: A Literature Review. Biomed Inform Insights. 2016;8:1-10.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 217]  [Cited by in F6Publishing: 238]  [Article Influence: 34.0]  [Reference Citation Analysis (0)]
21.  Okada M. Big data and real-world data-based medicine in the management of hypertension. Hypertens Res. 2021;44:147-153.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 6]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
22.  Major A, Cox SM, Volchenboum SL. Using big data in pediatric oncology: Current applications and future directions. Semin Oncol. 2020;47:56-64.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 11]  [Cited by in F6Publishing: 11]  [Article Influence: 3.7]  [Reference Citation Analysis (0)]
23.  Finkelstein J, Zhang F, Levitin SA, Cappelli D. Using big data to promote precision oral health in the context of a learning healthcare system. J Public Health Dent. 2020;80 Suppl 1:S43-S58.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 13]  [Cited by in F6Publishing: 10]  [Article Influence: 3.3]  [Reference Citation Analysis (0)]
24.  Waschkau A, Wilfling D, Steinhäuser J. Are big data analytics helpful in caring for multimorbid patients in general practice? BMC Fam Pract. 2019;20:37.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 11]  [Cited by in F6Publishing: 8]  [Article Influence: 2.0]  [Reference Citation Analysis (0)]
25.  Manrique de Lara A, Peláez-Ballestas I. Big data and data processing in rheumatology: bioethical perspectives. Clin Rheumatol. 2020;39:1007-1014.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 12]  [Cited by in F6Publishing: 7]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
26.  Yang C, Kong G, Wang L, Zhang L, Zhao MH. Big data in nephrology: Are we ready for the change? Nephrology (Carlton). 2019;24:1097-1102.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 17]  [Cited by in F6Publishing: 17]  [Article Influence: 4.3]  [Reference Citation Analysis (0)]
27.  Smallwood CD. Monitoring Big Data During Mechanical Ventilation in the ICU. Respir Care. 2020;65:894-910.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2]  [Cited by in F6Publishing: 2]  [Article Influence: 1.0]  [Reference Citation Analysis (0)]
28.  Alexander M, Loomis AK, van der Lei J, Duarte-Salles T, Prieto-Alhambra D, Ansell D, Pasqua A, Lapi F, Rijnbeek P, Mosseveld M, Waterworth DM, Kendrick S, Sattar N, Alazawi W. Risks and clinical predictors of cirrhosis and hepatocellular carcinoma diagnoses in adults with diagnosed NAFLD: real-world study of 18 million patients in four European cohorts. BMC Med. 2019;17:95.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 128]  [Cited by in F6Publishing: 116]  [Article Influence: 29.0]  [Reference Citation Analysis (0)]
29.  Kringos D, Boerma W, Bourgueil Y, Cartier T, Dedeu T, Hasvold T, Hutchinson A, Lember M, Oleszczyk M, Rotar Pavlic D, Svab I, Tedeschi P, Wilm S, Wilson A, Windak A, Van der Zee J, Groenewegen P. The strength of primary care in Europe: an international comparative study. Br J Gen Pract. 2013;63:e742-e750.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 219]  [Cited by in F6Publishing: 228]  [Article Influence: 25.3]  [Reference Citation Analysis (0)]
30.  Suwinski P, Ong C, Ling MHT, Poh YM, Khan AM, Ong HS. Advancing Personalized Medicine Through the Application of Whole Exome Sequencing and Big Data Analytics. Front Genet. 2019;10:49.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 93]  [Cited by in F6Publishing: 94]  [Article Influence: 23.5]  [Reference Citation Analysis (0)]
31.  Jung JH, Shin JI. Big Data Analysis of Media Reports Related to COVID-19. Int J Environ Res Public Health. 2020;17.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 9]  [Cited by in F6Publishing: 9]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
32.  Sundermann AJ, Miller JK, Marsh JW, Saul MI, Shutt KA, Pacey M, Mustapha MM, Ayres A, Pasculle AW, Chen J, Snyder GM, Dubrawski AW, Harrison LH. Automated data mining of the electronic health record for investigation of healthcare-associated outbreaks. Infect Control Hosp Epidemiol. 2019;40:314-319.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 28]  [Cited by in F6Publishing: 29]  [Article Influence: 7.3]  [Reference Citation Analysis (0)]
33.  Moons KG, Kengne AP, Woodward M, Royston P, Vergouwe Y, Altman DG, Grobbee DE. Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker. Heart. 2012;98:683-690.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 524]  [Cited by in F6Publishing: 553]  [Article Influence: 50.3]  [Reference Citation Analysis (0)]
34.  In J, Lee DK. Survival analysis: part II - applied clinical data analysis. Korean J Anesthesiol. 2019;72:441-457.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 32]  [Cited by in F6Publishing: 30]  [Article Influence: 7.5]  [Reference Citation Analysis (0)]
35.  Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16:385-395.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 15]  [Reference Citation Analysis (0)]
36.  Van Calster B, Wynants L, Verbeek JFM, Verbakel JY, Christodoulou E, Vickers AJ, Roobol MJ, Steyerberg EW. Reporting and Interpreting Decision Curve Analysis: A Guide for Investigators. Eur Urol. 2018;74:796-804.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 275]  [Cited by in F6Publishing: 435]  [Article Influence: 87.0]  [Reference Citation Analysis (0)]
37.  Vickers AJ, Elkin EB. Decision curve analysis: a novel method for evaluating prediction models. Med Decis Making. 2006;26:565-574.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2176]  [Cited by in F6Publishing: 2365]  [Article Influence: 147.8]  [Reference Citation Analysis (0)]
38.  Pauker SG, Kassirer JP. Therapeutic decision making: a cost-benefit analysis. N Engl J Med. 1975;293:229-234.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 336]  [Cited by in F6Publishing: 338]  [Article Influence: 7.0]  [Reference Citation Analysis (0)]
39.  Vickers AJ, Van Calster B, Steyerberg EW. Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests. BMJ. 2016;352:i6.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 412]  [Cited by in F6Publishing: 380]  [Article Influence: 54.3]  [Reference Citation Analysis (0)]
40.  Baker SG, Kramer BS. Evaluating Prognostic Markers Using Relative Utility Curves and Test Tradeoffs. J Clin Oncol. 2015;33:2578-2580.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 6]  [Cited by in F6Publishing: 6]  [Article Influence: 0.8]  [Reference Citation Analysis (0)]
41.  Sabzevari I, Mahajan A, Sharma S. An accelerated linear method for optimizing non-linear wavefunctions in variational Monte Carlo. J Chem Phys. 2020;152:024111.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 8]  [Cited by in F6Publishing: 9]  [Article Influence: 3.0]  [Reference Citation Analysis (0)]
42.  Zhang T, Georgiopoulos M, Anagnostopoulos GC. Pareto-Optimal Model Selection via SPRINT-Race. IEEE Trans Cybern. 2018;48:596-610.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 4]  [Reference Citation Analysis (0)]
43.  Hengenius JB, Gribskov M, Rundell AE, Umulis DM. Making models match measurements: model optimization for morphogen patterning networks. Semin Cell Dev Biol. 2014;35:109-123.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 13]  [Cited by in F6Publishing: 13]  [Article Influence: 1.4]  [Reference Citation Analysis (0)]
44.  Boughorbel S, Al-Ali R, Elkum N. Model Comparison for Breast Cancer Prognosis Based on Clinical Data. PLoS One. 2016;11:e0146413.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 22]  [Cited by in F6Publishing: 23]  [Article Influence: 3.3]  [Reference Citation Analysis (0)]
45.  Singal AG, Rahimi RS, Clark C, Ma Y, Cuthbert JA, Rockey DC, Amarasingham R. An automated model using electronic medical record data identifies patients with cirrhosis at high risk for readmission. Clin Gastroenterol Hepatol. 2013;11:1335-1341.e1.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 65]  [Cited by in F6Publishing: 67]  [Article Influence: 6.7]  [Reference Citation Analysis (0)]
46.  Sylvester RJ, van der Meijden AP, Oosterlinck W, Witjes JA, Bouffioux C, Denis L, Newling DW, Kurth K. Predicting recurrence and progression in individual patients with stage Ta T1 bladder cancer using EORTC risk tables: a combined analysis of 2596 patients from seven EORTC trials. Eur Urol. 2006;49:466-5; discussion 475.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1856]  [Cited by in F6Publishing: 1660]  [Article Influence: 97.6]  [Reference Citation Analysis (0)]
47.  Fernandez-Gomez J, Madero R, Solsona E, Unda M, Martinez-Piñeiro L, Gonzalez M, Portillo J, Ojea A, Pertusa C, Rodriguez-Molina J, Camacho JE, Rabadan M, Astobieta A, Montesinos M, Isorna S, Muntañola P, Gimeno A, Blas M, Martinez-Piñeiro JA. Predicting nonmuscle invasive bladder cancer recurrence and progression in patients treated with bacillus Calmette-Guerin: the CUETO scoring model. J Urol. 2009;182:2195-2203.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 345]  [Cited by in F6Publishing: 379]  [Article Influence: 27.1]  [Reference Citation Analysis (0)]
48.  Cambier S, Sylvester RJ, Collette L, Gontero P, Brausi MA, van Andel G, Kirkels WJ, Silva FC, Oosterlinck W, Prescott S, Kirkali Z, Powell PH, de Reijke TM, Turkeri L, Collette S, Oddens J. EORTC Nomograms and Risk Groups for Predicting Recurrence, Progression, and Disease-specific and Overall Survival in Non-Muscle-invasive Stage Ta-T1 Urothelial Bladder Cancer Patients Treated with 1-3 Years of Maintenance Bacillus Calmette-Guérin. Eur Urol. 2016;69:60-69.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 302]  [Cited by in F6Publishing: 331]  [Article Influence: 41.4]  [Reference Citation Analysis (0)]
49.  Abuadas MH, Petro-Nustas W, Albikawi ZF. Predictors of Participation in Prostate Cancer Screening among Older Men in Jordan. Asian Pac J Cancer Prev. 2015;16:5377-5383.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 8]  [Cited by in F6Publishing: 9]  [Article Influence: 1.3]  [Reference Citation Analysis (0)]
50.  Moons KG, Kengne AP, Grobbee DE, Royston P, Vergouwe Y, Altman DG, Woodward M. Risk prediction models: II. External validation, model updating, and impact assessment. Heart. 2012;98:691-698.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 634]  [Cited by in F6Publishing: 659]  [Article Influence: 59.9]  [Reference Citation Analysis (0)]
51.  Magaz M, Baiges A, Hernández-Gea V. Precision medicine in variceal bleeding: Are we there yet? J Hepatol. 2020;72:774-784.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 24]  [Cited by in F6Publishing: 23]  [Article Influence: 7.7]  [Reference Citation Analysis (0)]
52.  He L, Ye X, Ma J, Li P, Jiang Y, Hu J, Yang J, Zhou Y, Liang X, Lin Y, Wei H. Antiviral therapy reduces rebleeding rate in patients with hepatitis B-related cirrhosis with acute variceal bleeding after endotherapy. BMC Gastroenterol. 2019;19:101.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3]  [Cited by in F6Publishing: 3]  [Article Influence: 0.8]  [Reference Citation Analysis (0)]
53.  Jiménez Rosales R, Martínez-Cara JG, Vadillo-Calles F, Ortega-Suazo EJ, Abellán-Alfocea P, Redondo-Cerezo E. Analysis of rebleeding in cases of an upper gastrointestinal bleed in a single center series. Rev Esp Enferm Dig. 2019;111:189-192.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3]  [Cited by in F6Publishing: 3]  [Article Influence: 0.6]  [Reference Citation Analysis (0)]
54.  Trebicka J, Gu W, Ibáñez-Samaniego L, Hernández-Gea V, Pitarch C, Garcia E, Procopet B, Giráldez Á, Amitrano L, Villanueva C, Thabut D, Silva-Junior G, Martinez J, Genescà J, Bureau C, Llop E, Laleman W, Palazon JM, Castellote J, Rodrigues S, Gluud L, Ferreira CN, Barcelo R, Cañete N, Rodríguez M, Ferlitsch A, Mundi JL, Gronbaek H, Hernández-Guerra M, Sassatelli R, Dell'Era A, Senzolo M, Abraldes JG, Romero-Gómez M, Zipprich A, Casas M, Masnou H, Primignani M, Weiss E, Catalina MV, Erasmus HP, Uschner FE, Schulz M, Brol MJ, Praktiknjo M, Chang J, Krag A, Nevens F, Calleja JL, Robic MA, Conejo I, Albillos A, Rudler M, Alvarado E, Guardascione MA, Tantau M, Bosch J, Torres F, Pavesi M, Garcia-Pagán JC, Jansen C, Bañares R; International Variceal Bleeding Observational Study Group and Baveno Cooperation. Rebleeding and mortality risk are increased by ACLF but reduced by pre-emptive TIPS. J Hepatol. 2020;73:1082-1091.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 72]  [Cited by in F6Publishing: 75]  [Article Influence: 25.0]  [Reference Citation Analysis (0)]
55.  Wang Z, Xie YW, Lu Q, Yan HL, Liu XB, Long Y, Zhang X, Yang JL. The impact of albumin infusion on the risk of rebleeding and in-hospital mortality in cirrhotic patients admitted for acute gastrointestinal bleeding: a retrospective study of a single institute. BMC Gastroenterol. 2020;20:198.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 9]  [Cited by in F6Publishing: 7]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
56.  Puente A, Hernández-Gea V, Graupera I, Roque M, Colomo A, Poca M, Aracil C, Gich I, Guarner C, Villanueva C. Drugs plus ligation to prevent rebleeding in cirrhosis: an updated systematic review. Liver Int. 2014;34:823-833.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 84]  [Cited by in F6Publishing: 86]  [Article Influence: 9.6]  [Reference Citation Analysis (0)]
57.  Abraldes JG, Villanueva C, Aracil C, Turnes J, Hernandez-Guerra M, Genesca J, Rodriguez M, Castellote J, García-Pagán JC, Torres F, Calleja JL, Albillos A, Bosch J; BLEPS Study Group. Addition of Simvastatin to Standard Therapy for the Prevention of Variceal Rebleeding Does Not Reduce Rebleeding but Increases Survival in Patients With Cirrhosis. Gastroenterology. 2016;150:1160-1170.e3.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 175]  [Cited by in F6Publishing: 180]  [Article Influence: 25.7]  [Reference Citation Analysis (0)]
58.  Amitrano L, Guardascione MA, Scaglione M, Menchise A, Martino R, Manguso F, Lanza AG, Lampasi F. Splanchnic vein thrombosis and variceal rebleeding in patients with cirrhosis. Eur J Gastroenterol Hepatol. 2012;24:1381-1385.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 33]  [Cited by in F6Publishing: 31]  [Article Influence: 2.8]  [Reference Citation Analysis (0)]
59.  Zhou JN, Wei Z, Sun ZQ. [Risk factors for early rebleeding after esophageal variceal ligation in patients with liver cirrhosis]. Zhonghua Gan Zang Bing Za Zhi. 2016;24:486-492.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 2]  [Reference Citation Analysis (0)]
60.  Boursier J, Asfar P, Joly-Guillou ML, Calès P. [Infection and variceal bleeding in cirrhosis]. Gastroenterol Clin Biol. 2007;31:27-38.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3]  [Cited by in F6Publishing: 3]  [Article Influence: 0.2]  [Reference Citation Analysis (0)]
61.  Ardevol A, Alvarado-Tapias E, Garcia-Guix M, Brujats A, Gonzalez L, Hernández-Gea V, Aracil C, Pavel O, Cuyas B, Graupera I, Colomo A, Poca M, Torras X, Concepción M, Villanueva C. Early rebleeding increases mortality of variecal bleeders on secondary prophylaxis with β-blockers and ligation. Dig Liver Dis. 2020;52:1017-1025.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5]  [Cited by in F6Publishing: 4]  [Article Influence: 1.3]  [Reference Citation Analysis (0)]
62.  Ma JL, Chen X, He LL, Wei HS, Li P.   Predictive value of child Pugh score, MELD score, MELD-Na score, APASAL score and R score in rebleeding and death of liver cirrhosis with esophagogastric varices. JCTH 2020; 36: 1278-1283. Available from:  [PubMed]  [DOI]  [Cited in This Article: ]
63.  Bjerre-Nielsen A, Kassarnig V, Lassen DD, Lehmann S. Task-specific information outperforms surveillance-style big data in predictive analytics. Proc Natl Acad Sci USA. 2021;118.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2]  [Cited by in F6Publishing: 2]  [Article Influence: 1.0]  [Reference Citation Analysis (0)]
64.  Liu L, Liu Q, Xiao N, Zhang Y, Nie Y, Zhu X. A Liver Stiffness Measurement-Based Nomogram Predicts Variceal Rebleeding in Hepatitis B-Related Cirrhosis. Dis Markers. 2022;2022:4107877.  [PubMed]  [DOI]  [Cited in This Article: ]  [Reference Citation Analysis (0)]
65.  Sourabh D. Clinical Epidemiology: Principles, Methods and Applications for Clinical Research. D E Grobbee and A W Hoes. Int J Epidemiol. 2010;39:318-319.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1]  [Cited by in F6Publishing: 1]  [Article Influence: 0.1]  [Reference Citation Analysis (0)]
66.  Reilly BM, Evans AT. Translating clinical research into clinical practice: impact of using prediction rules to make decisions. Ann Intern Med. 2006;144:201-209.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 508]  [Cited by in F6Publishing: 514]  [Article Influence: 30.2]  [Reference Citation Analysis (0)]