1
|
Pan J, Zhang Z, Peters SR, Vatanpour S, Walker RL, Lee S, Martin EA, Quan H. Cerebrovascular disease case identification in inpatient electronic medical record data using natural language processing. Brain Inform 2023; 10:22. [PMID: 37658963 PMCID: PMC10474977 DOI: 10.1186/s40708-023-00203-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 08/14/2023] [Indexed: 09/05/2023] Open
Abstract
BACKGROUND Abstracting cerebrovascular disease (CeVD) from inpatient electronic medical records (EMRs) through natural language processing (NLP) is pivotal for automated disease surveillance and improving patient outcomes. Existing methods rely on coders' abstraction, which has time delays and under-coding issues. This study sought to develop an NLP-based method to detect CeVD using EMR clinical notes. METHODS CeVD status was confirmed through a chart review on randomly selected hospitalized patients who were 18 years or older and discharged from 3 hospitals in Calgary, Alberta, Canada, between January 1 and June 30, 2015. These patients' chart data were linked to administrative discharge abstract database (DAD) and Sunrise™ Clinical Manager (SCM) EMR database records by Personal Health Number (a unique lifetime identifier) and admission date. We trained multiple natural language processing (NLP) predictive models by combining two clinical concept extraction methods and two supervised machine learning (ML) methods: random forest and XGBoost. Using chart review as the reference standard, we compared the model performances with those of the commonly applied International Classification of Diseases (ICD-10-CA) codes, on the metrics of sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULT Of the study sample (n = 3036), the prevalence of CeVD was 11.8% (n = 360); the median patient age was 63; and females accounted for 50.3% (n = 1528) based on chart data. Among 49 extracted clinical documents from the EMR, four document types were identified as the most influential text sources for identifying CeVD disease ("nursing transfer report," "discharge summary," "nursing notes," and "inpatient consultation."). The best performing NLP model was XGBoost, combining the Unified Medical Language System concepts extracted by cTAKES (e.g., top-ranked concepts, "Cerebrovascular accident" and "Transient ischemic attack"), and the term frequency-inverse document frequency vectorizer. Compared with ICD codes, the model achieved higher validity overall, such as sensitivity (25.0% vs 70.0%), specificity (99.3% vs 99.1%), PPV (82.6 vs. 87.8%), and NPV (90.8% vs 97.1%). CONCLUSION The NLP algorithm developed in this study performed better than the ICD code algorithm in detecting CeVD. The NLP models could result in an automated EMR tool for identifying CeVD cases and be applied for future studies such as surveillance, and longitudinal studies.
Collapse
Affiliation(s)
- Jie Pan
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Zilong Zhang
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Steven Ray Peters
- Department of Clinical Neurosciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Shabnam Vatanpour
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Robin L Walker
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Alberta Health Services, Edmonton, AB, Canada
| | - Seungwon Lee
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Alberta Health Services, Edmonton, AB, Canada
| | - Elliot A Martin
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Alberta Health Services, Edmonton, AB, Canada
| | - Hude Quan
- Centre for Health Informatics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
- Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
2
|
Escamilla-Ocañas CE, Torrealba-Acosta G, Mandava P, Qasim MS, Gutiérrez-Flores B, Bershad E, Hirzallah M, Venkatasubba Rao CP, Damani R. Implementation of systematic safety checklists in a neurocritical care unit: a quality improvement study. BMJ Open Qual 2022; 11:bmjoq-2022-001824. [PMID: 36588320 PMCID: PMC9743379 DOI: 10.1136/bmjoq-2022-001824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Accepted: 09/16/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND AND OBJECTIVES Structured and systematised checklists have been shown to prevent complications and improve patient care. We evaluated the implementation of systematic safety checklists in our neurocritical care unit (NCCU) and assessed its effect on patient outcomes. DESIGN/METHODS This quality improvement project followed a Plan-Do-Study-Act (PDSA) methodology. A checklist for medication reconciliation, thromboembolic prophylaxis, glycaemic control, daily spontaneous awakening, breathing trial, diet, catheter/lines duration monitoring and antibiotics de-escalation was implemented during daily patient rounds. Main outcomes included the rate of new infections, mortality and NCCU-length of stay (LOS). Intervened patients were compared with historical controls after propensity score and Euclidean distance matching to balance baseline covariates. RESULTS After several PDSA iterations, we applied checklists to 411 patients; the overall average age was 61.34 (17.39). The main reason for admission included tumour resection (31.39%), ischaemic stroke (26.76%) and intracerebral haemorrhage (10.95%); the mean Sequential Organ Failure Assessment (SOFA) score was 2.58 (2.68). At the end of the study, the checklist compliance rate throughout the full NCCU stays reached 97.11%. After controlling for SOFA score, age, sex and primary admitting diagnosis, the implementation of systematic checklists significantly correlated with a reduced LOS (ß=-0.15, 95% CI -0.24 to -0.06), reduced rate of any new infections (OR 0.59, 95% CI 0.40 to 0.87) and reduced urinary tract infections (UTIs) (OR 0.23, 95% CI 0.09 to 0.55). Propensity score and Euclidean distance matching yielded 382 and 338 pairs with excellent covariate balance. After matching, outcomes remained significant. DISCUSSION The implementation of safety checklists in the NCCU proved feasible, easy to incorporate into the NCCU workflow, and a helpful tool to improve adherence to practice guidelines and quality of care measurements. Furthermore, our intervention resulted in a reduced NCCU-LOS, rate of new infections and rate of UTIs compared with propensity score and Euclidean distance matched historical controls.
Collapse
Affiliation(s)
| | | | - Pitchaiah Mandava
- Neurology, Baylor College of Medicine, Houston, Texas, USA,Analytical Software and Engineering Research Laboratory, Michael E DeBakey VA Medical Center, Houston, Texas, USA
| | | | | | - Eric Bershad
- Neurology, Baylor College of Medicine, Houston, Texas, USA
| | | | | | - Rahul Damani
- Neurology, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|
3
|
Miller MI, Orfanoudaki A, Cronin M, Saglam H, So Yeon Kim I, Balogun O, Tzalidi M, Vasilopoulos K, Fanaropoulou G, Fanaropoulou NM, Kalin J, Hutch M, Prescott BR, Brush B, Benjamin EJ, Shin M, Mian A, Greer DM, Smirnakis SM, Ong CJ. Natural Language Processing of Radiology Reports to Detect Complications of Ischemic Stroke. Neurocrit Care 2022; 37:291-302. [PMID: 35534660 PMCID: PMC9986939 DOI: 10.1007/s12028-022-01513-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Accepted: 04/05/2022] [Indexed: 02/01/2023]
Abstract
BACKGROUND Abstraction of critical data from unstructured radiologic reports using natural language processing (NLP) is a powerful tool to automate the detection of important clinical features and enhance research efforts. We present a set of NLP approaches to identify critical findings in patients with acute ischemic stroke from radiology reports of computed tomography (CT) and magnetic resonance imaging (MRI). METHODS We trained machine learning classifiers to identify categorical outcomes of edema, midline shift (MLS), hemorrhagic transformation, and parenchymal hematoma, as well as rule-based systems (RBS) to identify intraventricular hemorrhage (IVH) and continuous MLS measurements within CT/MRI reports. Using a derivation cohort of 2289 reports from 550 individuals with acute middle cerebral artery territory ischemic strokes, we externally validated our models on reports from a separate institution as well as from patients with ischemic strokes in any vascular territory. RESULTS In all data sets, a deep neural network with pretrained biomedical word embeddings (BioClinicalBERT) achieved the highest discrimination performance for binary prediction of edema (area under precision recall curve [AUPRC] > 0.94), MLS (AUPRC > 0.98), hemorrhagic conversion (AUPRC > 0.89), and parenchymal hematoma (AUPRC > 0.76). BioClinicalBERT outperformed lasso regression (p < 0.001) for all outcomes except parenchymal hematoma (p = 0.755). Tailored RBS for IVH and continuous MLS outperformed BioClinicalBERT (p < 0.001) and linear regression, respectively (p < 0.001). CONCLUSIONS Our study demonstrates robust performance and external validity of a core NLP tool kit for identifying both categorical and continuous outcomes of ischemic stroke from unstructured radiographic text data. Medically tailored NLP methods have multiple important big data applications, including scalable electronic phenotyping, augmentation of clinical risk prediction models, and facilitation of automatic alert systems in the hospital setting.
Collapse
Affiliation(s)
- Matthew I Miller
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | | | - Michael Cronin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Hanife Saglam
- Department of Neurology, West Virginia University School of Medicine, Morgantown, WV, USA
| | | | - Oluwafemi Balogun
- Boston Medical Center, Boston, MA, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Maria Tzalidi
- School of Medicine, University of Crete, Heraklion, Greece
| | | | | | - Nina M Fanaropoulou
- School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Jack Kalin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA
| | - Meghan Hutch
- Department of Preventive Medicine, Northwestern University, Chicago, IL, USA.,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Benjamin Brush
- Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Emelia J Benjamin
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston University School of Public Health, Boston, MA, USA
| | - Min Shin
- Department of Computer Science, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Asim Mian
- Department of Radiology, Boston Medical Center, Boston, MA, USA
| | - David M Greer
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA.,Boston Medical Center, Boston, MA, USA
| | - Stelios M Smirnakis
- Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA.,Harvard Medical School, Boston, MA, USA.,Jamaica Plain Veterans Administration Hospital, Boston, MA, USA
| | - Charlene J Ong
- Department of Neurology, Boston University School of Medicine, 85 E. Concord St., Suite 1116, Boston, MA, 02118, USA. .,Boston Medical Center, Boston, MA, USA. .,Department of Neurology, Brigham and Women's Hospital, Boston, MA, USA. .,Department of Neurology, Massachusetts General Hospital, Boston, MA, USA. .,Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
4
|
Taylor-Rowan M, Wilson A, Dawson J, Quinn TJ. Functional Assessment for Acute Stroke Trials: Properties, Analysis, and Application. Front Neurol 2018; 9:191. [PMID: 29632511 PMCID: PMC5879151 DOI: 10.3389/fneur.2018.00191] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 03/12/2018] [Indexed: 11/13/2022] Open
Abstract
A measure of treatment effect is needed to assess the utility of any novel intervention in acute stroke. For a potentially disabling condition such as stroke, outcomes of interest should include some measure of functional recovery. There are many functional outcome assessments that can be used after stroke. In this narrative review, we discuss exemplars of assessments that describe impairment, activity, participation, and quality of life. We will consider the psychometric properties of assessment scales in the context of stroke trials, focusing on validity, reliability, responsiveness, and feasibility. We will consider approaches to the analysis of functional outcome measures, including novel statistical approaches. Finally, we will discuss how advances in audiovisual and information technology could further improve outcome assessment in trials.
Collapse
Affiliation(s)
- Martin Taylor-Rowan
- Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Alastair Wilson
- Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Jesse Dawson
- Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, United Kingdom
| | - Terence J Quinn
- Institute of Cardiovascular and Medical Sciences, University of Glasgow, Glasgow, United Kingdom
| |
Collapse
|
5
|
Mishra NK, Mandava P, Chen C, Grotta J, Lees KR, Kent TA. Influence of racial differences on outcomes after thrombolytic therapy in acute ischemic stroke. Int J Stroke 2014; 9:613-7. [PMID: 24148895 DOI: 10.1111/ijs.12162] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 05/22/2013] [Indexed: 12/01/2022]
Abstract
BACKGROUND The National Institutes of Neurological Disorders and Stroke and the European Co-operative Acute Stroke III trials enrolled a largely Caucasian population, but the results are often extrapolated onto non-Caucasians. A limited number of nonrandomized studies have proposed that non-Caucasian patients show differential response to tissue plasminogen activator. AIMS AND/OR HYPOTHESIS We examined if non-Caucasian patients of mixed national origin within the Virtual International Stroke Trials Archives neuroprotection trials responded differently to tissue plasminogen activator compared with Caucasians. METHODS We matched patients within each race-subtype for age, baseline National Institutes of Health Stroke Scales, and diabetes status, and excluded outliers. We tested for an interaction of race ethnicity with tissue plasminogen activator on predicting outcomes at α = 0·05. We compared 90-day ordinal outcome (modified Rankin Scale; primary analysis) and dichotomized outcomes (modified Rankin Scale 0-1; modified Rankin Scale 0-2; survival) within individual race ethnicity. RESULTS One thousand nine hundred forty-six thrombolysed patients (125 Blacks, 39 Asians, and 1821 Caucasians) were matched with 1946 non-thrombolysed patients in each race ethnicity group. Postmatching, there were no imbalances in baseline National Institutes of Health Stroke Scales and age between the groups (P > 0·05). The interaction of tissue plasminogen activator with race ethnicity was nonsignificant in ordinal (P = 0·4) and in dichotomized outcome models (P > 0·05). Ordinal odds for improved outcomes were 1·5 for all patients (P < 0·05). Ordinal odds for Caucasians were 1·5 (P < 0·05); for Blacks, 2·1 (P < 0·05); and for Asians, 1·2 (P > 0·05; 1·6 after 1:2 matching with nonthrombolysed, because of small numbers). Dichotomized functional outcomes improved after thrombolysis overall, in Caucasians, in Blacks (modified Rankin Scale 0-2 only), and in Asians (after 1:2 matching; P > 0·05). Odds for survival were consistent across all groups. CONCLUSIONS These results do not suggest a differential response to tissue plasminogen activator based on race ethnicity. Among Asians, data were particularly sparse, and results should be interpreted with caution.
Collapse
Affiliation(s)
- Nishant K Mishra
- Stanford Stroke Center, Stanford University Medical Center, Palo Alto, CA, USA; Western Infirmary and Faculty of Medicine, University of Glasgow, Glasgow, Scotland; Department of Neurology, University of Texas Health Sciences Centre, Houston, TX, USA
| | | | | | | | | | | |
Collapse
|
6
|
Hyperglycemia Worsens Outcome After rt-PA Primarily in the Large-Vessel Occlusive Stroke Subtype. Transl Stroke Res 2014; 5:519-25. [DOI: 10.1007/s12975-014-0338-x] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Revised: 02/28/2014] [Accepted: 03/04/2014] [Indexed: 01/04/2023]
|
7
|
Quantification of errors in ordinal outcome scales using shannon entropy: effect on sample size calculations. PLoS One 2013; 8:e67754. [PMID: 23861800 PMCID: PMC3702531 DOI: 10.1371/journal.pone.0067754] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 05/22/2013] [Indexed: 01/19/2023] Open
Abstract
Objective Clinical trial outcomes often involve an ordinal scale of subjective functional assessments but the optimal way to quantify results is not clear. In stroke, the most commonly used scale, the modified Rankin Score (mRS), a range of scores (“Shift”) is proposed as superior to dichotomization because of greater information transfer. The influence of known uncertainties in mRS assessment has not been quantified. We hypothesized that errors caused by uncertainties could be quantified by applying information theory. Using Shannon’s model, we quantified errors of the “Shift” compared to dichotomized outcomes using published distributions of mRS uncertainties and applied this model to clinical trials. Methods We identified 35 randomized stroke trials that met inclusion criteria. Each trial’s mRS distribution was multiplied with the noise distribution from published mRS inter-rater variability to generate an error percentage for “shift” and dichotomized cut-points. For the SAINT I neuroprotectant trial, considered positive by “shift” mRS while the larger follow-up SAINT II trial was negative, we recalculated sample size required if classification uncertainty was taken into account. Results Considering the full mRS range, error rate was 26.1%±5.31 (Mean±SD). Error rates were lower for all dichotomizations tested using cut-points (e.g. mRS 1; 6.8%±2.89; overall p<0.001). Taking errors into account, SAINT I would have required 24% more subjects than were randomized. Conclusion We show when uncertainty in assessments is considered, the lowest error rates are with dichotomization. While using the full range of mRS is conceptually appealing, a gain of information is counter-balanced by a decrease in reliability. The resultant errors need to be considered since sample size may otherwise be underestimated. In principle, we have outlined an approach to error estimation for any condition in which there are uncertainties in outcome assessment. We provide the user with programs to calculate and incorporate errors into sample size estimation.
Collapse
|
8
|
Mandava P, Murthy SB, Munoz M, McGuire D, Simon RP, Alexandrov AV, Albright KC, Boehme AK, Martin-Schild S, Martini S, Kent TA. Explicit consideration of baseline factors to assess recombinant tissue-type plasminogen activator response with respect to race and sex. Stroke 2013; 44:1525-31. [PMID: 23674524 PMCID: PMC5535075 DOI: 10.1161/strokeaha.113.001116] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 03/26/2013] [Indexed: 12/24/2022]
Abstract
BACKGROUND AND PURPOSE Sex and race reportedly influence outcome after recombinant tissue-type plasminogen activator (rtPA). It is, however, unclear whether baseline imbalances (eg, stroke severity) or lack of response to thrombolysis is responsible. We applied balancing methods to test the hypothesis that race and sex influence outcome after rtPA independent of baseline conditions. METHODS We mapped group outcomes from the National Institute of Neurological Disorders and Stroke (NINDS) dataset based on race and sex onto a surrogate-control function to assess differences from expected outcomes at their respective National Institutes of Health Stroke Scale and age. Outcomes were also compared for subjects matched individually on key baseline factors using NINDS and 2 recent datasets from southeastern United States. RESULTS At similar National Institutes of Health Stroke Scale and age, 90-day good outcomes (modified Rankin Score, 0-2) in NINDS were similarly improved after rtPA for white men and women. There was a strong trend for improvement in black men. Conversely, black women treated with rtPA showed response rates no different from the controls. After baseline matching, there were nonsignificant trends in outcomes except for significantly fewer good outcomes in black versus matched white women (37% versus 63%; P=0.027). Pooling the 3 datasets showed a similar trend for poorer short-term outcome for black women (P=0.054; modified Rankin Score, 0-1). CONCLUSIONS Matching for key baseline factors indicated that race and sex influence outcome most strikingly in black women who demonstrated poorest outcomes after rtPA. This finding supports the hypothesis that poor response to rtPA, rather than differences in baseline conditions, contributes to the worse outcome. This finding requires prospective confirmation.
Collapse
Affiliation(s)
- Pitchaiah Mandava
- Stroke Outcomes Laboratory, Department of Neurology, Baylor College of Medicine, The Michael E. DeBakey VA Medical Center, 2002 Holcombe Blvd (127), Houston, TX 77030, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|