1
|
Jian X, Zhang D, Yu Z, Xu H, Bian J, Wu Y, Tong J, Chen Y. Leveraging undecided cases in chart-reviewed phenotypes to enhance EHR-based association studies. J Biomed Inform 2025; 166:104839. [PMID: 40316004 DOI: 10.1016/j.jbi.2025.104839] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Revised: 03/25/2025] [Accepted: 04/23/2025] [Indexed: 05/04/2025]
Abstract
OBJECTIVES In electronic health record (EHR)-based association studies, phenotyping algorithms efficiently classify patient clinical outcomes into binary categories but are susceptible to misclassification errors. The gold standard, manual chart review, involves clinicians determining the true disease status based on their assessment of health records. These clinicians-labeled phenotypes are labor-intensive and typically limited to a small subset of patients, potentially introducing a third "undecided" category when phenotypes are indeterminate. We aim to effectively integrate the algorithm-derived and chart-reviewed outcomes when both are available in EHR-based association studies. MATERIAL AND METHODS We propose an augmented estimation method that combines the binary algorithm-derived phenotypes for the entire cohort with the trinary chart-reviewed phenotypes for a small, selected subset. Additionally, a cost-effective outcome-dependent sampling strategy is used to address the rare disease scenarios. The proposed trinary chart-reviewed phenotype integrated cost-effective augmented estimation (TriCA) was evaluated across a wide range of simulation settings and real-world applications, including using EHR data on Alzheimer's disease and related dementias (ADRD) from the OneFlorida + Clinical Research Network, and using cohort data on second breast cancer events (SBCE) from the Kaiser Permanente Washington. RESULTS Compared to estimation based on random sampling, our augmented method improved mean square error by up to 28.3% in simulation studies; compared to estimation using only trinary chart-reviewed phenotypes, our method improved efficiency by up to 33.3% in ADRD data and 50.8% in SBCE data. DISCUSSION Our simulation studies and real-world applications demonstrate that, compared to existing methods, the proposed method provides unbiased estimates with higher statistical efficiency. CONCLUSION The proposed method effectively combined binary algorithm-derived phenotypes for the whole cohort with trinary chart-reviewed outcomes for a limited validation set, making it applicable to a broader range of applications and enhancing risk factor identification in EHR-based association studies.
Collapse
Affiliation(s)
- Xinyao Jian
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| | - Dazheng Zhang
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA
| | - Zehao Yu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA
| | - Hua Xu
- Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT, USA
| | - Jiang Bian
- Department of Biostatistics and Health Data Science, School of Medicine, Indiana University, Indianapolis, IN, USA; Regenstreif Institute, Indianapolis, Indiana, IN, USA
| | - Yonghui Wu
- Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL, USA; Cancer Informatics Shared Resource, University of Florida Health Cancer Center, Gainesville, FL, USA
| | - Jiayi Tong
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Yong Chen
- The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA; Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA; The Graduate Group in Applied Mathematics and Computational Science, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, USA; Leonard Davis Institute of Health Economics, Philadelphia, PA, USA; Penn Medicine Center for Evidence-based Practice (CEP), Philadelphia, PA, USA; Penn Institute for Biomedical Informatics (IBI), Philadelphia, PA, USA.
| |
Collapse
|
2
|
Köse AM, Petzold P, Zocholl D, Kostoulas P, Rose M, Fischer F. Prevalence Estimation Using a Depression Screening Tool in the National Health and Nutrition Examination Survey: Comparison of Different Cutoffs. Int J Methods Psychiatr Res 2025; 34:e70019. [PMID: 40178057 PMCID: PMC11966556 DOI: 10.1002/mpr.70019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Revised: 02/25/2025] [Accepted: 03/12/2025] [Indexed: 04/05/2025] Open
Abstract
OBJECTIVES The National Health and Nutrition Examination Survey (NHANES) in the US relies on the depression screening tool PHQ-9 to assess depressive symptoms in the general population. For prevalence estimation, PHQ-9s imperfect diagnostic accuracy can be modeled with a Bayesian Latent Class Model. We investigate the impact of different cutoffs on prevalence estimation. METHODS We used data from the 16-th wave of the National Health and Nutrition Examination Survey (NHANES). We assessed the joint posterior distribution to asssess the prevalence of major depression as well as sensitivity and specificity of the PHQ-9 at cutoffs 5 to 15. We also assessed the impact of weakly and strongly informative prevalence priors. RESULTS Data from 9693 participants of the NHANES Wave 2019-2020 were analyzed. Under weakly informative prevalence priors, prevalence estimates ranged from 16.0% (95% CrI: 0.3%-87.8%) when using a cut-off of 5% to 3.9% (0.2%-12.7%) at 13. More informative prevalence priors led to narrower credible intervals, but the observed data was still in accordance with a wide range of possible MDD prevalence estimates. CONCLUSIONS Regardless of the cutoff and the prevalence prior chosen, prevalence estimation of major depressive disorders in the NHANES based on the PHQ-9 is imprecise.
Collapse
Affiliation(s)
- Ali Mertcan Köse
- Department of Computer ProgrammingIstanbul Ticaret UniversityIstanbulTurkey
| | - Paul Petzold
- Charité – Universitätsmedizin BerlinCorporate Member of Freie Universität Berlin and Humboldt Universität zu BerlinMedizinische Klinik Mit Schwerpunkt für PsychosomatikCenter for Patient‐Centered Outcomes ResearchBerlinGermany
| | - Dario Zocholl
- Institute of Medical BiometryInformatics and EpidemiologyUniversity Hospital BonnBonnGermany
| | - Polychronis Kostoulas
- Laboratory of Epidemiology & Artificial IntelligenceFaculty of Public & One HealthUniversity of ThessalyKarditsaGreece
| | - Matthias Rose
- Charité – Universitätsmedizin BerlinCorporate Member of Freie Universität Berlin and Humboldt Universität zu BerlinMedizinische Klinik Mit Schwerpunkt für PsychosomatikCenter for Patient‐Centered Outcomes ResearchBerlinGermany
- German Center for Mental Health (DZPG)BerlinGermany
| | - Felix Fischer
- Charité – Universitätsmedizin BerlinCorporate Member of Freie Universität Berlin and Humboldt Universität zu BerlinMedizinische Klinik Mit Schwerpunkt für PsychosomatikCenter for Patient‐Centered Outcomes ResearchBerlinGermany
- German Center for Mental Health (DZPG)BerlinGermany
| |
Collapse
|
3
|
Ni H, Koop G, Klugkist I, Nielen M. Evaluation of Bayesian Hui-Walter and logistic regression latent class models to estimate diagnostic test characteristics with simulated data. Prev Vet Med 2023; 217:105972. [PMID: 37499309 DOI: 10.1016/j.prevetmed.2023.105972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 05/08/2023] [Accepted: 06/28/2023] [Indexed: 07/29/2023]
Abstract
Estimation of the accuracy of diagnostic tests in the absence of a gold standard is an important research subject in epidemiology (Dohoo et al., 2009). One of the most used methods the last few decades is the Bayesian Hui-Walter (HW) latent class model (Hui and Walter, 1980). However, the classic HW models aggregate the observed individual test results to the population level, and as a result, potentially valuable information from the lower level(s) is not fully incorporated. An alternative approach is the Bayesian logistic regression (LR) latent class model that allows inclusion of individual level covariates (McInturff et al., 2004). In this study, we explored both classic HW and individual level LR latent class models using Bayesian methodology within a simulation context where true disease status and true test properties were predefined. Population prevalences and test characteristics that were realistic for paratuberculosis in cattle (Toft et al., 2005) were used for the simulation. Individual animals were generated to be clustered within herds in two regions. Two tests with binary outcomes were simulated with constant test characteristics across the two regions. On top of the prevalence properties and test characteristics, one animal level binary risk factor was added to the data. The main objective was to compare the performance of Bayesian HW and LR approaches in estimating test sensitivity and specificity in simulated datasets with different population characteristics. Results from various settings showed that LR models provided posterior estimates that were closer to the true values. The LR models that incorporated herd level clustering effects provided the most accurate estimates, in terms of being closest to the true values and having smaller estimation intervals. This work illustrates that individual level LR models are in many situations preferable over classic HW models for estimation of test characteristics in the absence of a gold standard.
Collapse
Affiliation(s)
- Haifang Ni
- Department Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, 3508 TD Utrecht, the Netherlands; Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, 3508 TC Utrecht, the Netherlands.
| | - Gerrit Koop
- Department Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, 3508 TD Utrecht, the Netherlands
| | - Irene Klugkist
- Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, 3508 TC Utrecht, the Netherlands
| | - Mirjam Nielen
- Department Population Health Sciences, Faculty of Veterinary Medicine, Utrecht University, 3508 TD Utrecht, the Netherlands
| |
Collapse
|
4
|
Fischer F, Zocholl D, Rauch G, Levis B, Benedetti A, Thombs B, Rose M, Kostoulas P. Prevalence estimates of major depressive disorder in 27 European countries from the European Health Interview Survey: accounting for imperfect diagnostic accuracy of the PHQ-8. BMJ MENTAL HEALTH 2023; 26:e300675. [PMID: 37024144 PMCID: PMC10083787 DOI: 10.1136/bmjment-2023-300675] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 03/06/2023] [Indexed: 04/08/2023]
Abstract
BACKGROUND Cut-offs on self-report depression screening tools are designed to identify many more people than those who meet criteria for major depressive disorder. In a recent analysis of the European Health Interview Survey (EHIS), the percentage of participants with Patient Health Questionnaire-8 (PHQ-8) scores ≥10 was reported as major depression prevalence. OBJECTIVE We used a Bayesian framework to re-analyse EHIS PHQ-8 data, accounting for the imperfect diagnostic accuracy of the PHQ-8. METHODS The EHIS is a cross-sectional, population-based survey in 27 countries across Europe with 258 888 participants from the general population. We incorporated evidence from a comprehensive individual participant data meta-analysis on the accuracy of the PHQ-8 cut-off of ≥10. We evaluated the joint posterior distribution to estimate the major depression prevalence, prevalence differences between countries and compared with previous EHIS results. FINDINGS Overall, major depression prevalence was 2.1% (95% credible interval (CrI) 1.0% to 3.8%). Mean posterior prevalence estimates ranged from 0.6% (0.0% to 1.9%) in the Czech Republic to 4.2% (0.2% to 11.3%) in Iceland. Accounting for the imperfect diagnostic accuracy resulted in insufficient power to establish prevalence differences. 76.4% (38.0% to 96.0%) of observed positive tests were estimated to be false positives. Prevalence was lower than the 6.4% (95% CI 6.2% to 6.5%) estimated previously. CONCLUSIONS Prevalence estimation needs to account for imperfect diagnostic accuracy. CLINICAL IMPLICATIONS Major depression prevalence in European countries is likely lower than previously reported on the basis of the EHIS survey.
Collapse
Affiliation(s)
- Felix Fischer
- Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Dario Zocholl
- Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Geraldine Rauch
- Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
| | - Brooke Levis
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Quèbec, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Quèbec, Canada
| | - Andrea Benedetti
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Quèbec, Canada
- Respiratory Epidemiology and Clinical Research Unit, McGill University Health Centre, Montréal, Québec, Canada
- Department of Medicine, McGill University, Montréal, Québec, Canada
| | - Brett Thombs
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montréal, Quèbec, Canada
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, Quèbec, Canada
- Department of Medicine, McGill University, Montréal, Québec, Canada
- Department of Psychiatry, McGill University, Montréal, Québec, Canada
- Department of Psychology, McGill University, Montréal, Québec, Canada
- Biomedical Ethics Unit, McGill University, Montréal, Québec, Canada
| | - Matthias Rose
- Department of Psychosomatic Medicine, Center for Internal Medicine and Dermatology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany
- Department of Quantitative Health Sciences, University of Massachusetts Medical School, Worcester, Massachusetts, USA
| | | |
Collapse
|
5
|
Berman J, Francoz D, Abdallah A, Dufour S, Buczinski S. Development and validation of a clinical respiratory disease scoring system for guiding treatment decisions in veal calves using a Bayesian framework. J Dairy Sci 2022; 105:9917-9933. [DOI: 10.3168/jds.2021-21695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 07/17/2022] [Indexed: 11/07/2022]
|
6
|
Samara I, Roth TS, Nikolic M, Prochazkova E, Kret ME. Can third-party observers detect attraction in others based on subtle nonverbal cues? CURRENT PSYCHOLOGY 2022; 42:1-15. [PMID: 35431520 PMCID: PMC8990491 DOI: 10.1007/s12144-022-02927-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/15/2022] [Indexed: 11/20/2022]
Abstract
In a series of three studies, we examined whether third-party observers can detect attraction in others based on subtle nonverbal cues. We employed video segments of dates collected from a speed-dating experiment, in which daters went on a brief (approx. 4 min) blind-date and indicated whether they would like to go on another date with their brief interaction partner or not. We asked participants to view these stimuli and indicate whether or not each couple member is attracted to their partner. Our results show that participants could not reliably detect attraction, and this ability was not influenced by the age of the observer, video segment location (beginning or middle of the date), video duration, or general emotion recognition capacity. Contrary to previous research findings, our findings suggest that third-party observers cannot reliably detect attraction in others. However, there was one exception: Recognition rose above chance level when the daters were both interested in their partners compared to when they were not interested. Supplementary Information The online version contains supplementary material available at 10.1007/s12144-022-02927-0.
Collapse
Affiliation(s)
- Iliana Samara
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK Leiden, the Netherlands
- Leiden Institute for Brain and Cognition (LIBC), Leiden, the Netherlands
| | - Tom S. Roth
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK Leiden, the Netherlands
- Apenheul Primate Park, Apeldoorn, the Netherlands
| | - Milica Nikolic
- Research Institute of Child Development and Education, University of Amsterdam, Amsterdam, the Netherlands
| | - Eliska Prochazkova
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK Leiden, the Netherlands
- Leiden Institute for Brain and Cognition (LIBC), Leiden, the Netherlands
| | - Mariska E. Kret
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK Leiden, the Netherlands
- Leiden Institute for Brain and Cognition (LIBC), Leiden, the Netherlands
| |
Collapse
|
7
|
Buczinski S, Boccardo A, Pravettoni D. Clinical Scores in Veterinary Medicine: What Are the Pitfalls of Score Construction, Reliability, and Validation? A General Methodological Approach Applied in Cattle. Animals (Basel) 2021; 11:ani11113244. [PMID: 34827976 PMCID: PMC8614512 DOI: 10.3390/ani11113244] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/08/2021] [Accepted: 11/09/2021] [Indexed: 12/18/2022] Open
Abstract
Simple Summary Clinical scores are practical tools that can be used in the daily management of cattle. Score building and validation are a challenge involving various methodological and statistical issues. This article provides a specific framework for clinical score building where the target condition can be assessed directly or indirectly. Practical examples are given throughout the manuscript in order to build new scores or to assess score robustness. Abstract Clinical scores are commonly used for cattle. They generally contain a mix of categorical and numerical variables that need to be assessed by scorers, such as farmers, animal caretakers, scientists, and veterinarians. This article examines the key concepts that need to be accounted for when developing the test for optimal outcomes. First, the target condition or construct that the scale is supposed to measure should be defined, and if possible, an adequate proxy used for classification should be determined. Then, items (e.g., clinical signs) of interest that are either caused by the target condition (reflective items) or that caused the target condition (formative items) are listed, and reliable items (inter and intra-rater reliability) are kept for the next step. A model is then developed to determine the relative weight of the items associated with the target condition. A scale is then built after validating the model and determining the optimal threshold in terms of sensitivity (ability to detect the target condition) and specificity (ability to detect the absence of the target condition). Its robustness to various scenarios of the target condition prevalence and the impact of the relative cost of false negatives to false positives can also be assessed to tailor the scale used based on specific application conditions.
Collapse
Affiliation(s)
- Sébastien Buczinski
- Département des Sciences Cliniques, Faculté de Médecine Vétérinaire, Université de Montréal, Saint-Hyacinthe, QC J2S 2M2, Canada
- Centre d’Expertise et de Recherche Clinique en Santé et Bien-Etre Animal (CERCL), Faculté de Médecine Vétérinaire, Université de Montréal, Saint-Hyacinthe, QC J2S 2M2, Canada
- Correspondence: ; Tel.: +1-450-773-8521 (ext. 8675)
| | - Antonio Boccardo
- Dipartimento di Medicina Veterinaria, Università degli Studi di Milano, via dell’Università 6, 26900 Lodi, Italy; (A.B.); (D.P.)
| | - Davide Pravettoni
- Dipartimento di Medicina Veterinaria, Università degli Studi di Milano, via dell’Università 6, 26900 Lodi, Italy; (A.B.); (D.P.)
| |
Collapse
|
8
|
Samara I, Roth TS, Kret ME. The Role of Emotion Projection, Sexual Desire, and Self-Rated Attractiveness in the Sexual Overperception Bias. ARCHIVES OF SEXUAL BEHAVIOR 2021; 50:2507-2516. [PMID: 34389894 PMCID: PMC8416843 DOI: 10.1007/s10508-021-02017-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 02/22/2021] [Accepted: 04/16/2021] [Indexed: 06/13/2023]
Abstract
A consistent finding in the literature is that men overperceive sexual interest in women (i.e., sexual overperception bias). Several potential mechanisms have been proposed for this bias, including projecting one's own interest onto a given partner, sexual desire, and self-rated attractiveness. Here, we examined the influence of these factors in attraction detection accuracy during speed-dates. Sixty-seven participants (34 women) split in four groups went on a total of 10 speed-dates with all opposite-sex members of their group, resulting in 277 dates. The results showed that attraction detection accuracy was reliably predicted by projection of own interest in combination with participant sex. Specifically, men were more accurate than women in detecting attraction when they were not interested in their partner compared to when they were interested. These results are discussed in the wider context of arousal influencing detection of partner attraction.
Collapse
Affiliation(s)
- Iliana Samara
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK, Leiden, the Netherlands.
- Leiden Institute for Brain and Cognition, Leiden, the Netherlands.
| | - Tom S Roth
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK, Leiden, the Netherlands
- Apenheul Primate Park, Apeldoorn, the Netherlands
| | - Mariska E Kret
- Cognitive Psychology Unit, Department of Psychology, Leiden University, Wassenaarseweg 52, 2333 AK, Leiden, the Netherlands
- Leiden Institute for Brain and Cognition, Leiden, the Netherlands
| |
Collapse
|
9
|
de Oliveira Valadares DG, Costa Quinino R, Carvalho Pires M. The need to conduct repeated classifications in a logistic regression model with misclassification in the dependent variable. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2019.1584301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
| | - Roberto Costa Quinino
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Magda Carvalho Pires
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
10
|
Keogh RH, Shaw PA, Gustafson P, Carroll RJ, Deffner V, Dodd KW, Küchenhoff H, Tooze JA, Wallace MP, Kipnis V, Freedman LS. STRATOS guidance document on measurement error and misclassification of variables in observational epidemiology: Part 1-Basic theory and simple methods of adjustment. Stat Med 2020; 39:2197-2231. [PMID: 32246539 PMCID: PMC7450672 DOI: 10.1002/sim.8532] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Revised: 02/25/2020] [Accepted: 02/28/2020] [Indexed: 11/11/2022]
Abstract
Measurement error and misclassification of variables frequently occur in epidemiology and involve variables important to public health. Their presence can impact strongly on results of statistical analyses involving such variables. However, investigators commonly fail to pay attention to biases resulting from such mismeasurement. We provide, in two parts, an overview of the types of error that occur, their impacts on analytic results, and statistical methods to mitigate the biases that they cause. In this first part, we review different types of measurement error and misclassification, emphasizing the classical, linear, and Berkson models, and on the concepts of nondifferential and differential error. We describe the impacts of these types of error in covariates and in outcome variables on various analyses, including estimation and testing in regression models and estimating distributions. We outline types of ancillary studies required to provide information about such errors and discuss the implications of covariate measurement error for study design. Methods for ascertaining sample size requirements are outlined, both for ancillary studies designed to provide information about measurement error and for main studies where the exposure of interest is measured with error. We describe two of the simpler methods, regression calibration and simulation extrapolation (SIMEX), that adjust for bias in regression coefficients caused by measurement error in continuous covariates, and illustrate their use through examples drawn from the Observing Protein and Energy (OPEN) dietary validation study. Finally, we review software available for implementing these methods. The second part of the article deals with more advanced topics.
Collapse
Affiliation(s)
- Ruth H Keogh
- Department of Medical Statistics, London School of Hygiene and Tropical Medicine, London, UK
| | - Pamela A Shaw
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania, USA
| | - Paul Gustafson
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Raymond J Carroll
- Department of Statistics, Texas A&M University, College Station, Texas, USA
- School of Mathematical and Physical Sciences, University of Technology Sydney, Broadway, New South Wales, Australia
| | - Veronika Deffner
- Statistical Consulting Unit StaBLab, Department of Statistics, Ludwig-Maximilians-Universität, Munich, Germany
| | - Kevin W Dodd
- Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland, USA
| | - Helmut Küchenhoff
- Department of Statistics, Statistical Consulting Unit StaBLab, Ludwig-Maximilians-Universität, Munich, Germany
| | - Janet A Tooze
- Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA
| | - Michael P Wallace
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Victor Kipnis
- Biometry Research Group, Division of Cancer Prevention, National Cancer Institute, Bethesda, Maryland, USA
| | - Laurence S Freedman
- Biostatistics and Biomathematics Unit, Gertner Institute for Epidemiology and Health Policy Research, Tel Hashomer, Israel
- Information Management Services Inc., Rockville, Maryland, USA
| |
Collapse
|
11
|
Buczinski S, Pardon B. Bovine Respiratory Disease Diagnosis: What Progress Has Been Made in Clinical Diagnosis? Vet Clin North Am Food Anim Pract 2020; 36:399-423. [PMID: 32451033 DOI: 10.1016/j.cvfa.2020.03.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Bovine respiratory disease (BRD) complex is a worldwide health problem in cattle and is a major reason for antimicrobial use in young cattle. Several challenges may explain why it is difficult to make progress in the management of this disease. This article defines the limitation of BRD complex nomenclature, which may not easily distinguish upper versus lower respiratory tract infection and infectious bronchopneumonia versus other types of respiratory diseases. It then discusses the obstacles to clinical diagnosis and reviews the current knowledge of readily available diagnostic test to reach a diagnosis of infectious bronchopneumonia.
Collapse
Affiliation(s)
- Sébastien Buczinski
- Département des Sciences Cliniques, Faculté de Médecine Vétérinaire, Université de Montréal, 3200 Rue Sicotte, St-Hyacinthe, Québec J2S 2M2, Canada.
| | - Bart Pardon
- Department of Large Animal Internal Medicine, Faculty of Veterinary Medicine, Ghent University, Salisburylaan 133, Merelbeke 9820, Belgium
| |
Collapse
|
12
|
Verdugo C, Valdes MF, Salgado M. Herd level risk factors for Mycobacterium avium subsp. paratuberculosis infection and clinical incidence in dairy herds in Chile. Prev Vet Med 2020; 176:104888. [DOI: 10.1016/j.prevetmed.2020.104888] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Revised: 01/08/2020] [Accepted: 01/08/2020] [Indexed: 11/26/2022]
|
13
|
Tong J, Huang J, Chubak J, Wang X, Moore JH, Hubbard RA, Chen Y. An augmented estimation procedure for EHR-based association studies accounting for differential misclassification. J Am Med Inform Assoc 2020; 27:244-253. [PMID: 31617899 DOI: 10.1093/jamia/ocz180] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 08/14/2019] [Accepted: 09/15/2019] [Indexed: 11/12/2022] Open
Abstract
OBJECTIVES The ability to identify novel risk factors for health outcomes is a key strength of electronic health record (EHR)-based research. However, the validity of such studies is limited by error in EHR-derived phenotypes. The objective of this study was to develop a novel procedure for reducing bias in estimated associations between risk factors and phenotypes in EHR data. MATERIALS AND METHODS The proposed method combines the strengths of a gold-standard phenotype obtained through manual chart review for a small validation set of patients and an automatically-derived phenotype that is available for all patients but is potentially error-prone (hereafter referred to as the algorithm-derived phenotype). An augmented estimator of associations is obtained by optimally combining these 2 phenotypes. We conducted simulation studies to evaluate the performance of the augmented estimator and conducted an analysis of risk factors for second breast cancer events using data on a cohort from Kaiser Permanente Washington. RESULTS The proposed method was shown to reduce bias relative to an estimator using only the algorithm-derived phenotype and reduce variance compared to an estimator using only the validation data. DISCUSSION Our simulation studies and real data application demonstrate that, compared to the estimator using validation data only, the augmented estimator has lower variance (ie, higher statistical efficiency). Compared to the estimator using error-prone EHR-derived phenotypes, the augmented estimator has smaller bias. CONCLUSIONS The proposed estimator can effectively combine an error-prone phenotype with gold-standard data from a limited chart review in order to improve analyses of risk factors using EHR data.
Collapse
Affiliation(s)
- Jiayi Tong
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jing Huang
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Jessica Chubak
- Department of Epidemiology, Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA
| | - Xuan Wang
- Department of Statistics, School of Mathematical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
| | - Jason H Moore
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Rebecca A Hubbard
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Yong Chen
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, The University of Pennsylvania, Philadelphia, Pennsylvania, USA
| |
Collapse
|
14
|
Duan R, Cao M, Ning Y, Zhu M, Zhang B, McDermott A, Chu H, Zhou X, Moore JH, Ibrahim JG, Scharfstein DO, Chen Y. Global identifiability of latent class models with applications to diagnostic test accuracy studies: A Gröbner basis approach. Biometrics 2019; 76:98-108. [PMID: 31444807 DOI: 10.1111/biom.13133] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2018] [Accepted: 07/25/2019] [Indexed: 11/30/2022]
Abstract
Identifiability of statistical models is a fundamental regularity condition that is required for valid statistical inference. Investigation of model identifiability is mathematically challenging for complex models such as latent class models. Jones et al. used Goodman's technique to investigate the identifiability of latent class models with applications to diagnostic tests in the absence of a gold standard test. The tool they used was based on examining the singularity of the Jacobian or the Fisher information matrix, in order to obtain insights into local identifiability (ie, there exists a neighborhood of a parameter such that no other parameter in the neighborhood leads to the same probability distribution as the parameter). In this paper, we investigate a stronger condition: global identifiability (ie, no two parameters in the parameter space give rise to the same probability distribution), by introducing a powerful mathematical tool from computational algebra: the Gröbner basis. With several existing well-known examples, we argue that the Gröbner basis method is easy to implement and powerful to study global identifiability of latent class models, and is an attractive alternative to the information matrix analysis by Rothenberg and the Jacobian analysis by Goodman and Jones et al.
Collapse
Affiliation(s)
- Rui Duan
- Department of Biostatistics, Epidemiology, and Informatics, The University of Pennsylvania, Philadelphia, Pennsylvania
| | - Ming Cao
- Department of Data and Analytics, Klynveld Peat Marwick Goerdeler US, New York, New York
| | - Yang Ning
- Department of Statistical Science, Cornell University, Ithaca, New York
| | - Mingfu Zhu
- Department of Research, Panorama Medicine Inc, Philadelphia, Pennsylvania
| | - Bin Zhang
- Division of Biostatistics and Epidemiology, Cincinnati Children's Hospital, Cincinnati, Ohio
| | - Aidan McDermott
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland
| | - Haitao Chu
- Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota
| | - Xiaohua Zhou
- Department of Biostatistics and Beijing International Center for Mathematical Research, Peking University, Beijing, China
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, The University of Pennsylvania, Philadelphia, Pennsylvania
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
| | | | - Yong Chen
- Department of Biostatistics, Epidemiology, and Informatics, The University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|
15
|
Hajizadeh N, Baghestani AR, Pourhoseingholi MA, Ashtari S, Najafimehr H, Busani L, Zali MR. Trend of Gastric Cancer after Bayesian Correction of Misclassification Error in Neighboring Provinces of Iran. Galen Med J 2019; 8:e1223. [PMID: 34466473 PMCID: PMC8344079 DOI: 10.31661/gmj.v0i0.1223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 07/07/2018] [Accepted: 07/30/2018] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Some errors may occur in the disease registry system. One of them is misclassification error in cancer registration. It occurs because some of the patients from deprived provinces travel to their adjacent provinces to receive better healthcare without mentioning their permanent residence. The aim of this study was to re-estimate the incidence of gastric cancer using the Bayesian correction for misclassification across Iranian provinces. MATERIALS AND METHODS Data of gastric cancer incidence were adapted from the Iranian national cancer registration reports from 2004 to 2008. Bayesian analysis was performed to estimate the misclassification rate with a beta prior distribution for misclassification parameter. Parameters of beta distribution were selected according to the expected coverage of new cancer cases in each medical university of the country. RESULTS There was a remarkable misclassification with reference to the registration of cancer cases across the provinces of the country. The average estimated misclassification rate was between 15% and 68%, and higher rates were estimated for more deprived provinces. CONCLUSION Misclassification error reduces the accuracy of the registry data, in turn causing underestimation and overestimation in the assessment of the risk of cancer in different areas. In conclusion, correcting the regional misclassification in cancer registry data is essential for discerning high-risk regions and making plans for cancer control and prevention.
Collapse
Affiliation(s)
- Nastaran Hajizadeh
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Reza Baghestani
- Physiotherapy Research Centre, Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohamad Amin Pourhoseingholi
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sara Ashtari
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hadis Najafimehr
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Luca Busani
- Department of Infectious Diseases, Istituto Superiore di Sanità, Roma, Italy
| | - Mohammad Reza Zali
- Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
16
|
Johnson WO, Jones G, Gardner IA. Gold standards are out and Bayes is in: Implementing the cure for imperfect reference tests in diagnostic accuracy studies. Prev Vet Med 2019; 167:113-127. [DOI: 10.1016/j.prevetmed.2019.01.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 01/24/2019] [Indexed: 11/16/2022]
|
17
|
Naranjo L, Pérez CJ, Martín J, Mutsvari T, Lesaffre E. A Bayesian approach for misclassified ordinal response data. J Appl Stat 2019. [DOI: 10.1080/02664763.2019.1582613] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Lizbeth Naranjo
- Departamento de Matemáticas, Facultad de Ciencias, Universidad Nacional Autónoma de México (UNAM), México D.F., Mexico
| | - Carlos J. Pérez
- Departamento de Matemáticas, Facultad de Ciencias, Universidad de Extremadura, Badajoz, Spain
| | - Jacinto Martín
- Departamento de Matemáticas, Facultad de Ciencias, Universidad de Extremadura, Badajoz, Spain
| | | | | |
Collapse
|
18
|
Shojaee S, Hajizadeh N, Najafimehr H, Busani L, Pourhoseingholi MA, Baghestani AR, Nasserinejad M, Ashtari S, Zali MR. Bayesian adjustment for trend of colorectal cancer incidence in misclassified registering across Iranian provinces. PLoS One 2018; 13:e0199273. [PMID: 30543626 PMCID: PMC6292591 DOI: 10.1371/journal.pone.0199273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Accepted: 11/25/2018] [Indexed: 12/02/2022] Open
Abstract
Misclassification error is a common problem of cancer registries in developing countries that leads to biased cancer rates. The purpose of this research is to use Bayesian method for correcting misclassification in registered cancer incidence of eighteen provinces in Iran. Incidence data of patients with colorectal cancer were extracted from Iranian annual of national cancer registration reports from 2005 to 2008. A province with proper medical facilities can always be compared to its neighbors. Almost 28% of the misclassification was estimated between the province of East Azarbaijan and West Azarbaijan, 56% between Fars and Hormozgan, 43% between Isfahan and Charmahal and Bakhtyari, 46% between Isfahan and Lorestan, 58% between Razavi Khorasan and North Khorasan, 50% between Razavi Khorasan and South Khorasan, 74% between Razavi Khorasan and Sistan and Balochestan, 43% between Mazandaran and Golestan, 37% between Tehran and Qazvin, 45% between Tehran and Markazi, 42% between Tehran and Qom, 47% between Tehran and Zanjan. Correcting the regional misclassification and obtaining the correct rates of cancer incidence in different regions is necessary for making cancer control and prevention programs and in healthcare resource allocation.
Collapse
Affiliation(s)
- Sajad Shojaee
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Nastaran Hajizadeh
- Physiotherapy Research Center, Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hadis Najafimehr
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Luca Busani
- Department of Infectious Diseases, Istituto Superiore di Sanità, Roma, Italy
| | - Mohamad Amin Pourhoseingholi
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ahmad Reza Baghestani
- Physiotherapy Research Center, Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Maryam Nasserinejad
- Physiotherapy Research Center, Department of Biostatistics, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sara Ashtari
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Zali
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
19
|
Ni J, Dasgupta K, Kahn SR, Talbot D, Lefebvre G, Lix LM, Berry G, Burman M, Dimentberg R, Laflamme Y, Cirkovic A, Rahme E. Comparing external and internal validation methods in correcting outcome misclassification bias in logistic regression: A simulation study and application to the case of postsurgical venous thromboembolism following total hip and knee arthroplasty. Pharmacoepidemiol Drug Saf 2018; 28:217-226. [PMID: 30515908 DOI: 10.1002/pds.4693] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2017] [Revised: 09/10/2018] [Accepted: 10/03/2018] [Indexed: 12/11/2022]
Abstract
PURPOSE We assessed the validity of postsurgery venous thromboembolism (VTE) diagnoses identified from administrative databases and compared Bayesian and multiple imputation (MI) approaches in correcting for outcome misclassification in logistic regression models. METHODS Sensitivity and specificity of postsurgery VTE among patients undergoing total hip or knee replacement (THR/TKR) were assessed against chart review in six Montreal hospitals in 2009 to 2010. Administrative data on all THR/TKR Quebec patients in 2009 to 2010 were obtained. The performance of Bayesian external, Bayesian internal, and MI approaches to correct the odds ratio (OR) of postsurgery VTE in tertiary versus community hospitals was assessed using simulations. Bayesian external approach used prior information from external sources, while Bayesian internal and MI approaches used chart review. RESULTS In total, 17 319 patients were included, 2136 in participating hospitals, among whom 75 had VTE in administrative data versus 81 in chart review. VTE sensitivity was 0.59 (95% confidence interval, 0.48-0.69) and specificity was 0.99 (0.98-0.99), overall. The adjusted OR of VTE in tertiary versus community hospitals was 1.35 (1.12-1.64) using administrative data, 1.45 (0.97-2.19) when MI was used for misclassification correction, and 1.53 (0.83-2.87) and 1.57 (0.39-5.24) when Bayesian internal and external approaches were used, respectively. In simulations, all three approaches reduced the OR bias and had appropriate coverage for both nondifferential and differential misclassification. CONCLUSION VTE identified from administrative data had low sensitivity and high specificity. The Bayesian external approach was useful to reduce outcome misclassification bias in logistic regression; however, it required accurate specification of the misclassification properties and should be used with caution.
Collapse
Affiliation(s)
- Jiayi Ni
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada
| | - Kaberi Dasgupta
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Medicine, Division of Clinical Epidemiology, McGill University, Montreal, QC, Canada
| | - Suzan R Kahn
- Department of Medicine, Division of Clinical Epidemiology, McGill University, Montreal, QC, Canada.,Center for Clinical Epidemiology & Community Studies, Jewish General Hospital, Montreal, QC, Canada
| | - Denis Talbot
- Research Center of the Centre Hospitalier Universitaire de Québec, Université Laval, Québec City, QC, Canada.,Department of Social and Preventive Medicine, Faculty of Medicine, Université Laval, Québec City, QC, Canada
| | - Geneviève Lefebvre
- Département de Mathématiques, Université du Québec à Montréal, Montreal, QC, Canada
| | - Lisa M Lix
- Department of Community Health Sciences, University of Manitoba, Winnipeg, MB, Canada
| | - Greg Berry
- Division of Orthopaedic Surgery, McGill University Health Centre-Montreal General Hospital, Montreal, QC, Canada
| | - Mark Burman
- Division of Orthopaedic Surgery, McGill University Health Centre-Montreal General Hospital, Montreal, QC, Canada
| | - Ronald Dimentberg
- Division of Orthopaedic Surgery, St. Mary's Hospital Center, Montreal, QC, Canada
| | - Yves Laflamme
- Division of Orthopaedic Surgery, Université de Montréal, Hôpital du Sacré-Coeur, Montreal, QC, Canada
| | - Alain Cirkovic
- Orthopedic Surgery, Hôpital de Verdun, Verdun, QC, Canada
| | - Elham Rahme
- Centre for Outcomes Research and Evaluation, Research Institute of the McGill University Health Centre, Montreal, QC, Canada.,Department of Medicine, Division of Clinical Epidemiology, McGill University, Montreal, QC, Canada
| |
Collapse
|
20
|
Gardner IA, Colling A, Greiner M. Design, statistical analysis and reporting standards for test accuracy studies for infectious diseases in animals: Progress, challenges and recommendations. Prev Vet Med 2018; 162:46-55. [PMID: 30621898 DOI: 10.1016/j.prevetmed.2018.10.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Revised: 08/13/2018] [Accepted: 10/28/2018] [Indexed: 02/04/2023]
Abstract
The quality of diagnostic accuracy studies (DAS) for infectious diseases of animals has improved over the last 20 years because of international educational efforts, use of design and reporting standards to guide researchers and test developers, and acceptance of the use of latent class models to account for imperfect reference tests. In this review, we focus on measurement of diagnostic sensitivity and specificity as a measure of clinical validity, describe the leadership role of the World Organisation of Animal Health (OIE) in setting standards for test validation in the context of fitness-for-purpose, and describe how design and reporting quality have facilitated the increased use of systematic reviews and meta-analysis of DAS. Ongoing challenges for design, conduct, analysis and reporting of DAS are identified; and we make recommendations for improvements in these areas for OIE-listed and non-listed infectious diseases.
Collapse
Affiliation(s)
- Ian A Gardner
- Department of Health Management, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, C1A 4P3, Canada.
| | - Axel Colling
- CSIRO Australian Animal Health Laboratory, Private Bag 24, Geelong, VIC, 3220, Australia.
| | - Matthias Greiner
- Federal Institute for Risk Assessment, Department of Exposure, Berlin and University of Veterinary Medicine Hannover, Foundation, Germany.
| |
Collapse
|
21
|
Krug C, Morin PA, Lacasse P, Roy JP, Dubuc J, Dufour S. Effect of incomplete milking during the first 5 days in milk on udder and reproductive tract health: Results from a randomized controlled trial. J Dairy Sci 2018; 101:9275-9286. [PMID: 30077449 DOI: 10.3168/jds.2018-14713] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 06/12/2018] [Indexed: 11/19/2022]
Abstract
The aim of this study was to investigate the effect of an incomplete milking on risk of mastitis and reproductive tract disease. Multiparous dairy cows (n = 878) from 13 commercial herds were enrolled in a randomized controlled trial. Cows were randomly assigned to either a control (milked conventionally) or a treatment group, which consisted of an incomplete milking (10-14 L of milk collected/d) from 1 to 5 d in milk (DIM). Quarter milk samples were collected at approximately 11 and 18 DIM to measure somatic cell count (SCC). Quarters were considered negative for intramammary infection if SCC was <100,000 cells/mL and positive if SCC was ≥200,000 cells/mL. To calculate intramammary infection incidence, negative quarters of the initial samples collected were tested again 1 wk later. This was done to deter incidence of positive quarters. To calculate elimination rate, positive quarters were tested again 1 wk later to detect mastitis elimination. Farmers recorded clinical mastitis events. Cows were also examined at approximately 35 DIM with a Metricheck device (Simcro, Hamilton, New Zealand) for detection of purulent vaginal discharge (PVD) and with an endometrial cytobrush for presence of leukocytes [endometrial cytology for smear (ENDO) and for leukocyte esterase test (LE)]. A threshold ≥3 was used to define a positive PVD or LE test, whereas a polymorphonuclear cell count ≥6% was used to define a positive ENDO. Five generalized mixed models with cow or herd as random intercepts were used to determine the effects of incomplete milking on odds of new intramammary infection, odds of intramammary infection elimination, and odds of a positive PVD, LE, or ENDO status. To investigate time until first clinical mastitis event, a Cox model with a herd frailty term was used. The odds of new intramammary infection and intramammary infection elimination for incompletely milked cows were 0.90 [95% confidence interval (CI): 0.49, 1.7] and 2.9 (95% CI: 1.4, 6.0) times those of conventionally milked cows, respectively. The hazard of clinical mastitis in incompletely milked cows was 0.96 (95% CI: 0.59, 1.6) times that of conventionally milked cows. The odds of PVD, LE, and ENDO for incompletely milked cows were 1.4 (95% CI: 0.89, 2.1), 1.3 (95% CI: 0.88, 1.8), and 1.2 (95% CI: 0.81, 1.7) times those of conventionally milked cows. These results suggest that incomplete milking during the first 5 DIM increases the odds of a decrease in SCC from 11 to 18 DIM but does not affect odds of increase in SCC in the same period. The incomplete milking had no effect on clinical mastitis incidence in the first 90 DIM or on reproductive tract health at 35 DIM.
Collapse
Affiliation(s)
- C Krug
- Département de Pathologie et Microbiologie, Québec, J2S 2M2, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, Québec, J2S 2M2, Canada
| | - P-A Morin
- Département de Sciences Cliniques, Faculté de Médecine Vétérinaire, Université de Montréal, 3200 Rue Sicotte, St-Hyacinthe, Québec, J2S 2M2, Canada
| | - P Lacasse
- Canadian Bovine Mastitis and Milk Quality Research Network, Québec, J2S 2M2, Canada; Sherbrooke Research and Development Centre, Agriculture and Agri-Food Canada, 2000 College, Sherbrooke, Québec, J1M 0C8, Canada
| | - J-P Roy
- Canadian Bovine Mastitis and Milk Quality Research Network, Québec, J2S 2M2, Canada; Département de Sciences Cliniques, Faculté de Médecine Vétérinaire, Université de Montréal, 3200 Rue Sicotte, St-Hyacinthe, Québec, J2S 2M2, Canada
| | - J Dubuc
- Département de Sciences Cliniques, Faculté de Médecine Vétérinaire, Université de Montréal, 3200 Rue Sicotte, St-Hyacinthe, Québec, J2S 2M2, Canada
| | - S Dufour
- Département de Pathologie et Microbiologie, Québec, J2S 2M2, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, Québec, J2S 2M2, Canada.
| |
Collapse
|
22
|
Pires MC, Quinino RDC. Repeated responses in misclassification binary regression: A Bayesian approach. STAT MODEL 2018. [DOI: 10.1177/1471082x18773394] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Binary regression models generally assume that the response variable is measured perfectly. However, in some situations, the outcome is subject to misclassification: a success may be erroneously classified as a failure or vice versa. Many methods, described in existing literature, have been developed to deal with misclassification, but we demonstrate that these methods may lead to serious inferential problems when only a single evaluation of the individual is taken. Thus, this study proposes to incorporate repeated and independent responses in misclassification binary regression models, considering the total number of successes obtained or even the simple majority classification. We use subjective prior distributions, as our conditional means prior, to evaluate and compare models. A data augmentation approach, Gibbs sampling, and Adaptive Rejection Metropolis Sampling are used for posterior inferences. Simulation studies suggested that repeated measures significantly improve the posterior estimates, in that these estimates are closer to those obtained in a case with no misclassifications with a lower standard deviation. Finally, we illustrate the usefulness of the new methodology with the analysis about defects in eyeglass lenses.
Collapse
Affiliation(s)
- Magda Carvalho Pires
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | |
Collapse
|
23
|
Haine D, Dohoo I, Dufour S. Selection and Misclassification Biases in Longitudinal Studies. Front Vet Sci 2018; 5:99. [PMID: 29892604 PMCID: PMC5985700 DOI: 10.3389/fvets.2018.00099] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 04/20/2018] [Indexed: 01/19/2023] Open
Abstract
Using imperfect tests may lead to biased estimates of disease frequency and measures of association. Many studies have looked into the effect of misclassification on statistical inferences. These evaluations were either within a cross-sectional study framework, assessing biased prevalence, or for cohort study designs, evaluating biased incidence rate or risk ratio estimates based on misclassification at one of the two time-points (initial assessment or follow-up). However, both observations at risk and incident cases can be wrongly identified in longitudinal studies, leading to selection and misclassification biases, respectively. The objective of this paper was to evaluate the relative impact of selection and misclassification biases resulting from misclassification, together, on measures of incidence and risk ratio. To investigate impact on measure of disease frequency, data sets from a hypothetical cohort study with two samples collected one month apart were simulated and analyzed based on specific test and disease characteristics, with no elimination of disease during the sampling interval or clustering of observations. Direction and magnitude of bias due to selection, misclassification, and total bias was assessed for diagnostic test sensitivity and specificity ranging from 0.7 to 1.0 and 0.8 to 1.0, respectively, and for specific disease contexts, i.e., disease prevalences of 5 and 20%, and disease incidences of 0.01, 0.05, and 0.1 cases/animal-month. A hypothetical exposure with known strength of association was also generated. A total of 1,000 cohort studies of 1,000 observations each were simulated for these six disease contexts where the same diagnostic test was used to identify observations at risk at beginning of the cohort and incident cases at its end. Our results indicated that the departure of the estimates of disease incidence and risk ratio from their true value were mainly a function of test specificity, and disease prevalence and incidence. The combination of the two biases, at baseline and follow-up, revealed the importance of a good to excellent specificity relative to sensitivity for the diagnostic test. Small divergence from perfect specificity extended quickly to disease incidence over-estimation as true prevalence increased and true incidence decreased. A highly sensitive test to exclude diseased subjects at baseline was of less importance to minimize bias than using a highly specific one at baseline. Near perfect diagnostic test attributes were even more important to obtain a measure of association close to the true risk ratio, according to specific disease characteristics, especially its prevalence. Low prevalent and high incident disease lead to minimal bias if disease is diagnosed with high sensitivity and close to perfect specificity at baseline and follow-up. For more prevalent diseases we observed large risk ratio biases towards the null value, even with near perfect diagnosis.
Collapse
Affiliation(s)
- Denis Haine
- Faculté de médecine vétérinaire, Université de Montréal, Montreal, QC, Canada.,Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, Canada
| | - Ian Dohoo
- Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, Canada.,Centre for Veterinary Epidemiological Research, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE, Canada
| | - Simon Dufour
- Faculté de médecine vétérinaire, Université de Montréal, Montreal, QC, Canada.,Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, Canada
| |
Collapse
|
24
|
Buczinski S, Fecteau G, Dubuc J, Francoz D. Validation of a clinical scoring system for bovine respiratory disease complex diagnosis in preweaned dairy calves using a Bayesian framework. Prev Vet Med 2018; 156:102-112. [PMID: 29891139 PMCID: PMC7114123 DOI: 10.1016/j.prevetmed.2018.05.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Revised: 05/01/2018] [Accepted: 05/02/2018] [Indexed: 12/22/2022]
Abstract
A prediction rule using thoracic ultrasound as an imperfect test for BRD diagnosis was modeled. Selection of the optimal threshold for case definition was proposed accounting for misclassification cost term analysis. Probability of active infection of the lower respiratory tract was determined for all 64 clinical signs combinations. Bovine respiratory disease complex is a major cause of illness in dairy calves. The diagnosis of active infection of the lower respiratory tract is challenging on daily basis in the absence of accurate clinical signs. Clinical scoring systems such as the Californian scoring system, are appealing but were developed without considering the imperfection of reference standard tests used for case definition. This study used a Bayesian latent class model to update Californian prediction rules. The results of clinical examination and ultrasound findings of 608 preweaned dairy calves were used. A model accounting for imperfect accuracy of thoracic ultrasound examination was used to obtain updated weights for the clinical signs included in the Californian scoring system. There were 20 points (95% Bayesian credible intervals: 11–29) for abnormal breathing pattern, 16 points (95% BCI: 4–29) for ear drop/head tilt, 16 points (95% BCI: 9–25) for cough, 10 points (95% BCI: 3–18) for the presence of nasal discharge, 7 points (95% BCI: −1 to 8) for rectal temperature ≥39.2 °C, and −1 points (95% BCI: −9 to 8) for the presence of ocular discharge. The optimal cut-offs were determined using the misclassification cost-term term (MCT) approach with different possible scenarios of expected prevalence and different plausible ratio of false negative costs/false positive costs. The predicted probabilities of active infection of the lower respiratory tract were also obtained using posterior densities of the main logistic regression model. Depending on the context, cut-off varying from 9 to 16 can minimized the MCT. The optimal cut-off decreased when expected prevalence of disease and false negative/false positive ratio increased.
Collapse
Affiliation(s)
- S Buczinski
- Faculté de médecine vétérinaire, Université de Montréal, Saint-Hyacinthe, J2S 2M2, Québec, Canada.
| | - G Fecteau
- Faculté de médecine vétérinaire, Université de Montréal, Saint-Hyacinthe, J2S 2M2, Québec, Canada
| | - J Dubuc
- Faculté de médecine vétérinaire, Université de Montréal, Saint-Hyacinthe, J2S 2M2, Québec, Canada
| | - D Francoz
- Faculté de médecine vétérinaire, Université de Montréal, Saint-Hyacinthe, J2S 2M2, Québec, Canada
| |
Collapse
|
25
|
CHLAMYDIA PSITTACI IN FERAL ROSY-FACED LOVEBIRDS ( AGAPORNIS ROSEICOLLIS) AND OTHER BACKYARD BIRDS IN MARICOPA COUNTY, ARIZONA, USA. J Wildl Dis 2018; 54:248-260. [PMID: 29369723 DOI: 10.7589/2017-06-145] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
In 2013, a mortality event of nonnative, feral Rosy-faced Lovebirds ( Agapornis roseicollis) in residential backyards in Maricopa County, Arizona, US was attributed to infection with Chlamydia psittaci. In June 2014, additional mortality occurred in the same region. Accordingly, in August 2014 we sampled live lovebirds and sympatric bird species visiting backyard bird feeders to determine the prevalence of DNA and the seroprevalence of antibodies to C. psittaci using real-time PCR-based testing and elementary body agglutination, respectively. Chlamydia psittaci DNA was present in conjunctival-choanal or cloacal swabs in 93% (43/46) of lovebirds and 10% (14/142) of sympatric birds. Antibodies to C. psittaci were detected in 76% (31/41) of lovebirds and 7% (7/102) of sympatric birds. Among the sympatric birds, Rock Doves ( Columba livia) had the highest prevalence of C. psittaci DNA (75%; 6/8) and seroprevalence (25%; 2/8). Psittacine circovirus 1 DNA was also identified, using real-time PCR-based testing, from the same swab samples in 69% (11/16) of species sampled, with a prevalence of 80% (37/46) in lovebirds and 27% (38/142) in sympatric species. The presence of either Rosy-faced Lovebirds or Rock Doves at residential bird feeders may be cause for concern for epizootic and zoonotic transmission of C. psittaci in this region.
Collapse
|
26
|
Högg T, Petkau J, Zhao Y, Gustafson P, Wijnands JM, Tremlett H. Bayesian analysis of pair-matched case-control studies subject to outcome misclassification. Stat Med 2017; 36:4196-4213. [PMID: 28783882 DOI: 10.1002/sim.7427] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Revised: 05/03/2017] [Accepted: 06/29/2017] [Indexed: 11/06/2022]
Abstract
We examine the impact of nondifferential outcome misclassification on odds ratios estimated from pair-matched case-control studies and propose a Bayesian model to adjust these estimates for misclassification bias. The model relies on access to a validation subgroup with confirmed outcome status for all case-control pairs as well as prior knowledge about the positive and negative predictive value of the classification mechanism. We illustrate the model's performance on simulated data and apply it to a database study examining the presence of ten morbidities in the prodromal phase of multiple sclerosis.
Collapse
Affiliation(s)
- Tanja Högg
- Department of Statistics, University of British Columbia, 2207 Main Mall, Vancouver, V6T 1Z4, British Columbia, Canada
| | - John Petkau
- Department of Statistics, University of British Columbia, 2207 Main Mall, Vancouver, V6T 1Z4, British Columbia, Canada
| | - Yinshan Zhao
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada.,BC Centre for Improved Cardiovascular Health, Vancouver, British Columbia, Canada
| | - Paul Gustafson
- Department of Statistics, University of British Columbia, 2207 Main Mall, Vancouver, V6T 1Z4, British Columbia, Canada
| | - José Ma Wijnands
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| | - Helen Tremlett
- Department of Medicine, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
27
|
Diagnosing intramammary infection: Controlling misclassification bias in longitudinal udder health studies. Prev Vet Med 2017; 150:162-167. [PMID: 29169686 DOI: 10.1016/j.prevetmed.2017.11.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Revised: 10/31/2017] [Accepted: 11/09/2017] [Indexed: 12/20/2022]
Abstract
Using imperfect tests may lead to biased estimates of disease frequency and of associations between risk factors and disease. For instance in longitudinal udder health studies, both quarters at risk and incident intramammary infections (IMI) can be wrongly identified, resulting in selection and misclassification bias, respectively. Diagnostic accuracy can possibly be improved by using duplicate or triplicate samples for identifying quarters at risk and, subsequently, incident IMI. The objectives of this study were to evaluate the relative impact of selection and misclassification biases resulting from IMI misclassification on measures of disease frequency (incidence) and of association with hypothetical exposures. The effect of improving the sampling strategy by collecting duplicate or triplicate samples at first or second sampling was also assessed. Data sets from a hypothetical cohort study were simulated and analyzed based on a separate scenario for two common mastitis pathogens representing two distinct prevailing patterns. Staphylococcus aureus, a relatively uncommon pathogen with a low incidence, is identified with excellent sensitivity and almost perfect specificity. Coagulase negative staphylococci (CNS) are more prevalent, with a high incidence, and with milk bacteriological culture having fair Se but excellent Sp. The generated data sets for each scenario were emulating a longitudinal cohort study with two milk samples collected one month apart from each quarter of a random sample of 30 cows/herd, from 100 herds, with a herd-level exposure having a known strength of association. Incidence of IMI and measure of association with exposure (odds ratio; OR) were estimated using Markov Chain Monte Carlo (MCMC) for each data set and using different sampling strategies (single, duplicate, triplicate samples with series or parallel interpretation) for identifying quarters at risk and incident IMI. For S. aureus biases were small with an observed incidence of 0.29 versus a true incidence of 0.25IMI/100 quarter-month. In the CNS scenario, diagnostic errors in the two samples led to important selection (40IMI/100 quarter-month) and misclassification (23IMI/100 quarter-month) biases for estimation of IMI incidence, respectively. These biases were in opposite direction and therefore the incidence measure obtained using single sampling on both the first and second test (29IMI/100 quarter-month) was exactly the true value. In the S. aureus scenario the OR for association with exposure showed little bias (observed OR of 3.1 versus true OR of 3.2). The CNS scenario revealed the presence of a large misclassification bias moving the association towards the null value (OR of 1.7 versus true OR of 2.6). Little improvement could be brought using different sampling strategies aiming at improving Se and/or Sp on first and/or second sampling or using a two out of three interpretation for IMI definition. Increasing number of samples or tests can prevent bias in some situations but efforts can be spared by holding to a single sampling approach in others. When designing longitudinal studies, evaluating potential biases and best sampling strategy is as critical as the choice of test.
Collapse
|
28
|
Gravel CA, Platt RW. Weighted estimation for confounded binary outcomes subject to misclassification. Stat Med 2017; 37:425-436. [DOI: 10.1002/sim.7522] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 08/18/2017] [Accepted: 09/13/2017] [Indexed: 11/07/2022]
Affiliation(s)
- Christopher A. Gravel
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Quebec Canada
- McLaughlin Centre for Population Health Risk Assessment; University of Ottawa; Ottawa Ontario Canada
| | - Robert W. Platt
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Quebec Canada
- Department of Pediatrics; McGill University; Montreal Quebec Canada
| |
Collapse
|
29
|
Hajizadeh N, Baghestani AR, Pourhoseingholi MA, Ashtari S, Fazeli Z, Vahedi M, Zali MR. Trend of hepatocellular carcinoma incidence after Bayesian correction for misclassified data in Iranian provinces. World J Hepatol 2017; 9:704-710. [PMID: 28596818 PMCID: PMC5440774 DOI: 10.4254/wjh.v9.i15.704] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Revised: 03/16/2017] [Accepted: 04/23/2017] [Indexed: 02/06/2023] Open
Abstract
AIM To study the trend of hepatocellular carcinoma incidence after correcting the misclassification in registering cancer incidence across Iranian provinces in cancer registry data. METHODS Incidence data of hepatocellular carcinoma were extracted from Iranian annual of national cancer registration reports 2004 to 2008. A Bayesian method was implemented to estimate the rate of misclassification in registering cancer incidence in neighboring province. A beta prior is considered for misclassification parameter. Each time two neighboring provinces were selected to be entered in the Bayesian model based on their expected coverage of cancer cases which is reported by medical university of the province. It is assumed that some cancer cases from a province that has an expected coverage of cancer cases lower than 100% are registered in their neighboring facilitate province with more than 100% expected coverage. RESULTS There is an increase in the rate of hepatocellular carcinoma in Iran. Among total of 30 provinces of Iran, 21 provinces were selected to be entered to the Bayesian model for correcting the existed misclassification. Provinces with more medical facilities of Iran are Tehran (capital of the country), Razavi Khorasan in north-east of Iran, East Azerbaijan in north-west of the country, Isfahan in central part and near to Tehran, Khozestan and Fars in south and Mazandaran in north of the Iran, had an expected coverage more than their expectation. Those provinces had significantly higher rates of hepatocellular carcinoma than their neighboring provinces. In years 2004 to 2008, it was estimated to be on average 34% misclassification between North Khorasan province and Razavi Khorasan, 43% between South Khorasan province and Razavi Khorasan, 47% between Sistan and balochestan province and Razavi Khorasan, 23% between West Azerbaijan province and East Azerbaijan province, 25% between Ardebil province and East Azerbaijan province, 41% between Hormozgan province and Fars province, 22% betweenChaharmahal and bakhtyari province and Isfahan province, 22% between Kogiloye and boyerahmad province and Isfahan, 22% between Golestan province and Mazandaran province, 43% between Bushehr province and Khozestan province, 41% between Ilam province and Khuzestan province, 42% between Qazvin province and Tehran province, 44% between Markazi province and Tehran, and 30% between Qom province and Tehran. CONCLUSION Accounting and correcting the regional misclassification is necessary for identifying high risk areas and planning for reducing the cancer incidence.
Collapse
Affiliation(s)
- Nastaran Hajizadeh
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Ahmad Reza Baghestani
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Mohamad Amin Pourhoseingholi
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Sara Ashtari
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Zeinab Fazeli
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Mohsen Vahedi
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| | - Mohammad Reza Zali
- Nastaran Hajizadeh, Sara Ashtari, Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
| |
Collapse
|
30
|
Condas LAZ, De Buck J, Nobrega DB, Carson DA, Naushad S, De Vliegher S, Zadoks RN, Middleton JR, Dufour S, Kastelic JP, Barkema HW. Prevalence of non-aureus staphylococci species causing intramammary infections in Canadian dairy herds. J Dairy Sci 2017; 100:5592-5612. [PMID: 28527793 DOI: 10.3168/jds.2016-12478] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2016] [Accepted: 03/25/2017] [Indexed: 01/26/2023]
Abstract
Non-aureus staphylococci (NAS), the microorganisms most frequently isolated from bovine milk worldwide, are a heterogeneous group of numerous species. To establish their importance as a group, the distribution of individual species needs to be determined. In the present study, NAS intramammary infection (IMI) was defined as a milk sample containing ≥1,000 cfu/mL in pure or mixed culture that was obtained from a cohort of cows assembled by the Canadian Bovine Mastitis Research Network. Overall, 6,213 (6.3%) of 98,233 quarter-milk samples from 5,149 cows and 20,305 udder quarters were associated with an NAS IMI. Of the 6,213 phenotypically identified NAS isolates, 5,509 (89%) were stored by the Canadian Bovine Mastitis Research Network Mastitis Pathogen Collection and characterized using partial sequencing of the rpoB housekeeping gene, confirming 5,434 isolates as NAS. Prevalence of each NAS species IMI was estimated using Bayesian models, with presence of a specific NAS species as the outcome. Overall quarter-level NAS IMI prevalence was 26%. The most prevalent species causing IMI were Staphylococcus chromogenes (13%), Staphylococcus simulans (4%), Staphylococcus haemolyticus (3%), Staphylococcus xylosus (2%), and Staphylococcus epidermidis (1%). The prevalence of NAS IMI as a group was highest in first-parity heifers and was evenly distributed throughout cows in parities ≥2. The IMI prevalence of some species such as S. chromogenes, S. simulans, and S. epidermidis differed among parities. Overall prevalence of NAS IMI was 35% at calving, decreased over the next 10 d, and then gradually increased until the end of lactation. The prevalence of S. chromogenes, Staphylococcus gallinarum, Staphylococcus cohnii, and Staphylococcus capitis was highest at calving, whereas the prevalence of S. chromogenes, S. haemolyticus, S. xylosus, and S. cohnii increased during lactation. Although the overall prevalence of NAS IMI was similar across barn types, the prevalence of S. simulans, S. xylosus, S. cohnii, Staphylococcus saprophyticus, S. capitis, and Staphylococcus arlettae IMI was higher in tiestall barns; the prevalence of S. epidermidis IMI was lowest; and the prevalence of S. chromogenes and Staphylococcus sciuri IMI was highest in bedded-pack barns. Staphylococcus simulans, S. epidermidis, S. xylosus, and S. cohnii IMI were more prevalent in herds with intermediate to high bulk milk somatic cell count (BMSCC) and S. haemolyticus IMI was more prevalent in herds with high BMSCC, whereas other common NAS species IMI were equally prevalent in all 3 BMSCC categories. Distribution of NAS species IMI differed among the 4 regions of Canada. In conclusion, distribution differed considerably among NAS species IMI; therefore, accurate identification (species level) is essential for studying NAS epidemiology.
Collapse
Affiliation(s)
- Larissa A Z Condas
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - Jeroen De Buck
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - Diego B Nobrega
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - Domonique A Carson
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - Sohail Naushad
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - Sarne De Vliegher
- M-Team and Mastitis and Milk Quality Research Unit, Department of Reproduction, Obstetrics and Herd Health, Faculty of Veterinary Medicine, Ghent University, Salisburylaan 133, 9820 Merelbeke, Belgium
| | - Ruth N Zadoks
- Institute of Biodiversity, Animal Health and Comparative Medicine, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, G61 1QH, Scotland, United Kingdom
| | - John R Middleton
- Department of Veterinary Medicine and Surgery, University of Missouri, Columbia 65211
| | - Simon Dufour
- Department of Pathology and Microbiology, Faculty of Veterinary Medicine, University of Montreal, C. P. 5000, St-Hyacinthe, Québec J2S 7C6, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada
| | - John P Kastelic
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada
| | - Herman W Barkema
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, Québec J2S 7C6, Canada.
| |
Collapse
|
31
|
Condas LAZ, De Buck J, Nobrega DB, Carson DA, Roy JP, Keefe GP, DeVries TJ, Middleton JR, Dufour S, Barkema HW. Distribution of non-aureus staphylococci species in udder quarters with low and high somatic cell count, and clinical mastitis. J Dairy Sci 2017; 100:5613-5627. [PMID: 28456402 DOI: 10.3168/jds.2016-12479] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2016] [Accepted: 03/11/2017] [Indexed: 01/01/2023]
Abstract
The effect of non-aureus staphylococci (NAS) in bovine mammary health is controversial. Overall, NAS intramammary infections (IMI) increase somatic cell count (SCC), with an effect categorized as mild, mostly causing subclinical or mild to moderate clinical mastitis. However, based on recent studies, specific NAS may affect the udder more severely. Some of these apparent discrepancies could be attributed to the large number of species that compose the NAS group. The objectives of this study were to determine (1) the SCC of quarters infected by individual NAS species compared with NAS as a group, culture-negative, and major pathogen-infected quarters; (2) the distribution of NAS species isolated from quarters with low SCC (<200,000 cells/mL) and high SCC (≥200,000 cells/mL), and clinical mastitis; and (3) the prevalence of NAS species across quarters with low and high SCC. A total of 5,507 NAS isolates, 3,561 from low SCC quarters, 1,873 from high SCC quarters, and 73 from clinical mastitis cases, were obtained from the National Cohort of Dairy Farms of the Canadian Bovine Mastitis Research Network. Of quarters with low SCC, high SCC, or clinical mastitis, 7.6, 18.5, and 4.3% were NAS positive, respectively. The effect of NAS IMI on SCC was estimated using mixed-effect linear regression; prevalence of NAS IMI was estimated using Bayesian analyses. Mean SCC of NAS-positive quarters was 70,000 cells/mL, which was higher than culture-negative quarters (32,000 cells/mL) and lower than major pathogen-positive quarters (129,000 to 183,000 cells/mL). Compared with other NAS species, SCC was highest in quarters positive for Staphylococcus capitis, Staphylococcus gallinarum, Staphylococcus hyicus, Staphylococcus agnetis, or Staphylococcus simulans. In NAS-positive quarters, Staphylococcus xylosus (12.6%), Staphylococcus cohnii (3.1%), and Staphylococcus equorum (0.6%) were more frequently isolated from quarters with low SCC than other NAS species, whereas Staphylococcus sciuri (14%) was most frequently isolated from clinical mastitis cases. Finally, in NAS-positive quarters, Staphylococcus chromogenes, S. simulans, Staphylococcus epidermidis, and Staphylococcus haemolyticus were isolated with similar frequency from among low SCC and high SCC quarters and clinical mastitis cases. Staphylococcus chromogenes, S. simulans, S. xylosus, S. haemolyticus, S. epidermidis, S. agnetis, Staphylococcus arlettae, S. capitis, S. gallinarum, S. sciuri, and Staphylococcus warneri were more prevalent in high than in low SCC quarters. Because the NAS are a large, heterogeneous group, considering them as a single group rather than at the species, or even subspecies level, has undoubtedly contributed to apparent discrepancies among studies as to their distribution and importance in IMI and mastitis.
Collapse
Affiliation(s)
- Larissa A Z Condas
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada
| | - Jeroen De Buck
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada
| | - Diego B Nobrega
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada
| | - Domonique A Carson
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada
| | - Jean-Philippe Roy
- Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada; Department of Clinical Sciences, Faculty of Veterinary Medicine, University of Montreal, C.P. 5000, St-Hyacinthe, QC J2S 7C6, Canada
| | - Greg P Keefe
- Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada; Department of Health Management, Atlantic Veterinary College, University of Prince Edward Island, Charlottetown, PE C1A 4P3, Canada
| | - Trevor J DeVries
- Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada; Department of Animal Biosciences, Ontario Agricultural College, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - John R Middleton
- Department of Veterinary Medicine and Surgery, University of Missouri, Columbia 65211
| | - Simon Dufour
- Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada; Department of Pathology and Microbiology, Faculty of Veterinary Medicine, University of Montreal, C.P. 5000, St-Hyacinthe, QC J2S 7C6, Canada
| | - Herman W Barkema
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, Calgary, AB T2N 4N1, Canada; Canadian Bovine Mastitis and Milk Quality Research Network, St-Hyacinthe, QC, J2S 7C6, Canada.
| |
Collapse
|
32
|
Hajizadeh N, Pourhoseingholi MA, Baghestani AR, Abadi A, Zali MR. Bayesian adjustment of gastric cancer mortality rate in the presence of misclassification. World J Gastrointest Oncol 2017; 9:160-165. [PMID: 28451063 PMCID: PMC5390301 DOI: 10.4251/wjgo.v9.i4.160] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Revised: 12/24/2016] [Accepted: 01/11/2017] [Indexed: 02/05/2023] Open
Abstract
AIM To correct for misclassification error in registering causes of death in Iran death registry using Bayesian method. METHODS National death statistic from 2006 to 2010 for gastric cancer which reported annually by the Ministry of Health and Medical Education included in this study. To correct the rate of gastric cancer mortality with reassigning the deaths due to gastric cancer that registered as cancer without detail, a Bayesian method was implemented with Poisson count regression and beta prior for misclassified parameter, assuming 20% misclassification in registering causes of death in Iran. RESULTS Registered mortality due to gastric cancer from 2006 to 2010 was considered in this study. According to the Bayesian re-estimate, about 3%-7% of deaths due to gastric cancer have registered as cancer without mentioning details. It makes an undercount of gastric cancer mortality in Iranian population. The number and age standardized rate of gastric cancer death is estimated to be 5805 (10.17 per 100000 populations), 5862 (10.51 per 100000 populations), 5731 (10.23 per 100000 populations), 5946 (10.44 per 100000 populations), and 6002 (10.35 per 100000 populations), respectively for years 2006 to 2010. CONCLUSION There is an undercount in gastric cancer mortality in Iranian registered data that researchers and authorities should notice that in sequential estimations and policy making.
Collapse
|
33
|
Zawistowski M, Sussman JB, Hofer TP, Bentley D, Hayward RA, Wiitala WL. Corrected ROC analysis for misclassified binary outcomes. Stat Med 2017; 36:2148-2160. [PMID: 28245528 DOI: 10.1002/sim.7260] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Revised: 01/25/2017] [Accepted: 01/26/2017] [Indexed: 11/06/2022]
Abstract
Creating accurate risk prediction models from Big Data resources such as Electronic Health Records (EHRs) is a critical step toward achieving precision medicine. A major challenge in developing these tools is accounting for imperfect aspects of EHR data, particularly the potential for misclassified outcomes. Misclassification, the swapping of case and control outcome labels, is well known to bias effect size estimates for regression prediction models. In this paper, we study the effect of misclassification on accuracy assessment for risk prediction models and find that it leads to bias in the area under the curve (AUC) metric from standard ROC analysis. The extent of the bias is determined by the false positive and false negative misclassification rates as well as disease prevalence. Notably, we show that simply correcting for misclassification while building the prediction model is not sufficient to remove the bias in AUC. We therefore introduce an intuitive misclassification-adjusted ROC procedure that accounts for uncertainty in observed outcomes and produces bias-corrected estimates of the true AUC. The method requires that misclassification rates are either known or can be estimated, quantities typically required for the modeling step. The computational simplicity of our method is a key advantage, making it ideal for efficiently comparing multiple prediction models on very large datasets. Finally, we apply the correction method to a hospitalization prediction model from a cohort of over 1 million patients from the Veterans Health Administrations EHR. Implementations of the ROC correction are provided for Stata and R. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Collapse
Affiliation(s)
- Matthew Zawistowski
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Biostatistics, University of Michigan, Ann Arbor, 48109, MI, U.S.A
| | - Jeremy B Sussman
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Timothy P Hofer
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Douglas Bentley
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A
| | - Rodney A Hayward
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.,Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, 48109, MI, U.S.A
| | - Wyndy L Wiitala
- Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A
| |
Collapse
|
34
|
Hajizadeh N, Pourhoseingholi MA, Baghestani AR, Abadi A, Zali MR. Bayesian adjustment for over-estimation and under-estimation of gastric cancer incidence across Iranian provinces. World J Gastrointest Oncol 2017; 9:87-93. [PMID: 28255430 PMCID: PMC5314205 DOI: 10.4251/wjgo.v9.i2.87] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 11/02/2016] [Accepted: 11/27/2016] [Indexed: 02/05/2023] Open
Abstract
AIM To correct the misclassification in registered gastric cancer incidence across Iranian provinces in cancer registry data. METHODS Gastric cancer data is extracted from Iranian annual of national cancer registration report 2008. A Bayesian method with beta prior is implemented to estimate the rate of misclassification in registering patient's permanent residence in neighboring province. Each time two neighboring provinces with lower and higher than 100% expected coverage of cancer cases are selected to be entered in the model. The expected coverage of cancerous patient is reported by medical university of each province. It is assumed that some cancer cases from a province with a lower than 100% expected coverage are registered in their neighboring province with more than 100% expected coverage. RESULTS The condition was true for 21 provinces from a total of 30 provinces of Iran. It was estimated that 43% of gastric cancer cases of North and South Khorasan provinces in north-east of Iran was registered in Razavi Khorasan as the neighboring facilitate province; also 72% misclassification was estimated between Sistan and balochestan province and Razavi Khorasan. The misclassification rate was estimated to be 36% between West Azerbaijan province and East Azerbaijan province, 21% between Ardebil province and East Azerbaijan, 63% between Hormozgan province and Fars province, 8% between Chaharmahal and bakhtyari province and Isfahan province, 8% between Kogiloye and boyerahmad province and Isfahan, 43% Golestan province and Mazandaran province, 54% between Bushehr province and Khozestan province, 26% between Ilam province and Khuzestan province, 32% between Qazvin province and Tehran province (capital of Iran), 43% between Markazi province and Tehran, and 37% between Qom province and Tehran. CONCLUSION Policy makers should consider the regional misclassification in the time of programming for cancer control, prevention and resource allocation.
Collapse
|
35
|
Abstract
We consider the problem of variable selection for logistic regression when the dependent variable is measured imperfectly, under both differential and non-differential misclassification. An MCMC sampling scheme is designed, incorporating uncertainty about which explanatory variables affect the dependent variable and which affect the probability of misclassification. We assume that a small gold standard perfectly measured sample is available to augment the imperfectly measured sample, under the differential misclassification framework. A simulation study illustrates favourable results both in terms of variable selection and parameter estimation. Examples analysing the risk of violence against young women by their partner and the risk of injury in highway motor accidents are considered.
Collapse
Affiliation(s)
- Richard Gerlach
- Richard Gerlach, Discipline of Econometrics and Business Statistics, Faculty of Economics and Business, University of Sydney, H04, Sydney, NSW, Australia, 2006
| | | |
Collapse
|
36
|
Maheu-Giroux M, Filippi V, Maulet N, Samadoulougou S, Castro MC, Meda N, Pouliot M, Kirakoya-Samadoulougou F. Risk factors for vaginal fistula symptoms in Sub-Saharan Africa: a pooled analysis of national household survey data. BMC Pregnancy Childbirth 2016; 16:82. [PMID: 27098261 PMCID: PMC4839076 DOI: 10.1186/s12884-016-0871-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Accepted: 04/14/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Vaginal fistula (VF) is one of the most severe maternal morbidities with the immediate consequence of chronic urinary and/or fecal incontinence. The epidemiological evidence regarding risk factors for VF is dominated by facility-based studies. Our aim is to estimate the effect size of selected risk factors for VF using population-based survey data. METHODS We pooled all available Demographic and Health Surveys and Multiple Indicators Cluster Surveys carried out in sub-Saharan Africa that collected information on VF symptoms. Bayesian matched logistic regression models that accounted for the imperfect sensitivity and specificity of self-reports of VF symptoms were used for effect size estimation. RESULTS Up to 27 surveys were pooled, including responses from 332,889 women. Being able to read decreased the odds of VF by 13% (95% Credible Intervals (CrI): 1% to 23%), while higher odds of VF symptoms were observed for women of short stature (<150 cm) (Odds Ratio (OR) = 1.31; 95% CrI: 1.02-1.68), those that had experienced intimate partner sexual violence (OR = 2.13; 95% CrI: 1.60-2.86), those that reported sexual debut before the age of 14 (OR = 1.41; 95% CrI: 1.16-1.71), and those that reported a first birth before the age of 14 (OR = 1.39; 95% CrI: 1.04-1.82). The effect of post-primary education, female genital mutilation, and having problems obtaining permission to seek health care were not statistically significant. CONCLUSIONS Increasing literacy, delaying age at first sex/birth, and preventing sexual violence could contribute to the elimination of obstetric fistula. Concomitant improvements in access to quality sexual and reproductive healthcare are, however, required to end fistula in sub-Saharan Africa.
Collapse
Affiliation(s)
- Mathieu Maheu-Giroux
- />Department of Infectious Disease Epidemiology, Imperial College London, St Mary’s Hospital, London, UK
| | - Véronique Filippi
- />Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, UK
| | - Nathalie Maulet
- />Institute of Health and Society, Université Catholique de Louvain, Clos Chapelle-aux-champs, Brussels, Belgium
| | - Sékou Samadoulougou
- />Pôle Épidemiologie et Biostatistique, Institute de recherche expérimentale et Clinique, Université Catholique de Louvain, Clos Chapelle-aux-champs, Brussels, Belgium
| | - Marcia C. Castro
- />Department of Global Health and Population, Harvard TH Chan School of Public Health, Boston, MA USA
| | - Nicolas Meda
- />Centre Muraz, Ministry of Health, Bobo-Dioulasso, Burkina Faso
- />UFR Sciences de la Santé, Université de Ouagadougou, Ouagadougou, Burkina Faso
| | - Mariève Pouliot
- />Institute of Food and Resources Economics, Section for Global Development, University of Copenhagen, Copenhagen, Denmark
| | | |
Collapse
|
37
|
Burgos JL, Patterson TL, Graff-Zivin JS, Kahn JG, Rangel MG, Lozada MR, Staines H, Strathdee SA. Cost-Effectiveness of Combined Sexual and Injection Risk Reduction Interventions among Female Sex Workers Who Inject Drugs in Two Very Distinct Mexican Border Cities. PLoS One 2016; 11:e0147719. [PMID: 26890001 PMCID: PMC4758635 DOI: 10.1371/journal.pone.0147719] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Accepted: 01/07/2016] [Indexed: 12/31/2022] Open
Abstract
Background We evaluated the cost-effectiveness of combined single session brief behavioral intervention, either didactic or interactive (Mujer Mas Segura, MMS) to promote safer-sex and safer-injection practices among female sex workers who inject drugs (FSW-IDUs) in Tijuana (TJ) and Ciudad-Juarez (CJ) Mexico. Data for this analysis was obtained from a factorial RCT in 2008–2010 coinciding with expansion of needle exchange programs (NEP) in TJ, but not in CJ. Methods A Markov model was developed to estimate the incremental cost per quality adjusted life year gained (QALY) over a lifetime time frame among a hypothetical cohort of 1,000 FSW-IDUs comparing a less intensive didactic vs. a more intensive interactive format of the MMS, separately for safer sex and safer injection combined behavioral interventions. The costs for antiretroviral therapy was not included in the model. We applied a societal perspective, a discount rate of 3% per year and currency adjusted to US$2014. A multivariate sensitivity analysis was performed. The combined and individual components of the MMS interactive behavioral intervention were compared with the didactic formats by calculating the incremental cost-effectiveness ratios (ICER), defined as incremental unit of cost per additional health benefit (e.g., HIV/STI cases averted, QALYs) compared to the next least costly strategy. Following guidelines from the World Health Organization, a combined strategy was considered highly cost-effective if the incremental cost per QALY gained fell below the gross domestic product per capita (GDP) in Mexico (equivalent to US$10,300). Findings For CJ, the mixed intervention approach of interactive safer sex/didactic safer injection had an incremental cost-effectiveness ratio (ICER) of US$4,360 ($310–$7,200) per QALY gained compared with a dually didactic strategy. Using the dually interactive strategy had an ICER of US$5,874 ($310–$7,200) compared with the mixed approach. For TJ, the combination of interactive safer sex/didactic safer injection had an ICER of US$5,921 ($104–$9,500) per QALY compared with dually didactic. Strategies using the interactive safe injection intervention were dominated due to lack of efficacy advantage. The multivariate sensitivity analysis showed a 95% certainty that in both CJ and TJ the ICER for the mixed approach (interactive safer sex didactic safer injection intervention) was less than the GDP per capita for Mexico. The dual interactive approach met this threshold consistently in CJ, but not in TJ. Interpretation In the absence of an expanded NEP in CJ, the combined-interactive formats of the MMS behavioral intervention is highly cost-effective. In contrast, in TJ where NEP expansion suggests that improved access to sterile syringes significantly reduced injection-related risks, the interactive safer-sex combined didactic safer-injection was highly cost-effective compared with the combined didactic versions of the safer-sex and safer-injection formats of the MMS, with no added benefit from the interactive safer-injection component.
Collapse
Affiliation(s)
- Jose L. Burgos
- University of California San Diego, Department of Medicine, Division of Global Public Health, La Jolla, California, United States of America
- Universidad Autonoma de Baja California, Facultad de Medicina y Psicología, Tijuana, Baja California, México
- * E-mail:
| | - Thomas L. Patterson
- University of California San Diego, Department of Psychiatry, La Jolla, California, United States of America
| | - Joshua S. Graff-Zivin
- University of California San Diego, School of Global Policy and Strategy, La Jolla, California, United States of America
| | - James G. Kahn
- University of California San Francisco, Department of Epidemiology and Biostatistics, Philip R. Lee Institute for Health Policy Studies, Global Health Sciences, San Francisco, California, United States of America
| | - M. Gudelia Rangel
- Secretaria de Salud de México, Comision de Salud Fronteriza Mexico-Estados Unidos Sección México, Tijuana, Baja California, México
| | - M. Remedios Lozada
- Instituto de Servicios de Salud Pública del Estado de Baja California, Mexicali, Baja California, Mexico
| | - Hugo Staines
- Universidad Autonoma de Ciudad Juarez, Facultad de Medicina, Ciudad Juárez, Chihuahua, México
| | - Steffanie A. Strathdee
- University of California San Diego, Department of Medicine, Division of Global Public Health, La Jolla, California, United States of America
| |
Collapse
|
38
|
Luo S, Chan W, Detry MA, Massman PJ, Doody RS. Binomial regression with a misclassified covariate and outcome. Stat Methods Med Res 2016; 25:101-17. [PMID: 22421539 PMCID: PMC3883897 DOI: 10.1177/0962280212441965] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Misclassification occurring in either outcome variables or categorical covariates or both is a common issue in medical science. It leads to biased results and distorted disease-exposure relationships. Moreover, it is often of clinical interest to obtain the estimates of sensitivity and specificity of some diagnostic methods even when neither gold standard nor prior knowledge about the parameters exists. We present a novel Bayesian approach in binomial regression when both the outcome variable and one binary covariate are subject to misclassification. Extensive simulation results under various scenarios and a real clinical example are given to illustrate the proposed approach. This approach is motivated and applied to a dataset from the Baylor Alzheimer's Disease and Memory Disorders Center.
Collapse
Affiliation(s)
- Sheng Luo
- Division of Biostatistics, The University of Texas Health Science Center at Houston, Houston, USA
| | - Wenyaw Chan
- Division of Biostatistics, The University of Texas Health Science Center at Houston, Houston, USA
| | - Michelle A Detry
- Department of Biostatistics and Medical Informatics, The University of Wisconsin-Madison, Madison, USA
| | - Paul J Massman
- Department of Psychology, University of Houston, Houston, USA
| | - Rachelle S Doody
- Department of Neurology, Baylor College of Medicine, Houston, USA
| |
Collapse
|
39
|
Karim ME, Gustafson P. Hypothesis Testing for an Exposure–Disease Association in Case–Control Studies Under Nondifferential Exposure Misclassification in the Presence of Validation Data: Bayesian and Frequentist Adjustments. STATISTICS IN BIOSCIENCES 2016. [DOI: 10.1007/s12561-015-9141-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
40
|
Kazembe LN, Kamndaya MS. Hierarchical spatial modelling of pneumonia prevalence when response outcome has misclassification error: Applications to household data from Malawi. Spat Spatiotemporal Epidemiol 2015; 16:35-42. [PMID: 26919753 DOI: 10.1016/j.sste.2015.11.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/23/2015] [Revised: 10/22/2015] [Accepted: 11/04/2015] [Indexed: 01/05/2023]
Abstract
Pneumonia remains a major cause of child mortality in less developed countries. However, the accuracy of its prevalence and burden remains a challenge because disease data is often based on self-reports, resulting in measurement error in a form of under- and over-reporting. We propose hierarchical disease mapping approaches that permit measurement error, through different prior distributions of sensitivity and specificity. Proposed models were used to evaluate spatial variation of risk of pneumonia in children in Malawi. Results show that the true prevalence was 0.50 (95 CI: 0.4-0.66), however, estimates were dependent on sensitivity and specificity parameters. The estimated sensitivity was 0.76 (95% CI: 0.68-0.95), whereas specificity was 0.84 (95% CI: 0.72-0.93). A lower specificity underestimated the true prevalence, while sensitivity and specificity of greater or equal to 0.75 provided reliable and stable prevalence estimates. The spatial variation in disease risk changed little; however, misclassification of areas as high risk was visible.
Collapse
Affiliation(s)
- Lawrence N Kazembe
- Department of Statistics and Population Studies, University of Namibia, Private Bag 13301 Windhoek, 340 Mandume Ndemufayo Avenue, Pionerspark, Namibia.
| | - Mphatso S Kamndaya
- School of Public Health, Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa
| |
Collapse
|
41
|
Valle D, Lima JMT, Millar J, Amratia P, Haque U. Bias in logistic regression due to imperfect diagnostic test results and practical correction approaches. Malar J 2015; 14:434. [PMID: 26537373 PMCID: PMC4634725 DOI: 10.1186/s12936-015-0966-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2015] [Accepted: 10/24/2015] [Indexed: 11/14/2022] Open
Abstract
Background Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. Methods A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. Results A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Conclusion Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models. Electronic supplementary material The online version of this article (doi:10.1186/s12936-015-0966-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Denis Valle
- School of Forest Resources and Conservation, University of Florida, Gainesville, USA.
| | - Joanna M Tucker Lima
- School of Forest Resources and Conservation, University of Florida, Gainesville, USA.
| | - Justin Millar
- School of Forest Resources and Conservation, University of Florida, Gainesville, USA.
| | - Punam Amratia
- School of Forest Resources and Conservation, University of Florida, Gainesville, USA.
| | - Ubydul Haque
- Emerging Pathogens Institute, University of Florida, Gainesville, USA. .,Geography Department, University of Florida, Gainesville, USA.
| |
Collapse
|
42
|
Branscum AJ, Johnson WO, Hanson TE, Baron AT. Flexible regression models for ROC and risk analysis, with or without a gold standard. Stat Med 2015; 34:3997-4015. [DOI: 10.1002/sim.6610] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 07/06/2015] [Indexed: 11/07/2022]
Affiliation(s)
- Adam J. Branscum
- Biostatistics Program; Oregon State University; Corvallis 97331 Oregon U.S.A
| | | | - Timothy E. Hanson
- Department of Statistics; University of South Carolina; Columbia SC U.S.A
| | | |
Collapse
|
43
|
Gradassi M, Caminiti A, Galletti G, Santi A, Paternoster G, Tamba M, Zanoni M, Tagliabue S, Alborali GL, Trevisani M. Suitability of a Salmonella control programme based on serology in slaughter heavy pigs. Res Vet Sci 2015; 101:154-60. [DOI: 10.1016/j.rvsc.2015.06.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Revised: 06/09/2015] [Accepted: 06/27/2015] [Indexed: 10/23/2022]
|
44
|
Buczinski S, L Ollivett T, Dendukuri N. Bayesian estimation of the accuracy of the calf respiratory scoring chart and ultrasonography for the diagnosis of bovine respiratory disease in pre-weaned dairy calves. Prev Vet Med 2015; 119:227-31. [DOI: 10.1016/j.prevetmed.2015.02.018] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Revised: 02/16/2015] [Accepted: 02/17/2015] [Indexed: 10/24/2022]
|
45
|
Buczinski S, Rademacher RD, Tripp HM, Edmonds M, Johnson EG, Dufour S. Assessment of L-lactatemia as a predictor of respiratory disease recognition and severity in feedlot steers. Prev Vet Med 2014; 118:306-18. [PMID: 25537763 DOI: 10.1016/j.prevetmed.2014.12.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2014] [Revised: 11/03/2014] [Accepted: 12/05/2014] [Indexed: 10/24/2022]
Abstract
The bovine respiratory disease complex (BRD) is a major health issue in feedlot cattle and one of the primary reasons for antimicrobial use in the North American feedlot industry. The purpose of the present study was to assess blood L-lactate levels of feedlot steers at high risk of developing BRD during the early feeding period. Blood samples were obtained at initial processing and again after BRD confirmation (using bronchial lavage or thoracic ultrasound exam). The study involved 232 recently weaned steers received at a single research feedlot that were processed without metaphylactic antimicrobial treatment. Blood samples were obtained for determination of L-lactatemia and temperament scores (very quiet or stoic [score 1], average [score 2] and very excited [score 3]) were systematically assigned at initial processing. A subsample of calves that were later confirmed as cases of BRD were sampled at first pull (day 0), and at subsequent observation points on days 3, 6, 9 and 15 following initial BRD diagnosis for blood lactate determination as a potential indicator of subsequent death. The clinical BRD cumulative incidence in the cohort was 38% (87/232). Temperament was associated with the probability of becoming a BRD case during the early feeding period. Stoic or very excited calves showed 2.2 times higher odds (95%CI: 1.3, 3.8) of becoming BRD cases compared to calves with average temperament. The impact of L-lactatemia differed by temperament strata. In calves with a temperament score of 2 (average temperament) every 1-log unit increase of lactatemia at processing resulted in 1.9 times higher odds (95% CI: 1.2, 3.1) of becoming a BRD case; this relationship was not significant in calves with a score of either 1 or 3. Twenty-nine confirmed BRD cases were studied for the dynamic lactate assessment analysis. L-lactate at first pull was not significantly different between survivors (median 3.3mmol/L; range 0.8-7.8mmol/L) and non-survivors (median 2.7mmol/L; range: 1.6-5.4mmol/L) steers. However, the dynamic assessment of L-lactatemia was associated with the hazard of death using Cox proportional hazard survival analysis. A 1-log increase of lactatemia increased the hazard of dying prior to the next observation by a factor of 36.5 (95% CI: 3.5-381.6). For calves showing a normal temperament score (i.e. temperament score of 2), a misclassification cost term analysis was conducted to identify potential L-lactate test thresholds for identifying future BRD steers. When planned test usage was for informing decision of administering or not a metaphylactic treatment at processing, experts agreed that false-negative (not treating a calf that would have benefit from treatment) to false-positive (wrongfully treating a calf that would have remained healthy) health costs ratio ranged from 8:1 to 20:1. In this situation, a threshold of 5mmol/L would have best informed treatment decision. When using L-lactate for informing the type of antimicrobial used at processing, false-negative to false-positive health costs ratio ranging from 1:1 to 3:1 could be expected and, again, a L-lactate threshold of 5.0mmol/L would have minimized the costs associated with calves' misclassification and could be used to identify calves that would benefit from a more efficient metaphylactic treatment. This study provides an interesting perspective on the potential application of chute-side markers or diagnostic tests to stratify the risk of future pull for BRD in cattle during processing in order to adapt antimicrobial treatments accordingly.
Collapse
Affiliation(s)
- S Buczinski
- Faculté de médecine vétérinaire, Université de Montréal, CP 5000, Saint-Hyacinthe, Québec J2S 7C6, Canada.
| | - R D Rademacher
- College of Veterinary Medicine, Oregon State University, Corvallis, OR 97331, United States
| | - H M Tripp
- Center for Veterinary Health Sciences, Oklahoma State University, Stillwater, OK 74078, United States
| | - M Edmonds
- Johnson Research LLC, Parma, ID, United States
| | - E G Johnson
- Johnson Research LLC, Parma, ID, United States
| | - S Dufour
- Faculté de médecine vétérinaire, Université de Montréal, CP 5000, Saint-Hyacinthe, Québec J2S 7C6, Canada
| |
Collapse
|
46
|
Angelidou E, Kostoulas P, Leontides L. Flock-level factors associated with the risk of Mycobacterium avium subsp. paratuberculosis (MAP) infection in Greek dairy goat flocks. Prev Vet Med 2014; 117:233-41. [DOI: 10.1016/j.prevetmed.2014.09.002] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 09/03/2014] [Accepted: 09/03/2014] [Indexed: 11/28/2022]
|
47
|
Tang L, Lyles RH, King CC, Hogan JW, Lo Y. Regression Analysis for Differentially Misclassified Correlated Binary Outcomes. J R Stat Soc Ser C Appl Stat 2014; 64:433-449. [PMID: 26005223 DOI: 10.1111/rssc.12081] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In many epidemiological and clinical studies, misclassification may arise in one or several variables, resulting in potentially invalid analytic results (e.g., estimates of odds ratios of interest) when no correction is made. Here we consider the situation in which correlated binary response variables are subject to misclassification. Building upon prior work, we provide an approach to adjust for potentially complex differential misclassification via internal validation sampling applied at multiple study time points. We seek to estimate the parameters of a primary generalized linear mixed model (GLMM) that accounts for baseline and/or time-dependent covariates. The misclassification process is modeled via a second generalized linear model that captures variations in sensitivity and specificity parameters according to time and a set of subject-specific covariates that may or may not overlap with those in the primary model. Simulation studies demonstrate the precision and validity of the proposed method. An application is presented based on longitudinal assessments of bacterial vaginosis conducted in the HIV Epidemiology Research (HER) Study.
Collapse
Affiliation(s)
- Li Tang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, U.S.A
| | - Robert H Lyles
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health of Emory University, Atlanta, Georgia 30322, U.S.A
| | - Caroline C King
- Division of Reproductive Health, Centers for Disease Control and Prevention, Atlanta, Georgia 30322, U.S.A
| | - Joseph W Hogan
- Center for Statistical Sciences, Program in Public Health, Brown University, Providence, RI 02912, U.S.A
| | - Yungtai Lo
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York 10461, U.S.A
| |
Collapse
|
48
|
Gilbert R, Martin RM, Donovan J, Lane JA, Hamdy F, Neal DE, Metcalfe C. Misclassification of outcome in case-control studies: Methods for sensitivity analysis. Stat Methods Med Res 2014; 25:2377-2393. [PMID: 25217446 DOI: 10.1177/0962280214523192] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Case-control studies are potentially open to misclassification of disease outcome which may be unrelated to risk factor exposure (non-differential), thus underestimating associations, or related to risk factor exposure (differential), thus causing more serious bias.We conducted a systematic literature review for methods of adjusting for outcome misclassification in case-control studies. We also applied methods to simulated data with known outcome misclassification to assess performance of these methods. Finally, real data from the Prostate Testing for Cancer and Treatment (ProtecT) randomised controlled trial gauged the usefulness of these methods.Adjustment methods range from recalculating cell frequencies to probabilistic sensitivity modelling and Bayesian models, which incorporate uncertainty in sensitivity and specificity estimates. Simulated data indicated that substantial bias in either direction resulted from differential misclassification. More sophisticated methods, incorporating uncertainty into estimates of misclassification, provided appropriately wide confidence intervals for corrected estimates of risk factor-disease association.Method choice depends on whether the objective is to assess if an observed association can be explained by bias, or to provide a 'corrected' estimate for the primary analysis. Accurate estimation of the degree of misclassification is important for the latter; otherwise further bias may be introduced.
Collapse
Affiliation(s)
- Rebecca Gilbert
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Richard M Martin
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Jenny Donovan
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - J Athene Lane
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Freddie Hamdy
- Nuffield Department of Surgery, University of Oxford, Oxford, UK
| | - David E Neal
- Department of Oncology, University of Cambridge, Cambridge, UK
| | - Chris Metcalfe
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| |
Collapse
|
49
|
Frénay B, Verleysen M. Classification in the presence of label noise: a survey. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2014; 25:845-869. [PMID: 24808033 DOI: 10.1109/tnnls.2013.2292894] [Citation(s) in RCA: 360] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Label noise is an important issue in classification, with many potential negative consequences. For example, the accuracy of predictions may decrease, whereas the complexity of inferred models and the number of necessary training samples may increase. Many works in the literature have been devoted to the study of label noise and the development of techniques to deal with label noise. However, the field lacks a comprehensive survey on the different types of label noise, their consequences and the algorithms that consider label noise. This paper proposes to fill this gap. First, the definitions and sources of label noise are considered and a taxonomy of the types of label noise is proposed. Second, the potential consequences of label noise are discussed. Third, label noise-robust, label noise cleansing, and label noise-tolerant algorithms are reviewed. For each category of approaches, a short discussion is proposed to help the practitioner to choose the most suitable technique in its own particular field of application. Eventually, the design of experiments is also discussed, what may interest the researchers who would like to test their own algorithms. In this paper, label noise consists of mislabeled instances: no additional information is assumed to be available like e.g., confidences on labels.
Collapse
|
50
|
Bihrmann K, Toft N, Nielsen SS, Ersbøll AK. Spatial correlation in Bayesian logistic regression with misclassification. Spat Spatiotemporal Epidemiol 2014; 9:1-12. [PMID: 24889989 DOI: 10.1016/j.sste.2014.02.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/07/2013] [Revised: 02/14/2014] [Accepted: 02/20/2014] [Indexed: 10/25/2022]
Abstract
Standard logistic regression assumes that the outcome is measured perfectly. In practice, this is often not the case, which could lead to biased estimates if not accounted for. This study presents Bayesian logistic regression with adjustment for misclassification of the outcome applied to data with spatial correlation. The models assessed include a fixed effects model, an independent random effects model, and models with spatially correlated random effects modelled using conditional autoregressive prior distributions (ICAR and ICAR(ρ)). Performance of these models was evaluated in a simulation study. Parameters were estimated by Markov Chain Monte Carlo methods, using slice sampling to improve convergence. The results demonstrated that adjustment for misclassification must be included to produce unbiased regression estimates. With strong correlation the ICAR model performed best. With weak or moderate correlation the ICAR(ρ) performed best. With unknown spatial correlation the recommended model would be the ICAR(ρ), assuming convergence can be obtained.
Collapse
Affiliation(s)
- Kristine Bihrmann
- Faculty of Medical and Health Sciences, University of Copenhagen, Grønnegårdsvej 8, DK-1870 Frederiksberg C, Denmark.
| | - Nils Toft
- Faculty of Medical and Health Sciences, University of Copenhagen, Grønnegårdsvej 8, DK-1870 Frederiksberg C, Denmark
| | - Søren Saxmose Nielsen
- Faculty of Medical and Health Sciences, University of Copenhagen, Grønnegårdsvej 8, DK-1870 Frederiksberg C, Denmark
| | - Annette Kjær Ersbøll
- National Institute of Public Health, University of Southern Denmark, Øster Farimagsgade 5A, 2, DK-1353 Copenhagen K, Denmark
| |
Collapse
|