1
|
Bannister MH, Peng XP. Clinical Genetics and Genomics for the Immunologist: A Primer. Immunol Allergy Clin North Am 2025; 45:153-171. [PMID: 40287166 DOI: 10.1016/j.iac.2025.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2025]
Abstract
We are just beginning to understand the architectures, landscapes, and paradigms underlying genetically driven immune disorders (GIDs), though have already benefited greatly from the evolution of increasingly sophisticated sequencing technologies. Genetic diagnostic strategies are chosen by matching the most appropriate molecular assays and analytical tools to the relevant genetic and genomic features of a patient's differential. This review provides a practical guide for such decision-making. The authors review GID-specific paradigms, compare available and emerging genomic technologies and assays, delineate a typical clinical genomic diagnostic process, and discuss the implications of the current variant classification framework for GIDs.
Collapse
Affiliation(s)
- Maxwell H Bannister
- Medical Scientist Training Program, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Xiao P Peng
- Genetics of Blood and Immunity, Montefiore Einstein; New York Center for Rare Diseases; Division of Pediatric Genetic Medicine, Department of Pediatrics, The Children's Hospital at Montefiore, The University Hospital for Albert Einstein College of Medicine, 3411 Wayne Avenue, 9th Floor, Bronx, NY 10467, USA.
| |
Collapse
|
2
|
Tafazoli A, Hemmati M, Rafigh M, Alimardani M, Khaghani F, Korostyński M, Karnes JH. Leveraging long-read sequencing technologies for pharmacogenomic testing: applications, analytical strategies, challenges, and future perspectives. Front Genet 2025; 16:1435416. [PMID: 40370700 PMCID: PMC12075302 DOI: 10.3389/fgene.2025.1435416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 04/07/2025] [Indexed: 05/16/2025] Open
Abstract
Long-read sequencing (LRS) was introduced as the third generation of next-generation sequencing technologies with a high accuracy rate in genomic variant identification for some of its platforms. Due to the structural complexity of many pharmacogenes, the presence of rare variants, and the limitations of genotyping and short-read sequencing approaches in detecting pharmacovariants, LRS methods are likely to become increasingly utilized in the near future. In this review, we aim to provide a comprehensive discussion of current and future applications of long-read genotyping methods by introducing the opportunities and advantages as well as the challenges and disadvantages of state-of-the-art LRS platforms for the implementation of pharmacogenomic tests in clinical and research settings. New approaches to data processing, as well as the challenges and pitfalls of performing such tests in daily practice, will be explored in detail. We provide references to resources for those who are interested or intend to employ LRS in pharmacogenomics screening, both in clinical and research settings.
Collapse
Affiliation(s)
- Alireza Tafazoli
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, Canada
| | - Mahboobeh Hemmati
- Department of Medical Genetics and Molecular Medicine, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahboobeh Rafigh
- Medical Genetics Research Center, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Maliheh Alimardani
- Department of Medical Genetics and Molecular Medicine, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Faeze Khaghani
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Guilan University of Medical Sciences, Rasht, Iran
| | - Michał Korostyński
- Laboratory of Pharmacogenomics, Department of Molecular Neuropharmacology, Maj Institute of Pharmacology Polish Academy of Sciences, Kraków, Poland
| | - Jason H. Karnes
- Department of Pharmacy Practice and Science, University of Arizona R. Ken Coit College of Pharmacy, Tucson, AZ, United States
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| |
Collapse
|
3
|
Abdelwahab O, Torkamaneh D. Artificial intelligence in variant calling: a review. FRONTIERS IN BIOINFORMATICS 2025; 5:1574359. [PMID: 40337525 PMCID: PMC12055765 DOI: 10.3389/fbinf.2025.1574359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2025] [Accepted: 04/08/2025] [Indexed: 05/09/2025] Open
Abstract
Artificial intelligence (AI) has revolutionized numerous fields, including genomics, where it has significantly impacted variant calling, a crucial process in genomic analysis. Variant calling involves the detection of genetic variants such as single nucleotide polymorphisms (SNPs), insertions/deletions (InDels), and structural variants from high-throughput sequencing data. Traditionally, statistical approaches have dominated this task, but the advent of AI led to the development of sophisticated tools that promise higher accuracy, efficiency, and scalability. This review explores the state-of-the-art AI-based variant calling tools, including DeepVariant, DNAscope, DeepTrio, Clair, Clairvoyante, Medaka, and HELLO. We discuss their underlying methodologies, strengths, limitations, and performance metrics across different sequencing technologies, alongside their computational requirements, focusing primarily on SNP and InDel detection. By comparing these AI-driven techniques with conventional methods, we highlight the transformative advancements AI has introduced and its potential to further enhance genomic research.
Collapse
Affiliation(s)
- Omar Abdelwahab
- Département de Phytologie, Université Laval, Québec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
- Centre de recherche et d’innovation sur les végétaux (CRIV), Université Laval, Québec City, QC, Canada
- Institut intelligence et données (IID), Université Laval, Québec City, QC, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
- Centre de recherche et d’innovation sur les végétaux (CRIV), Université Laval, Québec City, QC, Canada
- Institut intelligence et données (IID), Université Laval, Québec City, QC, Canada
| |
Collapse
|
4
|
Wong M, Liew B, Hum M, Lee NY, Lee ASG. Benchmarking of variant calling software for whole-exome sequencing using gold standard datasets. Sci Rep 2025; 15:13697. [PMID: 40258889 PMCID: PMC12012014 DOI: 10.1038/s41598-025-97047-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 04/02/2025] [Indexed: 04/23/2025] Open
Abstract
Accurate variant calling from whole-exome sequencing (WES) data is vital for understanding genetic diseases. Recently, commercial variant calling software have emerged that do not require bioinformatics or programming expertise, hence enabling independent analysis of WES data by smaller laboratories and clinics and circumventing the need for dedicated and expensive computers and bioinformatics staff. This study benchmarks four non-programming variant calling software namely, Illumina BaseSpace Sequence Hub (Illumina), CLC Genomics Workbench (CLC), Partek Flow, and Varsome Clinical, for the variant calling of three Genome in a Bottle (GIAB) whole-exome sequencing datasets (HG001, HG002 and HG003). Following alignment of sequence reads to the human reference genome GRCh38, variants were compared against high-confidence regions from GIAB datasets and assessed using the Variant Calling Assessment Tool (VCAT). Illumina's DRAGEN Enrichment achieved the highest precision and recall scores for single nucleotide variant (SNV) and insertions/deletion (indel) calling at over 99% for SNVs and 96% for indels while Partek Flow using unionised variant calls from Freebayes and Samtools had the lowest indel calling performance. Illumina had the highest true positives (TP) variant counts for all samples and all four software shared 98-99% similarity of TP variants. Run times were shortest for CLC and Illumina ranging from 6 to 25 min and 29 to 36 min respectively, while Partek Flow took the longest (3.6 to 29.7 h). This study provides information for clinicians and biologists without programming expertise in their selection of software for variant analysis that balance accuracy, sensitivity, and runtime.
Collapse
Affiliation(s)
- Matthew Wong
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 30 Hospital Boulevard, Singapore, 168583, Singapore
| | - Bryan Liew
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 30 Hospital Boulevard, Singapore, 168583, Singapore
| | - Melissa Hum
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 30 Hospital Boulevard, Singapore, 168583, Singapore
| | - Ning Yuan Lee
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 30 Hospital Boulevard, Singapore, 168583, Singapore
| | - Ann S G Lee
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 30 Hospital Boulevard, Singapore, 168583, Singapore.
- SingHealth Duke-NUS Oncology Academic Clinical Programme (ONCO ACP), Duke-NUS Medical School, 8 College Road, Singapore, 169857, Singapore.
- Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117593, Singapore.
| |
Collapse
|
5
|
Furtado LV, Ikemura K, Benkli CY, Moncur JT, Huang RSP, Zehir A, Stellato K, Vasalos P, Sadri N, Suarez CJ. General Applicability of Existing College of American Pathologists Accreditation Requirements to Clinical Implementation of Machine Learning-Based Methods in Molecular Oncology Testing. Arch Pathol Lab Med 2025; 149:319-327. [PMID: 38871357 DOI: 10.5858/arpa.2024-0037-cp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/07/2024] [Indexed: 06/15/2024]
Abstract
CONTEXT.— The College of American Pathologists (CAP) accreditation requirements for clinical laboratory testing help ensure laboratories implement and maintain systems and processes that are associated with quality. Machine learning (ML)-based models share some features of conventional laboratory testing methods. Accreditation requirements that specifically address clinical laboratories' use of ML remain in the early stages of development. OBJECTIVE.— To identify relevant CAP accreditation requirements that may be applied to the clinical adoption of ML-based molecular oncology assays, and to provide examples of current and emerging ML applications in molecular oncology testing. DESIGN.— CAP accreditation checklists related to molecular pathology and general laboratory practices (Molecular Pathology, All Common and Laboratory General) were reviewed. Examples of checklist requirements that are generally applicable to validation, revalidation, quality management, infrastructure, and analytical procedures of ML-based molecular oncology assays were summarized. Instances of ML use in molecular oncology testing were assessed from literature review. RESULTS.— Components of the general CAP accreditation framework that exist for traditional molecular oncology assay validation and maintenance are also relevant for implementing ML-based tests in a clinical laboratory. Current and emerging applications of ML in molecular oncology testing include DNA methylation profiling for central nervous system tumor classification, variant calling, microsatellite instability testing, mutational signature analysis, and variant prediction from histopathology images. CONCLUSIONS.— Currently, much of the ML activity in molecular oncology is within early clinical implementation. Despite specific considerations that apply to the adoption of ML-based methods, existing CAP requirements can serve as general guidelines for the clinical implementation of ML-based assays in molecular oncology testing.
Collapse
Affiliation(s)
- Larissa V Furtado
- From the Department of Pathology, St. Jude Children's Research Hospital, Memphis, Tennessee (Furtado)
| | - Kenji Ikemura
- the Department of Pathology, Mass General Brigham, Boston, Massachusetts (Ikemura)
| | - Cagla Y Benkli
- the Department of Pathology, Baylor College of Medicine, Houston, Texas (Benkli)
| | - Joel T Moncur
- Office of the Director, The Joint Pathology Center, Silver Spring, Maryland (Moncur)
| | - Richard S P Huang
- Clinical Development, Foundation Medicine Inc, Cambridge, Massachusetts (Huang)
| | - Ahmet Zehir
- Precision Medicine & Biosamples, AstraZeneca, New York, New York (Zehir)
| | - Katherine Stellato
- Proficiency Testing, College of American Pathologists, Northfield, Illinois (Stellato, Vasalos)
| | - Patricia Vasalos
- Proficiency Testing, College of American Pathologists, Northfield, Illinois (Stellato, Vasalos)
| | - Navid Sadri
- the Department of Pathology, University Hospitals Cleveland Medical Center, Cleveland, Ohio (Sadri)
| | - Carlos J Suarez
- the Department of Pathology, Stanford University School of Medicine, Palo Alto, California (Suarez)
| |
Collapse
|
6
|
Song Q, Hu T, Liang B, Li S, Li Y, Wu J, Wang S, Zhou X. cascAGS: Comparative Analysis of SNP Calling Methods for Human Genome Data in the Absence of Gold Standard. Interdiscip Sci 2025; 17:1-11. [PMID: 39443427 DOI: 10.1007/s12539-024-00653-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 08/14/2024] [Accepted: 08/19/2024] [Indexed: 10/25/2024]
Abstract
The development of third-generation sequencing has accelerated the boom of single nucleotide polymorphism (SNP) calling methods, but evaluating accuracy remains challenging owing to the absence of the SNP gold standard. The definitions for without-gold-standard and performance metrics and their estimation are urgently needed. Additionally, the possible correlations between different SNP loci should also be further explored. To address these challenges, we first introduced the concept of a gold standard and imperfect gold standard under the consistency framework and gave the corresponding definitions of sensitivity and specificity. A latent class model (LCM) was established to estimate the sensitivity and specificity of callers. Furthermore, we incorporated different dependency structures into LCM to investigate their impact on sensitivity and specificity. The performance of LCM was illustrated by comparing the accuracy of BCFtools, DeepVariant, FreeBayes, and GATK on various datasets. Through estimations across multiple datasets, the results indicate that LCM is well-suitable for evaluating callers without the SNP gold standard, and accurate inclusion of the dependency between variations is crucial for better performance ranking. DeepVariant has a higher sum of sensitivity and specificity than other callers, followed by GATK and BCFtools. FreeBayes has low sensitivity but high specificity. Notably, appropriate sequencing coverage is another important factor for precise callers' evaluation. Most importantly, a web interface for assessing and comparing different callers was developed to simplify the evaluation process.
Collapse
Affiliation(s)
- Qianqian Song
- Department of Biostatistics, School of Public Health, Peking University, Beijing, 100083, China
| | - Taobo Hu
- Department of Breast Surgery, Peking University People's Hospital, Beijing, 100044, China
| | - Baosheng Liang
- Department of Biostatistics, School of Public Health, Peking University, Beijing, 100083, China
| | - Shihai Li
- Chongqing Big Data Research Institute, Peking University, Chongqing, 401147, China
| | - Yang Li
- Chongqing Big Data Research Institute, Peking University, Chongqing, 401147, China
| | - Jinbo Wu
- Department of Breast Surgery, Peking University People's Hospital, Beijing, 100044, China
| | - Shu Wang
- Department of Breast Surgery, Peking University People's Hospital, Beijing, 100044, China.
| | - Xiaohua Zhou
- Department of Biostatistics, School of Public Health, Peking University, Beijing, 100083, China.
| |
Collapse
|
7
|
Chaabane F, Pillonel T, Bertelli C. MeSS and assembly_finder: a toolkit for in silico metagenomic sample generation. Bioinformatics 2024; 41:btae760. [PMID: 39739308 PMCID: PMC11755095 DOI: 10.1093/bioinformatics/btae760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Revised: 11/17/2024] [Accepted: 12/30/2024] [Indexed: 01/02/2025] Open
Abstract
SUMMARY The intrinsic complexity of the microbiota combined with technical variability render shotgun metagenomics challenging to analyze for routine clinical or research applications. In silico data generation offers a controlled environment allowing for example to benchmark bioinformatics tools, to optimize study design, statistical power, or to validate targeted applications. Here, we propose assembly_finder and the Metagenomic Sequence Simulator (MeSS), two easy-to-use Bioconda packages, as part of a benchmarking toolkit to download genomes and simulate shotgun metagenomics samples, respectively. Outperforming existing tools in speed while requiring less memory, MeSS reproducibly generates accurate complex communities based on a list of taxonomic ranks and their abundance. AVAILABILITY AND IMPLEMENTATION All code is released under MIT License and is available on https://github.com/metagenlab/MeSS and https://github.com/metagenlab/assembly_finder.
Collapse
Affiliation(s)
- Farid Chaabane
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, 1011, Switzerland
| | - Trestan Pillonel
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, 1011, Switzerland
| | - Claire Bertelli
- Institute of Microbiology, Lausanne University Hospital and University of Lausanne, Lausanne, 1011, Switzerland
| |
Collapse
|
8
|
Bonfiglio F, Legati A, Lasorsa VA, Palombo F, De Riso G, Isidori F, Russo S, Furini S, Merla G, Coppedè F, Tartaglia M, Bruselles A, Pippucci T, Ciolfi A, Pinelli M, Capasso M. Best practices for germline variant and DNA methylation analysis of second- and third-generation sequencing data. Hum Genomics 2024; 18:120. [PMID: 39501379 PMCID: PMC11536923 DOI: 10.1186/s40246-024-00684-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 10/11/2024] [Indexed: 11/09/2024] Open
Abstract
This comprehensive review provides insights and suggested strategies for the analysis of germline variants using second- and third-generation sequencing technologies (SGS and TGS). It addresses the critical stages of data processing, starting from alignment and preprocessing to quality control, variant calling, and the removal of artifacts. The document emphasized the importance of meticulous data handling, highlighting advanced methodologies for annotating variants and identifying structural variations and methylated DNA sites. Special attention is given to the inspection of problematic variants, a step that is crucial for ensuring the accuracy of the analysis, particularly in clinical settings where genetic diagnostics can inform patient care. Additionally, the document covers the use of various bioinformatics tools and software that enhance the precision and reliability of these analyses. It outlines best practices for the annotation of variants, including considerations for problematic genetic alterations such as those in the human leukocyte antigen region, runs of homozygosity, and mitochondrial DNA alterations. The document also explores the complexities associated with identifying structural variants and copy number variations, underscoring the challenges posed by these large-scale genomic alterations. The objective is to offer a comprehensive framework for researchers and clinicians, ensuring that genetic analyses conducted with SGS and TGS are both accurate and reproducible. By following these best practices, the document aims to increase the diagnostic accuracy for hereditary diseases, facilitating early diagnosis, prevention, and personalized treatment strategies. This review serves as a valuable resource for both novices and experts in the field, providing insights into the latest advancements and methodologies in genetic analysis. It also aims to encourage the adoption of these practices in diverse research and clinical contexts, promoting consistency and reliability across studies.
Collapse
Affiliation(s)
- Ferdinando Bonfiglio
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Andrea Legati
- Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | | | - Flavia Palombo
- Programma Di Neurogenetica, IRCCS Istituto Delle Scienze Neurologiche Di Bologna, Bologna, Italy
| | - Giulia De Riso
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Federica Isidori
- IRCCS Azienda Ospedaliero-Universitaria Di Bologna, Bologna, Italy
| | - Silvia Russo
- Research Laboratory of Medical Cytogenetics and Molecular Genetics, IRCCS Istituto Auxologico Italiano, Milan, Italy
- Laboratorio di Ricerca di Citogenetica Medica e Genetica Molecolare, Istituto Auxologico Italiano, IRCCS, 20145, Milano, Italy
| | - Simone Furini
- Department of Electrical, Electronic and Information Engineering "Guglielmo Marconi", University of Bologna, Bologna, Italy
| | - Giuseppe Merla
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
| | - Fabio Coppedè
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| | - Marco Tartaglia
- Molecular Genetics and Functional Genomics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Alessandro Bruselles
- Department of Oncology and Molecular Medicine, Istituto Superiore Di Sanità, Rome, Italy
| | - Tommaso Pippucci
- IRCCS Azienda Ospedaliero-Universitaria Di Bologna, Bologna, Italy
| | - Andrea Ciolfi
- Molecular Genetics and Functional Genomics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Michele Pinelli
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Mario Capasso
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy.
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy.
| |
Collapse
|
9
|
Menzel M, Martis-Thiele M, Goldschmid H, Ott A, Romanovsky E, Siemanowski-Hrach J, Seillier L, Brüchle NO, Maurer A, Lehmann KV, Begemann M, Elbracht M, Meyer R, Dintner S, Claus R, Meier-Kolthoff JP, Blanc E, Möbs M, Joosten M, Benary M, Basitta P, Hölscher F, Tischler V, Groß T, Kutz O, Prause R, William D, Horny K, Goering W, Sivalingam S, Borkhardt A, Blank C, Junk SV, Yasin L, Moskalev EA, Carta MG, Ferrazzi F, Tögel L, Wolter S, Adam E, Matysiak U, Rosenthal T, Dönitz J, Lehmann U, Schmidt G, Bartels S, Hofmann W, Hirsch S, Dikow N, Göbel K, Banan R, Hamelmann S, Fink A, Ball M, Neumann O, Rehker J, Kloth M, Murtagh J, Hartmann N, Jurmeister P, Mock A, Kumbrink J, Jung A, Mayr EM, Jacob A, Trautmann M, Kirmse S, Falkenberg K, Ruckert C, Hirsch D, Immel A, Dietmaier W, Haack T, Marienfeld R, Fürstberger A, Niewöhner J, Gerstenmaier U, Eberhardt T, Greif PA, Appenzeller S, Maurus K, Doll J, Jelting Y, Jonigk D, Märkl B, Beule D, Horst D, Wulf AL, Aust D, Werner M, Reuter-Jessen K, Ströbel P, Auber B, Sahm F, Merkelbach-Bruse S, Siebolts U, Roth W, Lassmann S, Klauschen F, Gaisa NT, et alMenzel M, Martis-Thiele M, Goldschmid H, Ott A, Romanovsky E, Siemanowski-Hrach J, Seillier L, Brüchle NO, Maurer A, Lehmann KV, Begemann M, Elbracht M, Meyer R, Dintner S, Claus R, Meier-Kolthoff JP, Blanc E, Möbs M, Joosten M, Benary M, Basitta P, Hölscher F, Tischler V, Groß T, Kutz O, Prause R, William D, Horny K, Goering W, Sivalingam S, Borkhardt A, Blank C, Junk SV, Yasin L, Moskalev EA, Carta MG, Ferrazzi F, Tögel L, Wolter S, Adam E, Matysiak U, Rosenthal T, Dönitz J, Lehmann U, Schmidt G, Bartels S, Hofmann W, Hirsch S, Dikow N, Göbel K, Banan R, Hamelmann S, Fink A, Ball M, Neumann O, Rehker J, Kloth M, Murtagh J, Hartmann N, Jurmeister P, Mock A, Kumbrink J, Jung A, Mayr EM, Jacob A, Trautmann M, Kirmse S, Falkenberg K, Ruckert C, Hirsch D, Immel A, Dietmaier W, Haack T, Marienfeld R, Fürstberger A, Niewöhner J, Gerstenmaier U, Eberhardt T, Greif PA, Appenzeller S, Maurus K, Doll J, Jelting Y, Jonigk D, Märkl B, Beule D, Horst D, Wulf AL, Aust D, Werner M, Reuter-Jessen K, Ströbel P, Auber B, Sahm F, Merkelbach-Bruse S, Siebolts U, Roth W, Lassmann S, Klauschen F, Gaisa NT, Weichert W, Evert M, Armeanu-Ebinger S, Ossowski S, Schroeder C, Schaaf CP, Malek N, Schirmacher P, Kazdal D, Pfarr N, Budczies J, Stenzinger A. Benchmarking whole exome sequencing in the German network for personalized medicine. Eur J Cancer 2024; 211:114306. [PMID: 39293347 DOI: 10.1016/j.ejca.2024.114306] [Show More Authors] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Revised: 08/23/2024] [Accepted: 08/23/2024] [Indexed: 09/20/2024]
Abstract
INTRODUCTION Whole Exome Sequencing (WES) has emerged as an efficient tool in clinical cancer diagnostics to broaden the scope from panel-based diagnostics to screening of all genes and enabling robust determination of complex biomarkers in a single analysis. METHODS To assess concordance, six formalin-fixed paraffin-embedded (FFPE) tissue specimens and four commercial reference standards were analyzed by WES as matched tumor-normal DNA at 21 NGS centers in Germany, each employing local wet-lab and bioinformatics. Somatic and germline variants, copy-number alterations (CNAs), and complex biomarkers were investigated. Somatic variant calling was performed in 494 diagnostically relevant cancer genes. The raw data were collected and re-analyzed with a central bioinformatic pipeline to separate wet- and dry-lab variability. RESULTS The mean positive percentage agreement (PPA) of somatic variant calling was 76 % while the positive predictive value (PPV) was 89 % in relation to a consensus list of variants found by at least five centers. Variant filtering was identified as the main cause for divergent variant calls. Adjusting filter criteria and re-analysis increased the PPA to 88 % for all and 97 % for the clinically relevant variants. CNA calls were concordant for 82 % of genomic regions. Homologous recombination deficiency (HRD), tumor mutational burden (TMB), and microsatellite instability (MSI) status were concordant for 94 %, 93 %, and 93 % of calls, respectively. Variability of CNAs and complex biomarkers did not decrease considerably after harmonization of the bioinformatic processing and was hence attributed mainly to wet-lab differences. CONCLUSION Continuous optimization of bioinformatic workflows and participating in round robin tests are recommended.
Collapse
Affiliation(s)
- Michael Menzel
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany.
| | - Mihaela Martis-Thiele
- Institute of Pathology, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Hannah Goldschmid
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Alexander Ott
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Eva Romanovsky
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Janna Siemanowski-Hrach
- Institute of Pathology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Lancelot Seillier
- Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany; Joint Research Center Computational Biomedicine, University Hospital RWTH Aachen, Aachen, Germany; Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany
| | - Nadina Ortiz Brüchle
- Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany; Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany
| | - Angela Maurer
- Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany
| | - Kjong-Van Lehmann
- Joint Research Center Computational Biomedicine, University Hospital RWTH Aachen, Aachen, Germany; Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Cancer Research Center Cologne-Essen, University Hospital Cologne, Germany; Machine Learning in Cancer Genetis and Precision Medicine, University RWTH Aachen, Aachen, Germany
| | - Matthias Begemann
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Institute for Human Genetics and Genomic Medicine., University Hospital RWTH Aachen, Aachen, Germany; NGS diagnostic centre, University Hospital RWTH Aachen, Aachen, Germany
| | - Miriam Elbracht
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Institute for Human Genetics and Genomic Medicine., University Hospital RWTH Aachen, Aachen, Germany
| | - Robert Meyer
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany
| | | | - Rainer Claus
- Pathology, Faculty of Medicine, University of Augsburg, Germany; Comprehensive Cancer Center, Faculty of Medicine, University of Augsburg, Germany
| | - Jan P Meier-Kolthoff
- Chair of Biomedical Informatics, Data Mining and Data Analytics, Faculty of Applied Computer Science, University of Augsburg, Germany
| | - Eric Blanc
- Core Unit Bioinformatics, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - Markus Möbs
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - Maria Joosten
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - Manuela Benary
- Core Unit Bioinformatics, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany; Charité Comprehensive Cancer Center, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - Patrick Basitta
- Universitätsklinikum Bonn, Molekularpathologische Diagnostik, Institut für Pathologie, Venusberg Campus 1, 53127 Bonn, Germany
| | - Florian Hölscher
- Universitätsklinikum Bonn, Molekularpathologische Diagnostik, Institut für Pathologie, Venusberg Campus 1, 53127 Bonn, Germany
| | - Verena Tischler
- Universitätsklinikum Bonn, Molekularpathologische Diagnostik, Institut für Pathologie, Venusberg Campus 1, 53127 Bonn, Germany
| | - Thomas Groß
- Core Unit for Molecular Tumor Diagnostics (CMTD), National Center for Tumor Diseases Dresden (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany
| | - Oliver Kutz
- Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany; ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Germany; National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany; German Cancer Consortium (DKTK), Dresden, Germany; German Cancer Research Center (DKFZ), Heidelberg, Germany; Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Rebecca Prause
- Core Unit for Molecular Tumor Diagnostics (CMTD), National Center for Tumor Diseases Dresden (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany
| | - Doreen William
- Core Unit for Molecular Tumor Diagnostics (CMTD), National Center for Tumor Diseases Dresden (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany; Institute for Clinical Genetics, University Hospital Carl Gustav Carus at TUD Dresden University of Technology and Faculty of Medicine of TUD Dresden University of Technology, Dresden, Germany; ERN GENTURIS, Hereditary Cancer Syndrome Center Dresden, Germany; National Center for Tumor Diseases (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany; German Cancer Consortium (DKTK), Dresden, Germany; German Cancer Research Center (DKFZ), Heidelberg, Germany; Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Kai Horny
- Center for Personalized Medicine Oncology, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany; Core Unit Bioinformatics, Medical Faculty and University Hospital Düsseldorf, Heinrich Heine University Düsseldorf, Germany
| | | | - Sugirthan Sivalingam
- Institute of Human Genetics, Medical Faculty, University Hospital of Düsseldorf, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | - Arndt Borkhardt
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Department of Pediatric Oncology, Hematology and Clinical Immunology, Medical Faculty, HHU Düsseldorf, Germany; German Cancer Consortium (DKTK), partner site Essen-Düsseldorf, Germany
| | - Cornelia Blank
- Institute of Human Genetics, Medical Faculty, University Hospital of Düsseldorf, Heinrich Heine University of Düsseldorf, Düsseldorf, Germany
| | - Stefanie V Junk
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Department of Pediatric Oncology, Hematology and Clinical Immunology, Medical Faculty, HHU Düsseldorf, Germany
| | - Layal Yasin
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Department of Pediatric Oncology, Hematology and Clinical Immunology, Medical Faculty, HHU Düsseldorf, Germany
| | - Evgeny A Moskalev
- Institute of Pathology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; Center for Personalized Medicine (ZPM), Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Erlangen, Germany; Bavarian Cancer Research Center (BZKF), Erlangen, Germany
| | - Maria Giulia Carta
- Institute of Pathology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; Center for Personalized Medicine (ZPM), Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Erlangen, Germany; Bavarian Cancer Research Center (BZKF), Erlangen, Germany
| | - Fulvia Ferrazzi
- Institute of Pathology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; Center for Personalized Medicine (ZPM), Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Erlangen, Germany; Bavarian Cancer Research Center (BZKF), Erlangen, Germany; Department of Nephropathology, Institute of Pathology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | - Lars Tögel
- Institute of Pathology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany; Center for Personalized Medicine (ZPM), Erlangen, Germany; Comprehensive Cancer Center Erlangen-EMN (CCC ER-EMN), Erlangen, Germany; Bavarian Cancer Research Center (BZKF), Erlangen, Germany
| | - Steffen Wolter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Germany; Center for Personalized Medicine (ZPM), partner site Freiburg, Germany; Comprehensive Cancer Center Freiburg (CCCF), Medical Center, Freiburg, Germany
| | - Eugen Adam
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Germany; Center for Personalized Medicine (ZPM), partner site Freiburg, Germany; Comprehensive Cancer Center Freiburg (CCCF), Medical Center, Freiburg, Germany
| | - Uta Matysiak
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Germany; Center for Personalized Medicine (ZPM), partner site Freiburg, Germany; Comprehensive Cancer Center Freiburg (CCCF), Medical Center, Freiburg, Germany
| | - Tessa Rosenthal
- Institut für Pathologie, Universitätsmedizin Göttingen, Germany
| | - Jürgen Dönitz
- Institut für Bioinformatik, Universitätsmedizin Göttingen, Germany
| | - Ulrich Lehmann
- Institute of Pathology, Hannover Medical School, Hannover, Germany
| | - Gunnar Schmidt
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Stephan Bartels
- Institute of Pathology, Hannover Medical School, Hannover, Germany
| | - Winfried Hofmann
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Steffen Hirsch
- Institute of Human Genetics, Heidelberg University, Heidelberg, Germany
| | - Nicola Dikow
- Institute of Human Genetics, Heidelberg University, Heidelberg, Germany
| | - Kirsten Göbel
- Department of Neuropathology, University Hospital Heidelberg, Germany
| | - Rouzbeh Banan
- Department of Neuropathology, University Hospital Heidelberg, Germany
| | - Stefan Hamelmann
- Department of Neuropathology, University Hospital Heidelberg, Germany
| | - Annette Fink
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Markus Ball
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Translational Lung Research Center Heidelberg (TLRC), Member of the German Center for Lung Research (DZL), Heidelberg, Germany
| | - Olaf Neumann
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Jan Rehker
- Institute of Pathology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Michael Kloth
- Institut für Pathologie, Universitätsmedizin Mainz, Germany
| | - Justin Murtagh
- Institut für Pathologie, Universitätsmedizin Mainz, Germany
| | - Nils Hartmann
- Institut für Pathologie, Universitätsmedizin Mainz, Germany
| | - Phillip Jurmeister
- Institute of Pathology, Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany; German Cancer Consortium, German Cancer Research Center (DKTK/DKFZ), Munich, Partner Site, Munich, Germany
| | - Andreas Mock
- Institute of Pathology, Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany; German Cancer Consortium, German Cancer Research Center (DKTK/DKFZ), Munich, Partner Site, Munich, Germany
| | - Jörg Kumbrink
- Institute of Pathology, Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany; German Cancer Consortium, German Cancer Research Center (DKTK/DKFZ), Munich, Partner Site, Munich, Germany
| | - Andreas Jung
- Institute of Pathology, Faculty of Medicine, Ludwig-Maximilians-Universität München, Munich, Germany; German Cancer Consortium, German Cancer Research Center (DKTK/DKFZ), Munich, Partner Site, Munich, Germany
| | - Eva-Maria Mayr
- Institute of Pathology, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Anne Jacob
- Institute of Pathology, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Marcel Trautmann
- Gerhard-Domagk-Institute of Pathology, University Hospital Münster, Münster, Germany; West German Cancer Center, University Hospital Münster, Münster, Germany
| | - Santina Kirmse
- Gerhard-Domagk-Institute of Pathology, University Hospital Münster, Münster, Germany; West German Cancer Center, University Hospital Münster, Münster, Germany
| | - Kim Falkenberg
- Gerhard-Domagk-Institute of Pathology, University Hospital Münster, Münster, Germany; West German Cancer Center, University Hospital Münster, Münster, Germany
| | - Christian Ruckert
- Centre of Medical Genetics, Department of Medical Genetics, University and University Hospital Münster, Münster, Germany
| | - Daniela Hirsch
- Institute of Pathology, University of Regensburg, Germany
| | - Alexander Immel
- Institute of Pathology, University of Regensburg, Germany; Centrum für Translationale Onkologie, Universitätsklinikum Regensburg, Germany
| | | | - Tobias Haack
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Ralf Marienfeld
- Institute of Pathology, University Hospital Ulm, Germany; Centers for Personalized Medicine (ZPM), Ulm, Germany
| | - Axel Fürstberger
- Institute of Pathology, University Hospital Ulm, Germany; Centers for Personalized Medicine (ZPM), Ulm, Germany
| | - Jakob Niewöhner
- Institute of Pathology, University Hospital Ulm, Germany; Centers for Personalized Medicine (ZPM), Ulm, Germany
| | - Uwe Gerstenmaier
- Institute of Pathology, University Hospital Ulm, Germany; Centers for Personalized Medicine (ZPM), Ulm, Germany
| | - Timo Eberhardt
- Centers for Personalized Medicine (ZPM), Ulm, Germany; Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Philipp A Greif
- German Cancer Consortium, German Cancer Research Center (DKTK/DKFZ), Munich, Partner Site, Munich, Germany; Department of Medicine III, University Hospital, LMU Munich, Munich, Germany; Institute of Human Genetics, University Hospital, LMU Munich, Munich, Germany
| | - Silke Appenzeller
- Comprehensive Cancer Center Mainfranken, University Hospital Wuerzburg, Germany
| | - Katja Maurus
- Institute of Pathology, University of Wuerzburg, Germany
| | - Julia Doll
- Institute of Pathology, University of Wuerzburg, Germany
| | - Yvonne Jelting
- Institute of Human Genetics, University of Wuerzburg, Germany
| | - Danny Jonigk
- Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany; Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Biomedical Research in End-stage and Obstructive Lung Disease Hannover (BREATH), German Lung Research Centre (DZL), Hannover, Germany
| | - Bruno Märkl
- Pathology, Faculty of Medicine, University of Augsburg, Germany
| | - Dieter Beule
- Core Unit Bioinformatics, Berlin Institute of Health at Charité-Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - David Horst
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin, Germany
| | - Anna-Lena Wulf
- Universitätsklinikum Bonn, Molekularpathologische Diagnostik, Institut für Pathologie, Venusberg Campus 1, 53127 Bonn, Germany
| | - Daniela Aust
- Core Unit for Molecular Tumor Diagnostics (CMTD), National Center for Tumor Diseases Dresden (NCT), NCT/UCC Dresden, a partnership between German Cancer Research Center (DKFZ), Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology and Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Germany; Institut für Pathologie, Universitätsklinikum Carl Gustav Carus der TU Dresden, Fetscherstr. 74, 01307 Dresden, Germany
| | - Martin Werner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Germany; Center for Personalized Medicine (ZPM), partner site Freiburg, Germany; Comprehensive Cancer Center Freiburg (CCCF), Medical Center, Freiburg, Germany; German Cancer Consortium (DKTK), Partner Site Freiburg, Germany
| | | | - Philipp Ströbel
- Institut für Pathologie, Universitätsmedizin Göttingen, Germany
| | - Bernd Auber
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Felix Sahm
- Department of Neuropathology, University Hospital Heidelberg, Germany; CCU Neuropathology, German Consortium for Translational Cancer Research (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Sabine Merkelbach-Bruse
- Institute of Pathology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Udo Siebolts
- Institute of Pathology, University of Cologne, Faculty of Medicine and University Hospital Cologne, Cologne, Germany
| | - Wilfried Roth
- Institut für Pathologie, Universitätsmedizin Mainz, Germany
| | - Silke Lassmann
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany; Center for Personalized Medicine (ZPM), Freiburg, Germany
| | - Frederick Klauschen
- Department of Human Genetics, Hannover Medical School, Hannover, Germany; Institute of Human Genetics, Heidelberg University, Heidelberg, Germany
| | - Nadine T Gaisa
- Institute of Pathology, University Hospital RWTH Aachen, Aachen, Germany; Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Germany; Gerhard-Domagk-Institute of Pathology, University Hospital Münster, Münster, Germany
| | - Wilko Weichert
- Institute of Pathology, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Matthias Evert
- Institute of Pathology, University of Regensburg, Germany
| | - Sorin Armeanu-Ebinger
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Stephan Ossowski
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Christopher Schroeder
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | | | - Nisar Malek
- Centers for Personalized Medicine (ZPM), Germany; Department of Gastroenterology, Tübingen University Hospital, Tübingen, Germany
| | - Peter Schirmacher
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Daniel Kazdal
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Translational Lung Research Center Heidelberg (TLRC), Member of the German Center for Lung Research (DZL), Heidelberg, Germany
| | - Nicole Pfarr
- Institute of Pathology, TUM School of Medicine and Health, Technical University of Munich, Germany
| | - Jan Budczies
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany
| | - Albrecht Stenzinger
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany; Centers for Personalized Medicine (ZPM), Germany; Translational Lung Research Center Heidelberg (TLRC), Member of the German Center for Lung Research (DZL), Heidelberg, Germany; German Cancer Consortium (DKTK), Germany.
| |
Collapse
|
10
|
Skitchenko R, Smirnov S, Krapivin M, Smirnova A, Artomov M, Loboda A, Dinikina Y. Case report: A case study of variant calling pipeline selection effect on the molecular diagnostics outcome. Front Oncol 2024; 14:1422811. [PMID: 39544296 PMCID: PMC11560904 DOI: 10.3389/fonc.2024.1422811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Accepted: 09/27/2024] [Indexed: 11/17/2024] Open
Abstract
Next-generation sequencing technologies have not only defined a breakthrough in medical genetics, but also been able to enter routine clinical practice to determine individual genetic susceptibilities. Modern technological developments are routinely introduced to genetic analysis overtaking the established approaches, potentially raising a number of challenges. To what extent is the advantage of new methodologies in synthetic metrics, such as precision and recall, more important than stability and reproducibility? Could differences in the technical protocol for calling variants be crucial to the diagnosis and, by extension, the patient's treatment strategy? A regulatory review process may delay the incorporation of potentially beneficial technologies, resulting in missed opportunities to make the right medical decisions. On the other hand, a blind adoption of new technologies based solely on synthetic metrics of precision and recall can lead to incorrect conclusions and adverse outcomes for the specific patient. Here, we use the example of a patient with a WHO-diagnosed desmoplastic/nodular SHH-medulloblastoma to explore how the choice of DNA variant search protocol affects the genetic diagnostics outcome.
Collapse
Affiliation(s)
- Rostislav Skitchenko
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
- Computer Technologies Laboratory, National Research University of Information Technologies, Mechanics and Optics, St. Petersburg, Russia
| | - Sergey Smirnov
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
| | - Mikhail Krapivin
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
| | - Anna Smirnova
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
| | - Mykyta Artomov
- The Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH, United States
- Department of Pediatrics, The Ohio State University, Columbus, OH, United States
| | - Alexander Loboda
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
- Computer Technologies Laboratory, National Research University of Information Technologies, Mechanics and Optics, St. Petersburg, Russia
| | - Yulia Dinikina
- Laboratory of Computer Modelling and Artificial Intelligence, Almazov National Medical Research Centre, St. Petersburg, Russia
| |
Collapse
|
11
|
de Bruin DDSH, Haagmans MA, van der Gaag KJ, Hoogenboom J, Weiler NEC, Tesi N, Salazar A, Zhang Y, Holstege H, Reinders M, M'charek AA, Sijen T, Henneman P. Exploring nanopore direct sequencing performance of forensic STRs, SNPs, InDels, and DNA methylation markers in a single assay. Forensic Sci Int Genet 2024; 74:103154. [PMID: 39426120 DOI: 10.1016/j.fsigen.2024.103154] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 10/02/2024] [Accepted: 10/04/2024] [Indexed: 10/21/2024]
Abstract
INTRODUCTION The field of forensic DNA analysis has undergone rapid advancements in recent decades. The integration of massively parallel sequencing (MPS) has notably expanded the forensic toolkit, moving beyond identity matching to predicting phenotypic traits and biogeographical ancestry. This shift is of particular significance in cases where conventional DNA profiling fails to identify a single suspect. Supplementing forensic analyses with estimated biological age may be valuable but involves a complex and time-consuming DNA methylation analysis. This study explores and validates the performance of a comprehensive forensic third-generation sequencing assay utilizing Oxford Nanopore Technologies (ONT) in an adaptive and direct sequencing approach. We incorporated the most widely used forensic markers, i.e., STRs, SNPs, InDels, mitochondrial DNA (mtDNA), and two methylation-based clock classifiers, thereby combining forensic genetic and epigenetic analysis in one single workflow. METHODS AND RESULTS In our investigation, DNA from six anonymous individuals was sequenced using the ONT standard adaptive direct sequencing approach, reaching a mean percentage of on-target reads ranging from 6.6 % to 7.7 % per sample. ONT data was compared to standard MPS data and Illumina EPIC DNA methylation profiles. Basecalling employed recommended ONT software packages. TREAT was used for ONT-based analysis of autosomal and Y-chromosome STRs, achieving 90-92 % correct calls depending on allelic read depth thresholds. InDel analyses for two lower-quality samples proved challenging due to inadequate read depth, while the remaining four samples significantly contributed to the observed percentage markers (60.9 %) and correct calls (97.8 %). SNP analysis achieved a 98 % call rate, with only two mismatches and two missed alleles. ONT-generated DNA methylation data demonstrated Pearson's correlation coefficients with EPIC data ranging from 0.67 to 0.97 for Horvath's clock. Additional age-associated markers exhibited Pearson's correlation coefficients with chronological age between 0.14 (ELOVL2) and 0.96 (FHL2) at read depths of <30 and <20, respectively. Despite excluding mtDNA from our targeted sequencing approach, adaptive proof-reading fragments covered the complete mtDNA with an average read depth of 21-72, showing 100 % concordance with reference data. DISCUSSION Our exploratory study using ONT adaptive sequencing for conventional forensic and age associated DNA methylation markers showed high sequencing accuracy for a significant number of markers, showcasing ONT as a promising (epi)genetic forensic method. Future studies must address three critical aspects: determining clear quantity and quality measures and detection thresholds for accuracy, optimizing input DNA quantity for forensic casework expectations, and addressing ethical considerations associated with phenotype and ancestry analysis to prevent ethnic biases.
Collapse
Affiliation(s)
- Desiree D S H de Bruin
- Department of Human Genetics, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands; CLHC, Amsterdam Center for Forensic Science and Medicine, University of Amsterdam, Amsterdam, The Netherlands.
| | - Martin A Haagmans
- Department of Human Genetics, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands.
| | | | - Jerry Hoogenboom
- Netherlands Forensic Institute, Biological Traces, Den Haag, The Netherlands.
| | - Natalie E C Weiler
- Netherlands Forensic Institute, Biological Traces, Den Haag, The Netherlands.
| | - Niccoló Tesi
- Department of Human Genetics, Genomics of Neurodegenerative Diseases and Aging, Vrije Universiteit Amsterdam, Amsterdam University Medical Center, Amsterdam, The Netherlands; Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.
| | - Alex Salazar
- Department of Human Genetics, Genomics of Neurodegenerative Diseases and Aging, Vrije Universiteit Amsterdam, Amsterdam University Medical Center, Amsterdam, The Netherlands.
| | - Yaran Zhang
- Department of Human Genetics, Genomics of Neurodegenerative Diseases and Aging, Vrije Universiteit Amsterdam, Amsterdam University Medical Center, Amsterdam, The Netherlands.
| | - Henne Holstege
- Department of Human Genetics, Genomics of Neurodegenerative Diseases and Aging, Vrije Universiteit Amsterdam, Amsterdam University Medical Center, Amsterdam, The Netherlands; Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.
| | - Marcel Reinders
- Department of Human Genetics, Genomics of Neurodegenerative Diseases and Aging, Vrije Universiteit Amsterdam, Amsterdam University Medical Center, Amsterdam, The Netherlands; Delft Bioinformatics Lab, Delft University of Technology, Delft, The Netherlands.
| | | | - Titia Sijen
- Netherlands Forensic Institute, Biological Traces, Den Haag, The Netherlands; University of Amsterdam, Swammerdam Institute for Life Sciences, Amsterdam, The Netherlands.
| | - Peter Henneman
- Department of Human Genetics, Amsterdam University Medical Center, University of Amsterdam, Amsterdam, The Netherlands; Amsterdam Reproduction and Development research Institute, Amsterdam, The Netherlands; Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam, The Netherlands.
| |
Collapse
|
12
|
Yu L, Zhang Y, Wang D, Li L, Zhang R, Li J. Harmonizing tumor mutational burden analysis: Insights from a multicenter study using in silico reference data sets in clinical whole-exome sequencing (WES). Am J Clin Pathol 2024; 162:408-419. [PMID: 38733635 DOI: 10.1093/ajcp/aqae056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Accepted: 04/13/2024] [Indexed: 05/13/2024] Open
Abstract
OBJECTIVES Tumor mutational burden (TMB) is a significant biomarker for predicting immune checkpoint inhibitor response, but the clinical performance of whole-exome sequencing (WES)-based TMB estimation has received less attention compared to panel-based methods. This study aimed to assess the reliability and comparability of WES-based TMB analysis among laboratories under routine testing conditions. METHODS A multicenter study was conducted involving 24 laboratories in China using in silico reference data sets. The accuracy and comparability of TMB estimation were evaluated using matched tumor-normal data sets. Factors such as accuracy of variant calls, limit of detection (LOD) of WES test, size of regions of interest (ROIs) used for TMB calculation, and TMB cutoff points were analyzed. RESULTS The laboratories consistently underestimated the expected TMB scores in matched tumor-normal samples, with only 50% falling within the ±30% TMB interval. Samples with low TMB score (<2.5) received the consensus interpretation. Accuracy of variant calls, LOD of the WES test, ROI, and TMB cutoff points were important factors causing interlaboratory deviations. CONCLUSIONS This study highlights real-world challenges in WES-based TMB analysis that need to be improved and optimized. This research will aid in the selection of more reasonable analytical procedures to minimize potential methodologic biases in estimating TMB in clinical exome sequencing tests. Harmonizing TMB estimation in clinical testing conditions is crucial for accurately evaluating patients' response to immunotherapy.
Collapse
Affiliation(s)
- Lijia Yu
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Yuanfeng Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Duo Wang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Lin Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Rui Zhang
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| | - Jinming Li
- National Center for Clinical Laboratories, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing Hospital/National Center of Gerontology, Beijing, China
- National Center for Clinical Laboratories, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, China
| |
Collapse
|
13
|
Hanssen F, Gabernet G, Bäuerle F, Stöcker B, Wiegand F, Smith NH, Mertes C, Neogi AG, Brandhoff L, Ossowski A, Altmueller J, Becker K, Petzold A, Sturm M, Stöcker T, Sivalingam S, Brand F, Schmidt A, Buness A, Probst AJ, Motameny S, Köster J. NCBench: providing an open, reproducible, transparent, adaptable, and continuous benchmark approach for DNA-sequencing-based variant calling. F1000Res 2024; 12:1125. [PMID: 39345270 PMCID: PMC11428021 DOI: 10.12688/f1000research.140344.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/27/2024] [Indexed: 10/01/2024] Open
Abstract
We present the results of the human genomic small variant calling benchmarking initiative of the German Research Foundation (DFG) funded Next Generation Sequencing Competence Network (NGS-CN) and the German Human Genome-Phenome Archive (GHGA). In this effort, we developed NCBench, a continuous benchmarking platform for the evaluation of small genomic variant callsets in terms of recall, precision, and false positive/negative error patterns. NCBench is implemented as a continuously re-evaluated open-source repository. We show that it is possible to entirely rely on public free infrastructure (Github, Github Actions, Zenodo) in combination with established open-source tools. NCBench is agnostic of the used dataset and can evaluate an arbitrary number of given callsets, while reporting the results in a visual and interactive way. We used NCBench to evaluate over 40 callsets generated by various variant calling pipelines available in the participating groups that were run on three exome datasets from different enrichment kits and at different coverages. While all pipelines achieve high overall quality, subtle systematic differences between callers and datasets exist and are made apparent by NCBench.These insights are useful to improve existing pipelines and develop new workflows. NCBench is meant to be open for the contribution of any given callset. Most importantly, for authors, it will enable the omission of repeated re-implementation of paper-specific variant calling benchmarks for the publication of new tools or pipelines, while readers will benefit from being able to (continuously) observe the performance of tools and pipelines at the time of reading instead of at the time of writing.
Collapse
Affiliation(s)
- Friederike Hanssen
- Quantitative Biology Center, Eberhard Karls University Tübingen, Tübingen, Germany
| | - Gisela Gabernet
- Quantitative Biology Center, Eberhard Karls University Tübingen, Tübingen, Germany
| | - Famke Bäuerle
- Quantitative Biology Center, Eberhard Karls University Tübingen, Tübingen, Germany
- M3 Research Center, University Hospital, Tübingen, Germany
- Institute for Translational Bioinformatics, University Medical Center, Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), Eberhard-Karls University of Tübingen, Tübingen, Germany
| | - Bianca Stöcker
- Bioinformatics and Computational Oncology, Institute for Artificial Intelligence in Medicine (IKIM), University Medicine Essen, University of Duisburg-Essen, Essen, Germany
| | - Felix Wiegand
- Bioinformatics and Computational Oncology, Institute for Artificial Intelligence in Medicine (IKIM), University Medicine Essen, University of Duisburg-Essen, Essen, Germany
| | - Nicholas H. Smith
- TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Christian Mertes
- TUM School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
- Munich Data Science Institute, Technical University of Munich, Munich, Germany
- Institute of Human Genetics, Klinikum rechts der Isar, School of Medicine, Technical University of Munich, Munich, Germany
| | - Avirup Guha Neogi
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | - Leon Brandhoff
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- West German Genome Center - Cologne, University of Cologne, Cologne, Germany
| | - Anna Ossowski
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | - Janine Altmueller
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- Core Facility Genomics, Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Berlin, Germany
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | - Kerstin Becker
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| | - Andreas Petzold
- DRESDEN-concept Genome Center, TUD Dresden University of Technology, Dresden, Germany
| | - Marc Sturm
- Institute of Medical Genetics and Applied Genomics, University Hospital Tuebingen, Tübingen, Germany
| | - Tyll Stöcker
- Institute of Crop Science and Resource Conservation, University of Bonn, Bonn, Germany
| | - Sugirthan Sivalingam
- Institute of Human Genetics, Medical Faculty and University Hospital Düsseldorf, Heinrich-Heine-University Düsseldorf, Düsseldorf, Germany
| | - Fabian Brand
- Institute for Genomic Statistics and Bioinformatics, Medical Faculty, University of Bonn, Bonn, Germany
| | - Axel Schmidt
- Institute of Human Genetics, University Hospital of Bonn, Bonn, Germany
| | - Andreas Buness
- Core Unit for Bioinformatics Analysis, University Hospital Bonn, Bonn, Germany
| | - Alexander J. Probst
- Environmental Metagenomics, Research Center One Health Ruhr, University Alliance Ruhr, Faculty of Chemistry, University of Duisburg-Essen, Essen, Germany
| | - Susanne Motameny
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
- West German Genome Center - Cologne, University of Cologne, Cologne, Germany
| | - Johannes Köster
- Bioinformatics and Computational Oncology, Institute for Artificial Intelligence in Medicine (IKIM), University Medicine Essen, University of Duisburg-Essen, Essen, Germany
- German Cancer Consortium, Essen, Germany
| |
Collapse
|
14
|
Rothschild D, Susanto TT, Sui X, Spence JP, Rangan R, Genuth NR, Sinnott-Armstrong N, Wang X, Pritchard JK, Barna M. Diversity of ribosomes at the level of rRNA variation associated with human health and disease. CELL GENOMICS 2024; 4:100629. [PMID: 39111318 PMCID: PMC11480859 DOI: 10.1016/j.xgen.2024.100629] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 05/07/2024] [Accepted: 07/14/2024] [Indexed: 09/14/2024]
Abstract
With hundreds of copies of rDNA, it is unknown whether they possess sequence variations that form different types of ribosomes. Here, we developed an algorithm for long-read variant calling, termed RGA, which revealed that variations in human rDNA loci are predominantly insertion-deletion (indel) variants. We developed full-length rRNA sequencing (RIBO-RT) and in situ sequencing (SWITCH-seq), which showed that translating ribosomes possess variation in rRNA. Over 1,000 variants are lowly expressed. However, tens of variants are abundant and form distinct rRNA subtypes with different structures near indels as revealed by long-read rRNA structure probing coupled to dimethyl sulfate sequencing. rRNA subtypes show differential expression in endoderm/ectoderm-derived tissues, and in cancer, low-abundance rRNA variants can become highly expressed. Together, this study identifies the diversity of ribosomes at the level of rRNA variants, their chromosomal location, and unique structure as well as the association of ribosome variation with tissue-specific biology and cancer.
Collapse
Affiliation(s)
- Daphna Rothschild
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | | | - Xin Sui
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jeffrey P Spence
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Ramya Rangan
- Biophysics Program, Stanford University, Stanford, CA 94305, USA
| | - Naomi R Genuth
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Biology, Stanford University, Stanford, CA 94305, USA
| | | | - Xiao Wang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Biology, Stanford University, Stanford, CA 94305, USA
| | - Maria Barna
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
15
|
Ong SS, Ho PJ, Khng AJ, Tan BKT, Tan QT, Tan EY, Tan SM, Putti TC, Lim SH, Tang ELS, Li J, Hartman M. Genomic Insights into Idiopathic Granulomatous Mastitis through Whole-Exome Sequencing: A Case Report of Eight Patients. Int J Mol Sci 2024; 25:9058. [PMID: 39201744 PMCID: PMC11354296 DOI: 10.3390/ijms25169058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2024] [Revised: 08/17/2024] [Accepted: 08/19/2024] [Indexed: 09/03/2024] Open
Abstract
Idiopathic granulomatous mastitis (IGM) is a rare condition characterised by chronic inflammation and granuloma formation in the breast. The aetiology of IGM is unclear. By focusing on the protein-coding regions of the genome, where most disease-related mutations often occur, whole-exome sequencing (WES) is a powerful approach for investigating rare and complex conditions, like IGM. We report WES results on paired blood and tissue samples from eight IGM patients. Samples were processed using standard genomic protocols. Somatic variants were called with two analytical pipelines: nf-core/sarek with Strelka2 and GATK4 with Mutect2. Our WES study of eight patients did not find evidence supporting a clear genetic component. The discrepancies between variant calling algorithms, along with the considerable genetic heterogeneity observed amongst the eight IGM cases, indicate that common genetic drivers are not readily identifiable. With only three genes, CHIT1, CEP170, and CTR9, recurrently altering in multiple cases, the genetic basis of IGM remains uncertain. The absence of validation for somatic variants by Sanger sequencing raises further questions about the role of genetic mutations in the disease. Other potential contributors to the disease should be explored.
Collapse
Affiliation(s)
- Seeu Si Ong
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore; (S.S.O.)
- Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
| | - Peh Joo Ho
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore; (S.S.O.)
- Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Singapore
| | - Alexis Jiaying Khng
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore; (S.S.O.)
| | - Benita Kiat Tee Tan
- Department of General Surgery, Sengkang General Hospital, Singapore 544886, Singapore
- Department of Breast Surgery, Singapore General Hospital, Singapore 169608, Singapore
- Division of Surgical Oncology, National Cancer Centre, Singapore 169610, Singapore
| | - Qing Ting Tan
- Breast Department, KK Women’s and Children’s Hospital, Singapore 229899, Singapore
| | - Ern Yu Tan
- Department of General Surgery, Tan Tock Seng Hospital, Singapore 308433, Singapore
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore 308232, Singapore
- Institute of Molecular and Cell Biology (IMCB), Agency for Science, Technology and Research (A*STAR), Singapore 138673, Singapore
| | - Su-Ming Tan
- Division of Breast Surgery, Changi General Hospital, Singapore 529889, Singapore
| | - Thomas Choudary Putti
- Department of Pathology, National University Health System, Singapore 119228, Singapore
| | - Swee Ho Lim
- Breast Department, KK Women’s and Children’s Hospital, Singapore 229899, Singapore
| | | | - Jingmei Li
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore 138672, Singapore; (S.S.O.)
- Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
| | - Mikael Hartman
- Department of Surgery, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 119228, Singapore
- Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Singapore
- Department of Surgery, University Surgical Cluster, National University Health System, Singapore 119228, Singapore
| |
Collapse
|
16
|
Lazareva TE, Barbitoff YA, Nasykhova YA, Glotov AS. Major Causes of Conflicting Interpretations of Variant Pathogenicity in Rare Disease: A Systematic Analysis. J Pers Med 2024; 14:864. [PMID: 39202055 PMCID: PMC11355203 DOI: 10.3390/jpm14080864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 07/31/2024] [Accepted: 08/12/2024] [Indexed: 09/03/2024] Open
Abstract
The identification of the genetic causes of inherited disorders from next-generation sequencing (NGS) data remains a complicated process, in particular due to challenges in interpretation of the vast amount of generated data and hundreds of candidate variants identified. Inconsistencies in variant classification, where genetic centers classify the same variant differently, can hinder accurate diagnoses for rare diseases. Publicly available databases that collect data on human genetic variations and their association with diseases provide ample opportunities to discover conflicts in variant interpretation worldwide. In this study, we explored patterns of variant classification discrepancies using data from ClinVar, a public archive of variant interpretations. We found that 5.7% of variants have conflicting interpretations (COIs) reported, and the vast majority of interpretation conflicts arise for variants of uncertain significance (VUS). As many as 78% of clinically relevant genes harbor variants with COIs, and genes with high COI rates tended to have more exons and longer transcripts, with a greater proportion of genes linked to several distinct conditions. The enrichment analysis of COI-enriched genes revealed that the products of these genes are involved in cardiac disorders, muscle development, and function. To improve diagnoses, we believe that specific variant interpretation rules could be developed for such genes. Additionally, our findings underscore the need for the publication of variant pathogenicity evidence and the importance of considering every variant as VUS unless proven otherwise.
Collapse
Affiliation(s)
- Tatyana E. Lazareva
- Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya Line 3, 199034 St. Petersburg, Russia
| | - Yury A. Barbitoff
- Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya Line 3, 199034 St. Petersburg, Russia
- Bioinformatics Institute, Kantemirovskaya St. 2A, 197342 St. Petersburg, Russia
| | - Yulia A. Nasykhova
- Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya Line 3, 199034 St. Petersburg, Russia
| | - Andrey S. Glotov
- Department of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya Line 3, 199034 St. Petersburg, Russia
| |
Collapse
|
17
|
Kong J, Yao Z, Chen J, Zhao Q, Li T, Dong M, Bai Y, Liu Y, Lin Z, Xie Q, Zhang X. Comparative Transcriptome Analysis Unveils Regulatory Factors Influencing Fatty Liver Development in Lion-Head Geese under High-Intake Feeding Compared to Normal Feeding. Vet Sci 2024; 11:366. [PMID: 39195820 PMCID: PMC11359645 DOI: 10.3390/vetsci11080366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 07/13/2024] [Accepted: 08/01/2024] [Indexed: 08/29/2024] Open
Abstract
The lion-head goose is the only large goose species in China, and it is one of the largest goose species in the world. Lion-head geese have a strong tolerance for massive energy intake and show a priority of fat accumulation in liver tissue through special feeding. Therefore, the aim of this study was to investigate the impact of high feed intake compared to normal feeding conditions on the transcriptome changes associated with fatty liver development in lion-head geese. In this study, 20 healthy adult lion-head geese were randomly assigned to a control group (CONTROL, n = 10) and high-intake-fed group (CASE, n = 10). After 38 d of treatment, all geese were sacrificed, and liver samples were collected. Three geese were randomly selected from the CONTROL and CASE groups, respectively, to perform whole-transcriptome analysis to analyze the key regulatory genes. We identified 716 differentially expressed mRNAs, 145 differentially expressed circRNAs, and 39 differentially expressed lncRNAs, including upregulated and downregulated genes. GO enrichment analysis showed that these genes were significantly enriched in molecular function. The node degree analysis and centrality metrics of the mRNA-lncRNA-circRNA triple regulatory network indicate the presence of crucial functional nodes in the network. We identified differentially expressed genes, including HSPB9, Pgk1, Hsp70, ME2, malic enzyme, HSP90, FADS1, transferrin, FABP, PKM2, Serpin2, and PKS, and we additionally confirmed the accuracy of sequencing at the RNA level. In this study, we studied for the first time the important differential genes that regulate fatty liver in high-intake feeding of the lion-head goose. In summary, these differentially expressed genes may play important roles in fatty liver development in the lion-head goose, and the functions and mechanisms should be investigated in future studies.
Collapse
Affiliation(s)
- Jie Kong
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Ziqi Yao
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Junpeng Chen
- Shantou Baisha Research Institute of Original Species of Poultry and Stock, Shantou 515000, China; (J.C.); (Z.L.)
| | - Qiqi Zhao
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Tong Li
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Mengyue Dong
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Yuhang Bai
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Yuanjia Liu
- College of Coastal Agricultural Sciences, Guangdong Ocean University, Zhanjiang 524088, China;
| | - Zhenping Lin
- Shantou Baisha Research Institute of Original Species of Poultry and Stock, Shantou 515000, China; (J.C.); (Z.L.)
| | - Qingmei Xie
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| | - Xinheng Zhang
- State Key Laboratory of Swine and Poultry Breeding Industry & Heyuan Branch, Guangdong Provincial Laboratory of Lingnan Modern Agricultural Science and Technology, College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (J.K.); (Z.Y.); (Q.Z.); (T.L.); (M.D.); (Y.B.)
- Guangdong Provincial Key Lab of AgroAnimal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou 510642, China
- Guangdong Engineering Research Center for Vector Vaccine of Animal Virus, Guangzhou 510642, China
- Zhongshan Innovation Center, South China Agricultural University, Zhongshan 528400, China
| |
Collapse
|
18
|
Ricci CA, Crysup B, Phillips NR, Ray WC, Santillan MK, Trask AJ, Woerner AE, Goulopoulou S. Machine learning: a new era for cardiovascular pregnancy physiology and cardio-obstetrics research. Am J Physiol Heart Circ Physiol 2024; 327:H417-H432. [PMID: 38847756 PMCID: PMC11442027 DOI: 10.1152/ajpheart.00149.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 05/31/2024] [Accepted: 05/31/2024] [Indexed: 06/10/2024]
Abstract
The maternal cardiovascular system undergoes functional and structural adaptations during pregnancy and postpartum to support increased metabolic demands of offspring and placental growth, labor, and delivery, as well as recovery from childbirth. Thus, pregnancy imposes physiological stress upon the maternal cardiovascular system, and in the absence of an appropriate response it imparts potential risks for cardiovascular complications and adverse outcomes. The proportion of pregnancy-related maternal deaths from cardiovascular events has been steadily increasing, contributing to high rates of maternal mortality. Despite advances in cardiovascular physiology research, there is still no comprehensive understanding of maternal cardiovascular adaptations in healthy pregnancies. Furthermore, current approaches for the prognosis of cardiovascular complications during pregnancy are limited. Machine learning (ML) offers new and effective tools for investigating mechanisms involved in pregnancy-related cardiovascular complications as well as the development of potential therapies. The main goal of this review is to summarize existing research that uses ML to understand mechanisms of cardiovascular physiology during pregnancy and develop prediction models for clinical application in pregnant patients. We also provide an overview of ML platforms that can be used to comprehensively understand cardiovascular adaptations to pregnancy and discuss the interpretability of ML outcomes, the consequences of model bias, and the importance of ethical consideration in ML use.
Collapse
Affiliation(s)
- Contessa A Ricci
- College of Nursing, Washington State University, Spokane, Washington, United States
- IREACH: Institute for Research and Education to Advance Community Health, Washington State University, Seattle, Washington, United States
- Elson S. Floyd College of Medicine, Washington State University, Spokane, Washington, United States
| | - Benjamin Crysup
- Department of Microbiology, Immunology and Genetics, University of North Texas Health Science, Fort Worth, Texas, United States
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, Texas, United States
| | - Nicole R Phillips
- Department of Microbiology, Immunology and Genetics, University of North Texas Health Science, Fort Worth, Texas, United States
| | - William C Ray
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio, United States
| | - Mark K Santillan
- Department of Obstetrics and Gynecology, University of Iowa Carver College of Medicine, Iowa City, Iowa, United States
| | - Aaron J Trask
- Center for Cardiovascular Research, The Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, Ohio, United States
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio, United States
| | - August E Woerner
- Department of Microbiology, Immunology and Genetics, University of North Texas Health Science, Fort Worth, Texas, United States
- Center for Human Identification, University of North Texas Health Science Center, Fort Worth, Texas, United States
| | - Styliani Goulopoulou
- Lawrence D. Longo Center for Perinatal Biology, Departments of Basic Sciences, Gynecology and Obstetrics, Loma Linda University, Loma Linda, California, United States
| |
Collapse
|
19
|
Yang XT, Yang WL, Lau YL. NGS data analysis for molecular diagnosis of Inborn Errors of Immunity. Semin Immunol 2024; 74-75:101901. [PMID: 39509871 DOI: 10.1016/j.smim.2024.101901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2024] [Revised: 10/01/2024] [Accepted: 10/30/2024] [Indexed: 11/15/2024]
Abstract
Inborn errors of immunity (IEI) encompass a group of disorders with a strong genetic component. Prompt and accurate diagnosis of these disorders is essential for effective clinical management. Next-generation sequencing (NGS) has significantly enhanced the diagnostic process by offering a comprehensive and scalable approach for identifying genomic variations causal for these disorders. Nevertheless, the bioinformatics analysis of NGS data poses several challenges. In this review, we explore these challenges and share our insights on addressing them, aiming to improve the overall diagnostic yield.
Collapse
Affiliation(s)
- X T Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - W L Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Y L Lau
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
20
|
Sun Y, Zhao X, Fan X, Wang M, Li C, Liu Y, Wu P, Yan Q, Sun L. Assessing the impact of sequencing platforms and analytical pipelines on whole-exome sequencing. Front Genet 2024; 15:1334075. [PMID: 38818042 PMCID: PMC11137314 DOI: 10.3389/fgene.2024.1334075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/30/2024] [Indexed: 06/01/2024] Open
Affiliation(s)
- Yanping Sun
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Xiaochao Zhao
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Xue Fan
- Clinical Research Institute, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Miao Wang
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Chaoyang Li
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Yongfeng Liu
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Ping Wu
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Qin Yan
- GeneMind Biosciences Company Limited, Shenzhen, China
| | - Lei Sun
- GeneMind Biosciences Company Limited, Shenzhen, China
| |
Collapse
|
21
|
Sergi A, Beltrame L, Marchini S, Masseroli M. Integrated approach to generate artificial samples with low tumor fraction for somatic variant calling benchmarking. BMC Bioinformatics 2024; 25:180. [PMID: 38720249 PMCID: PMC11077792 DOI: 10.1186/s12859-024-05793-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 04/19/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND High-throughput sequencing (HTS) has become the gold standard approach for variant analysis in cancer research. However, somatic variants may occur at low fractions due to contamination from normal cells or tumor heterogeneity; this poses a significant challenge for standard HTS analysis pipelines. The problem is exacerbated in scenarios with minimal tumor DNA, such as circulating tumor DNA in plasma. Assessing sensitivity and detection of HTS approaches in such cases is paramount, but time-consuming and expensive: specialized experimental protocols and a sufficient quantity of samples are required for processing and analysis. To overcome these limitations, we propose a new computational approach specifically designed for the generation of artificial datasets suitable for this task, simulating ultra-deep targeted sequencing data with low-fraction variants and demonstrating their effectiveness in benchmarking low-fraction variant calling. RESULTS Our approach enables the generation of artificial raw reads that mimic real data without relying on pre-existing data by using NEAT, a fine-grained read simulator that generates artificial datasets using models learned from multiple different datasets. Then, it incorporates low-fraction variants to simulate somatic mutations in samples with minimal tumor DNA content. To prove the suitability of the created artificial datasets for low-fraction variant calling benchmarking, we used them as ground truth to evaluate the performance of widely-used variant calling algorithms: they allowed us to define tuned parameter values of major variant callers, considerably improving their detection of very low-fraction variants. CONCLUSIONS Our findings highlight both the pivotal role of our approach in creating adequate artificial datasets with low tumor fraction, facilitating rapid prototyping and benchmarking of algorithms for such dataset type, as well as the important need of advancing low-fraction variant calling techniques.
Collapse
Affiliation(s)
- Aldo Sergi
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy.
- IRCCS Humanitas Research Hospital, Via Manzoni 56, 20089, Milan, Rozzano, Italy.
| | - Luca Beltrame
- IRCCS Humanitas Research Hospital, Via Manzoni 56, 20089, Milan, Rozzano, Italy
| | - Sergio Marchini
- IRCCS Humanitas Research Hospital, Via Manzoni 56, 20089, Milan, Rozzano, Italy
| | - Marco Masseroli
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133, Milan, Italy
| |
Collapse
|
22
|
Laguna JC, Pastor B, Nalda I, Hijazo-Pechero S, Teixido C, Potrony M, Puig-Butillé JA, Mezquita L. Incidental pathogenic germline alterations detected through liquid biopsy in patients with solid tumors: prevalence, clinical utility and implications. Br J Cancer 2024; 130:1420-1431. [PMID: 38532104 PMCID: PMC11059286 DOI: 10.1038/s41416-024-02607-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 01/14/2024] [Accepted: 01/25/2024] [Indexed: 03/28/2024] Open
Abstract
Liquid biopsy, a minimally invasive approach for detecting tumor biomarkers in blood, has emerged as a leading-edge technique in cancer precision medicine. New evidence has shown that liquid biopsies can incidentally detect pathogenic germline variants (PGVs) associated with cancer predisposition, including in patients with a cancer for which genetic testing is not recommended. The ability to detect these incidental PGV in cancer patients through liquid biopsy raises important questions regarding the management of this information and its clinical implications. This incidental identification of PGVs raises concerns about cancer predisposition and the potential impact on patient management, not only in terms of providing access to treatment based on the tumor molecular profiling, but also the management of revealing genetic predisposition in patients and families. Understanding how to interpret this information is essential to ensure proper decision-making and to optimize cancer treatment and prevention strategies. In this review we provide a comprehensive summary of current evidence of incidental PGVs in cancer predisposition genes identified by liquid biopsy in patients with cancer. We critically review the methodological considerations of liquid biopsy as a tool for germline diagnosis, clinical utility and potential implications for cancer prevention, treatment, and research.
Collapse
Affiliation(s)
- Juan Carlos Laguna
- Medical Oncology Department, Hospital Clinic of Barcelona, Barcelona, Spain
- Laboratory of Translational Genomics and Targeted Therapies in Solid Tumors, IDIBAPS, Barcelona, Spain
| | - Belén Pastor
- Medical Oncology Department, Hospital Clinic of Barcelona, Barcelona, Spain
| | - Irene Nalda
- Medical Oncology Department, Hospital Clinic of Barcelona, Barcelona, Spain
- Laboratory of Translational Genomics and Targeted Therapies in Solid Tumors, IDIBAPS, Barcelona, Spain
| | - Sara Hijazo-Pechero
- Preclinical and Experimental Research in Thoracic Tumors (PRETT), Oncobell, Bellvitge Biomedical Research Institute (IDIBELL), l'Hospitalet de Llobregat, Barcelona, Spain
| | - Cristina Teixido
- Laboratory of Translational Genomics and Targeted Therapies in Solid Tumors, IDIBAPS, Barcelona, Spain
- Department of Medicine, University of Barcelona, Barcelona, Spain
- Department of Pathology, Hospital Clinic of Barcelona, Barcelona, Spain
| | - Miriam Potrony
- Biochemistry and Molecular Genetics Department, Hospital Clínic of Barcelona, IDIBAPS, Barcelona, Spain
- CIBER of Rare Diseases (CIBERER), Barcelona, Spain
| | - Joan Antón Puig-Butillé
- CIBER of Rare Diseases (CIBERER), Barcelona, Spain
- Molecular Biology CORE, Hospital Clínic of Barcelona, IDIBAPS, Barcelona, Spain
| | - Laura Mezquita
- Medical Oncology Department, Hospital Clinic of Barcelona, Barcelona, Spain.
- Laboratory of Translational Genomics and Targeted Therapies in Solid Tumors, IDIBAPS, Barcelona, Spain.
- Department of Medicine, University of Barcelona, Barcelona, Spain.
| |
Collapse
|
23
|
Kosugi S, Terao C. Comparative evaluation of SNVs, indels, and structural variations detected with short- and long-read sequencing data. Hum Genome Var 2024; 11:18. [PMID: 38632226 PMCID: PMC11024196 DOI: 10.1038/s41439-024-00276-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/12/2024] [Accepted: 03/20/2024] [Indexed: 04/19/2024] Open
Abstract
Short- and long-read sequencing technologies are routinely used to detect DNA variants, including SNVs, indels, and structural variations (SVs). However, the differences in the quality and quantity of variants detected between short- and long-read data are not fully understood. In this study, we comprehensively evaluated the variant calling performance of short- and long-read-based SNV, indel, and SV detection algorithms (6 for SNVs, 12 for indels, and 13 for SVs) using a novel evaluation framework incorporating manual visual inspection. The results showed that indel-insertion calls greater than 10 bp were poorly detected by short-read-based detection algorithms compared to long-read-based algorithms; however, the recall and precision of SNV and indel-deletion detection were similar between short- and long-read data. The recall of SV detection with short-read-based algorithms was significantly lower in repetitive regions, especially for small- to intermediate-sized SVs, than that detected with long-read-based algorithms. In contrast, the recall and precision of SV detection in nonrepetitive regions were similar between short- and long-read data. These findings suggest the need for refined strategies, such as incorporating multiple variant detection algorithms, to generate a more complete set of variants using short-read data.
Collapse
Affiliation(s)
- Shunichi Kosugi
- Center for Genome Informatics, Research Organization of Information and Systems, Joint Support-Center for Data Science Research, Shizuoka, Japan.
- Advanced Genomics Center, National Institute of Genetics, Shizuoka, Japan.
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan.
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
- Clinical Research Center, Shizuoka General Hospital, Shizuoka, Japan
- The Department of Applied Genetics, The School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| |
Collapse
|
24
|
Unger M, Kather JN. Deep learning in cancer genomics and histopathology. Genome Med 2024; 16:44. [PMID: 38539231 PMCID: PMC10976780 DOI: 10.1186/s13073-024-01315-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 03/13/2024] [Indexed: 07/08/2024] Open
Abstract
Histopathology and genomic profiling are cornerstones of precision oncology and are routinely obtained for patients with cancer. Traditionally, histopathology slides are manually reviewed by highly trained pathologists. Genomic data, on the other hand, is evaluated by engineered computational pipelines. In both applications, the advent of modern artificial intelligence methods, specifically machine learning (ML) and deep learning (DL), have opened up a fundamentally new way of extracting actionable insights from raw data, which could augment and potentially replace some aspects of traditional evaluation workflows. In this review, we summarize current and emerging applications of DL in histopathology and genomics, including basic diagnostic as well as advanced prognostic tasks. Based on a growing body of evidence, we suggest that DL could be the groundwork for a new kind of workflow in oncology and cancer research. However, we also point out that DL models can have biases and other flaws that users in healthcare and research need to know about, and we propose ways to address them.
Collapse
Affiliation(s)
- Michaela Unger
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany.
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Medical Faculty Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany.
- Department of Medicine I, University Hospital Dresden, Dresden, Germany.
- Medical Oncology, National Center for Tumor Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany.
| |
Collapse
|
25
|
Charron P, Kang M. VariantDetective: an accurate all-in-one pipeline for detecting consensus bacterial SNPs and SVs. Bioinformatics 2024; 40:btae066. [PMID: 38366603 PMCID: PMC10898327 DOI: 10.1093/bioinformatics/btae066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Revised: 01/16/2024] [Accepted: 02/14/2024] [Indexed: 02/18/2024] Open
Abstract
MOTIVATION Genomic variations comprise a spectrum of alterations, ranging from single nucleotide polymorphisms (SNPs) to large-scale structural variants (SVs), which play crucial roles in bacterial evolution and species diversification. Accurately identifying SNPs and SVs is beneficial for subsequent evolutionary and epidemiological studies. This study presents VariantDetective (VD), a novel, user-friendly, and all-in-one pipeline combining SNP and SV calling to generate consensus genomic variants using multiple tools. RESULTS The VD pipeline accepts various file types as input to initiate SNP and/or SV calling, and benchmarking results demonstrate VD's robustness and high accuracy across multiple tested datasets when compared to existing variant calling approaches. AVAILABILITY AND IMPLEMENTATION The source code, test data, and relevant information for VD are freely accessible at https://github.com/OLF-Bioinformatics/VariantDetective under the MIT License.
Collapse
Affiliation(s)
- Philippe Charron
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| | - Mingsong Kang
- Ottawa Laboratory-Fallowfield, Canadian Food Inspection Agency, 3851 Fallowfield Road, Nepean, Ontario K2J 4S1, Canada
| |
Collapse
|
26
|
Barbitoff YA, Ushakov MO, Lazareva TE, Nasykhova YA, Glotov AS, Predeus AV. Bioinformatics of germline variant discovery for rare disease diagnostics: current approaches and remaining challenges. Brief Bioinform 2024; 25:bbad508. [PMID: 38271481 PMCID: PMC10810331 DOI: 10.1093/bib/bbad508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/18/2023] [Accepted: 12/12/2023] [Indexed: 01/27/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.
Collapse
Affiliation(s)
- Yury A Barbitoff
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| | - Mikhail O Ushakov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Tatyana E Lazareva
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Yulia A Nasykhova
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Andrey S Glotov
- Dpt. of Genomic Medicine, D.O. Ott Research Institute of Obstetrics, Gynaecology, and Reproductology, Mendeleevskaya line 3, 199034, St. Petersburg, Russia
| | - Alexander V Predeus
- Bioinformatics Institute, Kentemirovskaya st. 2A, 197342, St. Petersburg, Russia
| |
Collapse
|
27
|
Abdelwahab O, Belzile F, Torkamaneh D. Performance analysis of conventional and AI-based variant callers using short and long reads. BMC Bioinformatics 2023; 24:472. [PMID: 38097928 PMCID: PMC10720095 DOI: 10.1186/s12859-023-05596-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 12/04/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND The accurate detection of variants is essential for genomics-based studies. Currently, there are various tools designed to detect genomic variants, however, it has always been a challenge to decide which tool to use, especially when various major genome projects have chosen to use different tools. Thus far, most of the existing tools were mainly developed to work on short-read data (i.e., Illumina); however, other sequencing technologies (e.g. PacBio, and Oxford Nanopore) have recently shown that they can also be used for variant calling. In addition, with the emergence of artificial intelligence (AI)-based variant calling tools, there is a pressing need to compare these tools in terms of efficiency, accuracy, computational power, and ease of use. RESULTS In this study, we evaluated five of the most widely used conventional and AI-based variant calling tools (BCFTools, GATK4, Platypus, DNAscope, and DeepVariant) in terms of accuracy and computational cost using both short-read and long-read data derived from three different sequencing technologies (Illumina, PacBio HiFi, and ONT) for the same set of samples from the Genome In A Bottle project. The analysis showed that AI-based variant calling tools supersede conventional ones for calling SNVs and INDELs using both long and short reads in most aspects. In addition, we demonstrate the advantages and drawbacks of each tool while ranking them in each aspect of these comparisons. CONCLUSION This study provides best practices for variant calling using AI-based and conventional variant callers with different types of sequencing data.
Collapse
Affiliation(s)
- Omar Abdelwahab
- Département de Phytologie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada
- Institut intelligence et données (IID), Université Laval, Québec, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Québec, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, Canada.
- Centre de recherche et d'innovation sur les végétaux (CRIV), Université Laval, Québec, Canada.
- Institut intelligence et données (IID), Université Laval, Québec, Canada.
| |
Collapse
|
28
|
Rice ES, Alberdi A, Alfieri J, Athrey G, Balacco JR, Bardou P, Blackmon H, Charles M, Cheng HH, Fedrigo O, Fiddaman SR, Formenti G, Frantz LAF, Gilbert MTP, Hearn CJ, Jarvis ED, Klopp C, Marcos S, Mason AS, Velez-Irizarry D, Xu L, Warren WC. A pangenome graph reference of 30 chicken genomes allows genotyping of large and complex structural variants. BMC Biol 2023; 21:267. [PMID: 37993882 PMCID: PMC10664547 DOI: 10.1186/s12915-023-01758-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/02/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND The red junglefowl, the wild outgroup of domestic chickens, has historically served as a reference for genomic studies of domestic chickens. These studies have provided insight into the etiology of traits of commercial importance. However, the use of a single reference genome does not capture diversity present among modern breeds, many of which have accumulated molecular changes due to drift and selection. While reference-based resequencing is well-suited to cataloging simple variants such as single-nucleotide changes and short insertions and deletions, it is mostly inadequate to discover more complex structural variation in the genome. METHODS We present a pangenome for the domestic chicken consisting of thirty assemblies of chickens from different breeds and research lines. RESULTS We demonstrate how this pangenome can be used to catalog structural variants present in modern breeds and untangle complex nested variation. We show that alignment of short reads from 100 diverse wild and domestic chickens to this pangenome reduces reference bias by 38%, which affects downstream genotyping results. This approach also allows for the accurate genotyping of a large and complex pair of structural variants at the K feathering locus using short reads, which would not be possible using a linear reference. CONCLUSIONS We expect that this new paradigm of genomic reference will allow better pinpointing of exact mutations responsible for specific phenotypes, which will in turn be necessary for breeding chickens that meet new sustainability criteria and are resilient to quickly evolving pathogen threats.
Collapse
Affiliation(s)
- Edward S Rice
- Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - James Alfieri
- Department of Ecology & Evolutionary Biology, Texas A&M University, College Station, TX, USA
| | - Giridhar Athrey
- Department of Poultry Science, Texas A&M University, College Station, TX, USA
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Philippe Bardou
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, 31326, France
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Mathieu Charles
- University Paris-Saclay, INRAE, AgroParisTech, GABI, Sigenae, Jouy-en-Josas, France
| | - Hans H Cheng
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Laurent A F Frantz
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, E1 4DQ, UK
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - Cari J Hearn
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- The Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Christophe Klopp
- Sigenae, Genotoul Bioinfo, MIAT UR875, INRAE, Castanet Tolosan, France
| | - Sofia Marcos
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
- Applied Genomics and Bioinformatics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain
| | | | | | - Luohao Xu
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Key Laboratory of Aquatic Science of Chongqing, School of Life Sciences, Southwest University, Chongqing, 400715, China
| | - Wesley C Warren
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA.
| |
Collapse
|
29
|
Xiang X, Lu B, Song D, Li J, Shu K, Pu D. Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data. Sci Rep 2023; 13:20444. [PMID: 37993475 PMCID: PMC10665316 DOI: 10.1038/s41598-023-47135-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/09/2023] [Indexed: 11/24/2023] Open
Abstract
Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, we evaluated four raw-reads-based variant callers (SiNVICT, outLyzer, Pisces, and LoFreq) and four UMI-based variant callers (DeepSNVMiner, MAGERI, smCounter2, and UMI-VarCal) considering their capability to call single nucleotide variants (SNVs) with allelic frequency as low as 0.025% in deep sequencing data. We analyzed a total of 54 simulated data with various sequencing depths and variant allele frequencies (VAFs), two reference data, and Horizon Tru-Q sample data. The results showed that the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers regarding detection limit. Sequencing depth had almost no effect on the UMI-based callers but significantly influenced on the raw-reads-based callers. Regardless of the sequencing depth, MAGERI showed the fastest analysis, while smCounter2 consistently took the longest to finish the variant calling process. Overall, DeepSNVMiner and UMI-VarCal performed the best with considerably good sensitivity and precision of 88%, 100%, and 84%, 100%, respectively. In conclusion, the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers in terms of sensitivity and precision. We recommend using DeepSNVMiner and UMI-VarCal for low-frequency variant detection. The results provide important information regarding future directions for reliable low-frequency variant detection and algorithm development, which is critical in genetics-based medical research and clinical applications.
Collapse
Affiliation(s)
- Xudong Xiang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Bowen Lu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Dongyang Song
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Jie Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Dan Pu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| |
Collapse
|
30
|
Menzel M, Ossowski S, Kral S, Metzger P, Horak P, Marienfeld R, Boerries M, Wolter S, Ball M, Neumann O, Armeanu-Ebinger S, Schroeder C, Matysiak U, Goldschmid H, Schipperges V, Fürstberger A, Allgäuer M, Eberhardt T, Niewöhner J, Blaumeiser A, Ploeger C, Haack TB, Tay TKY, Kelemen O, Pauli T, Kirchner M, Kluck K, Ott A, Renner M, Admard J, Gschwind A, Lassmann S, Kestler H, Fend F, Illert AL, Werner M, Möller P, Seufferlein TTW, Malek N, Schirmacher P, Fröhling S, Kazdal D, Budczies J, Stenzinger A. Multicentric pilot study to standardize clinical whole exome sequencing (WES) for cancer patients. NPJ Precis Oncol 2023; 7:106. [PMID: 37864096 PMCID: PMC10589320 DOI: 10.1038/s41698-023-00457-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 09/26/2023] [Indexed: 10/22/2023] Open
Abstract
A growing number of druggable targets and national initiatives for precision oncology necessitate broad genomic profiling for many cancer patients. Whole exome sequencing (WES) offers unbiased analysis of the entire coding sequence, segmentation-based detection of copy number alterations (CNAs), and accurate determination of complex biomarkers including tumor mutational burden (TMB), homologous recombination repair deficiency (HRD), and microsatellite instability (MSI). To assess the inter-institution variability of clinical WES, we performed a comparative pilot study between German Centers of Personalized Medicine (ZPMs) from five participating institutions. Tumor and matched normal DNA from 30 patients were analyzed using custom sequencing protocols and bioinformatic pipelines. Calling of somatic variants was highly concordant with a positive percentage agreement (PPA) between 91 and 95% and a positive predictive value (PPV) between 82 and 95% compared with a three-institution consensus and full agreement for 16 of 17 druggable targets. Explanations for deviations included low VAF or coverage, differing annotations, and different filter protocols. CNAs showed overall agreement in 76% for the genomic sequence with high wet-lab variability. Complex biomarkers correlated strongly between institutions (HRD: 0.79-1, TMB: 0.97-0.99) and all institutions agreed on microsatellite instability. This study will contribute to the development of quality control frameworks for comprehensive genomic profiling and sheds light onto parameters that require stringent standardization.
Collapse
Affiliation(s)
- Michael Menzel
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Stephan Ossowski
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, Tübingen, Germany
| | - Sebastian Kral
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany
- Center for Personalized Medicine (ZPM), Freiburg, Germany
| | - Patrick Metzger
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- Institute of Medical Bioinformatics and Systems Medicine (IBSM), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Peter Horak
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Ralf Marienfeld
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
- Center for Personalized Medicine (ZPM), Ulm, Germany
| | - Melanie Boerries
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- Institute of Medical Bioinformatics and Systems Medicine (IBSM), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- Comprehensive Cancer Center Freiburg (CCCF), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) Partner Site Freiburg, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Steffen Wolter
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany
- Center for Personalized Medicine (ZPM), Freiburg, Germany
| | - Markus Ball
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Olaf Neumann
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Sorin Armeanu-Ebinger
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Christopher Schroeder
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Uta Matysiak
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany
- Center for Personalized Medicine (ZPM), Freiburg, Germany
| | - Hannah Goldschmid
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Vincent Schipperges
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- Institute of Medical Bioinformatics and Systems Medicine (IBSM), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Axel Fürstberger
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
- Center for Personalized Medicine (ZPM), Ulm, Germany
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Michael Allgäuer
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Timo Eberhardt
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
- Center for Personalized Medicine (ZPM), Ulm, Germany
| | - Jakob Niewöhner
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
| | - Andreas Blaumeiser
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- Institute of Medical Bioinformatics and Systems Medicine (IBSM), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
- German Cancer Consortium (DKTK) Partner Site Freiburg, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Carolin Ploeger
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Tobias Bernd Haack
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Timothy Kwang Yong Tay
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
- Department of Anatomical Pathology, Singapore General Hospital, Singapore, Singapore
| | - Olga Kelemen
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Thomas Pauli
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- Institute of Medical Bioinformatics and Systems Medicine (IBSM), Medical Center - University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Martina Kirchner
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Klaus Kluck
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Alexander Ott
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Marcus Renner
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
| | - Jakob Admard
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Axel Gschwind
- Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
- Center for Personalized Medicine (ZPM), Tübingen, Germany
| | - Silke Lassmann
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany
- Center for Personalized Medicine (ZPM), Freiburg, Germany
| | - Hans Kestler
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
- Center for Personalized Medicine (ZPM), Ulm, Germany
| | - Falko Fend
- Institute of Pathology and Neuropathology, University Hospital Tübingen, Tübingen, Germany
| | - Anna Lena Illert
- Department of Medicine I, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, 79085, Freiburg, Germany
- Medical Department for Hematology and Oncology, Klinikum Rechts der Isar, Technische Universität München, 80333, Munich, Germany
- German Cancer Consortium (DKTK) Partner Site Munich, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Martin Werner
- Institute for Surgical Pathology, Medical Center, University of Freiburg, Freiburg, Germany
- Center for Personalized Medicine (ZPM), Freiburg, Germany
- German Cancer Consortium (DKTK) Partner Site Freiburg, and German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Peter Möller
- Institute of Pathology, University Hospital Ulm, Ulm, Germany
| | | | - Nisar Malek
- Center for Personalized Medicine (ZPM), Tübingen, Germany
- Department of Internal Medicine I, University Hospital Tübingen, Tübingen, Germany
| | - Peter Schirmacher
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Stefan Fröhling
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
- Division of Translational Medical Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany
- German Cancer Consortium (DKTK), Heidelberg, Germany
| | - Daniel Kazdal
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany
- Center for Personalized Medicine (ZPM), Heidelberg, Germany
| | - Jan Budczies
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany.
- Center for Personalized Medicine (ZPM), Heidelberg, Germany.
- German Cancer Consortium (DKTK), Heidelberg, Germany.
| | - Albrecht Stenzinger
- Institute of Pathology, Heidelberg University Hospital, Heidelberg, Germany.
- Center for Personalized Medicine (ZPM), Heidelberg, Germany.
- German Cancer Consortium (DKTK), Heidelberg, Germany.
| |
Collapse
|
31
|
Zhang B, Bassani-Sternberg M. Current perspectives on mass spectrometry-based immunopeptidomics: the computational angle to tumor antigen discovery. J Immunother Cancer 2023; 11:e007073. [PMID: 37899131 PMCID: PMC10619091 DOI: 10.1136/jitc-2023-007073] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2023] [Indexed: 10/31/2023] Open
Abstract
Identification of tumor antigens presented by the human leucocyte antigen (HLA) molecules is essential for the design of effective and safe cancer immunotherapies that rely on T cell recognition and killing of tumor cells. Mass spectrometry (MS)-based immunopeptidomics enables high-throughput, direct identification of HLA-bound peptides from a variety of cell lines, tumor tissues, and healthy tissues. It involves immunoaffinity purification of HLA complexes followed by MS profiling of the extracted peptides using data-dependent acquisition, data-independent acquisition, or targeted approaches. By incorporating DNA, RNA, and ribosome sequencing data into immunopeptidomics data analysis, the proteogenomic approach provides a powerful means for identifying tumor antigens encoded within the canonical open reading frames of annotated coding genes and non-canonical tumor antigens derived from presumably non-coding regions of our genome. We discuss emerging computational challenges in immunopeptidomics data analysis and tumor antigen identification, highlighting key considerations in the proteogenomics-based approach, including accurate DNA, RNA and ribosomal sequencing data analysis, careful incorporation of predicted novel protein sequences into reference protein database, special quality control in MS data analysis due to the expanded and heterogeneous search space, cancer-specificity determination, and immunogenicity prediction. The advancements in technology and computation is continually enabling us to identify tumor antigens with higher sensitivity and accuracy, paving the way toward the development of more effective cancer immunotherapies.
Collapse
Affiliation(s)
- Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Michal Bassani-Sternberg
- Ludwig Institute for Cancer Research, University of Lausanne, Lausanne, Switzerland
- Department of Oncology, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland
- Agora Cancer Research Centre, Lausanne, Switzerland
| |
Collapse
|
32
|
O’Sullivan B, Seoighe C. Comprehensive and realistic simulation of tumour genomic sequencing data. NAR Cancer 2023; 5:zcad051. [PMID: 37746635 PMCID: PMC10516706 DOI: 10.1093/narcan/zcad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 08/25/2023] [Accepted: 09/08/2023] [Indexed: 09/26/2023] Open
Abstract
Accurate identification of somatic mutations and allele frequencies in cancer has critical research and clinical applications. Several computational tools have been developed for this purpose but, in the absence of comprehensive 'ground truth' data, assessing the accuracy of these methods is challenging. We created a computational framework to simulate tumour and matched normal sequencing data for which the source of all loci that contain non-reference bases is known, based on a phased, personalized genome. Unlike existing methods, we account for sampling errors inherent in the sequencing process. Using this framework, we assess accuracy and biases in inferred mutations and their frequencies in an established somatic mutation calling pipeline. We demonstrate bias in existing methods of mutant allele frequency estimation and show, for the first time, the observed mutation frequency spectrum corresponding to a theoretical model of tumour evolution. We highlight the impact of quality filters on detection sensitivity of clinically actionable variants and provide definitive assessment of false positive and false negative mutation calls. Our simulation framework provides an improved means to assess the accuracy of somatic mutation calling pipelines and a detailed picture of the effects of technical parameters and experimental factors on somatic mutation calling in cancer samples.
Collapse
Affiliation(s)
- Brian O’Sullivan
- School of Mathematical and Statistical Sciences, University of Galway, University Road, Galway H91 TK33, Ireland
| | - Cathal Seoighe
- School of Mathematical and Statistical Sciences, University of Galway, University Road, Galway H91 TK33, Ireland
| |
Collapse
|
33
|
Rollin J, Bester R, Brostaux Y, Caglayan K, De Jonghe K, Eichmeier A, Foucart Y, Haegeman A, Koloniuk I, Kominek P, Maree H, Onder S, Posada Céspedes S, Roumi V, Šafářová D, Schumpp O, Ulubas Serce C, Sõmera M, Tamisier L, Vainio E, van der Vlugt RAA, Massart S. Detection of single nucleotide polymorphisms in virus genomes assembled from high-throughput sequencing data: large-scale performance testing of sequence analysis strategies. PeerJ 2023; 11:e15816. [PMID: 37601254 PMCID: PMC10439718 DOI: 10.7717/peerj.15816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 07/10/2023] [Indexed: 08/22/2023] Open
Abstract
Recent developments in high-throughput sequencing (HTS) technologies and bioinformatics have drastically changed research in virology, especially for virus discovery. Indeed, proper monitoring of the viral population requires information on the different isolates circulating in the studied area. For this purpose, HTS has greatly facilitated the sequencing of new genomes of detected viruses and their comparison. However, bioinformatics analyses allowing reconstruction of genome sequences and detection of single nucleotide polymorphisms (SNPs) can potentially create bias and has not been widely addressed so far. Therefore, more knowledge is required on the limitations of predicting SNPs based on HTS-generated sequence samples. To address this issue, we compared the ability of 14 plant virology laboratories, each employing a different bioinformatics pipeline, to detect 21 variants of pepino mosaic virus (PepMV) in three samples through large-scale performance testing (PT) using three artificially designed datasets. To evaluate the impact of bioinformatics analyses, they were divided into three key steps: reads pre-processing, virus-isolate identification, and variant calling. Each step was evaluated independently through an original, PT design including discussion and validation between participants at each step. Overall, this work underlines key parameters influencing SNPs detection and proposes recommendations for reliable variant calling for plant viruses. The identification of the closest reference, mapping parameters and manual validation of the detection were recognized as the most impactful analysis steps for the success of the SNPs detections. Strategies to improve the prediction of SNPs are also discussed.
Collapse
Affiliation(s)
- Johan Rollin
- Laboratory of Plant Pathology—TERRA—Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| | - Rachelle Bester
- Citrus Research International, Matieland, South Africa
- Department of Genetics, Stellenbosch University, Matieland, South Africa
| | - Yves Brostaux
- Laboratory of Statistics, Computer Science and Modelling Applied to Bioengineering, TERRA, Gembloux Agro-Bio Tech, Teaching and Research Centre, University of Liège, Gembloux, Belgium
| | - Kadriye Caglayan
- Plant Protection Department, Agricultural Faculty, Hatay Mustafa Kemal University, Hatay, Turkey
| | - Kris De Jonghe
- Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium
| | - Ales Eichmeier
- Mendeleum—Institute of Genetics, Faculty of Horticulture, Mendel University in Brno, Lednice, Czech Republic
| | - Yoika Foucart
- Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium
| | - Annelies Haegeman
- Fisheries and Food (ILVO), Plant Sciences Unit, Flanders Research Institute for Agriculture, Merelbeke, Belgium
| | - Igor Koloniuk
- Biology Centre CAS, Ceske Budejovice, Czech Republic
| | | | - Hans Maree
- Citrus Research International, Matieland, South Africa
- Department of Genetics, Stellenbosch University, Matieland, South Africa
| | - Serkan Onder
- Department of Plant Protection, Faculty of Agriculture, Eskişehir Osmangazi University, Eskişehir, Turkey
| | - Susana Posada Céspedes
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, 4058, Switzerland
- Swiss Institute of Bioinformatics (SIB), Basel, Switzerland
| | - Vahid Roumi
- Plant Protection Department, Faculty of Agriculture, University of Maragheh, Maragheh, Iran
| | - Dana Šafářová
- Department of Cell Biology and Genetics, Faculty of Science, Palacký University Olomouc, Olomouc, Czech Republic
| | | | - Cigdem Ulubas Serce
- Plant Production and Technologies Department, Ayhan Şahenk Faculty of Agricultural Science and Technologies, Niğde Ömer Halisdemir University, Niğde, Turkey
| | - Merike Sõmera
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Tallinn, Estonia
| | - Lucie Tamisier
- Pathologie Végétale, Institut National de la Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Montfavet, France
- GAFL, Institut National de la Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Montfavet, France
| | - Eeva Vainio
- Natural Resources Institute Finland, Helsinki, Finland
| | | | - Sebastien Massart
- Laboratory of Plant Pathology—TERRA—Gembloux Agro-Bio Tech, University of Liège, Gembloux, Belgium
| |
Collapse
|
34
|
Wilton R, Szalay AS. Short-read aligner performance in germline variant identification. Bioinformatics 2023; 39:btad480. [PMID: 37527006 PMCID: PMC10421969 DOI: 10.1093/bioinformatics/btad480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/01/2023] [Accepted: 07/31/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Read alignment is an essential first step in the characterization of DNA sequence variation. The accuracy of variant-calling results depends not only on the quality of read alignment and variant-calling software but also on the interaction between these complex software tools. RESULTS In this review, we evaluate short-read aligner performance with the goal of optimizing germline variant-calling accuracy. We examine the performance of three general-purpose short-read aligners-BWA-MEM, Bowtie 2, and Arioc-in conjunction with three germline variant callers: DeepVariant, FreeBayes, and GATK HaplotypeCaller. We discuss the behavior of the read aligners with regard to the data elements on which the variant callers rely, and illustrate how the runtime configurations of these software tools combine to affect variant-calling performance. AVAILABILITY AND IMPLEMENTATION The quick brown fox jumps over the lazy dog.
Collapse
Affiliation(s)
- Richard Wilton
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Alexander S Szalay
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, United States
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, United States
| |
Collapse
|
35
|
Performance evaluation of six popular short-read simulators. Heredity (Edinb) 2023; 130:55-63. [PMID: 36496447 PMCID: PMC9905089 DOI: 10.1038/s41437-022-00577-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 11/10/2022] [Accepted: 11/11/2022] [Indexed: 12/14/2022] Open
Abstract
High-throughput sequencing data enables the comprehensive study of genomes and the variation therein. Essential for the interpretation of this genomic data is a thorough understanding of the computational methods used for processing and analysis. Whereas "gold-standard" empirical datasets exist for this purpose in humans, synthetic (i.e., simulated) sequencing data can offer important insights into the capabilities and limitations of computational pipelines for any arbitrary species and/or study design-yet, the ability of read simulator software to emulate genomic characteristics of empirical datasets remains poorly understood. We here compare the performance of six popular short-read simulators-ART, DWGSIM, InSilicoSeq, Mason, NEAT, and wgsim-and discuss important considerations for selecting suitable models for benchmarking.
Collapse
|
36
|
Evaluation of the Available Variant Calling Tools for Oxford Nanopore Sequencing in Breast Cancer. Genes (Basel) 2022; 13:genes13091583. [PMID: 36140751 PMCID: PMC9498802 DOI: 10.3390/genes13091583] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/30/2022] [Accepted: 08/31/2022] [Indexed: 11/23/2022] Open
Abstract
The goal of biomarker testing, in the field of personalized medicine, is to guide treatments to achieve the best possible results for each patient. The accurate and reliable identification of everyone’s genome variants is essential for the success of clinical genomics, employing third-generation sequencing. Different variant calling techniques have been used and recommended by both Oxford Nanopore Technologies (ONT) and Nanopore communities. A thorough examination of the variant callers might give critical guidance for third-generation sequencing-based clinical genomics. In this study, two reference genome sample datasets (NA12878) and (NA24385) and the set of high-confidence variant calls provided by the Genome in a Bottle (GIAB) were used to allow the evaluation of the performance of six variant calling tools, including Human-SNP-wf, Clair3, Clair, NanoCaller, Longshot, and Medaka, as an integral step in the in-house variant detection workflow. Out of the six variant callers understudy, Clair3 and Human-SNP-wf that has Clair3 incorporated into it achieved the highest performance rates in comparison to the other variant callers. Evaluation of the results for the tool was expressed in terms of Precision, Recall, and F1-score using Hap.py tools for the comparison. In conclusion, our findings give important insights for identifying accurate variants from third-generation sequencing of personal genomes using different variant detection tools available for long-read sequencing.
Collapse
|
37
|
Abstract
Whole Exome Sequencing (WES) is used for querying DNA variants using the protein coding parts of genomes (exomes). However, WES analysis can be challenging because of the complexity of the data. Here, we describe a consolidated protocol for unbiased WES analysis. The protocol uses three variant callers (HaplotypeCaller, FreeBayes, and DeepVariant), which have different underlying models. We provide detailed execution steps, as well as basic variant filtering, annotation, visualization, and consolidation aspects.
Protocol to enable whole exome data analysis in an unbiased approach A protocol for unbiased analysis using 3 variant callers with different underlying models From raw data to filtered, consolidated, and annotated DNA variant calls Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics.
Collapse
|
38
|
Lei Y, Meng Y, Guo X, Ning K, Bian Y, Li L, Hu Z, Anashkina AA, Jiang Q, Dong Y, Zhu X. Overview of structural variation calling: Simulation, identification, and visualization. Comput Biol Med 2022; 145:105534. [DOI: 10.1016/j.compbiomed.2022.105534] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 04/09/2022] [Accepted: 04/14/2022] [Indexed: 12/11/2022]
|