1
|
Berelson MFG, Heavens D, Nicholson P, Clark MD, Leggett RM. From air to insight: the evolution of airborne DNA sequencing technologies. MICROBIOLOGY (READING, ENGLAND) 2025; 171:001564. [PMID: 40434822 PMCID: PMC12120143 DOI: 10.1099/mic.0.001564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2024] [Accepted: 05/01/2025] [Indexed: 05/29/2025]
Abstract
Historically, the analysis of airborne biological organisms relied on microscopy and culture-based techniques. However, technological advances such as PCR and next-generation sequencing now provide researchers with the ability to gather vast amounts of data on airborne environmental DNA (eDNA). Studies typically involve capturing airborne biological material, followed by nucleic acid extraction, library preparation, sequencing and taxonomic identification to characterize the eDNA at a given location. These methods have diverse applications, including pathogen detection in agriculture and human health, air quality monitoring, bioterrorism detection and biodiversity monitoring. A variety of methods are used for airborne eDNA analysis, as no single pipeline meets all needs. This review outlines current methods for sampling, extraction, sequencing and bioinformatic analysis, highlighting how different approaches can influence the resulting data and their suitability for specific use cases. It also explores current applications of airborne eDNA sampling and identifies research gaps in the field.
Collapse
Affiliation(s)
| | - Darren Heavens
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Paul Nicholson
- John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | | | - Richard M. Leggett
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
- Centre for Microbial Interactions, Norwich Research Park, Norwich NR4 7UG, UK
| |
Collapse
|
2
|
Dakal TC, Xu C, Kumar A. Advanced computational tools, artificial intelligence and machine-learning approaches in gut microbiota and biomarker identification. FRONTIERS IN MEDICAL TECHNOLOGY 2025; 6:1434799. [PMID: 40303946 PMCID: PMC12037385 DOI: 10.3389/fmedt.2024.1434799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Accepted: 10/16/2024] [Indexed: 05/02/2025] Open
Abstract
The microbiome of the gut is a complex ecosystem that contains a wide variety of microbial species and functional capabilities. The microbiome has a significant impact on health and disease by affecting endocrinology, physiology, and neurology. It can change the progression of certain diseases and enhance treatment responses and tolerance. The gut microbiota plays a pivotal role in human health, influencing a wide range of physiological processes. Recent advances in computational tools and artificial intelligence (AI) have revolutionized the study of gut microbiota, enabling the identification of biomarkers that are critical for diagnosing and treating various diseases. This review hunts through the cutting-edge computational methodologies that integrate multi-omics data-such as metagenomics, metaproteomics, and metabolomics-providing a comprehensive understanding of the gut microbiome's composition and function. Additionally, machine learning (ML) approaches, including deep learning and network-based methods, are explored for their ability to uncover complex patterns within microbiome data, offering unprecedented insights into microbial interactions and their link to host health. By highlighting the synergy between traditional bioinformatics tools and advanced AI techniques, this review underscores the potential of these approaches in enhancing biomarker discovery and developing personalized therapeutic strategies. The convergence of computational advancements and microbiome research marks a significant step forward in precision medicine, paving the way for novel diagnostics and treatments tailored to individual microbiome profiles. Investigators have the ability to discover connections between the composition of microorganisms, the expression of genes, and the profiles of metabolites. Individual reactions to medicines that target gut microbes can be predicted by models driven by artificial intelligence. It is possible to obtain personalized and precision medicine by first gaining an understanding of the impact that the gut microbiota has on the development of disease. The application of machine learning allows for the customization of treatments to the specific microbial environment of an individual.
Collapse
Affiliation(s)
- Tikam Chand Dakal
- Genome and Computational Biology Lab, Department of Biotechnology, Mohanlal Sukhadia University, Udaipur, India
| | - Caiming Xu
- Beckman Research Institute of City of Hope, Monrovia, CA, United States
- Department of General Surgery, The First Affiliated Hospital of Dalian Medical University, Dalian, China
| | - Abhishek Kumar
- Manipal Academy of Higher Education (MAHE), Manipal, India
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| |
Collapse
|
3
|
Wu Q, Gao J, Sa B, Cong H, Deng W, Zhang Y, Zhong X, Zhang J, Wang L, Liu H, Yan Y, Zhang Y, Liu D, Yan W. Genomes of Prochlorococcus, Synechococcus, bacteria, and viruses recovered from marine picocyanobacteria cultures based on Illumina and Qitan nanopore sequencing. Sci Data 2025; 12:612. [PMID: 40221485 PMCID: PMC11993695 DOI: 10.1038/s41597-025-04762-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Accepted: 03/05/2025] [Indexed: 04/14/2025] Open
Abstract
Prochlorococcus and Synechococcus are key contributors to marine primary production and play essential roles in global biogeochemical cycles. Despite the ecological importance of these two picocyanobacterial genera, current genomic datasets still lack comprehensive representation of under-sampled ocean regions, associated bacteria and viruses. To address this gap, we used a combination of second- and third-generation sequencing technologies to assemble comprehensive genomic data from 105 Picocyanobacterial enrichment cultures isolated from the Indian Ocean, the South China Sea, and the western Pacific Ocean. This dataset includes 55 Prochlorococcus and 50 Synechococcus genomes with high completeness (>98%) and low contamination (<2%), along with 308 non-redundant associated bacterial genomes derived from 1,457 medium- and high-quality non-cyanobacteria metagenome-assembled genomes (MAGs, completeness ≥50% and contamination ≤10%). Additionally, 2,113 non-redundant viral operational taxonomic units (vOTUs) were derived from a total of 7632 qualified viral contigs. This dataset provides a valuable resource for improving our understanding of the complex interactions among Prochlorococcus, Synechococcus, and their associated bacteria and viruses in marine ecosystems, offering a foundation to study their ecological roles and evolutionary dynamics.
Collapse
Affiliation(s)
- Qingtao Wu
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
- College of Animal Science and Technology, Guangxi University, Nanning, 530004, China
| | - Jie Gao
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Boxuan Sa
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China
| | - Hongtao Cong
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, China
- Carbon Neutral Innovation Research Center, Xiamen University, Global ONCE Program, Xiamen, 361005, China
| | - Wenjie Deng
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, China
- Carbon Neutral Innovation Research Center, Xiamen University, Global ONCE Program, Xiamen, 361005, China
| | - Ying Zhang
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
- School of Life and Health Science, Hunan University of Science and Technology, Xiangtan, 411201, China
| | - Xiaojie Zhong
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen, 361102, China
- Carbon Neutral Innovation Research Center, Xiamen University, Global ONCE Program, Xiamen, 361005, China
| | - Jinyu Zhang
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China
| | - Liduo Wang
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China
| | - Haizhou Liu
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
| | - Yi Yan
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China
- University of Chinese Academy of Sciences, Beijing, 101408, China
| | - Yifei Zhang
- School of Life and Health Science, Hunan University of Science and Technology, Xiangtan, 411201, China
| | - Di Liu
- Computational Virology Group, Etiology Research Center, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, 430071, China.
- College of Animal Science and Technology, Guangxi University, Nanning, 530004, China.
- University of Chinese Academy of Sciences, Beijing, 101408, China.
- School of Life and Health Science, Hunan University of Science and Technology, Xiangtan, 411201, China.
| | - Wei Yan
- College of Marine Science and Technology, China University of Geosciences, Wuhan, 430074, China.
- Carbon Neutral Innovation Research Center, Xiamen University, Global ONCE Program, Xiamen, 361005, China.
| |
Collapse
|
4
|
Trunfio M, Scutari R, Fox V, Vuaran E, Dastgheyb RM, Fini V, Granaglia A, Balbo F, Tortarolo D, Bonora S, Perno CF, Di Perri G, Alteri C, Calcagno A. The cerebrospinal fluid virome in people with HIV: links to neuroinflammation and cognition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.02.28.640732. [PMID: 40060671 PMCID: PMC11888432 DOI: 10.1101/2025.02.28.640732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 03/16/2025]
Abstract
Despite effective HIV suppression, neuroinflammation and neurocognitive issues are prevalent in people with HIV (PWH) yet poorly understood. HIV infection alters the human virome, and virome perturbations have been linked to neurocognitive issues in people without HIV. Once thought to be sterile, the cerebrospinal fluid (CSF) hosts a recently discovered virome, presenting an unexplored avenue for understanding brain and mental health in PWH. This cross-sectional study analyzed 85 CSF samples (74 from PWH on suppressive antiretroviral therapy, and 11 from controls without HIV, CWH) through shotgun metagenomics for DNA/RNA viruses. Taxonomic composition (reads and contigs), α and β diversity, and relative abundance (RA) of prokaryotic (PV), human eukaryotic (hEV), and non-human eukaryotic viruses (nhEV) were evaluated in relation to HIV infection, markers of neuroinflammation and neurodegeneration, cognitive functions, and depressive symptoms. Sensitivity analyses and post-hoc cluster analysis on the RA of viral groups and blood-brain barrier permeability were also performed. Of 46 read-positive CSF samples, 93.5% contained PV sequences, 47.8% hEV, and 45.6% nhEV. Alpha diversity was lower in PWH versus CWH, although p>0.05. At β diversity analysis, HIV status explained 3.3% of the variation in viral composition (p=0.016). Contigs retained 13 samples positive for 8 hEV, 2 nhEV, and 6 PV. Higher RA of PV was correlated with higher CSF S100β (p=0.002) and β-Amyloid 1-42 fragment (βA-42, p=0.026), while higher RA of nhEV with poorer cognitive performance (p=0.022). Conversely, higher RA of hEV correlated with better cognition (p=0.003) and lower βA-42 (p=0.012). Sensitivity analyses in virome-positive samples only confirmed these findings. Three CSF clusters were identified and showed differences in astrocytosis, βA-42, tau protein, and cognitive functions. Participants with hEV-enriched CSF showed better cognitive performance compared to those with virus-devoid and nhEV-enriched CSF (models'p<0.05). This study provides the first comprehensive description of the CSF virome in PWH, revealing associations with neuroinflammation and cognition. These findings highlight the potential involvement of the CSF virome in brain health and inform about its composition, origin, and potential clinical implications in people with and without HIV.
Collapse
Affiliation(s)
- Mattia Trunfio
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
- HIV Neurobehavioral Research Program, Departments of Neurosciences and Psychiatry, University of California San Diego, CA 92103, USA
- Division of Infectious Diseases and Global Health, Department of Medicine, University of California San Diego, CA 92037, USA
| | - Rossana Scutari
- Multimodal Laboratory Research Unit, Bambino Gesù Children’s Hospital IRCCS, Rome 00165, Italy
| | - Valeria Fox
- Multimodal Laboratory Research Unit, Bambino Gesù Children’s Hospital IRCCS, Rome 00165, Italy
- Department of Oncology and Hemato-Oncology, University of Milan, Milan 20122, Italy
| | - Elisa Vuaran
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
| | - Raha Maryam Dastgheyb
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Vanessa Fini
- Multimodal Laboratory Research Unit, Bambino Gesù Children’s Hospital IRCCS, Rome 00165, Italy
| | - Annarita Granaglia
- Multimodal Laboratory Research Unit, Bambino Gesù Children’s Hospital IRCCS, Rome 00165, Italy
| | - Francesca Balbo
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
| | - Dora Tortarolo
- Department of Informatics, University of Turin, Turin 10149, Italy
| | - Stefano Bonora
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
| | - Carlo Federico Perno
- Multimodal Laboratory Research Unit, Bambino Gesù Children’s Hospital IRCCS, Rome 00165, Italy
- UniCamillus International Medical University, Rome 00131, Italy
| | - Giovanni Di Perri
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
| | - Claudia Alteri
- Department of Oncology and Hemato-Oncology, University of Milan, Milan 20122, Italy
- Microbiology and Virology Unit, IRCCS Fondazione Ca’ Granda Ospedale Maggiore Policlinico, Milan 20122, Italy
| | - Andrea Calcagno
- Unit of Infectious Diseases, Amedeo di Savoia hospital, Department of Medical Sciences, University of Turin, Turin 10149, Italy
| |
Collapse
|
5
|
Chen N, Liu L, Wang J, Mao D, Lu H, Shishido TK, Zhi S, Chen H, He S. Novel Gene Clusters for Secondary Metabolite Synthesis in Mesophotic Sponge-Associated Bacteria. Microb Biotechnol 2025; 18:e70107. [PMID: 39962733 PMCID: PMC11832590 DOI: 10.1111/1751-7915.70107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 01/08/2025] [Accepted: 01/30/2025] [Indexed: 02/21/2025] Open
Abstract
Mesophotic coral ecosystems (MCEs) host a diverse array of sponge species, which represent a promising source of bioactive compounds. Increasing evidence suggests that sponge-associated bacteria may be the primary producers of these compounds. However, cultivating these bacteria under laboratory conditions remains a significant challenge. To investigate the rich resource of bioactive compounds synthesised by mesophotic sponge-associated bacteria, we retrieved 429 metagenome-assembled genomes (MAGs) from 15 mesophotic sponges, revealing a strong correlation between bacterial diversity and sponge species. Furthermore, we identified 1637 secondary metabolite biosynthetic gene clusters (BGCs) within these MAGs. Among the identified BGCs, terpenes were the most abundant (495), followed by 369 polyketide synthases (PKSs), 293 ribosomally synthesised and post-translationally modified peptides (RiPPs) and 135 nonribosomal peptide synthetases (NRPSs). The BGCs were classified into 1086 gene cluster families (GCFs) based on sequence similarity. Notably, only five GCFs included experimentally validated reference BGCs from the Minimum Information about a Biosynthetic Gene cluster database (MIBiG). Additionally, an unusual abundance of BGCs was detected in Entotheonella sp. (s191209.Bin93) from the Tectomicrobia phylum. In contrast, members of Proteobacteria and Acidobacteriota harboured fewer BGCs (6-7 on average), yet their high abundance in MCE sponges suggests a potentially rich reservoir of BGCs. Analysis of the BGC distribution patterns revealed that a subset of BGCs, including terpene GCFs (FAM_00447 and FAM_01046), PKS GCF (FAM_00235), and RiPPs GCF (FAM_01143), were widespread across mesophotic sponges. Furthermore, 32 GCFs were consistently present in the same MAGs across different sponges, highlighting their potential key biological roles and capacity to yield novel bioactive compounds. This study not only underscores the untapped potential of mesophotic sponge-associated bacteria as a source of bioactive compounds but also provides valuable insights into the intricate interactions between sponges and their symbiotic microbial communities.
Collapse
Affiliation(s)
- Nuo Chen
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
- College of Food Science and EngineeringNingbo UniversityNingboZhejiangChina
| | - Liwei Liu
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
| | - Jingxuan Wang
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
- College of Food Science and EngineeringNingbo UniversityNingboZhejiangChina
| | - Deqiang Mao
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
- College of Food Science and EngineeringNingbo UniversityNingboZhejiangChina
| | - Hongmei Lu
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
- College of Food Science and EngineeringNingbo UniversityNingboZhejiangChina
| | | | - Shuai Zhi
- School of Public HealthNingbo UniversityNingboZhejiangChina
| | - Hua Chen
- Mingke Biotechnology Co., Ltd.HangzhouChina
| | - Shan He
- Li Dak Sum Yip Yio Chin Kenneth Li Marine Biopharmaceutical Research Center, Health Science CenterNingbo UniversityNingboZhejiangChina
- Ningbo Institute of Marine MedicinePeking UniversityNingboZhejiangChina
| |
Collapse
|
6
|
Sena F, Ingervo E, Khan S, Prjibelski A, Schmidt S, Tomescu A. Flowtigs: Safety in flow decompositions for assembly graphs. iScience 2024; 27:111208. [PMID: 39759024 PMCID: PMC11700653 DOI: 10.1016/j.isci.2024.111208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2024] [Revised: 09/30/2024] [Accepted: 10/15/2024] [Indexed: 01/07/2025] Open
Abstract
A decomposition of a network flow is a set of weighted walks whose superposition equals the flow. In this article, we give a simple and linear-time-verifiable complete characterization (flowtigs) of walks that are safe in such general flow decompositions, i.e., that are subwalks of any possible flow decomposition. We provide an O(mn)-time algorithm that identifies all maximal flowtigs and represents them inside a compact structure. On the practical side, we study flowtigs in the use-case of metagenomic assembly. By using the species abundances as flow values of the metagenomic assembly graph, we can model the possible assembly solutions as flow decompositions into weighted closed walks. On simulated data, compared to reporting unitigs or maximal safe walks based only on the graph structure, reporting flowtigs results in a notably more contiguous assembly. On real data, we frame flowtigs as a heuristic and provide an algorithm that is guided by this heuristic.
Collapse
Affiliation(s)
| | | | - Shahbaz Khan
- Indian Institute of Technology Roorkee, Roorkee, India
| | | | | | | |
Collapse
|
7
|
Han Y, He J, Li M, Peng Y, Jiang H, Zhao J, Li Y, Deng F. Unlocking the Potential of Metagenomics with the PacBio High-Fidelity Sequencing Technology. Microorganisms 2024; 12:2482. [PMID: 39770685 PMCID: PMC11728442 DOI: 10.3390/microorganisms12122482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 11/28/2024] [Accepted: 11/29/2024] [Indexed: 01/16/2025] Open
Abstract
Traditional methods for studying microbial communities have been limited due to difficulties in culturing and sequencing all microbial species. Recent advances in third-generation sequencing technologies, particularly PacBio's high-fidelity (HiFi) sequencing, have significantly advanced metagenomics by providing accurate long-read sequences. This review explores the role of HiFi sequencing in overcoming the limitations of previous sequencing methods, including high error rates and fragmented assemblies. We discuss the benefits and applications of HiFi sequencing across various environments, such as the human gut and soil, which provides broader context for further exploration. Key studies are discussed to highlight HiFi sequencing's ability to recover complete and coherent microbial genomes from complex microbiomes, showcasing its superior accuracy and continuity compared to other sequencing technologies. Additionally, we explore the potential applications of HiFi sequencing in quantitative microbial analysis, as well as the detection of single nucleotide variations (SNVs) and structural variations (SVs). PacBio HiFi sequencing is establishing a new benchmark in metagenomics, with the potential to significantly enhance our understanding of microbial ecology and drive forward advancements in both environmental and clinical applications.
Collapse
Affiliation(s)
- Yanhua Han
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Jinling He
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Minghui Li
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Yunjuan Peng
- College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (Y.P.); (J.Z.)
| | - Hui Jiang
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Jiangchao Zhao
- College of Animal Science, South China Agricultural University, Guangzhou 510642, China; (Y.P.); (J.Z.)
| | - Ying Li
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Feilong Deng
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, College of Life Science and Engineering, Foshan University, Foshan 528225, China; (Y.H.); (J.H.); (M.L.); (H.J.); (Y.L.)
- School of Life Science and Engineering, Foshan University, Foshan 528225, China
| |
Collapse
|
8
|
Barchi A, Massimino L, Mandarino FV, Vespa E, Sinagra E, Almolla O, Passaretti S, Fasulo E, Parigi TL, Cagliani S, Spanò S, Ungaro F, Danese S. Microbiota profiling in esophageal diseases: Novel insights into molecular staining and clinical outcomes. Comput Struct Biotechnol J 2024; 23:626-637. [PMID: 38274997 PMCID: PMC10808859 DOI: 10.1016/j.csbj.2023.12.026] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/22/2023] [Accepted: 12/23/2023] [Indexed: 01/27/2024] Open
Abstract
Gut microbiota is recognized nowadays as one of the key players in the development of several gastro-intestinal diseases. The first studies focused mainly on healthy subjects with staining of main bacterial species via culture-based techniques. Subsequently, lots of studies tried to focus on principal esophageal disease enlarged the knowledge on esophageal microbial environment and its role in pathogenesis. Gastro Esophageal Reflux Disease (GERD), the most widespread esophageal condition, seems related to a certain degree of mucosal inflammation, via interleukin (IL) 8 potentially enhanced by bacterial components, lipopolysaccharide (LPS) above all. Gram- bacteria, producing LPS), such as Campylobacter genus, have been found associated with GERD. Barrett esophagus (BE) seems characterized by a Gram- and microaerophils-shaped microbiota. Esophageal cancer (EC) development leads to an overturn in the esophageal environment with the shift from an oral-like microbiome to a prevalently low-abundant and low-diverse Gram--shaped microbiome. Although underinvestigated, also changes in the esophageal microbiome are associated with rare chronic inflammatory or neuropathic disease pathogenesis. The paucity of knowledge about the microbiota-driven mechanisms in esophageal disease pathogenesis is mainly due to the scarce sensitivity of sequencing technology and culture methods applied so far to study commensals in the esophagus. However, the recent advances in molecular techniques, especially with the advent of non-culture-based genomic sequencing tools and the implementation of multi-omics approaches, have revolutionized the microbiome field, with promises of implementing the current knowledge, discovering more mechanisms underneath, and giving insights into the development of novel therapies aimed to re-establish the microbial equilibrium for ameliorating esophageal diseases..
Collapse
Affiliation(s)
- Alberto Barchi
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Luca Massimino
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | | | - Edoardo Vespa
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Emanuele Sinagra
- Gastroenterology & Endoscopy Unit, Fondazione Istituto G. Giglio, Cefalù, Italy
| | - Omar Almolla
- Università Vita-Salute San Raffaele, Faculty of Medicine, Milan, Italy
| | - Sandro Passaretti
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Ernesto Fasulo
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Tommaso Lorenzo Parigi
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Faculty of Medicine, Milan, Italy
| | - Stefania Cagliani
- Università Vita-Salute San Raffaele, Faculty of Medicine, Milan, Italy
| | - Salvatore Spanò
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Federica Ungaro
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
| | - Silvio Danese
- Gastroenterology and Digestive Endoscopy, IRCCS Ospedale San Raffaele, Milan, Italy
- Università Vita-Salute San Raffaele, Faculty of Medicine, Milan, Italy
| |
Collapse
|
9
|
Shah Y, Kafaie S. Evaluating Sequence Alignment Tools for Antimicrobial Resistance Gene Detection in Assembly Graphs. Microorganisms 2024; 12:2168. [PMID: 39597557 PMCID: PMC11596566 DOI: 10.3390/microorganisms12112168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2024] [Revised: 10/24/2024] [Accepted: 10/25/2024] [Indexed: 11/29/2024] Open
Abstract
Antimicrobial resistance (AMR) is an escalating global health threat, often driven by the horizontal gene transfer (HGT) of resistance genes. Detecting AMR genes and understanding their genomic context within bacterial populations is crucial for mitigating the spread of resistance. In this study, we evaluate the performance of three sequence alignment tools-Bandage, SPAligner, and GraphAligner-in identifying AMR gene sequences from assembly and de Bruijn graphs, which are commonly used in microbial genome assembly. Efficiently identifying these genes allows for the detection of neighboring genetic elements and possible HGT events, contributing to a deeper understanding of AMR dissemination. We compare the performance of the tools both qualitatively and quantitatively, analyzing the precision, computational efficiency, and accuracy in detecting AMR-related sequences. Our analysis reveals that Bandage offers the most precise and efficient identification of AMR gene sequences, followed by GraphAligner and SPAligner. The comparison includes evaluating the similarity of paths returned by each tool and measuring output accuracy using a modified edit distance metric. These results highlight Bandage's potential for contributing to the accurate identification and study of AMR genes in bacterial populations, offering important insights into resistance mechanisms and potential targets for mitigating AMR spread.
Collapse
Affiliation(s)
| | - Somayeh Kafaie
- Department of Mathematics and Computing Science, Saint Mary’s University, Halifax, NS B3H 3C3, Canada;
| |
Collapse
|
10
|
Abramova A, Karkman A, Bengtsson-Palme J. Metagenomic assemblies tend to break around antibiotic resistance genes. BMC Genomics 2024; 25:959. [PMID: 39402510 PMCID: PMC11479545 DOI: 10.1186/s12864-024-10876-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 10/08/2024] [Indexed: 10/19/2024] Open
Abstract
BACKGROUND Assembly of metagenomic samples can provide essential information about the mobility potential and taxonomic origin of antibiotic resistance genes (ARGs) and inform interventions to prevent further spread of resistant bacteria. However, similar to other conserved regions, such as ribosomal RNA genes and mobile genetic elements, almost identical ARGs typically occur in multiple genomic contexts across different species, representing a considerable challenge for the assembly process. Usually, this results in many fragmented contigs of unclear origin, complicating the risk assessment of ARG detections. To systematically investigate the impact of this issue on detection, quantification and contextualization of ARGs, we evaluated the performance of different assembly approaches, including genomic-, metagenomic- and transcriptomic-specialized assemblers. We quantified recovery and accuracy rates of each tool for ARGs both from in silico spiked metagenomic samples as well as real samples sequenced using both long- and short-read sequencing technologies. RESULTS The results revealed that none of the investigated tools can accurately capture genomic contexts present in samples of high complexity. The transcriptomic assembler Trinity showed a better performance in terms of reconstructing longer and fewer contigs matching unique genomic contexts, which can be beneficial for deciphering the taxonomic origin of ARGs. The currently commonly used metagenomic assembly tools metaSPAdes and MEGAHIT were able to identify the ARG repertoire but failed to fully recover the diversity of genomic contexts present in a sample. On top of that, in a complex scenario MEGAHIT produced very short contigs, which can lead to considerable underestimation of the resistome in a given sample. CONCLUSIONS Our study shows that metaSPAdes and Trinity would be the preferable tools in terms of accuracy to recover correct genomic contexts around ARGs in metagenomic samples characterized by uneven coverages. Overall, the inability of assemblers to reconstruct long ARG-containing contigs has impacts on ARG quantification, suggesting that directly mapping reads to an ARG database should be performed as a complementary strategy to get accurate ARG abundance and diversity measures.
Collapse
Affiliation(s)
- Anna Abramova
- Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10A, Gothenburg, 413 46, Sweden.
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, 412 96, Sweden.
- Centre for Antibiotic Resistance Research (CARe), Gothenburg, Sweden.
| | - Antti Karkman
- Department of Microbiology, University of Helsinki, Helsinki, Finland
| | - Johan Bengtsson-Palme
- Department of Infectious Diseases, Institute of Biomedicine, The Sahlgrenska Academy, University of Gothenburg, Guldhedsgatan 10A, Gothenburg, 413 46, Sweden
- Division of Systems and Synthetic Biology, Department of Life Sciences, SciLifeLab, Chalmers University of Technology, Gothenburg, 412 96, Sweden
- Centre for Antibiotic Resistance Research (CARe), Gothenburg, Sweden
| |
Collapse
|
11
|
Kang X, Zhang W, Li Y, Luo X, Schönhuth A. HyLight: Strain aware assembly of low coverage metagenomes. Nat Commun 2024; 15:8665. [PMID: 39375348 PMCID: PMC11458758 DOI: 10.1038/s41467-024-52907-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 09/23/2024] [Indexed: 10/09/2024] Open
Abstract
Different strains of identical species can vary substantially in terms of their spectrum of biomedically relevant phenotypes. Reconstructing the genomes of microbial communities at the level of their strains poses significant challenges, because sequencing errors can obscure strain-specific variants. Next-generation sequencing (NGS) reads are too short to resolve complex genomic regions. Third-generation sequencing (TGS) reads, although longer, are prone to higher error rates or substantially more expensive. Limiting TGS coverage to reduce costs compromises the accuracy of the assemblies. This explains why prior approaches agree on losses in strain awareness, accuracy, tendentially excessive costs, or combinations thereof. We introduce HyLight, a metagenome assembly approach that addresses these challenges by implementing the complementary strengths of TGS and NGS data. HyLight employs strain-resolved overlap graphs (OG) to accurately reconstruct individual strains within microbial communities. Our experiments demonstrate that HyLight produces strain-aware and contiguous assemblies at minimal error content, while significantly reducing costs because utilizing low-coverage TGS data. HyLight achieves an average improvement of 19.05% in preserving strain identity and demonstrates near-complete strain awareness across diverse datasets. In summary, HyLight offers considerable advances in metagenome assembly, insofar as it delivers significantly enhanced strain awareness, contiguity, and accuracy without the typical compromises observed in existing approaches.
Collapse
Affiliation(s)
- Xiongbin Kang
- College of Biology, Hunan University, Changsha, China
- Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Wenhai Zhang
- College of Biology, Hunan University, Changsha, China
| | - Yichen Li
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiao Luo
- College of Biology, Hunan University, Changsha, China.
| | - Alexander Schönhuth
- Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
12
|
da Silva S, Vuong P, Amaral JRV, da Silva VAS, de Oliveira SS, Vermelho AB, Beale DJ, Bissett A, Whiteley AS, Kaur P, Macrae A. The piranha gut microbiome provides a selective lens into river water biodiversity. Sci Rep 2024; 14:21518. [PMID: 39277613 PMCID: PMC11401890 DOI: 10.1038/s41598-024-72329-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2024] [Accepted: 09/05/2024] [Indexed: 09/17/2024] Open
Abstract
Advances in omics technologies have enabled the in-depth study of microbial communities and their metabolic profiles from all environments. Here metagenomes were sampled from piranha (Serrasalmus rhombeus) and from river water from the Rio São Benedito (Amazon Basin). Shotgun metagenome sequencing was used to explore diversity and to test whether fish microbiomes are a good proxy for river microbiome studies. The results showed that the fish microbiomes were not significantly different from the river water microbiomes at higher taxonomic ranks. However, at the genus level, fish microbiome alpha diversity decreased, and beta diversity increased. This result repeated for functional gene abundances associated with specific metabolic categories (SEED level 3). A clear delineation between water and fish was seen for beta diversity. The piranha microbiome provides a good and representative subset of its river water microbiome. Variations seen in beta biodiversity were expected and can be explained by temporal variations in the fish microbiome in response to stronger selective forces on its biodiversity. Metagenome assembled genomes construction was better from the fish samples. This study has revealed that the microbiome of a piranha tells us a lot about its river water microbiome and function.
Collapse
Affiliation(s)
- Sheila da Silva
- Programa Pós-Graduação de Biotecnologia Vegetal e Bioprocessos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - João Ricardo Vidal Amaral
- Programa Pós-Graduação de Biotecnologia Vegetal e Bioprocessos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | | | - Selma Soares de Oliveira
- Programa Pós-Graduação de Biotecnologia Vegetal e Bioprocessos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - Alane Beatriz Vermelho
- Programa Pós-Graduação de Biotecnologia Vegetal e Bioprocessos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
| | - David John Beale
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Environment, Dutton Park, QLD, Australia
| | - Andrew Bissett
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Environment, Battery Point, TAS, Australia
| | - Andrew Steven Whiteley
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Environment, Waterford, WA, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Andrew Macrae
- Programa Pós-Graduação de Biotecnologia Vegetal e Bioprocessos, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
- Instituto de Microbiologia Paulo de Góes, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.
| |
Collapse
|
13
|
Liu C, Tang Z, Li L, Kang Y, Teng Y, Yu Y. Enhancing antimicrobial resistance detection with MetaGeneMiner: Targeted gene extraction from metagenomes. Chin Med J (Engl) 2024; 137:2092-2098. [PMID: 38934052 PMCID: PMC11374256 DOI: 10.1097/cm9.0000000000003182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Indexed: 06/28/2024] Open
Abstract
BACKGROUND Accurately and efficiently extracting microbial genomic sequences from complex metagenomic data is crucial for advancing our understanding in fields such as clinical diagnostics, environmental microbiology, and biodiversity. As sequencing technologies evolve, this task becomes increasingly challenging due to the intricate nature of microbial communities and the vast amount of data generated. Especially in intensive care units (ICUs), infections caused by antibiotic-resistant bacteria are increasingly prevalent among critically ill patients, significantly impacting the effectiveness of treatments and patient prognoses. Therefore, obtaining timely and accurate information about infectious pathogens is of paramount importance for the treatment of patients with severe infections, which enables precisely targeted anti-infection therapies, and a tool that can extract microbial genomic sequences from metagenomic dataset would be of help. METHODS We developed MetaGeneMiner to help with retrieving specific microbial genomic sequences from metagenomes using a k-mer-based approach. It facilitates the rapid and accurate identification and analysis of pathogens. The tool is designed to be user-friendly and efficient on standard personal computers, allowing its use across a wide variety of settings. We validated MetaGeneMiner using eight metagenomic samples from ICU patients, which demonstrated its efficiency and accuracy. RESULTS The software extensively retrieved coding sequences of pathogens Acinetobacter baumannii and herpes simplex virus type 1 and detected a variety of resistance genes. All documentation and source codes for MetaGeneMiner are freely available at https://gitee.com/sculab/MetaGeneMiner . CONCLUSIONS It is foreseeable that MetaGeneMiner possesses the potential for applications across multiple domains, including clinical diagnostics, environmental microbiology, gut microbiome research, as well as biodiversity and conservation biology. Particularly in ICU settings, MetaGeneMiner introduces a novel, rapid, and precise method for diagnosing and treating infections in critically ill patients. This tool is capable of efficiently identifying infectious pathogens, guiding personalized and precise treatment strategies, and monitoring the development of antibiotic resistance, significantly impacting the diagnosis and treatment of severe infections.
Collapse
Affiliation(s)
- Chang Liu
- Department of Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Zizhen Tang
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan 610065, China
| | - Linzhu Li
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan 610065, China
| | - Yan Kang
- Department of Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China
| | - Yue Teng
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Beijing 100071, China
| | - Yan Yu
- Key Laboratory of Bio-Resources and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, Sichuan 610065, China
| |
Collapse
|
14
|
Giolai M, Verweij W, Martin S, Pearson N, Nicholson P, Leggett RM, Clark MD. Measuring air metagenomic diversity in an agricultural ecosystem. Curr Biol 2024; 34:3778-3791.e4. [PMID: 39096906 DOI: 10.1016/j.cub.2024.07.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/26/2024] [Accepted: 07/04/2024] [Indexed: 08/05/2024]
Abstract
All species shed DNA during life or in death, providing an opportunity to monitor biodiversity via environmental DNA (eDNA). In recent years, combining eDNA, high-throughput sequencing technologies, bioinformatics, and increasingly complete sequence databases has promised a non-invasive and non-destructive environmental monitoring tool. Modern agricultural systems are often large monocultures and so are highly vulnerable to disease outbreaks. Pest and pathogen monitoring in agricultural ecosystems is key for efficient and early disease prevention, lower pesticide use, and better food security. Although the air is rich in biodiversity, it has the lowest DNA concentration of all environmental media and yet is the route for windborne spread of many damaging crop pathogens. Our work suggests that ecosystems can be monitored efficiently using airborne nucleic acid information. Here, we show that the airborne DNA of microbes can be recovered, shotgun sequenced, and taxonomically classified, including down to the species level. We show that by monitoring a field growing key crops we can identify the presence of agriculturally significant pathogens and quantify their changing abundance over a period of 1.5 months, often correlating with weather variables. We add to the evidence that aerial eDNA can be used as a source for biomonitoring in terrestrial ecosystems, specifically highlighting agriculturally relevant species and how pathogen levels correlate with weather conditions. Our ability to detect dynamically changing levels of species and strains highlights the value of airborne eDNA in agriculture, monitoring biodiversity changes, and tracking taxa of interest.
Collapse
Affiliation(s)
- Michael Giolai
- Natural History Museum, London SW7 5BD, UK; Research Centre for Ecological Change, Organismal and Evolutionary Biology Research Program, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki 00014, Finland
| | - Walter Verweij
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK; Enza Zaden, Enkhuizen 1602 DB, the Netherlands
| | - Samuel Martin
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Neil Pearson
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK
| | - Paul Nicholson
- Crop Genetics Department, John Innes Centre, Norwich Research Park, Norwich NR4 7UH, UK
| | | | | |
Collapse
|
15
|
Chen L, Chen A, Zhang XD, Saenz Robles MT, Han HS, Xiao Y, Xiao G, Pipas JM, Weitz DA. Targeted whole-genome recovery of single viral species in a complex environmental sample. Proc Natl Acad Sci U S A 2024; 121:e2404727121. [PMID: 39052829 PMCID: PMC11295033 DOI: 10.1073/pnas.2404727121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 06/07/2024] [Indexed: 07/27/2024] Open
Abstract
Characterizing unknown viruses is essential for understanding viral ecology and preparing against viral outbreaks. Recovering complete genome sequences from environmental samples remains computationally challenging using metagenomics, especially for low-abundance species with uneven coverage. We present an experimental method for reliably recovering complete viral genomes from complex environmental samples. Individual genomes are encapsulated into droplets and amplified using multiple displacement amplification. A unique gene detection assay, which employs an RNA-based probe and an exonuclease, selectively identifies droplets containing the target viral genome. Labeled droplets are sorted using a microfluidic sorter, and genomes are extracted for sequencing. We demonstrate this method's efficacy by spiking two known viral genomes, Simian virus 40 (SV40, 5,243 bp) and Human Adenovirus 5 (HAd5, 35,938 bp), into a sewage sample with a final abundance in the droplets of around 0.1% and 0.015%, respectively. We achieve 100% recovery of the complete sequence of the spiked-in SV40 genome with uniform coverage distribution. For the larger HAd5 genome, we cover approximately 99.4% of its sequence. Notably, genome recovery is achieved with as few as one sorted droplet, which enables the recovery of any desired genomes in complex environmental samples, regardless of their abundance. This method enables single-genome whole-genome amplification and targeting characterizations of rare viral species and will facilitate our ability to access the mutational profile in single-virus genomes and contribute to an improved understanding of viral ecology.
Collapse
Affiliation(s)
- Liyin Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Anqi Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Xinge Diana Zhang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | | | - Hee-Sun Han
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL61801
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Yi Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Gao Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - James M. Pipas
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA15260
| | - David A. Weitz
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
- Department of Physics, Harvard University, Cambridge, MA02138
| |
Collapse
|
16
|
Rajeev S, Nishan K, Dipesh T, M TC, Manu V, Vida A, Juliana G, Surendra Kumar M, Binod G, Runa J. Investigation of acute encephalitis syndrome with implementation of metagenomic next generation sequencing in Nepal. BMC Infect Dis 2024; 24:734. [PMID: 39054413 PMCID: PMC11274775 DOI: 10.1186/s12879-024-09628-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 07/17/2024] [Indexed: 07/27/2024] Open
Abstract
BACKGROUND The causative agents of Acute Encephalitis Syndrome remain unknown in 68-75% of the cases. In Nepal, the cases are tested only for Japanese encephalitis, which constitutes only about 15% of the cases. However, there could be several organisms, including vaccine-preventable etiologies that cause acute encephalitis, when identified could direct public health efforts for prevention, including addressing gaps in vaccine coverage. OBJECTIVES This study employs metagenomic next-generation-sequencing in the investigation of underlying causative etiologies contributing to acute encephalitis syndrome in Nepal. METHODS In this study, we investigated 90, Japanese-encephalitis-negative, banked cerebrospinal fluid samples that were collected as part of a national surveillance network in 2016 and 2017. Randomization was done to include three age groups (< 5-years; 5-14-years; >15-years). Only some metadata (age and gender) were available. The investigation was performed in two batches which included total nucleic-acid extraction, followed by individual library preparation (DNA and RNA) and sequencing on Illumina iSeq100. The genomic data were interpreted using Chan Zuckerberg-ID and confirmed with polymerase-chain-reaction. RESULTS Human-alphaherpes-virus 2 and Enterovirus-B were seen in two samples. These hits were confirmed by qPCR and semi-nested PCR respectively. Most of the other samples were marred by low abundance of pathogen, possible freeze-thaw cycles, lack of process controls and associated clinical metadata. CONCLUSION From this study, two documented causative agents were revealed through metagenomic next-generation-sequencing. Insufficiency of clinical metadata, process controls, low pathogen abundance and absence of standard procedures to collect and store samples in nucleic-acid protectants could have impeded the study and incorporated ambiguity while correlating the identified hits to infection. Therefore, there is need of standardized procedures for sample collection, inclusion of process controls and clinical metadata. Despite challenging conditions, this study highlights the usefulness of mNGS to investigate diseases with unknown etiologies and guide development of adequate clinical-management-algorithms and outbreak investigations in Nepal.
Collapse
Affiliation(s)
- Shrestha Rajeev
- Center for Infectious Disease Research and Surveillance, Dhulikhel Hospital Kathmandu University Hospital, Dhulikhel, Nepal.
- Department of Pharmacology, Kathmandu University School of Medical Sciences, Dhulikhel, Nepal.
- Molecular and Genome Sequencing Research Lab, Dhulikhel Hospital Kathmandu University Hospital, Dhulikhel, Nepal.
| | - Katuwal Nishan
- Center for Infectious Disease Research and Surveillance, Dhulikhel Hospital Kathmandu University Hospital, Dhulikhel, Nepal
- Molecular and Genome Sequencing Research Lab, Dhulikhel Hospital Kathmandu University Hospital, Dhulikhel, Nepal
| | - Tamrakar Dipesh
- Center for Infectious Disease Research and Surveillance, Dhulikhel Hospital Kathmandu University Hospital, Dhulikhel, Nepal
- Department of Community Medicine, Kathmandu University School of Medical Sciences, Dhulikhel, Nepal
| | - Tato Cristina M
- Rapid Response Team, Chan Zuckerberg Biohub, San Francisco, USA
| | | | - Ahyong Vida
- Rapid Response Team, Chan Zuckerberg Biohub, San Francisco, USA
| | - Gil Juliana
- Rapid Response Team, Chan Zuckerberg Biohub, San Francisco, USA
| | - Madhup Surendra Kumar
- Department of Microbiology, Kathmandu University School of Medical Sciences, Dhulikhel, Nepal
| | - Gupta Binod
- Emergency Preparedness and Operation, WHE Program, World Health Organization, Kathmandu, Nepal
| | - Jha Runa
- National Public Health Laboratory, Kathmandu, Nepal
| |
Collapse
|
17
|
Zavadska D, Henry N, Auladell A, Berney C, Richter DJ. Diverse patterns of correspondence between protist metabarcodes and protist metagenome-assembled genomes. PLoS One 2024; 19:e0303697. [PMID: 38843225 PMCID: PMC11156365 DOI: 10.1371/journal.pone.0303697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Accepted: 04/29/2024] [Indexed: 06/09/2024] Open
Abstract
Two common approaches to study the composition of environmental protist communities are metabarcoding and metagenomics. Raw metabarcoding data are usually processed into Operational Taxonomic Units (OTUs) or amplicon sequence variants (ASVs) through clustering or denoising approaches, respectively. Analogous approaches are used to assemble metagenomic reads into metagenome-assembled genomes (MAGs). Understanding the correspondence between the data produced by these two approaches can help to integrate information between the datasets and to explain how metabarcoding OTUs and MAGs are related with the underlying biological entities they are hypothesised to represent. MAGs do not contain the commonly used barcoding loci, therefore sequence homology approaches cannot be used to match OTUs and MAGs. We made an attempt to match V9 metabarcoding OTUs from the 18S rRNA gene (V9 OTUs) and MAGs from the Tara Oceans expedition based on the correspondence of their relative abundances across the same set of samples. We evaluated several metrics for detecting correspondence between features in these two datasets and developed controls to filter artefacts of data structure and processing. After selecting the best-performing metrics, ranking the V9 OTU/MAG matches by their proportionality/correlation coefficients and applying a set of selection criteria, we identified candidate matches between V9 OTUs and MAGs. In some cases, V9 OTUs and MAGs could be matched with a one-to-one correspondence, implying that they likely represent the same underlying biological entity. More generally, matches we observed could be classified into 4 scenarios: one V9 OTU matches many MAGs; many V9 OTUs match many MAGs; many V9 OTUs match one MAG; one V9 OTU matches one MAG. Notably, we found some instances in which different OTU-MAG matches from the same taxonomic group were not classified in the same scenario, with all four scenarios possible even within the same taxonomic group, illustrating that factors beyond taxonomic lineage influence the relationship between OTUs and MAGs. Overall, each scenario produces a different interpretation of V9 OTUs, MAGs and how they compare in terms of the genomic and ecological diversity they represent.
Collapse
Affiliation(s)
- Daryna Zavadska
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - Nicolas Henry
- CNRS, FR2424, ABiMS, Station Biologique de Roscoff, Sorbonne Université, Roscoff, France
| | - Adrià Auladell
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - Cédric Berney
- CNRS, UMR7144, AD2M, Station Biologique de Roscoff, Sorbonne Université, Roscoff, France
| | - Daniel J. Richter
- Institut de Biologia Evolutiva (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| |
Collapse
|
18
|
Chen Z, Grim CJ, Ramachandran P, Meng J. Advancing metagenome-assembled genome-based pathogen identification: unraveling the power of long-read assembly algorithms in Oxford Nanopore sequencing. Microbiol Spectr 2024; 12:e0011724. [PMID: 38687063 PMCID: PMC11237517 DOI: 10.1128/spectrum.00117-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/05/2024] [Indexed: 05/02/2024] Open
Abstract
Oxford Nanopore sequencing is one of the high-throughput sequencing technologies that facilitates the reconstruction of metagenome-assembled genomes (MAGs). This study aimed to assess the potential of long-read assembly algorithms in Oxford Nanopore sequencing to enhance the MAG-based identification of bacterial pathogens using both simulated and mock communities. Simulated communities were generated to mimic those on fresh spinach and in surface water. Long reads were produced using R9.4.1+SQK-LSK109 and R10.4 + SQK-LSK112, with 0.5, 1, and 2 million reads. The simulated bacterial communities included multidrug-resistant Salmonella enterica serotypes Heidelberg, Montevideo, and Typhimurium in the fresh spinach community individually or in combination, as well as multidrug-resistant Pseudomonas aeruginosa in the surface water community. Real data sets of the ZymoBIOMICS HMW DNA Standard were also studied. A bioinformatic pipeline (MAGenie, freely available at https://github.com/jackchen129/MAGenie) that combines metagenome assembly, taxonomic classification, and sequence extraction was developed to reconstruct draft MAGs from metagenome assemblies. Five assemblers were evaluated based on a series of genomic analyses. Overall, Flye outperformed the other assemblers, followed by Shasta, Raven, and Unicycler, while Canu performed least effectively. In some instances, the extracted sequences resulted in draft MAGs and provided the locations and structures of antimicrobial resistance genes and mobile genetic elements. Our study showcases the viability of utilizing the extracted sequences for precise phylogenetic inference, as demonstrated by the consistent alignment of phylogenetic topology between the reference genome and the extracted sequences. R9.4.1+SQK-LSK109 was more effective in most cases than R10.4+SQK-LSK112, and greater sequencing depths generally led to more accurate results.IMPORTANCEBy examining diverse bacterial communities, particularly those housing multiple Salmonella enterica serotypes, this study holds significance in uncovering the potential of long-read assembly algorithms to improve metagenome-assembled genome (MAG)-based pathogen identification through Oxford Nanopore sequencing. Our research demonstrates that long-read assembly stands out as a promising avenue for boosting precision in MAG-based pathogen identification, thus advancing the development of more robust surveillance measures. The findings also support ongoing endeavors to fine-tune a bioinformatic pipeline for accurate pathogen identification within complex metagenomic samples.
Collapse
Affiliation(s)
- Zhao Chen
- Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, Maryland, USA
| | - Christopher J. Grim
- Center for Food Safety and Applied Nutrition, United States Food and Drug Administration, College Park, Maryland, USA
| | - Padmini Ramachandran
- Center for Food Safety and Applied Nutrition, United States Food and Drug Administration, College Park, Maryland, USA
| | - Jianghong Meng
- Joint Institute for Food Safety and Applied Nutrition, Center for Food Safety and Security Systems, University of Maryland, College Park, Maryland, USA
- Department of Nutrition and Food Science, University of Maryland, College Park, Maryland, USA
| |
Collapse
|
19
|
Xiao Y, Hao T. New insights on ecological roles of waste activated sludge in nutrient-stressed co-digestion. BIORESOURCE TECHNOLOGY 2024; 402:130836. [PMID: 38744398 DOI: 10.1016/j.biortech.2024.130836] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 05/10/2024] [Accepted: 05/11/2024] [Indexed: 05/16/2024]
Abstract
There have been extensive applications of waste activated sludge (WAS) in anaerobic co-digestion (AcoD). Nonetheless, mechanisms through which AcoD systems maintain stability, particularly under nutrient-stressed conditions, are under-appreciated. In this study, the role of WAS in a nutrient-stressed WAS-food waste AcoD system was re-evaluated. Our findings demonstrated that WAS-based co-digestion increased methane production (by 20-60%) as WAS bolsters such systems' resilience via establishing a core niche-based microbial balance. The carbon utilization investigation suggested a microbial niche balance is attainable if two conditions are satisfied: 1) hydrolysis efficiency is greater than 50%; and 2) both the acidogenesis-to-hydrolysis and acetogenesis-to-hydrolysis efficiencies surpass 0.5. Metagenomic assembly genome (MAG) analysis indicated that the versatile metabolic characteristics strengthened the microbial niche balance, rendering the system resilient and efficient through a syntrophic mode, contributing to both acidogenesis and acetogenesis. The findings of this study provide new insights into the ecological effects of WAS on AcoD.
Collapse
Affiliation(s)
- Yihang Xiao
- Department of Civil and Environmental Engineering, Faculty of Science and Technology, University of Macau, Macau
| | - Tianwei Hao
- Department of Civil and Environmental Engineering, Faculty of Science and Technology, University of Macau, Macau.
| |
Collapse
|
20
|
Zampolli J, De Giani A, Rossi M, Finazzi M, Di Gennaro P. Who inhabits the built environment? A microbiological point of view on the principal bacteria colonizing our urban areas. Front Microbiol 2024; 15:1380953. [PMID: 38863750 PMCID: PMC11165352 DOI: 10.3389/fmicb.2024.1380953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 05/09/2024] [Indexed: 06/13/2024] Open
Abstract
Modern lifestyle greatly influences human well-being. Indeed, nowadays people are centered in the cities and this trend is growing with the ever-increasing population. The main habitat for modern humans is defined as the built environment (BE). The modulation of life quality in the BE is primarily mediated by a biodiversity of microbes. They derive from different sources, such as soil, water, air, pets, and humans. Humans are the main source and vector of bacterial diversity in the BE leaving a characteristic microbial fingerprint on the surfaces and spaces. This review, focusing on articles published from the early 2000s, delves into bacterial populations present in indoor and outdoor urban environments, exploring the characteristics of primary bacterial niches in the BE and their native habitats. It elucidates bacterial interconnections within this context and among themselves, shedding light on pathways for adaptation and survival across diverse environmental conditions. Given the limitations of culture-based methods, emphasis is placed on culture-independent approaches, particularly high-throughput techniques to elucidate the genetic and -omic features of BE bacteria. By elucidating these microbiota profiles, the review aims to contribute to understanding the implications for human health and the assessment of urban environmental quality in modern cities.
Collapse
Affiliation(s)
| | | | | | | | - Patrizia Di Gennaro
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| |
Collapse
|
21
|
Lin A, Torres CM, Hobbs EC, Bardhan J, Aley SB, Spencer CT, Taylor KL, Chiang T. Computational and Systems Biology Advances to Enable Bioagent Agnostic Signatures. Health Secur 2024; 22:130-139. [PMID: 38483337 PMCID: PMC11044874 DOI: 10.1089/hs.2023.0076] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024] Open
Affiliation(s)
- Andy Lin
- Andy Lin, PhD, is a Linus Pauling Distinguished Postdoctoral Fellow; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Cameron M. Torres
- Cameron M. Torres is a Graduate Research Assistant and Wieland Fellow, Department of Biological Sciences; at the University of Texas at El Paso, El Paso, TX
| | - Errett C. Hobbs
- Errett C. Hobbs, PhD, is a Data Scientist; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Jaydeep Bardhan
- Jaydeep Bardhan, PhD, is a Research Line Manager, Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA
| | - Stephen B. Aley
- Stephen B. Aley, PhD, is a Professor, Biological Sciences, and an Associate Vice President for Research, Sponsored Projects; at the University of Texas at El Paso, El Paso, TX
| | - Charles T. Spencer
- Charles T. Spencer, PhD, is an Associate Professor, Biological Sciences, and Edward and Barbara Brown Egbert Endowed Chair of the Department of Biological Sciences; at the University of Texas at El Paso, El Paso, TX
| | - Karen L. Taylor
- Karen L. Taylor, MS, is a Research Line Manager; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| | - Tony Chiang
- Tony Chiang, PhD, is a Data Scientist; in the National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA
| |
Collapse
|
22
|
Lin A, Torres C, Hobbs EC, Bardhan J, Aley S, Spencer CT, Taylor KL, Chiang T. Computational and Systems Biology Advances to Enable Bioagent Agnostic Signatures. ARXIV 2024:arXiv:2310.13898v3. [PMID: 37961741 PMCID: PMC10635321] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Enumerated threat agent lists have long driven biodefense priorities. The global SARS-CoV-2 pandemic demonstrated the limitations of searching for known threat agents as compared to a more agnostic approach. Recent technological advances are enabling agent-agnostic biodefense, especially through the integration of multi-modal observations of host-pathogen interactions directed by a human immunological model. Although well-developed technical assays exist for many aspects of human-pathogen interaction, the analytic methods and pipelines to combine and holistically interpret the results of such assays are immature and require further investments to exploit new technologies. In this manuscript, we discuss potential immunologically based bioagent-agnostic approaches and the computational tool gaps the community should prioritize filling.
Collapse
Affiliation(s)
- Andy Lin
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Cameron Torres
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Errett C Hobbs
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Jaydeep Bardhan
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Stephen Aley
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Charles T Spencer
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
| | - Karen L Taylor
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
| | - Tony Chiang
- National Security Directorate, Pacific Northwest National Laboratory, Seattle, WA 98109, USA
- Department of Biological Sciences, University of Texas at El Paso, El Paso, Texas 79968 USA
- Department of Mathematics, University of Washington, Seattle 98102 USA
| |
Collapse
|
23
|
Li Y, Miyani B, Faust RA, David RE, Xagoraraki I. A broad wastewater screening and clinical data surveillance for virus-related diseases in the metropolitan Detroit area in Michigan. Hum Genomics 2024; 18:14. [PMID: 38321488 PMCID: PMC10845806 DOI: 10.1186/s40246-024-00581-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/24/2024] [Indexed: 02/08/2024] Open
Abstract
BACKGROUND Periodic bioinformatics-based screening of wastewater for assessing the diversity of potential human viral pathogens circulating in a given community may help to identify novel or potentially emerging infectious diseases. Any identified contigs related to novel or emerging viruses should be confirmed with targeted wastewater and clinical testing. RESULTS During the COVID-19 pandemic, untreated wastewater samples were collected for a 1-year period from the Great Lakes Water Authority Wastewater Treatment Facility in Detroit, MI, USA, and viral population diversity from both centralized interceptor sites and localized neighborhood sewersheds was investigated. Clinical cases of the diseases caused by human viruses were tabulated and compared with data from viral wastewater monitoring. In addition to Betacoronavirus, comparison using assembled contigs against a custom Swiss-Prot human virus database indicated the potential prevalence of other pathogenic virus genera, including: Orthopoxvirus, Rhadinovirus, Parapoxvirus, Varicellovirus, Hepatovirus, Simplexvirus, Bocaparvovirus, Molluscipoxvirus, Parechovirus, Roseolovirus, Lymphocryptovirus, Alphavirus, Spumavirus, Lentivirus, Deltaretrovirus, Enterovirus, Kobuvirus, Gammaretrovirus, Cardiovirus, Erythroparvovirus, Salivirus, Rubivirus, Orthohepevirus, Cytomegalovirus, Norovirus, and Mamastrovirus. Four nearly complete genomes were recovered from the Astrovirus, Enterovirus, Norovirus and Betapolyomavirus genera and viral species were identified. CONCLUSIONS The presented findings in wastewater samples are primarily at the genus level and can serve as a preliminary "screening" tool that may serve as indication to initiate further testing for the confirmation of the presence of species that may be associated with human disease. Integrating innovative environmental microbiology technologies like metagenomic sequencing with viral epidemiology offers a significant opportunity to improve the monitoring of, and predictive intelligence for, pathogenic viruses, using wastewater.
Collapse
Affiliation(s)
- Yabing Li
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, 48823, USA
| | - Brijen Miyani
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, 48823, USA
| | - Russell A Faust
- Oakland County Health Division, 1200 Telegraph Rd, Pontiac, MI, 48341, USA
| | - Randy E David
- School of Medicine, Wayne State University, Detroit, MI, 48282, USA
| | - Irene Xagoraraki
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, 48823, USA.
| |
Collapse
|
24
|
Kim C, Pongpanich M, Porntaveetus T. Unraveling metagenomics through long-read sequencing: a comprehensive review. J Transl Med 2024; 22:111. [PMID: 38282030 PMCID: PMC10823668 DOI: 10.1186/s12967-024-04917-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 01/21/2024] [Indexed: 01/30/2024] Open
Abstract
The study of microbial communities has undergone significant advancements, starting from the initial use of 16S rRNA sequencing to the adoption of shotgun metagenomics. However, a new era has emerged with the advent of long-read sequencing (LRS), which offers substantial improvements over its predecessor, short-read sequencing (SRS). LRS produces reads that are several kilobases long, enabling researchers to obtain more complete and contiguous genomic information, characterize structural variations, and study epigenetic modifications. The current leaders in LRS technologies are Pacific Biotechnologies (PacBio) and Oxford Nanopore Technologies (ONT), each offering a distinct set of advantages. This review covers the workflow of long-read metagenomics sequencing, including sample preparation (sample collection, sample extraction, and library preparation), sequencing, processing (quality control, assembly, and binning), and analysis (taxonomic annotation and functional annotation). Each section provides a concise outline of the key concept of the methodology, presenting the original concept as well as how it is challenged or modified in the context of LRS. Additionally, the section introduces a range of tools that are compatible with LRS and can be utilized to execute the LRS process. This review aims to present the workflow of metagenomics, highlight the transformative impact of LRS, and provide researchers with a selection of tools suitable for this task.
Collapse
Affiliation(s)
- Chankyung Kim
- Center of Excellence in Genomics and Precision Dentistry, Department of Physiology, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand
- Graduate Program in Bioinformatics and Computational Biology, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Monnat Pongpanich
- Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence for Cancer and Inflammation, Chulalongkorn University, Bangkok, Thailand
| | - Thantrira Porntaveetus
- Center of Excellence in Genomics and Precision Dentistry, Department of Physiology, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand.
- Graduate Program in Geriatric and Special Patients Care, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand.
| |
Collapse
|
25
|
Li Y, Miyani B, Childs KL, Shiu SH, Xagoraraki I. Effect of wastewater collection and concentration methods on assessment of viral diversity. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 908:168128. [PMID: 37918732 DOI: 10.1016/j.scitotenv.2023.168128] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/23/2023] [Accepted: 10/24/2023] [Indexed: 11/04/2023]
Abstract
Monitoring of potentially pathogenic human viruses in wastewater is of crucial importance to understand disease trends in communities, predict potential outbreaks, and boost preparedness and response by public health departments. High throughput metagenomic sequencing opens an opportunity to expand the capabilities of wastewater surveillance. However, there are major bottlenecks in the metagenomic enabled wastewater surveillance, including the complexities in selecting appropriate sampling and concentration/virus enrichment methods as well as in bioinformatic analysis of complex samples with low human virus concentrations. To evaluate the abilities of two commonly used sampling and concentration methods in virus identification, virus communities concentrated with Virus Adsorption-Elution (VIRADEL) and PolyEthylene Glycol (PEG) precipitation were compared for three interceptor sites. Results indicated that more viral reads were obtained by the VIRADEL concentration method, with 2.84 ± 0.57 % viral reads in the sample. For samples concentrated with PEG, the average proportion of viral reads in the sample was 0.63 ± 0.19 %. In all wastewater samples, bacteriophage affiliated with the families Siphoviridae, Myoviridae and Podoviridae were found to be the abundant populations. Comparison against a custom Swiss-Prot human virus database indicated that the relatively abundant human viruses (average proportions in human virus community greater than 1.00 %) in samples concentrated with the VIRADEL method were Orthopoxvirus, Rhadinovirus, Parapoxvirus, Varicellovirus, Hepatovirus, Simplexvirus, Molluscipoxvirus, Parechovirus, Lymphocryptovirus, and Spumavirus. In samples concentrated with the PEG method, fewer human viruses were found to be relatively abundant. These were Orthopoxvirus, Rhadinovirus, Varicellovirus, Simplexvirus, Molluscipoxvirus, Lymphocryptovirus, and Betacoronavirus. Contigs of Betacoronavirus, which contains severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), were identified in VIRADEL and PEG samples. Our study demonstrates the feasibility of using metagenomics in wastewater surveillance as a first screening tool and the need for selecting the appropriate virus concentration methods and optimizing bioinformatic approaches in analyzing metagenomic data of wastewater samples.
Collapse
Affiliation(s)
- Yabing Li
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, United States
| | - Brijen Miyani
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, United States
| | - Kevin L Childs
- Department of Plant Biology, Michigan State University, East Lansing, MI, United States
| | - Shin-Han Shiu
- Department of Plant Biology, Michigan State University, East Lansing, MI, United States; Department of Energy (DOE) Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI, United States; Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI, United States
| | - Irene Xagoraraki
- Department of Civil and Environmental Engineering, Michigan State University, 1449 Engineering Research Ct, East Lansing, MI, United States.
| |
Collapse
|
26
|
Baltoumas FA, Karatzas E, Liu S, Ovchinnikov S, Sofianatos Y, Chen IM, Kyrpides N, Pavlopoulos G. NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes. Nucleic Acids Res 2024; 52:D502-D512. [PMID: 37811892 PMCID: PMC10767849 DOI: 10.1093/nar/gkad800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 09/19/2023] [Indexed: 10/10/2023] Open
Abstract
The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.
Collapse
Affiliation(s)
- Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - Sirui Liu
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA 02138, USA
| | - Yorgos Sofianatos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
| | - I-Min Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
| | - Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, Vari, 16672, Greece
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720-8150, USA
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, 75 Mikras Asias Street, Athens 11527, Greece
| |
Collapse
|
27
|
Cerk K, Ugalde‐Salas P, Nedjad CG, Lecomte M, Muller C, Sherman DJ, Hildebrand F, Labarthe S, Frioux C. Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing. Microb Biotechnol 2024; 17:e14396. [PMID: 38243750 PMCID: PMC10832553 DOI: 10.1111/1751-7915.14396] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/27/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024] Open
Abstract
Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.
Collapse
Affiliation(s)
- Klara Cerk
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Chabname Ghassemi Nedjad
- Inria, University of Bordeaux, INRAETalenceFrance
- University of Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800TalenceFrance
| | - Maxime Lecomte
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE STLO¸University of RennesRennesFrance
| | | | | | - Falk Hildebrand
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Simon Labarthe
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE, University of Bordeaux, BIOGECO, UMR 1202CestasFrance
| | | |
Collapse
|
28
|
Fu P, Wu Y, Zhang Z, Qiu Y, Wang Y, Peng Y. VIGA: a one-stop tool for eukaryotic virus identification and genome assembly from next-generation-sequencing data. Brief Bioinform 2023; 25:bbad444. [PMID: 38048079 PMCID: PMC10753531 DOI: 10.1093/bib/bbad444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 10/26/2023] [Accepted: 11/11/2023] [Indexed: 12/05/2023] Open
Abstract
Identification of viruses and further assembly of viral genomes from the next-generation-sequencing data are essential steps in virome studies. This study presented a one-stop tool named VIGA (available at https://github.com/viralInformatics/VIGA) for eukaryotic virus identification and genome assembly from NGS data. It was composed of four modules, namely, identification, taxonomic annotation, assembly and novel virus discovery, which integrated several third-party tools such as BLAST, Trinity, MetaCompass and RagTag. Evaluation on multiple simulated and real virome datasets showed that VIGA assembled more complete virus genomes than its competitors on both the metatranscriptomic and metagenomic data and performed well in assembling virus genomes at the strain level. Finally, VIGA was used to investigate the virome in metatranscriptomic data from the Human Microbiome Project and revealed different composition and positive rate of viromes in diseases of prediabetes, Crohn's disease and ulcerative colitis. Overall, VIGA would help much in identification and characterization of viromes, especially the known viruses, in future studies.
Collapse
Affiliation(s)
- Ping Fu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yifan Wu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Zhiyuan Zhang
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Ye Qiu
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yirong Wang
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| | - Yousong Peng
- Bioinformatics Center, College of Biology, Hunan Provincial Key Laboratory of Medical Virology, Hunan University, Changsha 410082, China
| |
Collapse
|
29
|
Chen L, Chen A, Zhang XD, Robles MST, Han HS, Xiao Y, Xiao G, Pipas JM, Weitz DA. High-sensitivity whole-genome recovery of single viral species in environmental samples. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.13.566948. [PMID: 38014300 PMCID: PMC10680796 DOI: 10.1101/2023.11.13.566948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Characterizing unknown viruses is essential for understanding viral ecology and preparing against viral outbreaks. Recovering complete genome sequences from environmental samples remains computationally challenging using metagenomics, especially for low-abundance species with uneven coverage. This work presents a method for reliably recovering complete viral genomes from complex environmental samples. Individual genomes are encapsulated into droplets and amplified using multiple displacement amplification. A novel gene detection assay, which employs an RNA-based probe and an exonuclease, selectively identifies droplets containing the target viral genome. Labeled droplets are sorted using a microfluidic sorter, and genomes are extracted for sequencing. Validation experiments using a sewage sample spiked with two known viruses demonstrate the method's efficacy. We achieve 100% recovery of the spiked-in SV40 (Simian virus 40, 5243bp) genome sequence with uniform coverage distribution, and approximately 99.4% for the larger HAd5 genome (Human Adenovirus 5, 35938bp). Notably, genome recovery is achieved with as few as one sorted droplet, which enables the recovery of any desired genomes in complex environmental samples, regardless of their abundance. This method enables targeted characterizations of rare viral species and whole-genome amplification of single genomes for accessing the mutational profile in single virus genomes, contributing to an improved understanding of viral ecology.
Collapse
Affiliation(s)
- Liyin Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Anqi Chen
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Xinge Diana Zhang
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Maria Saenz T Robles
- Department of Biological Sciences, University of Pittsburgh, Pennsylvania 15260, USA
| | - Hee-Sun Han
- Department of Chemistry, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, USA
| | - Yi Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - Gao Xiao
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
| | - James M Pipas
- Department of Biological Sciences, University of Pittsburgh, Pennsylvania 15260, USA
| | - David A Weitz
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, 02138, USA
- Department of Physics, Harvard University, Cambridge, MA, 02138, USA
| |
Collapse
|
30
|
Walsh LH, Coakley M, Walsh AM, O'Toole PW, Cotter PD. Bioinformatic approaches for studying the microbiome of fermented food. Crit Rev Microbiol 2023; 49:693-725. [PMID: 36287644 DOI: 10.1080/1040841x.2022.2132850] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 08/11/2022] [Accepted: 09/28/2022] [Indexed: 11/03/2022]
Abstract
High-throughput DNA sequencing-based approaches continue to revolutionise our understanding of microbial ecosystems, including those associated with fermented foods. Metagenomic and metatranscriptomic approaches are state-of-the-art biological profiling methods and are employed to investigate a wide variety of characteristics of microbial communities, such as taxonomic membership, gene content and the range and level at which these genes are expressed. Individual groups and consortia of researchers are utilising these approaches to produce increasingly large and complex datasets, representing vast populations of microorganisms. There is a corresponding requirement for the development and application of appropriate bioinformatic tools and pipelines to interpret this data. This review critically analyses the tools and pipelines that have been used or that could be applied to the analysis of metagenomic and metatranscriptomic data from fermented foods. In addition, we critically analyse a number of studies of fermented foods in which these tools have previously been applied, to highlight the insights that these approaches can provide.
Collapse
Affiliation(s)
- Liam H Walsh
- Teagasc Food Research Centre, Moorepark, Fermoy, Cork, Ireland
- School of Microbiology, University College Cork, Ireland
| | - Mairéad Coakley
- Teagasc Food Research Centre, Moorepark, Fermoy, Cork, Ireland
| | - Aaron M Walsh
- Teagasc Food Research Centre, Moorepark, Fermoy, Cork, Ireland
| | - Paul W O'Toole
- School of Microbiology, University College Cork, Ireland
- APC Microbiome Ireland, University College Cork, Ireland
| | - Paul D Cotter
- Teagasc Food Research Centre, Moorepark, Fermoy, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Ireland
- VistaMilk SFI Research Centre, Teagasc, Moorepark, Fermoy, Cork, Ireland
| |
Collapse
|
31
|
Blanco-Míguez A, Beghini F, Cumbo F, McIver LJ, Thompson KN, Zolfo M, Manghi P, Dubois L, Huang KD, Thomas AM, Nickols WA, Piccinno G, Piperni E, Punčochář M, Valles-Colomer M, Tett A, Giordano F, Davies R, Wolf J, Berry SE, Spector TD, Franzosa EA, Pasolli E, Asnicar F, Huttenhower C, Segata N. Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Nat Biotechnol 2023; 41:1633-1644. [PMID: 36823356 PMCID: PMC10635831 DOI: 10.1038/s41587-023-01688-w] [Citation(s) in RCA: 417] [Impact Index Per Article: 208.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 01/20/2023] [Indexed: 02/25/2023]
Abstract
Metagenomic assembly enables new organism discovery from microbial communities, but it can only capture few abundant organisms from most metagenomes. Here we present MetaPhlAn 4, which integrates information from metagenome assemblies and microbial isolate genomes for more comprehensive metagenomic taxonomic profiling. From a curated collection of 1.01 M prokaryotic reference and metagenome-assembled genomes, we define unique marker genes for 26,970 species-level genome bins, 4,992 of them taxonomically unidentified at the species level. MetaPhlAn 4 explains ~20% more reads in most international human gut microbiomes and >40% in less-characterized environments such as the rumen microbiome and proves more accurate than available alternatives on synthetic evaluations while also reliably quantifying organisms with no cultured isolates. Application of the method to >24,500 metagenomes highlights previously undetected species to be strong biomarkers for host conditions and lifestyles in human and mouse microbiomes and shows that even previously uncharacterized species can be genetically profiled at the resolution of single microbial strains.
Collapse
Affiliation(s)
| | | | - Fabio Cumbo
- Department CIBIO, University of Trento, Trento, Italy
| | - Lauren J McIver
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Kelsey N Thompson
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Moreno Zolfo
- Department CIBIO, University of Trento, Trento, Italy
| | - Paolo Manghi
- Department CIBIO, University of Trento, Trento, Italy
| | | | - Kun D Huang
- Department CIBIO, University of Trento, Trento, Italy
| | | | - William A Nickols
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Elisa Piperni
- Department CIBIO, University of Trento, Trento, Italy
- IEO, European Institute of Oncology IRCCS, Milan, Italy
| | | | | | - Adrian Tett
- Department CIBIO, University of Trento, Trento, Italy
- Centre for Microbiology and Environmental Systems Science, University of Vienna, Vienna, Austria
| | | | | | | | - Sarah E Berry
- Department of Nutritional Sciences, King's College London, London, UK
| | - Tim D Spector
- Department of Twin Research, King's College London, London, UK
| | - Eric A Franzosa
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Edoardo Pasolli
- Department of Agricultural Sciences, University of Naples, Naples, Italy
| | | | - Curtis Huttenhower
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- The Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nicola Segata
- Department CIBIO, University of Trento, Trento, Italy.
- IEO, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
32
|
Pavlopoulos GA, Baltoumas FA, Liu S, Selvitopi O, Camargo AP, Nayfach S, Azad A, Roux S, Call L, Ivanova NN, Chen IM, Paez-Espino D, Karatzas E, Iliopoulos I, Konstantinidis K, Tiedje JM, Pett-Ridge J, Baker D, Visel A, Ouzounis CA, Ovchinnikov S, Buluç A, Kyrpides NC. Unraveling the functional dark matter through global metagenomics. Nature 2023; 622:594-602. [PMID: 37821698 PMCID: PMC10584684 DOI: 10.1038/s41586-023-06583-7] [Citation(s) in RCA: 61] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 08/30/2023] [Indexed: 10/13/2023]
Abstract
Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matter.
Collapse
Affiliation(s)
- Georgios A Pavlopoulos
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece.
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Center for New Biotechnologies and Precision Medicine, School of Medicine, National and Kapodistrian University of Athens, Athens, Greece.
| | - Fotis A Baltoumas
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece
| | - Sirui Liu
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Oguz Selvitopi
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Antonio Pedro Camargo
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Stephen Nayfach
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Ariful Azad
- Luddy School of Informatics, Computing and Engineering, Indiana University Bloomington, Bloomington, IN, USA
| | - Simon Roux
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Lee Call
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Natalia N Ivanova
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - I Min Chen
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - David Paez-Espino
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Evangelos Karatzas
- Institute for Fundamental Biomedical Research, Biomedical Science Research Center Alexander Fleming, Vari, Greece
| | - Ioannis Iliopoulos
- Department of Basic Sciences, School of Medicine, University of Crete, Heraklion, Greece
| | | | - James M Tiedje
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
| | - Jennifer Pett-Ridge
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Axel Visel
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christos A Ouzounis
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Biological Computation & Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
- Biological Computation & Computational Biology Group, Artificial Intelligence & Information Analysis Lab, School of Informatics, Aristotle University of Thessalonica, Thessalonica, Greece
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Aydin Buluç
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, USA
| | - Nikos C Kyrpides
- DOE Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
33
|
Magdy Wasfy R, Mbaye B, Borentain P, Tidjani Alou M, Murillo Ruiz ML, Caputo A, Andrieu C, Armstrong N, Million M, Gerolami R. Ethanol-Producing Enterocloster bolteae Is Enriched in Chronic Hepatitis B-Associated Gut Dysbiosis: A Case-Control Culturomics Study. Microorganisms 2023; 11:2437. [PMID: 37894093 PMCID: PMC10608849 DOI: 10.3390/microorganisms11102437] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 09/25/2023] [Accepted: 09/26/2023] [Indexed: 10/29/2023] Open
Abstract
BACKGROUND Hepatitis B virus (HBV) infection is a global health epidemic that causes fatal complications, leading to liver cirrhosis and hepatocellular carcinoma. The link between HBV-related dysbiosis and specific bacterial taxa is still under investigation. Enterocloster is emerging as a new genus (formerly Clostridium), including Enterocloster bolteae, a gut pathogen previously associated with dysbiosis and human diseases such as autism, multiple sclerosis, and inflammatory bowel diseases. Its role in liver diseases, especially HBV infection, is not reported. METHODS The fecal samples of eight patients with chronic HBV infection and ten healthy individuals were analyzed using the high-throughput culturomics approach and compared to 16S rRNA sequencing. Quantification of ethanol, known for its damaging effect on the liver, produced from bacterial strains enriched in chronic HBV was carried out by gas chromatography-mass spectrometry. RESULTS Using culturomics, 29,120 isolated colonies were analyzed by Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-TOF); 340 species were identified (240 species in chronic HBV samples, 254 species in control samples) belonging to 169 genera and 6 phyla. In the chronic HBV group, 65 species were already known in the literature; 48 were associated with humans but had not been previously found in the gut, and 17 had never been associated with humans previously. Six species were newly isolated in our study. By comparing bacterial species frequency, three bacterial genera were serendipitously found with significantly enriched bacterial diversity in patients with chronic HBV: Enterocloster, Clostridium, and Streptococcus (p = 0.0016, p = 0.041, p = 0.053, respectively). However, metagenomics could not identify this enrichment, possibly concerning its insufficient taxonomical resolution (equivocal assignment of operational taxonomic units). At the species level, the significantly enriched species in the chronic HBV group almost all belonged to class Clostridia, such as Clostridium perfringens, Clostridium sporogenes, Enterocloster aldenensis, Enterocloster bolteae, Enterocloster clostridioformis, and Clostridium innocuum. Two E. bolteae strains, isolated from two patients with chronic HBV infection, showed high ethanol production (27 and 200 mM). CONCLUSIONS Culturomics allowed us to identify Enterocloster species, specifically, E. bolteae, enriched in the gut microbiota of patients with chronic HBV. These species had never been isolated in chronic HBV infection before. Moreover, ethanol production by E. bolteae strains isolated from the chronic HBV group could contribute to liver disease progression. Additionally, culturomics might be critical for better elucidating the relationship between dysbiosis and chronic HBV infection in the future.
Collapse
Affiliation(s)
- Reham Magdy Wasfy
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
| | - Babacar Mbaye
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
| | - Patrick Borentain
- Unité Hépatologie, Hôpital de la Timone, APHM, 13005 Marseille, France;
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| | - Maryam Tidjani Alou
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
| | - Maria Leticia Murillo Ruiz
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
| | - Aurelia Caputo
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| | - Claudia Andrieu
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| | - Nicholas Armstrong
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| | - Matthieu Million
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| | - Rene Gerolami
- IHU Méditerranée Infection, 13005 Marseille, France (M.T.A.); (C.A.)
- MEPHI, IRD, Aix-Marseille Université, 13005 Marseille, France
- Unité Hépatologie, Hôpital de la Timone, APHM, 13005 Marseille, France;
- Assistance Publique-Hôpitaux de Marseille (APHM), 13005 Marseille, France
| |
Collapse
|
34
|
Ho H, Chovatia M, Egan R, He G, Yoshinaga Y, Liachko I, O’Malley R, Wang Z. Integrating chromatin conformation information in a self-supervised learning model improves metagenome binning. PeerJ 2023; 11:e16129. [PMID: 37753177 PMCID: PMC10519199 DOI: 10.7717/peerj.16129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Accepted: 08/28/2023] [Indexed: 09/28/2023] Open
Abstract
Metagenome binning is a key step, downstream of metagenome assembly, to group scaffolds by their genome of origin. Although accurate binning has been achieved on datasets containing multiple samples from the same community, the completeness of binning is often low in datasets with a small number of samples due to a lack of robust species co-abundance information. In this study, we exploited the chromatin conformation information obtained from Hi-C sequencing and developed a new reference-independent algorithm, Metagenome Binning with Abundance and Tetra-nucleotide frequencies-Long Range (metaBAT-LR), to improve the binning completeness of these datasets. This self-supervised algorithm builds a model from a set of high-quality genome bins to predict scaffold pairs that are likely to be derived from the same genome. Then, it applies these predictions to merge incomplete genome bins, as well as recruit unbinned scaffolds. We validated metaBAT-LR's ability to bin-merge and recruit scaffolds on both synthetic and real-world metagenome datasets of varying complexity. Benchmarking against similar software tools suggests that metaBAT-LR uncovers unique bins that were missed by all other methods. MetaBAT-LR is open-source and is available at https://bitbucket.org/project-metabat/metabat-lr.
Collapse
Affiliation(s)
- Harrison Ho
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
- School of Natural Sciences, University of California, Merced, CA, United States
| | - Mansi Chovatia
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
| | - Rob Egan
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
| | - Guifen He
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
| | - Yuko Yoshinaga
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
| | | | - Ronan O’Malley
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, United States
| | - Zhong Wang
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Lab, Berkeley, CA, United States
- School of Natural Sciences, University of California, Merced, CA, United States
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, United States
| |
Collapse
|
35
|
Liang X, Zhang J, Kim Y, Ho J, Liu K, Keenum I, Gupta S, Davis B, Hepp SL, Zhang L, Xia K, Knowlton KF, Liao J, Vikesland PJ, Pruden A, Heath LS. ARGem: a new metagenomics pipeline for antibiotic resistance genes: metadata, analysis, and visualization. Front Genet 2023; 14:1219297. [PMID: 37811141 PMCID: PMC10558085 DOI: 10.3389/fgene.2023.1219297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 09/01/2023] [Indexed: 10/10/2023] Open
Abstract
Antibiotic resistance is of crucial interest to both human and animal medicine. It has been recognized that increased environmental monitoring of antibiotic resistance is needed. Metagenomic DNA sequencing is becoming an attractive method to profile antibiotic resistance genes (ARGs), including a special focus on pathogens. A number of computational pipelines are available and under development to support environmental ARG monitoring; the pipeline we present here is promising for general adoption for the purpose of harmonized global monitoring. Specifically, ARGem is a user-friendly pipeline that provides full-service analysis, from the initial DNA short reads to the final visualization of results. The capture of extensive metadata is also facilitated to support comparability across projects and broader monitoring goals. The ARGem pipeline offers efficient analysis of a modest number of samples along with affordable computational components, though the throughput could be increased through cloud resources, based on the user's configuration. The pipeline components were carefully assessed and selected to satisfy tradeoffs, balancing efficiency and flexibility. It was essential to provide a step to perform short read assembly in a reasonable time frame to ensure accurate annotation of identified ARGs. Comprehensive ARG and mobile genetic element databases are included in ARGem for annotation support. ARGem further includes an expandable set of analysis tools that include statistical and network analysis and supports various useful visualization techniques, including Cytoscape visualization of co-occurrence and correlation networks. The performance and flexibility of the ARGem pipeline is demonstrated with analysis of aquatic metagenomes. The pipeline is freely available at https://github.com/xlxlxlx/ARGem.
Collapse
Affiliation(s)
- Xiao Liang
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Jingyi Zhang
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Yoonjin Kim
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Josh Ho
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Kevin Liu
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Ishi Keenum
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Suraj Gupta
- Interdisciplinary PhD Program in Genetics, Bioinformatics, and Computational Biology, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Benjamin Davis
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Shannon L. Hepp
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Liqing Zhang
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Kang Xia
- School of Plant and Environmental Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Katharine F. Knowlton
- Department of Dairy Science, Virginia Polytechnic Institute and State University, Blacksburg, VaA, United States
| | - Jingqiu Liao
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Peter J. Vikesland
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Amy Pruden
- Department of Civil and Environmental Engineering, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Lenwood S. Heath
- Department of Computer Science, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| |
Collapse
|
36
|
Muralitharan RR, Snelson M, Meric G, Coughlan MT, Marques FZ. Guidelines for microbiome studies in renal physiology. Am J Physiol Renal Physiol 2023; 325:F345-F362. [PMID: 37440367 DOI: 10.1152/ajprenal.00072.2023] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 06/28/2023] [Accepted: 07/07/2023] [Indexed: 07/15/2023] Open
Abstract
Gut microbiome research has increased dramatically in the last decade, including in renal health and disease. The field is moving from experiments showing mere association to causation using both forward and reverse microbiome approaches, leveraging tools such as germ-free animals, treatment with antibiotics, and fecal microbiota transplantations. However, we are still seeing a gap between discovery and translation that needs to be addressed, so that patients can benefit from microbiome-based therapies. In this guideline paper, we discuss the key considerations that affect the gut microbiome of animals and clinical studies assessing renal function, many of which are often overlooked, resulting in false-positive results. For animal studies, these include suppliers, acclimatization, baseline microbiota and its normalization, littermates and cohort/cage effects, diet, sex differences, age, circadian differences, antibiotics and sweeteners, and models used. Clinical studies have some unique considerations, which include sampling, gut transit time, dietary records, medication, and renal phenotypes. We provide best-practice guidance on sampling, storage, DNA extraction, and methods for microbial DNA sequencing (both 16S rRNA and shotgun metagenome). Finally, we discuss follow-up analyses, including tools available, metrics, and their interpretation, and the key challenges ahead in the microbiome field. By standardizing study designs, methods, and reporting, we will accelerate the findings from discovery to translation and result in new microbiome-based therapies that may improve renal health.
Collapse
Affiliation(s)
- Rikeish R Muralitharan
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Melbourne, Victoria, Australia
- Institute for Medical Research, Ministry of Health Malaysia, Kuala Lumpur, Malaysia
| | - Matthew Snelson
- Department of Diabetes, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Guillaume Meric
- Cambridge-Baker Systems Genomics Initiative, Baker Heart & Diabetes Institute, Melbourne, Victoria, Australia
- Department of Cardiometabolic Health, University of Melbourne, Melbourne, Victoria, Australia
- Department of Medical Sciences, Molecular Epidemiology and Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Department of Cardiovascular Research Translation and Implementation, La Trobe University, Melbourne, Victoria, Australia
| | - Melinda T Coughlan
- Department of Diabetes, Central Clinical School, Monash University, Melbourne, Victoria, Australia
- Drug Discovery Biology, Monash Institute of Pharmaceutical Sciences, Parkville, Victoria, Australia
| | - Francine Z Marques
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Melbourne, Victoria, Australia
- Heart Failure Research Group, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Victorian Heart Institute, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
37
|
Mak L, Meleshko D, Danko DC, Barakzai WN, Maharjan S, Belchikov N, Hajirasouliha I. Ariadne: synthetic long read deconvolution using assembly graphs. Genome Biol 2023; 24:197. [PMID: 37641111 PMCID: PMC10463629 DOI: 10.1186/s13059-023-03033-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 08/07/2023] [Indexed: 08/31/2023] Open
Abstract
Synthetic long read sequencing techniques such as UST's TELL-Seq and Loop Genomics' LoopSeq combine 3[Formula: see text] barcoding with standard short-read sequencing to expand the range of linkage resolution from hundreds to tens of thousands of base-pairs. However, the lack of a 1:1 correspondence between a long fragment and a 3[Formula: see text] unique molecular identifier confounds the assignment of linkage between short reads. We introduce Ariadne, a novel assembly graph-based synthetic long read deconvolution algorithm, that can be used to extract single-species read-clouds from synthetic long read datasets to improve the taxonomic classification and de novo assembly of complex populations, such as metagenomes.
Collapse
Affiliation(s)
- Lauren Mak
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine of Cornell University, New York, USA.
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, New York, USA.
| | - Dmitry Meleshko
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine of Cornell University, New York, USA
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, New York, USA
| | - David C Danko
- Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine of Cornell University, New York, USA
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, New York, USA
| | - Waris N Barakzai
- Department of Computer Science, New York University, New York, USA
| | - Salil Maharjan
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, New York, USA
| | - Natan Belchikov
- Physiology, Biophysics & Systems Biology Program, Weill Cornell Medicine of Cornell University, New York, USA
| | - Iman Hajirasouliha
- Institute for Computational Biomedicine, Department of Physiology and Biophysics, Weill Cornell Medicine of Cornell University, New York, USA.
- Englander Institute for Precision Medicine, The Meyer Cancer Center, Weill Cornell Medicine of Cornell University, New York, USA.
| |
Collapse
|
38
|
Zhang ZF, Liu LR, Pan YP, Pan J, Li M. Long-read assembled metagenomic approaches improve our understanding on metabolic potentials of microbial community in mangrove sediments. MICROBIOME 2023; 11:188. [PMID: 37612768 PMCID: PMC10464287 DOI: 10.1186/s40168-023-01630-x] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 07/21/2023] [Indexed: 08/25/2023]
Abstract
BACKGROUND Mangrove wetlands are coastal ecosystems with important ecological features and provide habitats for diverse microorganisms with key roles in nutrient and biogeochemical cycling. However, the overall metabolic potentials and ecological roles of microbial community in mangrove sediment are remained unanswered. In current study, the microbial and metabolic profiles of prokaryotic and fungal communities in mangrove sediments were investigated using metagenomic analysis based on PacBio single-molecule real time (SMRT) and Illumina sequencing techniques. RESULTS Comparing to Illumina short reads, the incorporation of PacBio long reads significantly contributed to more contiguous assemblies, yielded more than doubled high-quality metagenome-assembled genomes (MAGs), and improved the novelty of the MAGs. Further metabolic reconstruction for recovered MAGs showed that prokaryotes potentially played an essential role in carbon cycling in mangrove sediment, displaying versatile metabolic potential for degrading organic carbons, fermentation, autotrophy, and carbon fixation. Mangrove fungi also functioned as a player in carbon cycling, potentially involved in the degradation of various carbohydrate and peptide substrates. Notably, a new candidate bacterial phylum named as Candidatus Cosmopoliota with a ubiquitous distribution is proposed. Genomic analysis revealed that this new phylum is capable of utilizing various types of organic substrates, anaerobic fermentation, and carbon fixation with the Wood-Ljungdahl (WL) pathway and the reverse tricarboxylic acid (rTCA) cycle. CONCLUSIONS The study not only highlights the advantages of HiSeq-PacBio Hybrid assembly for a more complete profiling of environmental microbiomes but also expands our understanding of the microbial diversity and potential roles of distinct microbial groups in biogeochemical cycling in mangrove sediment. Video Abstract.
Collapse
Affiliation(s)
- Zhi-Feng Zhang
- Archaeal Biology Center, Institute for Advanced Study, Shenzhen University, Shenzhen, China
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China
- Present Address: Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou, 511458, China
| | - Li-Rui Liu
- Archaeal Biology Center, Institute for Advanced Study, Shenzhen University, Shenzhen, China
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China
| | - Yue-Ping Pan
- Archaeal Biology Center, Institute for Advanced Study, Shenzhen University, Shenzhen, China
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China
| | - Jie Pan
- Archaeal Biology Center, Institute for Advanced Study, Shenzhen University, Shenzhen, China
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China
| | - Meng Li
- Archaeal Biology Center, Institute for Advanced Study, Shenzhen University, Shenzhen, China.
- Shenzhen Key Laboratory of Marine Microbiome Engineering, Institute for Advanced Study, Shenzhen University, Shenzhen, China.
| |
Collapse
|
39
|
Venbrux M, Crauwels S, Rediers H. Current and emerging trends in techniques for plant pathogen detection. FRONTIERS IN PLANT SCIENCE 2023; 14:1120968. [PMID: 37223788 PMCID: PMC10200959 DOI: 10.3389/fpls.2023.1120968] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Accepted: 03/21/2023] [Indexed: 05/25/2023]
Abstract
Plant pathogenic microorganisms cause substantial yield losses in several economically important crops, resulting in economic and social adversity. The spread of such plant pathogens and the emergence of new diseases is facilitated by human practices such as monoculture farming and global trade. Therefore, the early detection and identification of pathogens is of utmost importance to reduce the associated agricultural losses. In this review, techniques that are currently available to detect plant pathogens are discussed, including culture-based, PCR-based, sequencing-based, and immunology-based techniques. Their working principles are explained, followed by an overview of the main advantages and disadvantages, and examples of their use in plant pathogen detection. In addition to the more conventional and commonly used techniques, we also point to some recent evolutions in the field of plant pathogen detection. The potential use of point-of-care devices, including biosensors, have gained in popularity. These devices can provide fast analysis, are easy to use, and most importantly can be used for on-site diagnosis, allowing the farmers to take rapid disease management decisions.
Collapse
Affiliation(s)
- Marc Venbrux
- Centre of Microbial and Plant Genetics, Laboratory for Process Microbial Ecology and Bioinspirational Management (PME&BIM), Department of Microbial and Molecular Systems (M2S), KU Leuven, Leuven, Belgium
| | - Sam Crauwels
- Centre of Microbial and Plant Genetics, Laboratory for Process Microbial Ecology and Bioinspirational Management (PME&BIM), Department of Microbial and Molecular Systems (M2S), KU Leuven, Leuven, Belgium
- Leuven Plant Institute (LPI), KU Leuven, Leuven, Belgium
| | - Hans Rediers
- Centre of Microbial and Plant Genetics, Laboratory for Process Microbial Ecology and Bioinspirational Management (PME&BIM), Department of Microbial and Molecular Systems (M2S), KU Leuven, Leuven, Belgium
- Leuven Plant Institute (LPI), KU Leuven, Leuven, Belgium
| |
Collapse
|
40
|
Mineeva O, Danciu D, Schölkopf B, Ley RE, Rätsch G, Youngblut ND. ResMiCo: Increasing the quality of metagenome-assembled genomes with deep learning. PLoS Comput Biol 2023; 19:e1011001. [PMID: 37126495 PMCID: PMC10174551 DOI: 10.1371/journal.pcbi.1011001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 05/11/2023] [Accepted: 03/06/2023] [Indexed: 05/02/2023] Open
Abstract
The number of published metagenome assemblies is rapidly growing due to advances in sequencing technologies. However, sequencing errors, variable coverage, repetitive genomic regions, and other factors can produce misassemblies, which are challenging to detect for taxonomically novel genomic data. Assembly errors can affect all downstream analyses of the assemblies. Accuracy for the state of the art in reference-free misassembly prediction does not exceed an AUPRC of 0.57, and it is not clear how well these models generalize to real-world data. Here, we present the Residual neural network for Misassembled Contig identification (ResMiCo), a deep learning approach for reference-free identification of misassembled contigs. To develop ResMiCo, we first generated a training dataset of unprecedented size and complexity that can be used for further benchmarking and developments in the field. Through rigorous validation, we show that ResMiCo is substantially more accurate than the state of the art, and the model is robust to novel taxonomic diversity and varying assembly methods. ResMiCo estimated 7% misassembled contigs per metagenome across multiple real-world datasets. We demonstrate how ResMiCo can be used to optimize metagenome assembly hyperparameters to improve accuracy, instead of optimizing solely for contiguity. The accuracy, robustness, and ease-of-use of ResMiCo make the tool suitable for general quality control of metagenome assemblies and assembly methodology optimization.
Collapse
Affiliation(s)
- Olga Mineeva
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
- Department of Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, Germany
- Swiss Institute for Bioinformatics, Lausanne, Switzerland
| | - Daniel Danciu
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
| | - Bernhard Schölkopf
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
- Department of Empirical Inference, Max Planck Institute for Intelligent Systems, Tübingen, Germany
- ETH AI center, ETH Zürich, Zürich, Switzerland
| | - Ruth E Ley
- Department of Microbiome Science, Max Planck Institute for Biology, Tübingen, Germany
| | - Gunnar Rätsch
- Department of Computer Science, ETH Zürich, Zürich, Switzerland
- Swiss Institute for Bioinformatics, Lausanne, Switzerland
- ETH AI center, ETH Zürich, Zürich, Switzerland
- Department of Biology, ETH Zürich, Zürich, Switzerland
- Medical Informatics Unit, Zürich University Hospital, Zürich, Switzerland
| | - Nicholas D Youngblut
- Department of Microbiome Science, Max Planck Institute for Biology, Tübingen, Germany
| |
Collapse
|
41
|
Esquerra-Ruvira B, Baquedano I, Ruiz R, Fernandez A, Montoliu L, Mojica FJM. Identification of the EH CRISPR-Cas9 system on a metagenome and its application to genome engineering. Microb Biotechnol 2023. [PMID: 37097160 DOI: 10.1111/1751-7915.14266] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2023] [Revised: 04/04/2023] [Accepted: 04/12/2023] [Indexed: 04/26/2023] Open
Abstract
Non-coding RNAs (crRNAs) produced from clustered regularly interspaced short palindromic repeats (CRISPR) loci and CRISPR-associated (Cas) proteins of the prokaryotic CRISPR-Cas systems form complexes that interfere with the spread of transmissible genetic elements through Cas-catalysed cleavage of foreign genetic material matching the guide crRNA sequences. The easily programmable targeting of nucleic acids enabled by these ribonucleoproteins has facilitated the implementation of CRISPR-based molecular biology tools for in vivo and in vitro modification of DNA and RNA targets. Despite the diversity of DNA-targeting Cas nucleases so far identified, native and engineered derivatives of the Streptococcus pyogenes SpCas9 are the most widely used for genome engineering, at least in part due to their catalytic robustness and the requirement of an exceptionally short motif (5'-NGG-3' PAM) flanking the target sequence. However, the large size of the SpCas9 variants impairs the delivery of the tool to eukaryotic cells and smaller alternatives are desirable. Here, we identify in a metagenome a new CRISPR-Cas9 system associated with a smaller Cas9 protein (EHCas9) that targets DNA sequences flanked by 5'-NGG-3' PAMs. We develop a simplified EHCas9 tool that specifically cleaves DNA targets and is functional for genome editing applications in prokaryotes and eukaryotic cells.
Collapse
Affiliation(s)
- Belen Esquerra-Ruvira
- Department of Physiology, Genetics and Microbiology, University of Alicante, Alicante, Spain
| | - Ignacio Baquedano
- Department of Physiology, Genetics and Microbiology, University of Alicante, Alicante, Spain
| | - Raul Ruiz
- Department of Physiology, Genetics and Microbiology, University of Alicante, Alicante, Spain
| | - Almudena Fernandez
- Department of Molecular and Cellular Biology, National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER-ISCIII), Madrid, Spain
| | - Lluis Montoliu
- Department of Molecular and Cellular Biology, National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
- Centre for Biomedical Network Research on Rare Diseases (CIBERER-ISCIII), Madrid, Spain
| | - Francisco J M Mojica
- Department of Physiology, Genetics and Microbiology, University of Alicante, Alicante, Spain
- Multidisciplinary Institute for Environmental Studies "Ramón Margalef", University of Alicante, Alicante, Spain
| |
Collapse
|
42
|
Rong Lee M, Kim JC, Eun Park S, Kim WJ, Su Kim J. Detection of Viral Genes in Metarhizium anisopliae JEF-290-infected longhorned tick, Haemaphysalis longicornis using transcriptome analysis. J Invertebr Pathol 2023; 198:107926. [PMID: 37087092 DOI: 10.1016/j.jip.2023.107926] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Revised: 04/13/2023] [Accepted: 04/16/2023] [Indexed: 04/24/2023]
Abstract
Ticks are carriers of viruses that can cause disease in humans and animals. The longhorned ticks (Haemaphysalis longicornis; LHT), for example, mediates the severe fever with thrombocytopenia syndrome virus (SFTSV) in humans, and the population of ticks is growing due to increases in temperature caused by climate change. As ticks carry primarily RNA viruses, there is a need to study the possibility of detecting new viruses through tick virome analysis. In this study, viruses in LHTs collected in Korea were investigated and virus titers in ticks exposed to the entomopathogenic fungus Metarhizium anisopliae JEF-290 were analyzed. Total RNA was extracted from the collected ticks, and short reads were obtained from Illumina sequencing. A total of 50,024 contigs with coding capacity were obtained after de novo assembly of the reads in the metaSPAdes genome assembler. A series of BLAST-based analyses using the GenBank database was performed to screen viral contigs, and three putative virus species were identified from the tick meta-transcriptome, such as Alongshan virus (ALSV), Denso virus and Taggert virus. Measurements of virus-expression levels of infected and non-infected LHTs failed to detect substantial differences in expression levels. However, we suggest that LHT can spread not only SFTSV, but also various other disease-causing viruses over large areas of the world. From the phylogenetic analysis of ALSV glycoproteins, genetic differences in the ALSV could be due to host differences as well as regional differences. Viral metagenome analysis can be used as a tool to manage future outbreaks of disease caused by ticks by detecting unknown viruses.
Collapse
Affiliation(s)
- Mi Rong Lee
- Department of Agricultural Biology, College of Agriculture & Life Sciences, Jeonbuk National University, Jeonju 54596, Korea
| | - Jong-Cheol Kim
- Department of Agricultural Biology, College of Agriculture & Life Sciences, Jeonbuk National University, Jeonju 54596, Korea
| | - So Eun Park
- Department of Agricultural Biology, College of Agriculture & Life Sciences, Jeonbuk National University, Jeonju 54596, Korea
| | | | - Jae Su Kim
- Department of Agricultural Biology, College of Agriculture & Life Sciences, Jeonbuk National University, Jeonju 54596, Korea; Department of Agricultural Convergence Technology, Jeonbuk National University, Jeonju 54596, Republic of Korea.
| |
Collapse
|
43
|
Yorki S, Shea T, Cuomo CA, Walker BJ, LaRocque RC, Manson AL, Earl AM, Worby CJ. Comparison of long- and short-read metagenomic assembly for low-abundance species and resistance genes. Brief Bioinform 2023; 24:bbad050. [PMID: 36804804 PMCID: PMC10025444 DOI: 10.1093/bib/bbad050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 01/13/2023] [Accepted: 01/26/2023] [Indexed: 02/23/2023] Open
Abstract
Recent technological and computational advances have made metagenomic assembly a viable approach to achieving high-resolution views of complex microbial communities. In previous benchmarking, short-read (SR) metagenomic assemblers had the highest accuracy, long-read (LR) assemblers generated the most contiguous sequences and hybrid (HY) assemblers balanced length and accuracy. However, no assessments have specifically compared the performance of these assemblers on low-abundance species, which include clinically relevant organisms in the gut. We generated semi-synthetic LR and SR datasets by spiking small and increasing amounts of Escherichia coli isolate reads into fecal metagenomes and, using different assemblers, examined E. coli contigs and the presence of antibiotic resistance genes (ARGs). For ARG assembly, although SR assemblers recovered more ARGs with high accuracy, even at low coverages, LR assemblies allowed for the placement of ARGs within longer, E. coli-specific contigs, thus pinpointing their taxonomic origin. HY assemblies identified resistance genes with high accuracy and had lower contiguity than LR assemblies. Each assembler type's strengths were maintained even when our isolate was spiked in with a competing strain, which fragmented and reduced the accuracy of all assemblies. For strain characterization and determining gene context, LR assembly is optimal, while for base-accurate gene identification, SR assemblers outperform other options. HY assembly offers contiguity and base accuracy, but requires generating data on multiple platforms, and may suffer high misassembly rates when strain diversity exists. Our results highlight the trade-offs associated with each approach for recovering low-abundance taxa, and that the optimal approach is goal-dependent.
Collapse
Affiliation(s)
- Sosie Yorki
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Terrance Shea
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christina A Cuomo
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Bruce J Walker
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Applied Invention, LLC, Cambridge, MA, USA
| | - Regina C LaRocque
- Division of Infectious Diseases, Massachusetts General Hospital, Boston, MA, USA
| | - Abigail L Manson
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Ashlee M Earl
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Colin J Worby
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
44
|
Soto MA, Desai D, Bannon C, LaRoche J, Bertrand EM. Cobalamin producers and prokaryotic consumers in the Northwest Atlantic. Environ Microbiol 2023. [PMID: 36861357 DOI: 10.1111/1462-2920.16363] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 02/26/2023] [Indexed: 03/03/2023]
Abstract
Cobalamin availability can influence primary productivity and ecological interactions in marine microbial communities. The characterization of cobalamin sources and sinks is a first step in investigating cobalamin dynamics and its impact on productivity. Here, we identify potential cobalamin sources and sinks on the Scotian Shelf and Slope in the Northwest Atlantic Ocean. Functional and taxonomic annotation of bulk metagenomic reads, combined with analysis of genome bins, were used to identify potential cobalamin sources and sinks. Cobalamin synthesis potential was mainly attributed to Rhodobacteraceae, Thaumarchaeota, and cyanobacteria (Synechococcus and Prochlorococcus). Cobalamin remodelling potential was mainly attributed to Alteromonadales, Pseudomonadales, Rhizobiales, Oceanospirilalles, Rhodobacteraceae, and Verrucomicrobia, while potential cobalamin consumers include Flavobacteriaceae, Actinobacteria, Porticoccaceae, Methylophiliaceae, and Thermoplasmatota. These complementary approaches identified taxa with the potential to be involved in cobalamin cycling on the Scotian Shelf and revealed genomic information required for further characterization. The Cob operon of Rhodobacterales bacterium HTCC2255, a strain with known importance in cobalamin cycling, was similar to a major cobalamin producer bin, suggesting that a related strain may represent a critical cobalamin source in this region. These results enable future inquiries that will enhance our understanding of how cobalamin shapes microbial interdependencies and productivity in this region.
Collapse
Affiliation(s)
- Maria A Soto
- Department of Biology and Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Dhwani Desai
- Department of Biology and Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Catherine Bannon
- Department of Biology and Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Julie LaRoche
- Department of Biology and Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Erin M Bertrand
- Department of Biology and Institute for Comparative Genomics, Dalhousie University, Halifax, Nova Scotia, Canada
| |
Collapse
|
45
|
Wang T, Wang XW, Lee-Sarwar KA, Litonjua AA, Weiss ST, Sun Y, Maslov S, Liu YY. Predicting metabolomic profiles from microbial composition through neural ordinary differential equations. NAT MACH INTELL 2023; 5:284-293. [PMID: 38223254 PMCID: PMC10786629 DOI: 10.1038/s42256-023-00627-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 02/03/2023] [Indexed: 03/14/2023]
Abstract
Characterizing the metabolic profile of a microbial community is crucial for understanding its biological function and its impact on the host or environment. Metabolomics experiments directly measuring these profiles are difficult and expensive, while sequencing methods quantifying the species composition of microbial communities are well-developed and relatively cost-effective. Computational methods that are capable of predicting metabolomic profiles from microbial compositions can save considerable efforts needed for metabolomic profiling experimentally. Yet, despite existing efforts, we still lack a computational method with high prediction power, general applicability, and great interpretability. Here we develop a method - mNODE (Metabolomic profile predictor using Neural Ordinary Differential Equations), based on a state-of-the-art family of deep neural network models. We show compelling evidence that mNODE outperforms existing methods in predicting the metabolomic profiles of human microbiomes and several environmental microbiomes. Moreover, in the case of human gut microbiomes, mNODE can naturally incorporate dietary information to further enhance the prediction of metabolomic profiles. Besides, susceptibility analysis of mNODE enables us to reveal microbe-metabolite interactions, which can be validated using both synthetic and real data. The presented results demonstrate that mNODE is a powerful tool to investigate the microbiome-diet-metabolome relationship, facilitating future research on precision nutrition.
Collapse
Affiliation(s)
- Tong Wang
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Xu-Wen Wang
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Kathleen A. Lee-Sarwar
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
- Division of Allergy and Clinical Immunology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Augusto A. Litonjua
- Pediatric Pulmonology, Golisano Children’s Hospital, University of Rochester, Rochester, NY 14642, USA
| | - Scott T. Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Yizhou Sun
- Department of Computer Science, University of California, Los Angeles, USA
| | - Sergei Maslov
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Yang-Yu Liu
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA
- Center for Artificial Intelligence and Modeling, The Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
46
|
Lu Y, Ge C, Cai B, Xu Q, Kong R, Chang S. Antibody sequences assembly method based on weighted de Bruijn graph. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:6174-6190. [PMID: 37161102 DOI: 10.3934/mbe.2023266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
With the development of next-generation protein sequencing technologies, sequence assembly algorithm has become a key technology for de novo sequencing process. At present, the existing methods can address the assembly of an unknown single protein chain. However, for monoclonal antibodies with light and heavy chains, the assembly is still an unsolved question. To address this problem, we propose a new assembly method, DBAS, which integrates the quality scores and sequence alignment scores from de novo sequencing peptides into a weighted de Bruijn graph to assemble the final protein sequences. The established method is used to assembling sequences from two datasets with mixed light and heavy chains from antibodies. The results show that the DBAS can assemble long antibody sequences for both mixed light and heavy chains and single chains. In addition, DBAS is able to distinguish the light and heavy chains by using BLAST sequence alignment. The results show that the algorithm has good performance for both target sequence coverage and contig assembly accuracy.
Collapse
Affiliation(s)
- Yi Lu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Cheng Ge
- Key Laboratory of Marine Drugs, Chinese Ministry of Education, School of Medicine and Pharmacy, Ocean University of China, Qingdao 266003, China
| | - Biao Cai
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Qing Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Ren Kong
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
47
|
Morrison AG, Sarkar S, Umar S, Lee STM, Thomas SM. The Contribution of the Human Oral Microbiome to Oral Disease: A Review. Microorganisms 2023; 11:318. [PMID: 36838283 PMCID: PMC9962706 DOI: 10.3390/microorganisms11020318] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 01/16/2023] [Accepted: 01/20/2023] [Indexed: 01/28/2023] Open
Abstract
The oral microbiome is an emerging field that has been a topic of discussion since the development of next generation sequencing and the implementation of the human microbiome project. This article reviews the current literature surrounding the oral microbiome, briefly highlighting most recent methods of microbiome characterization including cutting edge omics, databases for the microbiome, and areas with current gaps in knowledge. This article also describes reports on microorganisms contained in the oral microbiome which include viruses, archaea, fungi, and bacteria, and provides an in-depth analysis of their significant roles in tissue homeostasis. Finally, we detail key bacteria involved in oral disease, including oral cancer, and the current research surrounding their role in stimulation of inflammatory cytokines, the role of gingival crevicular fluid in periodontal disease, the creation of a network of interactions between microorganisms, the influence of the planktonic microbiome and cospecies biofilms, and the implications of antibiotic resistance. This paper provides a comprehensive literature analysis while also identifying gaps in knowledge to enable future studies to be conducted.
Collapse
Affiliation(s)
- Austin Gregory Morrison
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Soumyadev Sarkar
- Division of Biology, Kansas State University, Manhattan, KS 66506, USA
| | - Shahid Umar
- Department of General Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Sonny T. M. Lee
- Division of Biology, Kansas State University, Manhattan, KS 66506, USA
- 1717 Claflin Road, 136 Ackert Hall, Manhattan, KS 66506, USA
| | - Sufi Mary Thomas
- Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS 66160, USA
- Departments of Otolaryngology, University of Kansas Medical Center, Kansas City, KS 66160, USA
- Departments of Anatomy and Cell Biology, University of Kansas Medical Center, Kansas City, KS 66160, USA
- 3901 Rainbow Blvd., 4031 Wahl Hall East, MS 3040, Kansas City, KS 66160, USA
| |
Collapse
|
48
|
Martin S, Ayling M, Patrono L, Caccamo M, Murcia P, Leggett RM. Capturing variation in metagenomic assembly graphs with MetaCortex. Bioinformatics 2023; 39:6986127. [PMID: 36722204 PMCID: PMC9889960 DOI: 10.1093/bioinformatics/btad020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 11/10/2022] [Accepted: 01/11/2023] [Indexed: 01/13/2023] Open
Abstract
MOTIVATION The assembly of contiguous sequence from metagenomic samples presents a particular challenge, due to the presence of multiple species, often closely related, at varying levels of abundance. Capturing diversity within species, for example, viral haplotypes, or bacterial strain-level diversity, is even more challenging. RESULTS We present MetaCortex, a metagenome assembler that captures intra-species diversity by searching for signatures of local variation along assembled sequences in the underlying assembly graph and outputting these sequences in sequence graph format. We show that MetaCortex produces accurate assemblies with higher genome coverage and contiguity than other popular metagenomic assemblers on mock viral communities with high levels of strain-level diversity and on simulated communities containing simulated strains. AVAILABILITY AND IMPLEMENTATION Source code is freely available to download from https://github.com/SR-Martin/metacortex, is implemented in C and supported on MacOS and Linux. The version used for the results presented in this article is available at doi.org/10.5281/zenodo.7273627. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | | | - Pablo Murcia
- MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK
| | | |
Collapse
|
49
|
Mangalea MR, Keift K, Duerkop BA, Anantharaman K. Assembly and Annotation of Viral Metagenomes from Short-Read Sequencing Data. Methods Mol Biol 2023; 2649:317-337. [PMID: 37258871 DOI: 10.1007/978-1-0716-3072-3_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Viral metagenomics enables the detection, characterization, and quantification of viral sequences present in shotgun-sequenced datasets of purified virus-like particles and whole metagenomes. Next generation sequencing (Illumina) derived short single or paired-end read runs are a principal platform for metagenomics, and assembly of short reads allows for the identification of distinguishing viral signatures and complex genomic features for taxonomy and functional annotation. Here we describe the identification and characterization of viral genome sequences, bacteriophages, and eukaryotic viruses, from a cohort of human stool samples, using multiple methods. Following the purification of virus-like particles, sequencing, quality refinement, and genome assembly, we begin the protocol with raw short reads deposited in an open-source nucleotide archive. We highlight the use of VIBRANT, an automated computational tool for the characterization of microbial viruses and their viral community function. Finally, we also describe an alternative assembly-free option of mapping reads to established databases of reference genomes and previously characterized metagenome-assembled viral genomes.
Collapse
Affiliation(s)
- Mihnea R Mangalea
- Department of Immunology and Microbiology, University of Colorado School of Medicine, Aurora, CO, USA
| | - Kristopher Keift
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Breck A Duerkop
- Department of Immunology and Microbiology, University of Colorado School of Medicine, Aurora, CO, USA
| | | |
Collapse
|
50
|
Santana-Pereira ALR. Identification of PKS Gene Clusters from Metagenomic Libraries Using a Next-Generation Sequencing Approach. Methods Mol Biol 2023; 2555:73-90. [PMID: 36306079 DOI: 10.1007/978-1-0716-2795-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Microbial secondary metabolites have been an important source of bioactive compounds with diverse applications from medicine to agriculture, noticeably those encoded by polyketide synthase (PKS) clusters due to their astounding chemical diversity. While most discovered compounds originate from culturable microorganisms, yet-to-be cultured microbes represent a reservoir of previously inaccessible compounds. The advent and development of metagenomics have allowed not only the characterization of these microorganisms but also their metabolic potential, making viable the prospection of environmental PKS for natural product discovery.Study of environmental PKSs often relies on the construction of metagenomic libraries and their mining, with clones containing PKS clusters identified via amplification of conserved domains and then screened for an activity of interest. Compounds produced by clones exhibiting the desired bioactivity can be isolated and characterized. However, these approaches can be less sensitive and biased against more divergent clusters, in addition to precluding the use of bioinformatics for cluster characterization prior to expression. While direct shotgun sequencing of metagenomes has identified and profiled a great number of PKSs from different environments and yet-to-be cultured microorganisms, it does not lend itself well to heterologous expression, the cruxes of natural product discovery.Here, we describe a strategy for sequencing entire metagenomic libraries while maintaining correspondence between sequence and clone, allowing the full characterization and annotation of all clusters present in a library using bioinformatic tools and then seamlessly passing clones of interest for activity screening through heterologous expression. Once a library is sequenced, the methods herein can be adapted for the mining of any biosynthetic gene cluster of interest within a metagenomic library.
Collapse
|