1
|
Wang T, Gao M. Utilizing a deep learning model based on BERT for identifying enhancers and their strength. PLoS One 2025; 20:e0320085. [PMID: 40203028 PMCID: PMC11981215 DOI: 10.1371/journal.pone.0320085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2024] [Accepted: 02/12/2025] [Indexed: 04/11/2025] Open
Abstract
An enhancer is a specific DNA sequence typically located within a gene at upstream or downstream position and serves as a pivotal element in the regulation of eukaryotic gene transcription. Therefore, the recognition of enhancers is highly significant for comprehending gene expression regulatory systems. While some useful predictive models have been proposed, there are still deficiencies in these models. To address current limitations, we propose a model, DNABERT2-Enhancer, based on transformer architecture and deep learning, designed for the recognition of enhancers (classified as either enhancer or non-enhancer) and the identification of their activity (strong or weak enhancers). More specifically, DNABERT2-Enhancer is composed of a BERT model for extracting features and a CNN model for enhancers classification. Parameters of the BERT model are initialized by a pre-training DNABERT-2 language model. The enhancer recognition task is then fine-tuned through transfer learning to convert the original sequence into feature vectors. Subsequently, the CNN network is employed to learn the feature vector generated by BERT and produce the prediction results. In comparison with existing predictors utilizing the identical dataset, our approach demonstrates superior performance. This suggests that the model will be a useful instrument for academic research on the enhancer recognition.
Collapse
Affiliation(s)
- Tong Wang
- School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, China
| | - Mengqi Gao
- School of Computer and Information Engineering, Shanghai Polytechnic University, Shanghai, China
| |
Collapse
|
2
|
Vashisht S, Parisi C, Winata CL. Computational analysis of congenital heart disease associated SNPs: unveiling their impact on the gene regulatory system. BMC Genomics 2025; 26:55. [PMID: 39838281 PMCID: PMC11749323 DOI: 10.1186/s12864-025-11232-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 01/09/2025] [Indexed: 01/23/2025] Open
Abstract
Congenital heart disease (CHD) is a prevalent condition characterized by defective heart development, causing premature death and stillbirths among infants. Genome-wide association studies (GWASs) have provided insights into the role of genetic variants in CHD pathogenesis through the identification of a comprehensive set of single-nucleotide polymorphisms (SNPs). Notably, 90-95% of these variants reside in the noncoding genome, complicating the understanding of their underlying mechanisms. Here, we developed a systematic computational pipeline for the identification and analysis of CHD-associated SNPs spanning both coding and noncoding regions of the genome. Initially, we curated a thorough dataset of SNPs from GWAS-catalog and ClinVar database and filtered them based on CHD-related traits. Subsequently, these CHD-SNPs were annotated and categorized into noncoding and coding regions based on their location. To study the functional implications of noncoding CHD-SNPs, we cross-validated them with enhancer-specific histone modification marks from developing human heart across 9 Carnegie stages and identified potential cardiac enhancers. This approach led to the identification of 2,056 CHD-associated putative enhancers (CHD-enhancers), 38.9% of them overlapping with known enhancers catalogued in human enhancer disease database. We identified heart-related transcription factor binding sites within these CHD-enhancers, offering insights into the impact of SNPs on TF binding. Conservation analysis further revealed that many of these CHD-enhancers were highly conserved across vertebrates, suggesting their evolutionary significance. Utilizing heart-specific expression quantitative trait loci data, we further identified a subset of 63 CHD-SNPs with regulatory potential distributed across various cardiac tissues. Concurrently, coding CHD-SNPs were represented as a protein interaction network and its subsequent binding energy analysis focused on a pair of proteins within this network, pinpointed a deleterious coding CHD-SNP, rs770030288, located in C2 domain of MYBPC3 protein. Overall, our findings demonstrate that SNPs have the potential to disrupt gene regulatory systems, either by affecting enhancer sequences or modulating protein-protein interactions, which can lead to abnormal developmental processes contributing to CHD pathogenesis.
Collapse
Affiliation(s)
- Shikha Vashisht
- International Institute of Molecular and Cell Biology in Warsaw, Laboratory of Zebrafish Developmental Genomics, Księcia Trojdena 4, Warsaw, 02-109, Poland
| | - Costantino Parisi
- International Institute of Molecular and Cell Biology in Warsaw, Laboratory of Zebrafish Developmental Genomics, Księcia Trojdena 4, Warsaw, 02-109, Poland
| | - Cecilia L Winata
- International Institute of Molecular and Cell Biology in Warsaw, Laboratory of Zebrafish Developmental Genomics, Księcia Trojdena 4, Warsaw, 02-109, Poland.
| |
Collapse
|
3
|
Zhang S, Wang C, Qin S, Chen C, Bao Y, Zhang Y, Xu L, Liu Q, Zhao Y, Li K, Tang Z, Liu Y. Analyzing super-enhancer temporal dynamics reveals potential critical enhancers and their gene regulatory networks underlying skeletal muscle development. Genome Res 2024; 34:2190-2202. [PMID: 39433439 PMCID: PMC11694746 DOI: 10.1101/gr.278344.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Accepted: 10/15/2024] [Indexed: 10/23/2024]
Abstract
Super-enhancers (SEs) govern the expression of genes defining cell identity. However, the dynamic landscape of SEs and their critical constituent enhancers involved in skeletal muscle development remains unclear. In this study, using pig as a model, we employed cleavage under targets and tagmentation (CUT&Tag) to profile the enhancer-associated histone modification marker H3K27ac in skeletal muscle across two prenatal and three postnatal stages, and investigated how SEs influence skeletal muscle development. We identify three SE families with distinct temporal dynamics: continuous (Con, 397), transient (TS, 434), and de novo (DN, 756). These SE families are associated with different temporal gene expression trajectories, biological functions, and DNA methylation levels. Notably, several lines of evidence suggest a potential prominent role of Con SEs in regulating porcine muscle development and meat traits. To pinpoint key cis-regulatory units in Con SEs, we developed an integrative approach that leverages information from eRNA annotation, genome-wide association study (GWAS) signals, and high-throughput capture self-transcribing active regulatory region sequencing (STARR-seq) experiments. Within Con SEs, we identify 20 candidate critical enhancers with meat and carcass-associated DNA variations that affect enhancer activity, and infer their upstream transcription factors and downstream target genes. As a proof of concept, we experimentally validate the role of one such enhancer and its potential target gene during myogenesis. Our findings reveal the dynamic regulatory features of SEs in skeletal muscle development and provide a general integrative framework for identifying critical enhancers underlying the formation of complex traits.
Collapse
Affiliation(s)
- Song Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Chao Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Shenghua Qin
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Choulin Chen
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of Ministry of Education and Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yongzhou Bao
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Yuanyuan Zhang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Lingna Xu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Qingyou Liu
- Guangdong Provincial Key Laboratory of Animal Molecular Design and Precise Breeding, School of Life Science and Engineering, Foshan University, Foshan 528225, China
| | - Yunxiang Zhao
- Guangxi Key Laboratory of Animal Breeding, Disease Control and Prevention, College of Animal Science and Technology, Guangxi University, Nanning 530004, China
| | - Kui Li
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Foshan 528226, China
| | - Zhonglin Tang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Foshan 528226, China
| | - Yuwen Liu
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Key Laboratory of Livestock and Poultry Multi-Omics of MARA, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China;
- Innovation Group of Pig Genome Design and Breeding, Research Centre for Animal Genome, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
- Kunpeng Institute of Modern Agriculture at Foshan, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Foshan 528226, China
| |
Collapse
|
4
|
Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome organization is intricately tied to regulating genes and associated cell fate decisions. Here, we examine the positioning and functional significance of human genes, grouped by their lineage restriction level, within the 3D organization of the genome. We reveal that genes of different lineage restriction levels have distinct positioning relationships with both domains and loop anchors, and remarkably consistent relationships with boundaries across cell types. While the functional associations of each group of genes are primarily cell type-specific, associations of conserved genes maintain greater stability across 3D genomic features and disease than recently evolved genes. Furthermore, the expression of these genes across various tissues follows an evolutionary progression, such that RNA levels increase from young lineage restricted genes to ancient genes present in most species. Thus, the distinct relationships of gene evolutionary age, function, and positioning within 3D genomic features contribute to tissue-specific gene regulation in development and disease.
Collapse
Affiliation(s)
- Katherine Fleck
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Nitanta Garag
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA 02115, USA
| | - Trevor Hunter
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Daniel Marten
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - William Phu
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Kee-Myoung Nam
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510, USA
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Anne H. O’Donnell-Luria
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
- Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA
| | - Jelena Erceg
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
| |
Collapse
|
5
|
Oguntoyinbo IO, Goyal R. The Role of Long Intergenic Noncoding RNA in Fetal Development. Int J Mol Sci 2024; 25:11453. [PMID: 39519006 PMCID: PMC11546696 DOI: 10.3390/ijms252111453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2024] [Accepted: 10/22/2024] [Indexed: 11/16/2024] Open
Abstract
The role of long intergenic noncoding RNAs (lincRNAs) in fetal development has emerged as a significant area of study, challenging the traditional protein-centric view of gene expression. While messenger RNAs (mRNAs) have long been recognized for their role in encoding proteins, recent advances have illuminated the critical functions of lincRNAs in various biological processes. Initially identified through high-throughput sequencing technologies, lincRNAs are transcribed from intergenic regions between protein-coding genes and exhibit unique regulatory functions. Unlike mRNAs, lincRNAs are involved in complex interactions with chromatin and chromatin-modifying complexes, influencing gene expression and chromatin structure. LincRNAs are pivotal in regulating tissue-specific development and embryogenesis. For example, they are crucial for proper cardiac, neural, and reproductive system development, with specific lincRNAs being associated with organogenesis and differentiation processes. Their roles in embryonic development include regulating transcription factors and modulating chromatin states, which are essential for maintaining developmental programs and cellular identity. Studies using RNA sequencing and genetic knockout models have highlighted the importance of lincRNAs in processes such as cell differentiation, tissue patterning, and organ development. Despite their functional significance, the comprehensive annotation and understanding of lincRNAs remain limited. Ongoing research aims to elucidate their mechanisms of action and potential applications in disease diagnostics and therapeutics. This review summarizes current knowledge on the functional roles of lincRNAs in fetal development, emphasizing their contributions to tissue-specific gene regulation and developmental processes.
Collapse
Affiliation(s)
- Ifetoluwani Oluwadunsin Oguntoyinbo
- School of Animal and Comparative Biomedical Sciences, College of Agriculture, Life & Environmental Sciences, University of Arizona, Tucson, AZ 85721, USA;
| | - Ravi Goyal
- Department of Obstetrics and Gynecology, College of Medicine, University of Arizona, Tucson, AZ 85724, USA
| |
Collapse
|
6
|
Yan H, Mendieta JP, Zhang X, Marand AP, Liang Y, Luo Z, Minow MAA, Jang H, Li X, Roule T, Wagner D, Tu X, Wang Y, Jiang D, Zhong S, Huang L, Wessler SR, Schmitz RJ. Evolution of plant cell-type-specific cis-regulatory elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.08.574753. [PMID: 38260561 PMCID: PMC10802394 DOI: 10.1101/2024.01.08.574753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Cis-regulatory elements (CREs) are critical in regulating gene expression, and yet understanding of CRE evolution remains challenging. Here, we constructed a comprehensive single-cell atlas of chromatin accessibility in Oryza sativa, integrating data from 103,911 nuclei representing 126 discrete cell states across nine distinct organs. We used comparative genomics to compare cell-type resolved chromatin accessibility between O. sativa and 57,552 nuclei from four additional grass species (Zea mays, Sorghum bicolor, Panicum miliaceum, and Urochloa fusca). Accessible chromatin regions (ACRs) had different levels of conservation depending on the degree of cell-type specificity. We found a complex relationship between ACRs with conserved noncoding sequences, cell-type specificity, conservation, and tissue-specific switching. Additionally, we found that epidermal ACRs were less conserved compared to other cell types, potentially indicating that more rapid regulatory evolution has occurred in the L1-derived epidermal layer of these species. Finally, we identified and characterized a conserved subset of ACRs that overlapped the repressive histone modification H3K27me3, implicating them as potentially silencer-like CREs maintained by evolution. Collectively, this comparative genomics approach highlights the dynamics of plant cell-type-specific CRE evolution.
Collapse
|
7
|
Uebbing S, Kocher AA, Baumgartner M, Ji Y, Bai S, Xing X, Nottoli T, Noonan JP. Evolutionary Innovations in Conserved Regulatory Elements Associate With Developmental Genes in Mammals. Mol Biol Evol 2024; 41:msae199. [PMID: 39302728 PMCID: PMC11465374 DOI: 10.1093/molbev/msae199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 08/26/2024] [Accepted: 09/17/2024] [Indexed: 09/22/2024] Open
Abstract
Transcriptional enhancers orchestrate cell type- and time point-specific gene expression programs. Genetic variation within enhancer sequences is an important contributor to phenotypic variation including evolutionary adaptations and human disease. Certain genes and pathways may be more prone to regulatory evolution than others, with different patterns across diverse organisms, but whether such patterns exist has not been investigated at a sufficient scale. To address this question, we identified signatures of accelerated sequence evolution in conserved enhancer elements throughout the mammalian phylogeny at an unprecedented scale. While different genes and pathways were enriched for regulatory evolution in different parts of the tree, we found a striking overall pattern of pleiotropic genes involved in gene regulatory and developmental processes being enriched for accelerated enhancer evolution. These genes were connected to more enhancers than other genes, which was the basis for having an increased amount of sequence acceleration over all their enhancers combined. We provide evidence that sequence acceleration is associated with turnover of regulatory function. Detailed study of one acceleration event in an enhancer of HES1 revealed that sequence evolution led to a new activity domain in the developing limb that emerged concurrently with the evolution of digit reduction in hoofed mammals. Our results provide evidence that enhancer evolution has been a frequent contributor to regulatory innovation at conserved developmental signaling genes in mammals.
Collapse
Affiliation(s)
- Severin Uebbing
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Biology, Genome Biology and Epigenetics, Institute of Biodynamics and Biocomplexity, Utrecht University, Utrecht, The Netherlands
| | - Acadia A Kocher
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | - Yu Ji
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Suxia Bai
- Yale Genome Editing Center, Yale School of Medicine, New Haven, CT, USA
| | - Xiaojun Xing
- Yale Genome Editing Center, Yale School of Medicine, New Haven, CT, USA
| | - Timothy Nottoli
- Yale Genome Editing Center, Yale School of Medicine, New Haven, CT, USA
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| |
Collapse
|
8
|
Roberts M, Josephs EB. Previously unmeasured genetic diversity explains part of Lewontin's paradox in a k -mer-based meta-analysis of 112 plant species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.17.594778. [PMID: 38798362 PMCID: PMC11118579 DOI: 10.1101/2024.05.17.594778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
At the molecular level, most evolution is expected to be neutral. A key prediction of this expectation is that the level of genetic diversity in a population should scale with population size. However, as was noted by Richard Lewontin in 1974 and reaffirmed by later studies, the slope of the population size-diversity relationship in nature is much weaker than expected under neutral theory. We hypothesize that one contributor to this paradox is that current methods relying on single nucleotide polymorphisms (SNPs) called from aligning short reads to a reference genome underestimate levels of genetic diversity in many species. To test this idea, we calculated nucleotide diversity ( π ) and k -mer-based metrics of genetic diversity across 112 plant species, amounting to over 205 terabases of DNA sequencing data from 27,488 individual plants. We then compared how these different metrics correlated with proxies of population size that account for both range size and population density variation across species. We found that our population size proxies scaled anywhere from about 3 to over 20 times faster with k -mer diversity than nucleotide diversity after adjusting for evolutionary history, mating system, life cycle habit, cultivation status, and invasiveness. The relationship between k -mer diversity and population size proxies also remains significant after correcting for genome size, whereas the analogous relationship for nucleotide diversity does not. These results suggest that variation not captured by common SNP-based analyses explains part of Lewontin's paradox in plants.
Collapse
Affiliation(s)
- Miles Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing MI
| | - Emily B. Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI
- Plant Resilience Institute, Michigan State University, East Lansing, MI
| |
Collapse
|
9
|
Cummins M, Watson C, Edwards RJ, Mattick JS. The Evolution of Ultraconserved Elements in Vertebrates. Mol Biol Evol 2024; 41:msae146. [PMID: 39058500 PMCID: PMC11276968 DOI: 10.1093/molbev/msae146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 06/29/2024] [Accepted: 07/08/2024] [Indexed: 07/18/2024] Open
Abstract
Ultraconserved elements were discovered two decades ago, arbitrarily defined as sequences that are identical over a length ≥ 200 bp in the human, mouse, and rat genomes. The definition was subsequently extended to sequences ≥ 100 bp identical in at least three of five mammalian genomes (including dog and cow), and shown to have undergone rapid expansion from ancestors in fish and strong negative selection in birds and mammals. Since then, many more genomes have become available, allowing better definition and more thorough examination of ultraconserved element distribution and evolutionary history. We developed a fast and flexible analytical pipeline for identifying ultraconserved elements in multiple genomes, dedUCE, which allows manipulation of minimum length, sequence identity, and number of species with a detectable ultraconserved element according to specified parameters. We suggest an updated definition of ultraconserved elements as sequences ≥ 100 bp and ≥97% sequence identity in ≥50% of placental mammal orders (12,813 ultraconserved elements). By mapping ultraconserved elements to ∼200 species, we find that placental ultraconserved elements appeared early in vertebrate evolution, well before land colonization, suggesting that the evolutionary pressures driving ultraconserved element selection were present in aquatic environments in the Cambrian-Devonian periods. Most (>90%) ultraconserved elements likely appeared after the divergence of gnathostomes from jawless predecessors, were largely established in sequence identity by early Sarcopterygii evolution-before the divergence of lobe-finned fishes from tetrapods-and became near fixed in the amniotes. Ultraconserved elements are mainly located in the introns of protein-coding and noncoding genes involved in neurological and skeletomuscular development, enriched in regulatory elements, and dynamically expressed throughout embryonic development.
Collapse
Affiliation(s)
- Mitchell Cummins
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| | - Cadel Watson
- School of Engineering, UNSW Sydney, Sydney, NSW 2052, Australia
| | - Richard J Edwards
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| | - John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, NSW 2052, Australia
| |
Collapse
|
10
|
Mian Y, Wang L, Keikhosravi A, Guo K, Misteli T, Arda HE, Finn EH. Cell type- and transcription-independent spatial proximity between enhancers and promoters. Mol Biol Cell 2024; 35:ar96. [PMID: 38717453 PMCID: PMC11244156 DOI: 10.1091/mbc.e24-02-0082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 04/12/2024] [Accepted: 04/29/2024] [Indexed: 06/07/2024] Open
Abstract
Cell type-specific enhancers are critically important for lineage specification. The mechanisms that determine cell-type specificity of enhancer activity, however, are not fully understood. Most current models for how enhancers function invoke physical proximity between enhancer elements and their target genes. Here, we use an imaging-based approach to examine the spatial relationship of cell type-specific enhancers and their target genes with single-cell resolution. Using high-throughput microscopy, we measure the spatial distance from target promoters to their cell type-specific active and inactive enhancers in individual pancreatic cells derived from distinct lineages. We find increased proximity of all promoter-enhancer pairs relative to non-enhancer pairs separated by similar genomic distances. Strikingly, spatial proximity between enhancers and target genes was unrelated to tissue-specific enhancer activity. Furthermore, promoter-enhancer proximity did not correlate with the expression status of target genes. Our results suggest that promoter-enhancer pairs exist in a distinctive chromatin environment but that genome folding is not a universal driver of cell-type specificity in enhancer function.
Collapse
Affiliation(s)
- Yasmine Mian
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - Li Wang
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - Adib Keikhosravi
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - Konnie Guo
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - Tom Misteli
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - H. Efsun Arda
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
| | - Elizabeth H. Finn
- National Cancer Institute, National Institutes of Health, Bethesda, MD 20892
- Cell Cycle and Cancer Biology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, OK, 73104
| |
Collapse
|
11
|
Hu W, Li Y, Wu Y, Guan L, Li M. A deep learning model for DNA enhancer prediction based on nucleotide position aware feature encoding. iScience 2024; 27:110030. [PMID: 38868182 PMCID: PMC11167433 DOI: 10.1016/j.isci.2024.110030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 04/23/2024] [Accepted: 05/16/2024] [Indexed: 06/14/2024] Open
Abstract
Enhancers, genomic DNA elements, regulate neighboring gene expression crucial for biological processes like cell differentiation and stress response. However, current machine learning methods for predicting DNA enhancers often underutilize hidden features in gene sequences, limiting model accuracy. Hence, this article proposes the PDCNN model, a deep learning-based enhancer prediction method. PDCNN extracts statistical nucleotide representations from gene sequences, discerning positional distribution information of nucleotides in modifier-like DNA sequences. With a convolutional neural network structure, PDCNN employs dual convolutional and fully connected layers. The cross-entropy loss function iteratively updates using a gradient descent algorithm, enhancing prediction accuracy. Model parameters are fine-tuned to select optimal combinations for training, achieving over 95% accuracy. Comparative analysis with traditional methods and existing models demonstrates PDCNN's robust feature extraction capability. It outperforms advanced machine learning methods in identifying DNA enhancers, presenting an effective method with broad implications for genomics, biology, and medical research.
Collapse
Affiliation(s)
- Wenxing Hu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Yelin Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Yan Wu
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Lixin Guan
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| | - Mengshan Li
- College of Physics and Electronic Information, Gannan Normal University, Ganzhou 341000, Jiangxi, China
| |
Collapse
|
12
|
Gonzalez P, Hauck QC, Baxevanis AD. Conserved Noncoding Elements Evolve Around the Same Genes Throughout Metazoan Evolution. Genome Biol Evol 2024; 16:evae052. [PMID: 38502060 PMCID: PMC10988421 DOI: 10.1093/gbe/evae052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 03/07/2024] [Accepted: 03/13/2024] [Indexed: 03/20/2024] Open
Abstract
Conserved noncoding elements (CNEs) are DNA sequences located outside of protein-coding genes that can remain under purifying selection for up to hundreds of millions of years. Studies in vertebrate genomes have revealed that most CNEs carry out regulatory functions. Notably, many of them are enhancers that control the expression of homeodomain transcription factors and other genes that play crucial roles in embryonic development. To further our knowledge of CNEs in other parts of the animal tree, we conducted a large-scale characterization of CNEs in more than 50 genomes from three of the main branches of the metazoan tree: Cnidaria, Mollusca, and Arthropoda. We identified hundreds of thousands of CNEs and reconstructed the temporal dynamics of their appearance in each lineage, as well as determining their spatial distribution across genomes. We show that CNEs evolve repeatedly around the same genes across the Metazoa, including around homeodomain genes and other transcription factors; they also evolve repeatedly around genes involved in neural development. We also show that transposons are a major source of CNEs, confirming previous observations from vertebrates and suggesting that they have played a major role in wiring developmental gene regulatory mechanisms since the dawn of animal evolution.
Collapse
Affiliation(s)
- Paul Gonzalez
- Center for Genomics and Data Science Research, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Quinn C Hauck
- Center for Genomics and Data Science Research, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Andreas D Baxevanis
- Center for Genomics and Data Science Research, Division of Intramural Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
13
|
Iliopoulou E, Papadogiannis V, Tsigenopoulos CS, Manousaki T. Extensive Loss and Gain of Conserved Noncoding Elements During Early Teleost Evolution. Genome Biol Evol 2024; 16:evae061. [PMID: 38648507 PMCID: PMC11034925 DOI: 10.1093/gbe/evae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/19/2024] [Indexed: 04/25/2024] Open
Abstract
Conserved noncoding elements in vertebrates are enriched around transcription factor loci associated with development. However, loss and rapid divergence of conserved noncoding elements has been reported in teleost fish, albeit taking only few genomes into consideration. Taking advantage of the recent increase in high-quality teleost genomes, we focus on studying the evolution of teleost conserved noncoding elements, carrying out targeted genomic alignments and comparisons within the teleost phylogeny to detect conserved noncoding elements and reconstruct the ancestral teleost conserved noncoding elements repertoire. This teleost-centric approach confirms previous observations of extensive vertebrate conserved noncoding elements loss early in teleost evolution, but also reveals massive conserved noncoding elements gain in the teleost stem-group over 300 million years ago. Using synteny-based association to link conserved noncoding elements to their putatively regulated target genes, we show the most teleost gained conserved noncoding elements are found in the vicinity of orthologous loci involved in transcriptional regulation and embryonic development that are also associated with conserved noncoding elements in other vertebrates. Moreover, teleost and vertebrate conserved noncoding elements share a highly similar motif and transcription factor binding site vocabulary. We suggest that early teleost conserved noncoding element gains reflect a restructuring of the ancestral conserved noncoding element repertoire through both extreme divergence and de novo emergence. Finally, we support newly identified pan-teleost conserved noncoding elements have potential for accurate resolution of teleost phylogenetic placements in par with coding sequences, unlike ancestral only elements shared with spotted gar. This work provides new insight into conserved noncoding element evolution with great value for follow-up work on phylogenomics, comparative genomics, and the study of gene regulation evolution in teleosts.
Collapse
Affiliation(s)
- Elisavet Iliopoulou
- Hellenic Centre for Marine Research (HCMR), Institute of Marine Biology, Biotechnology & Aquaculture (IMBBC), Heraklion, Greece
- Present Address: Université Paris Cité, CNRS, Institut Jacques Monod, F-75013 Paris, France
| | - Vasileios Papadogiannis
- Hellenic Centre for Marine Research (HCMR), Institute of Marine Biology, Biotechnology & Aquaculture (IMBBC), Heraklion, Greece
- Present Address: Center for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Costas S Tsigenopoulos
- Hellenic Centre for Marine Research (HCMR), Institute of Marine Biology, Biotechnology & Aquaculture (IMBBC), Heraklion, Greece
| | - Tereza Manousaki
- Hellenic Centre for Marine Research (HCMR), Institute of Marine Biology, Biotechnology & Aquaculture (IMBBC), Heraklion, Greece
| |
Collapse
|
14
|
Abrar M, Ali S, Hussain I, Khatoon H, Batool F, Ghazanfar S, Corcoran D, Kawakami Y, Abbasi AA. Cis-regulatory control of mammalian Trps1 gene expression. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2024; 342:85-100. [PMID: 38369890 PMCID: PMC10978278 DOI: 10.1002/jez.b.23246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 12/22/2023] [Accepted: 01/31/2024] [Indexed: 02/20/2024]
Abstract
TRPS1 serves as the causative gene for tricho-rhino phalangeal syndrome, known for its craniofacial and skeletal abnormalities. The Trps1 gene encodes a protein that represses Wnt signaling through strong interactions with Wnt signaling inhibitors. The identification of genomic cis-acting regulatory sequences governing Trps1 expression is crucial for understanding its role in embryogenesis. Nevertheless, to date, no investigations have been conducted concerning these aspects of Trps1. To identify deeply conserved noncoding elements (CNEs) within the Trps1 locus, we employed a comparative genomics approach, utilizing slowly evolving fish such as coelacanth and spotted gar. These analyses resulted in the identification of eight CNEs in the intronic region of the Trps1 gene. Functional characterization of these CNEs in zebrafish revealed their regulatory potential in various tissues, including pectoral fins, heart, and pharyngeal arches. RNA in-situ hybridization experiments revealed concordance between the reporter expression pattern induced by the identified set of CNEs and the spatial expression pattern of the trps1 gene in zebrafish. Comparative in vivo data from zebrafish and mice for CNE7/hs919 revealed conserved functions of these enhancers. Each of these eight CNEs was further investigated in cell line-based reporter assays, revealing their repressive potential. Taken together, in vivo and in vitro assays suggest a context-dependent dual functionality for the identified set of Trps1-associated CNE enhancers. This functionally characterized set of CNE-enhancers will contribute to a more comprehensive understanding of the developmental roles of Trps1 and can aid in the identification of noncoding DNA variants associated with human diseases.
Collapse
Affiliation(s)
- Muhammad Abrar
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Shahid Ali
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, IL 60637, USA
| | - Irfan Hussain
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
- Center of regenerative medicine and stem cells research Aga Khan University hospital Karachi
| | - Hizran Khatoon
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Fatima Batool
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Shakira Ghazanfar
- National Institute for Genomics Advanced Biotechnology, National Agriculture Research Centre (NARC), Islamabad-45500, Pakistan
| | - Dylan Corcoran
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455 United States
| | - Yasuhiko Kawakami
- Department of Genetics, Cell Biology and Development, University of Minnesota, Minneapolis, MN 55455 United States
| | - Amir Ali Abbasi
- National Center for Bioinformatics, program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| |
Collapse
|
15
|
Sakamoto F, Kanamori S, Díaz LM, Cádiz A, Ishii Y, Yamaguchi K, Shigenobu S, Nakayama T, Makino T, Kawata M. Detection of evolutionary conserved and accelerated genomic regions related to adaptation to thermal niches in Anolis lizards. Ecol Evol 2024; 14:e11117. [PMID: 38455144 PMCID: PMC10920033 DOI: 10.1002/ece3.11117] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 02/18/2024] [Accepted: 02/22/2024] [Indexed: 03/09/2024] Open
Abstract
Understanding the genetic basis for adapting to thermal environments is important due to serious effects of global warming on ectothermic species. Various genes associated with thermal adaptation in lizards have been identified mainly focusing on changes in gene expression or the detection of positively selected genes using coding regions. Only a few comprehensive genome-wide analyses have included noncoding regions. This study aimed to identify evolutionarily conserved and accelerated genomic regions using whole genomes of eight Anolis lizard species that have repeatedly adapted to similar thermal environments in multiple lineages. Evolutionarily conserved genomic regions were extracted as regions with overall sequence conservation (regions with fewer base substitutions) across all lineages compared with the neutral model. Genomic regions that underwent accelerated evolution in the lineage of interest were identified as those with more base substitutions in the target branch than in the entire background branch. Conserved elements across all branches were relatively abundant in "intergenic" genomic regions among noncoding regions. Accelerated regions (ARs) of each lineage contained a significantly greater proportion of noncoding RNA genes than the entire multiple alignment. Common genes containing ARs within 5 kb of their vicinity in lineages with similar thermal habitats were identified. Many genes associated with circadian rhythms and behavior were found in hot-open and cool-shaded habitat lineages. These genes might play a role in contributing to thermal adaptation and assist future studies examining the function of genes involved in thermal adaptation via genome editing.
Collapse
Affiliation(s)
- Fuku Sakamoto
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | | | - Luis M. Díaz
- National Museum of Natural History of CubaHavanaCuba
| | - Antonio Cádiz
- Faculty of BiologyUniversity of HavanaHavanaCuba
- Present address:
Department of BiologyUniversity of MiamiCoral GablesFloridaUSA
| | - Yuu Ishii
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | | | - Shuji Shigenobu
- Trans‐Omics FacilityNational Institute for Basic BiologyOkazakiJapan
- Department of Basic Biology, School of Life ScienceThe Graduate University for Advanced Studies, SOKENDAIOkazakiJapan
| | - Takuro Nakayama
- Division of Life Sciences, Center for Computational SciencesUniversity of TsukubaTsukubaJapan
| | - Takashi Makino
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| | - Masakado Kawata
- Graduate School of Life SciencesTohoku UniversitySendaiJapan
| |
Collapse
|
16
|
Uebbing S, Kocher AA, Baumgartner M, Ji Y, Bai S, Xing X, Nottoli T, Noonan JP. Evolutionary innovation in conserved regulatory elements across the mammalian tree of life. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.31.578197. [PMID: 38352419 PMCID: PMC10862883 DOI: 10.1101/2024.01.31.578197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Transcriptional enhancers orchestrate cell type- and time point-specific gene expression programs. Evolution of enhancer sequences can alter target gene expression without causing detrimental misexpression in other contexts. It has long been thought that this modularity allows evolutionary changes in enhancers to escape pleiotropic constraints, which is especially important for evolutionary constrained developmental patterning genes. However, there is still little data supporting this hypothesis. Here we identified signatures of accelerated evolution in conserved enhancer elements across the mammalian phylogeny. We found that pleiotropic genes involved in gene regulatory and developmental processes were enriched for accelerated sequence evolution within their enhancer elements. These genes were associated with an excess number of enhancers compared to other genes, and due to this they exhibit a substantial degree of sequence acceleration over all their enhancers combined. We provide evidence that sequence acceleration is associated with turnover of regulatory function. We studied one acceleration event in depth and found that its sequence evolution led to the emergence of a new enhancer activity domain that may be involved in the evolution of digit reduction in hoofed mammals. Our results provide tangible evidence that enhancer evolution has been a frequent contributor to modifications involving constrained developmental signaling genes in mammals.
Collapse
Affiliation(s)
- Severin Uebbing
- Department of Genetics, Yale School of Medicine, New Haven CT, USA
- Genome Biology and Epigenetics, Institute of Biodynamics and Biocomplexity, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Acadia A Kocher
- Department of Genetics, Yale School of Medicine, New Haven CT, USA
- Present address: Division of Molecular Genetics, Netherlands Cancer Institute, Amsterdam, The Netherlands
| | | | - Yu Ji
- Department of Genetics, Yale School of Medicine, New Haven CT, USA
| | - Suxia Bai
- Yale Genome Editing Center, Yale School of Medicine, New Haven CT, USA
| | - Xiaojun Xing
- Yale Genome Editing Center, Yale School of Medicine, New Haven CT, USA
| | - Timothy Nottoli
- Yale Genome Editing Center, Yale School of Medicine, New Haven CT, USA
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven CT, USA
- Department of Ecology and Evolutionary Biology, Yale University, New Haven CT, USA
- Department of Neuroscience, Yale School of Medicine, New Haven CT, USA
- Wu Tsai Institute, Yale University, New Haven CT, USA
| |
Collapse
|
17
|
Zhang Y, Zhang P, Wu H. Enhancer-MDLF: a novel deep learning framework for identifying cell-specific enhancers. Brief Bioinform 2024; 25:bbae083. [PMID: 38485768 PMCID: PMC10938904 DOI: 10.1093/bib/bbae083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 01/27/2024] [Accepted: 02/07/2024] [Indexed: 03/18/2024] Open
Abstract
Enhancers, noncoding DNA fragments, play a pivotal role in gene regulation, facilitating gene transcription. Identifying enhancers is crucial for understanding genomic regulatory mechanisms, pinpointing key elements and investigating networks governing gene expression and disease-related mechanisms. Existing enhancer identification methods exhibit limitations, prompting the development of our novel multi-input deep learning framework, termed Enhancer-MDLF. Experimental results illustrate that Enhancer-MDLF outperforms the previous method, Enhancer-IF, across eight distinct human cell lines and exhibits superior performance on generic enhancer datasets and enhancer-promoter datasets, affirming the robustness of Enhancer-MDLF. Additionally, we introduce transfer learning to provide an effective and potential solution to address the prediction challenges posed by enhancer specificity. Furthermore, we utilize model interpretation to identify transcription factor binding site motifs that may be associated with enhancer regions, with important implications for facilitating the study of enhancer regulatory mechanisms. The source code is openly accessible at https://github.com/HaoWuLab-Bioinformatics/Enhancer-MDLF.
Collapse
Affiliation(s)
- Yao Zhang
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hao Wu
- School of Software, Shandong University, Jinan, 250100, Shandong, China
| |
Collapse
|
18
|
Chow CN, Yang CW, Wu NY, Wang HT, Tseng KC, Chiu YH, Lee TY, Chang WC. PlantPAN 4.0: updated database for identifying conserved non-coding sequences and exploring dynamic transcriptional regulation in plant promoters. Nucleic Acids Res 2024; 52:D1569-D1578. [PMID: 37897338 PMCID: PMC10767843 DOI: 10.1093/nar/gkad945] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/07/2023] [Accepted: 10/12/2023] [Indexed: 10/30/2023] Open
Abstract
PlantPAN 4.0 (http://PlantPAN.itps.ncku.edu.tw/) is an integrative resource for constructing transcriptional regulatory networks for diverse plant species. In this release, the gene annotation and promoter sequences were expanded to cover 115 species. PlantPAN 4.0 can help users characterize the evolutionary differences and similarities among cis-regulatory elements; furthermore, this system can now help in identification of conserved non-coding sequences among homologous genes. The updated transcription factor binding site repository contains 3428 nonredundant matrices for 18305 transcription factors; this expansion helps in exploration of combinational and nucleotide variants of cis-regulatory elements in conserved non-coding sequences. Additionally, the genomic landscapes of regulatory factors were manually updated, and ChIP-seq data sets derived from a single-cell green alga (Chlamydomonas reinhardtii) were added. Furthermore, the statistical review and graphical analysis components were improved to offer intelligible information through ChIP-seq data analysis. These improvements included easy-to-read experimental condition clusters, searchable gene-centered interfaces for the identification of promoter regions' binding preferences by considering experimental condition clusters and peak visualization for all regulatory factors, and the 20 most significantly enriched gene ontology functions for regulatory factors. Thus, PlantPAN 4.0 can effectively reconstruct gene regulatory networks and help compare genomic cis-regulatory elements across plant species and experiments.
Collapse
Affiliation(s)
- Chi-Nga Chow
- Institute of Tropical Plant Sciences and Microbiology, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
- School of Molecular Sciences, Arizona State University, Tempe 85281, USA
| | - Chien-Wen Yang
- Institute of Tropical Plant Sciences and Microbiology, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
| | - Nai-Yun Wu
- Institute of Tropical Plant Sciences and Microbiology, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
| | - Hung-Teng Wang
- Institute of Tropical Plant Sciences and Microbiology, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
| | - Kuan-Chieh Tseng
- Department of Life Sciences, National Cheng Kung University, Tainan 701, Taiwan
| | - Yu-Hsuan Chiu
- Graduate Program in Translational Agricultural Sciences, National Cheng Kung University and Academia Sinica, Tainan 701, Taiwan
| | - Tzong-Yi Lee
- Department of Biological Science & Technology, National Yang Ming Chiao Tung University, Hsinchu 300, Taiwan
| | - Wen-Chi Chang
- Institute of Tropical Plant Sciences and Microbiology, College of Biosciences and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
- Department of Life Sciences, National Cheng Kung University, Tainan 701, Taiwan
- Graduate Program in Translational Agricultural Sciences, National Cheng Kung University and Academia Sinica, Tainan 701, Taiwan
| |
Collapse
|
19
|
Omori Y, Burgess SM. The Goldfish Genome and Its Utility for Understanding Gene Regulation and Vertebrate Body Morphology. Methods Mol Biol 2024; 2707:335-355. [PMID: 37668923 DOI: 10.1007/978-1-0716-3401-1_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]
Abstract
Goldfish, widely viewed as an ornamental fish, is a member of Cyprinidae family and has a very long history in research for both genetics and physiology studies. Among Cyprinidae, the chromosomal locations of orthologs and the amino acid sequences are usually highly conserved. Adult goldfish are 1000 times larger than adult zebrafish (who are in the same family of fishes), which can make it easier to perform several types of experiments compared to their zebrafish cousins. Comparing mutant phenotypes in orthologous genes between goldfish and zebrafish can often be very informative and provide a deeper insight into the gene function than studying the gene in either species alone. Comparative genomics and phenotypic comparisons between goldfish and zebrafish will provide new opportunities for understanding the development and evolution of body forms in the vertebrate lineage.
Collapse
Affiliation(s)
- Yoshihiro Omori
- Laboratory of Functional Genomics, Graduate School of Bioscience, Nagahama Institute of Bioscience and Technology, Nagahama, Japan.
| | - Shawn M Burgess
- Translational and Functional Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.
| |
Collapse
|
20
|
Kondoh H. Molecular Basis of Cell Reprogramming into iPSCs with Exogenous Transcription Factors. Results Probl Cell Differ 2024; 72:193-218. [PMID: 38509259 DOI: 10.1007/978-3-031-39027-2_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Abstract
A striking discovery in recent decades concerning the transcription factor (TF)-dependent process was the production of induced pluripotent stem cell (iPSCs) from fibroblasts by the exogenous expression of the TF cocktail containing Oct3/4 (Pou5f1), Sox2, Klf4, and Myc, collectively called OSKM. How fibroblast cells can be remodeled into embryonic stem cell (ESC)-like iPSCs despite high epigenetic barriers has opened a new essential avenue to understanding the action of TFs in developmental regulation. Two forerunning investigations preceded the iPSC phenomenon: exogenous TF-mediated cell remodeling driven by the action of MyoD, and the "pioneer TF" action to preopen chromatin, allowing multiple TFs to access enhancer sequences. The process of remodeling somatic cells into iPSCs has been broken down into multiple subprocesses: the initial attack of OSKM on closed chromatin, sequential changes in cytosine modification, enhancer usage, and gene silencing and activation. Notably, the OSKM TFs change their genomic binding sites extensively. The analyses are still at the descriptive stage, but currently available information is discussed in this chapter.
Collapse
Affiliation(s)
- Hisato Kondoh
- Osaka University, Suita, Osaka, Japan
- Biohistory Research Hall, Takatsuki, Osaka, Japan
| |
Collapse
|
21
|
Ali S, Abrar M, Hussain I, Batool F, Raza RZ, Khatoon H, Zoia M, Visel A, Shubin NH, Osterwalder M, Abbasi AA. Identification of ancestral gnathostome Gli3 enhancers with activity in mammals. Dev Growth Differ 2024; 66:75-88. [PMID: 37925606 PMCID: PMC10841732 DOI: 10.1111/dgd.12901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 09/01/2023] [Accepted: 10/23/2023] [Indexed: 11/06/2023]
Abstract
Abnormal expression of the transcriptional regulator and hedgehog (Hh) signaling pathway effector Gli3 is known to trigger congenital disease, most frequently affecting the central nervous system (CNS) and the limbs. Accurate delineation of the genomic cis-regulatory landscape controlling Gli3 transcription during embryonic development is critical for the interpretation of noncoding variants associated with congenital defects. Here, we employed a comparative genomic analysis on fish species with a slow rate of molecular evolution to identify seven previously unknown conserved noncoding elements (CNEs) in Gli3 intronic intervals (CNE15-21). Transgenic assays in zebrafish revealed that most of these elements drive activities in Gli3 expressing tissues, predominantly the fins, CNS, and the heart. Intersection of these CNEs with human disease associated SNPs identified CNE15 as a putative mammalian craniofacial enhancer, with conserved activity in vertebrates and potentially affected by mutation associated with human craniofacial morphology. Finally, comparative functional dissection of an appendage-specific CNE conserved in slowly evolving fish (elephant shark), but not in teleost (CNE14/hs1586) indicates co-option of limb specificity from other tissues prior to the divergence of amniotes and lobe-finned fish. These results uncover a novel subset of intronic Gli3 enhancers that arose in the common ancestor of gnathostomes and whose sequence components were likely gradually modified in other species during the process of evolutionary diversification.
Collapse
Affiliation(s)
- Shahid Ali
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, IL 60637, USA
| | - Muhammad Abrar
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Irfan Hussain
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Fatima Batool
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Rabail Zehra Raza
- Department of Biological Sciences, Faculty of Multidisciplinary Studies, National University of Medical Sciences Rawalpindi, Pakistan
| | - Hizran Khatoon
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| | - Matteo Zoia
- Department for Biomedical Research (DBMR), University of Bern, Bern, Switzerland
| | - Axel Visel
- Environmental Genomics and System Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA
- U.S. Department of Energy Joint Genome Institute, 1 Cyclotron Road, Berkeley, CA 94720, USA
- School of Natural Sciences, University of California, Merced, Merced, CA 95343, USA
| | - Neil H. Shubin
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, IL 60637, USA
| | - Marco Osterwalder
- Department for Biomedical Research (DBMR), University of Bern, Bern, Switzerland
- Department of Cardiology, Bern University Hospital, Bern, Switzerland
| | - Amir Ali Abbasi
- National Center for Bioinformatics, Program of Comparative and Evolutionary Genomics, Faculty of Biological Sciences, Quaid-i-Azam University, 45320, Islamabad Pakistan
| |
Collapse
|
22
|
Liu T, Li T, Ke S. Role of the CASZ1 transcription factor in tissue development and disease. Eur J Med Res 2023; 28:562. [PMID: 38053207 DOI: 10.1186/s40001-023-01548-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 11/22/2023] [Indexed: 12/07/2023] Open
Abstract
The zinc finger transcription factor gene, CASZ1/Castor (Castor zinc finger 1), initially identified in Drosophila, plays a critical role in neural, cardiac, and cardiovascular development, exerting a complex, multifaceted influence on cell fate and tissue morphogenesis. During neurogenesis, CASZ1 exhibits dynamic expression from early embryonic development to the perinatal period, constituting a key regulator in this process. Additionally, CASZ1 controls the transition between neurogenesis and gliomagenesis. During human cardiovascular system development, CASZ1 is essential for cardiomyocyte differentiation, cardiac morphogenesis, and vascular morphology homeostasis and formation. The deletion or inactivation of CASZ1 mutations can lead to human developmental diseases or tumors, including congenital heart disease, cardiovascular disease, and neuroblastoma. CASZ1 can be used as a biomarker for disease prevention and diagnosis as well as a prognostic indicator for cancer. This review explores the unique functions of CASZ1 in tissue morphogenesis and associated diseases, offering new insights for elucidating the molecular mechanisms underlying diseases and identifying potential therapeutic targets for disease prevention and treatment.
Collapse
Affiliation(s)
- Tiantian Liu
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, 156 Jinshui East Road, Zhengzhou, 450046, Henan, China.
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, 450046, Henan, China.
| | - Tao Li
- College of Life Sciences, Henan Agricultural University, Zhengzhou, 450002, China
| | - Shaorui Ke
- Henan Key Laboratory of Chinese Medicine for Respiratory Disease, Henan University of Chinese Medicine, 156 Jinshui East Road, Zhengzhou, 450046, Henan, China
- Academy of Chinese Medical Sciences, Henan University of Chinese Medicine, Zhengzhou, 450046, Henan, China
| |
Collapse
|
23
|
Mulet-Lazaro R, Delwel R. From Genotype to Phenotype: How Enhancers Control Gene Expression and Cell Identity in Hematopoiesis. Hemasphere 2023; 7:e969. [PMID: 37953829 PMCID: PMC10635615 DOI: 10.1097/hs9.0000000000000969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/11/2023] [Indexed: 11/14/2023] Open
Abstract
Blood comprises a wide array of specialized cells, all of which share the same genetic information and ultimately derive from the same precursor, the hematopoietic stem cell (HSC). This diversity of phenotypes is underpinned by unique transcriptional programs gradually acquired in the process known as hematopoiesis. Spatiotemporal regulation of gene expression depends on many factors, but critical among them are enhancers-sequences of DNA that bind transcription factors and increase transcription of genes under their control. Thus, hematopoiesis involves the activation of specific enhancer repertoires in HSCs and their progeny, driving the expression of sets of genes that collectively determine morphology and function. Disruption of this tightly regulated process can have catastrophic consequences: in hematopoietic malignancies, dysregulation of transcriptional control by enhancers leads to misexpression of oncogenes that ultimately drive transformation. This review attempts to provide a basic understanding of enhancers and their role in transcriptional regulation, with a focus on normal and malignant hematopoiesis. We present examples of enhancers controlling master regulators of hematopoiesis and discuss the main mechanisms leading to enhancer dysregulation in leukemia and lymphoma.
Collapse
Affiliation(s)
- Roger Mulet-Lazaro
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| | - Ruud Delwel
- Department of Hematology, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
- Oncode Institute, Utrecht, the Netherlands
| |
Collapse
|
24
|
Abatti LE, Lado-Fernández P, Huynh L, Collado M, Hoffman M, Mitchell J. Epigenetic reprogramming of a distal developmental enhancer cluster drives SOX2 overexpression in breast and lung adenocarcinoma. Nucleic Acids Res 2023; 51:10109-10131. [PMID: 37738673 PMCID: PMC10602899 DOI: 10.1093/nar/gkad734] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/18/2023] [Accepted: 08/24/2023] [Indexed: 09/24/2023] Open
Abstract
Enhancer reprogramming has been proposed as a key source of transcriptional dysregulation during tumorigenesis, but the molecular mechanisms underlying this process remain unclear. Here, we identify an enhancer cluster required for normal development that is aberrantly activated in breast and lung adenocarcinoma. Deletion of the SRR124-134 cluster disrupts expression of the SOX2 oncogene, dysregulates genome-wide transcription and chromatin accessibility and reduces the ability of cancer cells to form colonies in vitro. Analysis of primary tumors reveals a correlation between chromatin accessibility at this cluster and SOX2 overexpression in breast and lung cancer patients. We demonstrate that FOXA1 is an activator and NFIB is a repressor of SRR124-134 activity and SOX2 transcription in cancer cells, revealing a co-opting of the regulatory mechanisms involved in early development. Notably, we show that the conserved SRR124 and SRR134 regions are essential during mouse development, where homozygous deletion results in the lethal failure of esophageal-tracheal separation. These findings provide insights into how developmental enhancers can be reprogrammed during tumorigenesis and underscore the importance of understanding enhancer dynamics during development and disease.
Collapse
Affiliation(s)
- Luis E Abatti
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
| | - Patricia Lado-Fernández
- Laboratory of Cell Senescence, Cancer and Aging, Health Research Institute of Santiago de Compostela (IDIS), Xerencia de Xestión Integrada de Santiago (XXIS/SERGAS), Santiago de Compostela, Spain
- Department of Physiology and Center for Research in Molecular Medicine and Chronic Diseases (CiMUS), Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Linh Huynh
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Manuel Collado
- Laboratory of Cell Senescence, Cancer and Aging, Health Research Institute of Santiago de Compostela (IDIS), Xerencia de Xestión Integrada de Santiago (XXIS/SERGAS), Santiago de Compostela, Spain
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Jennifer A Mitchell
- Department of Cell and Systems Biology, University of Toronto, Toronto, Ontario, Canada
- Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
25
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
26
|
Fan K, Pfister E, Weng Z. Toward a comprehensive catalog of regulatory elements. Hum Genet 2023; 142:1091-1111. [PMID: 36935423 DOI: 10.1007/s00439-023-02519-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 01/03/2023] [Indexed: 03/21/2023]
Abstract
Regulatory elements are the genomic regions that interact with transcription factors to control cell-type-specific gene expression in different cellular environments. A precise and complete catalog of functional elements encoded by the human genome is key to understanding mammalian gene regulation. Here, we review the current state of regulatory element annotation. We first provide an overview of assays for characterizing functional elements, including genome, epigenome, transcriptome, three-dimensional chromatin interaction, and functional validation assays. We then discuss computational methods for defining regulatory elements, including peak-calling and other statistical modeling methods. Finally, we introduce several high-quality lists of regulatory element annotations and suggest potential future directions.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Edith Pfister
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA.
| |
Collapse
|
27
|
Wu J, Yue C, Xu W, Li H, Zhu J, Li L. MNX1 facilitates the malignant progress of lung adenocarcinoma through transcriptionally upregulating CCDC34. Oncol Lett 2023; 26:325. [PMID: 37415626 PMCID: PMC10320431 DOI: 10.3892/ol.2023.13911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 03/29/2023] [Indexed: 07/08/2023] Open
Abstract
Lung adenocarcinoma (LUAD) represents the most prevalent subtype of lung cancer and typically has high incidence and fatality rates. Motor neuron and pancreas homeobox 1 (MNX1) and coiled-coil domain-containing 34 (CCDC34) serve as oncogenes in multiple types of cancer. However, their role in LUAD remains to be elucidated. In the present study, bioinformatics analysis and LUAD cell lines were adopted to examine the expression of MNX1 and CCDC34. The proliferation, migration and invasion abilities of A549 cells were determined using Cell Counting Kit-8, colony formation, wound-healing and Transwell assay, and flow cytometry was conducted to assess cell cycle distribution and apoptosis. The interaction between MNX1 and CCDC34 was verified by luciferase reporter and chromatin immunoprecipitation assays. In addition, an in vivo animal model of LUAD was established for validation. The results demonstrated that both MNX1 and CCDC34 were upregulated in LUAD cell lines. MNX1 knockdown significantly suppressed cell proliferation, migration and invasion, hindered cell cycle progression and promoted cell apoptosis in vitro and inhibited tumor growth in vivo. However, the antitumor effect of MNX1 knockdown was weakened by simultaneous CCDC34 overexpression in vitro. In terms of mechanism, MNX1 was demonstrated to directly bind to the CCDC34 promoter and transcriptionally activate CCDC34 expression. In conclusion, the present study highlighted a critical role of the MNX1/CCDC34 axis in regulating LUAD progression, providing novel therapeutic targets for LUAD treatment.
Collapse
Affiliation(s)
- Junhua Wu
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| | - Chongmei Yue
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| | - Weiguo Xu
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| | - Hui Li
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| | - Jing Zhu
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| | - Lin Li
- Respiratory and Critical Care Medicine, Mianyang Central Hospital, Mianyang, Sichuan 621000, P.R. China
- School of Medicine, University of Electronic Science and Technology of China, Mianyang, Sichuan 621000, P.R. China
| |
Collapse
|
28
|
Pan X, Ma Z, Sun X, Li H, Zhang T, Zhao C, Wang N, Heller R, Hung Wong W, Wang W, Jiang Y, Wang Y. CNEReg Interprets Ruminant-specific Conserved Non-coding Elements by Developmental Gene Regulatory Network. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:632-648. [PMID: 36494035 PMCID: PMC10787174 DOI: 10.1016/j.gpb.2022.11.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 11/12/2022] [Accepted: 11/30/2022] [Indexed: 12/12/2022]
Abstract
The genetic information coded in DNA leads to trait innovation via a gene regulatory network (GRN) in development. Here, we developed a conserved non-coding element interpretation method to integrate multi-omics data into gene regulatory network (CNEReg) to investigate the ruminant multi-chambered stomach innovation. We generated paired expression and chromatin accessibility data during rumen and esophagus development in sheep, and revealed 1601 active ruminant-specific conserved non-coding elements (active-RSCNEs). To interpret the function of these active-RSCNEs, we defined toolkit transcription factors (TTFs) and modeled their regulation on rumen-specific genes via batteries of active-RSCNEs during development. Our developmental GRN revealed 18 TTFs and 313 active-RSCNEs regulating 7 rumen functional modules. Notably, 6 TTFs (OTX1, SOX21, HOXC8, SOX2, TP63, and PPARG), as well as 16 active-RSCNEs, functionally distinguished the rumen from the esophagus. Our study provides a systematic approach to understanding how gene regulation evolves and shapes complex traits by putting evo-devo concepts into practice with developmental multi-omics data.
Collapse
Affiliation(s)
- Xiangyu Pan
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China; Department of Medical Research, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China
| | - Zhaoxia Ma
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China; School of Mathematics, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xinqi Sun
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China; School of Mathematics, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hui Li
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China; State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Animal Science and Technology, Guangxi University, Nanning 530005, China
| | - Tingting Zhang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Chen Zhao
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Nini Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen DK-2100, Denmark
| | - Wing Hung Wong
- Department of Statistics, Department of Biomedical Data Science, Bio-X Program, Stanford University, Stanford, CA 94305, USA
| | - Wen Wang
- Center for Ecological and Environmental Sciences, Northwestern Polytechnical University, Xi'an 710072, China; State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China.
| | - Yu Jiang
- Key Laboratory of Animal Genetics, Breeding and Reproduction of Shaanxi Province, College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China.
| | - Yong Wang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China; School of Mathematics, University of Chinese Academy of Sciences, Beijing 100049, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China; Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China.
| |
Collapse
|
29
|
Kemmler CL, Moran HR, Murray BF, Scoresby A, Klem JR, Eckert RL, Lepovsky E, Bertho S, Nieuwenhuize S, Burger S, D'Agati G, Betz C, Puller AC, Felker A, Ditrychova K, Bötschi S, Affolter M, Rohner N, Lovely CB, Kwan KM, Burger A, Mosimann C. Next-generation plasmids for transgenesis in zebrafish and beyond. Development 2023; 150:dev201531. [PMID: 36975217 PMCID: PMC10263156 DOI: 10.1242/dev.201531] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 03/10/2023] [Indexed: 03/29/2023]
Abstract
Transgenesis is an essential technique for any genetic model. Tol2-based transgenesis paired with Gateway-compatible vector collections has transformed zebrafish transgenesis with an accessible modular system. Here, we establish several next-generation transgenesis tools for zebrafish and other species to expand and enhance transgenic applications. To facilitate gene regulatory element testing, we generated Gateway middle entry vectors harboring the small mouse beta-globin minimal promoter coupled to several fluorophores, CreERT2 and Gal4. To extend the color spectrum for transgenic applications, we established middle entry vectors encoding the bright, blue-fluorescent protein mCerulean and mApple as an alternative red fluorophore. We present a series of p2A peptide-based 3' vectors with different fluorophores and subcellular localizations to co-label cells expressing proteins of interest. Finally, we established Tol2 destination vectors carrying the zebrafish exorh promoter driving different fluorophores as a pineal gland-specific transgenesis marker that is active before hatching and through adulthood. exorh-based reporters and transgenesis markers also drive specific pineal gland expression in the eye-less cavefish (Astyanax). Together, our vectors provide versatile reagents for transgenesis applications in zebrafish, cavefish and other models.
Collapse
Affiliation(s)
- Cassie L. Kemmler
- University of Colorado, School of Medicine, Anschutz Medical Campus, Department of Pediatrics, Section of Developmental Biology, 12801 E 17th Avenue, Aurora, CO 80045, USA
| | - Hannah R. Moran
- University of Colorado, School of Medicine, Anschutz Medical Campus, Department of Pediatrics, Section of Developmental Biology, 12801 E 17th Avenue, Aurora, CO 80045, USA
| | - Brooke F. Murray
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Aaron Scoresby
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - John R. Klem
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY 40202, USA
| | - Rachel L. Eckert
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY 40202, USA
| | - Elizabeth Lepovsky
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY 40202, USA
| | - Sylvain Bertho
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Susan Nieuwenhuize
- University of Colorado, School of Medicine, Anschutz Medical Campus, Department of Pediatrics, Section of Developmental Biology, 12801 E 17th Avenue, Aurora, CO 80045, USA
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Sibylle Burger
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Gianluca D'Agati
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Charles Betz
- Growth & Development, Biozentrum, Spitalstrasse 41, University of Basel, 4056 Basel, Switzerland
| | - Ann-Christin Puller
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Anastasia Felker
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Karolina Ditrychova
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Seraina Bötschi
- Department of Molecular Life Sciences, University of Zurich, 8057 Zürich, Switzerland
| | - Markus Affolter
- Growth & Development, Biozentrum, Spitalstrasse 41, University of Basel, 4056 Basel, Switzerland
| | - Nicolas Rohner
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - C. Ben Lovely
- Department of Biochemistry and Molecular Genetics, University of Louisville School of Medicine, Louisville, KY 40202, USA
| | - Kristen M. Kwan
- Department of Human Genetics, University of Utah, Salt Lake City, UT 84112, USA
| | - Alexa Burger
- University of Colorado, School of Medicine, Anschutz Medical Campus, Department of Pediatrics, Section of Developmental Biology, 12801 E 17th Avenue, Aurora, CO 80045, USA
| | - Christian Mosimann
- University of Colorado, School of Medicine, Anschutz Medical Campus, Department of Pediatrics, Section of Developmental Biology, 12801 E 17th Avenue, Aurora, CO 80045, USA
| |
Collapse
|
30
|
Wu H, Liu M, Zhang P, Zhang H. iEnhancer-SKNN: a stacking ensemble learning-based method for enhancer identification and classification using sequence information. Brief Funct Genomics 2023; 22:302-311. [PMID: 36715222 DOI: 10.1093/bfgp/elac057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 12/01/2022] [Accepted: 12/13/2022] [Indexed: 01/31/2023] Open
Abstract
Enhancers, a class of distal cis-regulatory elements located in the non-coding region of DNA, play a key role in gene regulation. It is difficult to identify enhancers from DNA sequence data because enhancers are freely distributed in the non-coding region, with no specific sequence features, and having a long distance with the targeted promoters. Therefore, this study presents a stacking ensemble learning method to accurately identify enhancers and classify enhancers into strong and weak enhancers. Firstly, we obtain the fusion feature matrix by fusing the four features of Kmer, PseDNC, PCPseDNC and Z-Curve9. Secondly, five K-Nearest Neighbor (KNN) models with different parameters are trained as the base model, and the Logistic Regression algorithm is utilized as the meta-model. Thirdly, the stacking ensemble learning strategy is utilized to construct a two-layer model based on the base model and meta-model to train the preprocessed feature sets. The proposed method, named iEnhancer-SKNN, is a two-layer prediction model, in which the function of the first layer is to predict whether the given DNA sequences are enhancers or non-enhancers, and the function of the second layer is to distinguish whether the predicted enhancers are strong enhancers or weak enhancers. The performance of iEnhancer-SKNN is evaluated on the independent testing dataset and the results show that the proposed method has better performance in predicting enhancers and their strength. In enhancer identification, iEnhancer-SKNN achieves an accuracy of 81.75%, an improvement of 1.35% to 8.75% compared with other predictors, and in enhancer classification, iEnhancer-SKNN achieves an accuracy of 80.50%, an improvement of 5.5% to 25.5% compared with other predictors. Moreover, we identify key transcription factor binding site motifs in the enhancer regions and further explore the biological functions of the enhancers and these key motifs. Source code and data can be downloaded from https://github.com/HaoWuLab-Bioinformatics/iEnhancer-SKNN.
Collapse
Affiliation(s)
- Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.,School of Software, Shandong University, Jinan, 250101, Shandong, China
| | - Mengdi Liu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hongming Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| |
Collapse
|
31
|
Skvortsova K, Bertrand S, Voronov D, Duckett PE, Ross SE, Magri MS, Maeso I, Weatheritt RJ, Gómez Skarmeta JL, Arnone MI, Escriva H, Bogdanovic O. Active DNA demethylation of developmental cis-regulatory regions predates vertebrate origins. SCIENCE ADVANCES 2022; 8:eabn2258. [PMID: 36459547 PMCID: PMC10936051 DOI: 10.1126/sciadv.abn2258] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 10/19/2022] [Indexed: 06/17/2023]
Abstract
DNA methylation [5-methylcytosine (5mC)] is a repressive gene-regulatory mark required for vertebrate embryogenesis. Genomic 5mC is tightly regulated through the action of DNA methyltransferases, which deposit 5mC, and ten-eleven translocation (TET) enzymes, which participate in its active removal through the formation of 5-hydroxymethylcytosine (5hmC). TET enzymes are essential for mammalian gastrulation and activation of vertebrate developmental enhancers; however, to date, a clear picture of 5hmC function, abundance, and genomic distribution in nonvertebrate lineages is lacking. By using base-resolution 5mC and 5hmC quantification during sea urchin and lancelet embryogenesis, we shed light on the roles of nonvertebrate 5hmC and TET enzymes. We find that these invertebrate deuterostomes use TET enzymes for targeted demethylation of regulatory regions associated with developmental genes and show that the complement of identified 5hmC-regulated genes is conserved to vertebrates. This work demonstrates that active 5mC removal from regulatory regions is a common feature of deuterostome embryogenesis suggestive of an unexpected deep conservation of a major gene-regulatory module.
Collapse
Affiliation(s)
- Ksenia Skvortsova
- Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia
- St. Vincent’s Clinical School, Faculty of Medicine, University of New South Wales, Sydney, Australia
| | - Stephanie Bertrand
- Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins (BIOM), Observatoire Océanologique, Banyuls-sur-Mer, France
| | - Danila Voronov
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Paul E. Duckett
- Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia
| | - Samuel E. Ross
- Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia
- St. Vincent’s Clinical School, Faculty of Medicine, University of New South Wales, Sydney, Australia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney 22, Australia
| | - Marta Silvia Magri
- Centro Andaluz de Biología del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Seville, Spain
| | - Ignacio Maeso
- Centro Andaluz de Biología del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Seville, Spain
| | - Robert J. Weatheritt
- Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia
- EMBL Australia, Garvan Institute of Medical Research, Sydney, Australia
| | - Jose Luis Gómez Skarmeta
- Centro Andaluz de Biología del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Seville, Spain
| | - Maria Ina Arnone
- Biology and Evolution of Marine Organisms (BEOM), Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Hector Escriva
- Sorbonne Université, CNRS, Biologie Intégrative des Organismes Marins (BIOM), Observatoire Océanologique, Banyuls-sur-Mer, France
| | - Ozren Bogdanovic
- Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney 22, Australia
- Centro Andaluz de Biología del Desarrollo, CSIC-Universidad Pablo de Olavide-Junta de Andalucía, Seville, Spain
| |
Collapse
|
32
|
Liao M, Zhao JP, Tian J, Zheng CH. iEnhancer-DCLA: using the original sequence to identify enhancers and their strength based on a deep learning framework. BMC Bioinformatics 2022; 23:480. [PMCID: PMC9664816 DOI: 10.1186/s12859-022-05033-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Accepted: 11/02/2022] [Indexed: 11/16/2022] Open
Abstract
AbstractEnhancers are small regions of DNA that bind to proteins, which enhance the transcription of genes. The enhancer may be located upstream or downstream of the gene. It is not necessarily close to the gene to be acted on, because the entanglement structure of chromatin allows the positions far apart in the sequence to have the opportunity to contact each other. Therefore, identifying enhancers and their strength is a complex and challenging task. In this article, a new prediction method based on deep learning is proposed to identify enhancers and enhancer strength, called iEnhancer-DCLA. Firstly, we use word2vec to convert k-mers into number vectors to construct an input matrix. Secondly, we use convolutional neural network and bidirectional long short-term memory network to extract sequence features, and finally use the attention mechanism to extract relatively important features. In the task of predicting enhancers and their strengths, this method has improved to a certain extent in most evaluation indexes. In summary, we believe that this method provides new ideas in the analysis of enhancers.
Collapse
|
33
|
Panara V, Monteiro R, Koltowska K. Epigenetic Regulation of Endothelial Cell Lineages During Zebrafish Development-New Insights From Technical Advances. Front Cell Dev Biol 2022; 10:891538. [PMID: 35615697 PMCID: PMC9125237 DOI: 10.3389/fcell.2022.891538] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Accepted: 04/10/2022] [Indexed: 01/09/2023] Open
Abstract
Epigenetic regulation is integral in orchestrating the spatiotemporal regulation of gene expression which underlies tissue development. The emergence of new tools to assess genome-wide epigenetic modifications has enabled significant advances in the field of vascular biology in zebrafish. Zebrafish represents a powerful model to investigate the activity of cis-regulatory elements in vivo by combining technologies such as ATAC-seq, ChIP-seq and CUT&Tag with the generation of transgenic lines and live imaging to validate the activity of these regulatory elements. Recently, this approach led to the identification and characterization of key enhancers of important vascular genes, such as gata2a, notch1b and dll4. In this review we will discuss how the latest technologies in epigenetics are being used in the zebrafish to determine chromatin states and assess the function of the cis-regulatory sequences that shape the zebrafish vascular network.
Collapse
Affiliation(s)
- Virginia Panara
- Immunology Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Rui Monteiro
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, United Kingdom
- Birmingham Centre of Genome Biology, University of Birmingham, Birmingham, United Kingdom
| | | |
Collapse
|
34
|
Snetkova V, Pennacchio LA, Visel A, Dickel DE. Perfect and imperfect views of ultraconserved sequences. Nat Rev Genet 2022; 23:182-194. [PMID: 34764456 PMCID: PMC8858888 DOI: 10.1038/s41576-021-00424-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/30/2021] [Indexed: 12/12/2022]
Abstract
Across the human genome, there are nearly 500 'ultraconserved' elements: regions of at least 200 contiguous nucleotides that are perfectly conserved in both the mouse and rat genomes. Remarkably, the majority of these sequences are non-coding, and many can function as enhancers that activate tissue-specific gene expression during embryonic development. From their first description more than 15 years ago, their extreme conservation has both fascinated and perplexed researchers in genomics and evolutionary biology. The intrigue around ultraconserved elements only grew with the observation that they are dispensable for viability. Here, we review recent progress towards understanding the general importance and the specific functions of ultraconserved sequences in mammalian development and human disease and discuss possible explanations for their extreme conservation.
Collapse
Affiliation(s)
- Valentina Snetkova
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Molecular Biology, Genentech, South San Francisco, CA, USA
| | - Len A Pennacchio
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Comparative Biochemistry Program, University of California, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
| | - Axel Visel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- US Department of Energy Joint Genome Institute, Berkeley, CA, USA.
- School of Natural Sciences, University of California, Merced, Merced, CA, USA.
| | - Diane E Dickel
- Environmental Genomics & Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
35
|
Shapiro JA. What we have learned about evolutionary genome change in the past 7 decades. Biosystems 2022; 215-216:104669. [DOI: 10.1016/j.biosystems.2022.104669] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 03/23/2022] [Accepted: 03/23/2022] [Indexed: 12/12/2022]
|
36
|
Maderazo D, Flegg JA, Algama M, Ramialison M, Keith J. Detection and identification of cis-regulatory elements using change-point and classification algorithms. BMC Genomics 2022; 23:78. [PMID: 35078412 PMCID: PMC8790847 DOI: 10.1186/s12864-021-08190-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 11/19/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transcriptional regulation is primarily mediated by the binding of factors to non-coding regions in DNA. Identification of these binding regions enhances understanding of tissue formation and potentially facilitates the development of gene therapies. However, successful identification of binding regions is made difficult by the lack of a universal biological code for their characterisation. RESULTS We extend an alignment-based method, changept, and identify clusters of biological significance, through ontology and de novo motif analysis. Further, we apply a Bayesian method to estimate and combine binary classifiers on the clusters we identify to produce a better performing composite. CONCLUSIONS The analysis we describe provides a computational method for identification of conserved binding sites in the human genome and facilitates an alternative interrogation of combinations of existing data sets with alignment data.
Collapse
Affiliation(s)
- Dominic Maderazo
- School of Mathematics and Statistics, The University of Melbourne, Melbourne, 3010, VIC, Australia.
| | - Jennifer A Flegg
- School of Mathematics and Statistics, The University of Melbourne, Melbourne, 3010, VIC, Australia
| | - Manjula Algama
- School of Mathematics, Monash University, Melbourne, 3800, VIC, Australia
| | - Mirana Ramialison
- Australian Regenerative Medicine Institute, Monash University, Melbourne, 3800, VIC, Australia
| | - Jonathan Keith
- School of Mathematics, Monash University, Melbourne, 3800, VIC, Australia
| |
Collapse
|
37
|
Ma L, Yuan T, Li W, Guo L, Zhu D, Wang Z, Liu Z, Xue K, Wang Y, Liu J, Man W, Ye Z, Liu F, Wang J. Dynamic Functional Connectivity Alterations and Their Associated Gene Expression Pattern in Autism Spectrum Disorders. Front Neurosci 2022; 15:794151. [PMID: 35082596 PMCID: PMC8784878 DOI: 10.3389/fnins.2021.794151] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 12/16/2021] [Indexed: 12/12/2022] Open
Abstract
Autism spectrum disorders (ASDs) are a group of heterogeneous neurodevelopmental disorders that are highly heritable and are associated with impaired dynamic functional connectivity (DFC). However, the molecular mechanisms behind DFC alterations remain largely unknown. Eighty-eight patients with ASDs and 87 demographically matched typical controls (TCs) from the Autism Brain Imaging Data Exchange II database were included in this study. A seed-based sliding window approach was then performed to investigate the DFC changes in each of the 29 seeds in 10 classic resting-state functional networks and the whole brain. Subsequently, the relationships between DFC alterations in patients with ASDs and their symptom severity were assessed. Finally, transcription-neuroimaging association analyses were conducted to explore the molecular mechanisms of DFC disruptions in patients with ASDs. Compared with TCs, patients with ASDs showed significantly increased DFC between the right dorsolateral prefrontal cortex (DLPFC) and left fusiform/lingual gyrus, between the DLPFC and the superior temporal gyrus, between the right frontal eye field (FEF) and left middle frontal gyrus, between the FEF and the right angular gyrus, and between the left intraparietal sulcus and the right middle temporal gyrus. Moreover, significant relationships between DFC alterations and symptom severity were observed. Furthermore, the genes associated with DFC changes in ASDs were identified by performing gene-wise across-sample spatial correlation analysis between gene expression extracted from six donors’ brain of the Allen Human Brain Atlas and case-control DFC difference. In enrichment analysis, these genes were enriched for processes associated with synaptic signaling and voltage-gated ion channels and calcium pathways; also, these genes were highly expressed in autistic disorder, chronic alcoholic intoxication and several disorders related to depression. These results not only demonstrated higher DFC in patients with ASDs but also provided novel insight into the molecular mechanisms underlying these alterations.
Collapse
Affiliation(s)
- Lin Ma
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Tengfei Yuan
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Wei Li
- Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin’s Clinical Research Center for Cancer, Tianjin, China
| | - Lining Guo
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Dan Zhu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
- Department of Radiology, Tianjin Medical University General Hospital Airport Hospital, Tianjin, China
| | - Zirui Wang
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Zhixuan Liu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Kaizhong Xue
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Yaoyi Wang
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Jiawei Liu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Weiqi Man
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
| | - Zhaoxiang Ye
- Department of Radiology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin’s Clinical Research Center for Cancer, Tianjin, China
- *Correspondence: Zhaoxiang Ye,
| | - Feng Liu
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
- Feng Liu,
| | - Junping Wang
- Department of Radiology and Tianjin Key Laboratory of Functional Imaging, Tianjin Medical University General Hospital, Tianjin, China
- Junping Wang,
| |
Collapse
|
38
|
Roscito JG, Sameith K, Kirilenko BM, Hecker N, Winkler S, Dahl A, Rodrigues MT, Hiller M. Convergent and lineage-specific genomic differences in limb regulatory elements in limbless reptile lineages. Cell Rep 2022; 38:110280. [DOI: 10.1016/j.celrep.2021.110280] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/24/2021] [Accepted: 12/27/2021] [Indexed: 01/02/2023] Open
|
39
|
Pagni S, Mills JD, Frankish A, Mudge JM, Sisodiya SM. Non-coding regulatory elements: Potential roles in disease and the case of epilepsy. Neuropathol Appl Neurobiol 2021; 48:e12775. [PMID: 34820881 DOI: 10.1111/nan.12775] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/04/2021] [Accepted: 11/16/2021] [Indexed: 12/27/2022]
Abstract
Non-coding DNA (ncDNA) refers to the portion of the genome that does not code for proteins and accounts for the greatest physical proportion of the human genome. ncDNA includes sequences that are transcribed into RNA molecules, such as ribosomal RNAs (rRNAs), microRNAs (miRNAs), long non-coding RNAs (lncRNAs) and un-transcribed sequences that have regulatory functions, including gene promoters and enhancers. Variation in non-coding regions of the genome have an established role in human disease, with growing evidence from many areas, including several cancers, Parkinson's disease and autism. Here, we review the features and functions of the regulatory elements that are present in the non-coding genome and the role that these regions have in human disease. We then review the existing research in epilepsy and emphasise the potential value of further exploring non-coding regulatory elements in epilepsy. In addition, we outline the most widely used techniques for recognising regulatory elements throughout the genome, current methodologies for investigating variation and the main challenges associated with research in the field of non-coding DNA.
Collapse
Affiliation(s)
- Susanna Pagni
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| | - James D Mills
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK.,Amsterdam UMC, Department of (Neuro)Pathology, Amsterdam Neuroscience, University of Amsterdam, Amsterdam, Netherlands
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Sanjay M Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London, UK.,Chalfont Centre for Epilepsy, Chalfont St Peter, UK
| |
Collapse
|
40
|
Spead O, Weaver CJ, Moreland T, Poulain FE. Live imaging of retinotectal mapping reveals topographic map dynamics and a previously undescribed role for Contactin 2 in map sharpening. Development 2021; 148:272618. [PMID: 34698769 DOI: 10.1242/dev.199584] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 10/07/2021] [Indexed: 11/20/2022]
Abstract
Organization of neuronal connections into topographic maps is essential for processing information. Yet, our understanding of topographic mapping has remained limited by our inability to observe maps forming and refining directly in vivo. Here, we used Cre-mediated recombination of a new colorswitch reporter in zebrafish to generate the first transgenic model allowing the dynamic analysis of retinotectal mapping in vivo. We found that the antero-posterior retinotopic map forms early but remains dynamic, with nasal and temporal retinal axons expanding their projection domains over time. Nasal projections initially arborize in the anterior tectum but progressively refine their projection domain to the posterior tectum, leading to the sharpening of the retinotopic map along the antero-posterior axis. Finally, using a CRISPR-mediated mutagenesis approach, we demonstrate that the refinement of nasal retinal projections requires the adhesion molecule Contactin 2. Altogether, our study provides the first analysis of a topographic map maturing in real time in a live animal and opens new strategies for dissecting the molecular mechanisms underlying precise topographic mapping in vertebrates.
Collapse
Affiliation(s)
- Olivia Spead
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Cory J Weaver
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Trevor Moreland
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| | - Fabienne E Poulain
- Department of Biological Sciences, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
41
|
Parker HJ, De Kumar B, Pushel I, Bronner ME, Krumlauf R. Analysis of lamprey meis genes reveals that conserved inputs from Hox, Meis and Pbx proteins control their expression in the hindbrain and neural tube. Dev Biol 2021; 479:61-76. [PMID: 34310923 DOI: 10.1016/j.ydbio.2021.07.014] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 06/10/2021] [Accepted: 07/22/2021] [Indexed: 11/23/2022]
Abstract
Meis genes are known to play important roles in the hindbrain and neural crest cells of jawed vertebrates. To explore the roles of Meis genes in head development during evolution of vertebrates, we have identified four meis genes in the sea lamprey genome and characterized their patterns of expression and regulation, with a focus on the hindbrain and pharynx. Each of the lamprey meis genes displays temporally and spatially dynamic patterns of expression, some of which are coupled to rhombomeric domains in the developing hindbrain and select pharyngeal arches. Studies of Meis loci in mouse and zebrafish have identified enhancers that are bound by Hox and TALE (Meis and Pbx) proteins, implicating these factors in the direct regulation of Meis expression. We examined the lamprey meis loci and identified a series of cis-elements conserved between lamprey and jawed vertebrate meis genes. In transgenic reporter assays we demonstrated that these elements act as neural enhancers in lamprey embryos, directing reporter expression in appropriate domains when compared to expression of their associated endogenous meis gene. Sequence alignments reveal that these conserved elements are in similar relative positions of the meis loci and contain a series of consensus binding motifs for Hox and TALE proteins. This suggests that ancient Hox and TALE-responsive enhancers regulated expression of ancestral vertebrate meis genes in segmental domains in the hindbrain and have been retained in the meis loci during vertebrate evolution. The presence of conserved Meis, Pbx and Hox binding sites in these lamprey enhancers links Hox and TALE factors to regulation of lamprey meis genes in the developing hindbrain, indicating a deep ancestry for these regulatory interactions prior to the divergence of jawed and jawless vertebrates.
Collapse
Affiliation(s)
- Hugo J Parker
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Bony De Kumar
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Irina Pushel
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA
| | - Marianne E Bronner
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125, USA
| | - Robb Krumlauf
- Stowers Institute for Medical Research, Kansas City, MO, 64110, USA; Department of Anatomy and Cell Biology, Kansas University Medical Center, Kansas City, KS, 66160, USA.
| |
Collapse
|
42
|
Tayara H, Chong KT. Improved Predicting of The Sequence Specificities of RNA Binding Proteins by Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2526-2534. [PMID: 32191896 DOI: 10.1109/tcbb.2020.2981335] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
RNA-binding proteins (RBPs) have a significant role in various regulatory tasks. However, the mechanism by which RBPs identify the subsequence target RNAs is still not clear. In recent years, several machine and deep learning-based computational models have been proposed for understanding the binding preferences of RBPs. These methods required integrating multiple features with raw RNA sequences such as secondary structure and their performances can be further improved. In this paper, we propose an efficient and simple convolution neural network, RBPCNN, that relies on the combination of the raw RNA sequence and evolutionary information. We show that conservation scores (evolutionary information) for the RNA sequences can significantly improve the overall performance of the proposed predictor. In addition, the automatic extraction of the binding sequence motifs can enhance our understanding of the binding specificities of RBPs. The experimental results show that RBPCNN outperforms significantly the current state-of-the-art methods. More specifically, the average area under the receiver operator curve was improved by 2.67 percent and the mean average precision was improved by 8.03 percent. The datasets and results can be downloaded from https://home.jbnu.ac.kr/NSCL/RBPCNN.htm.
Collapse
|
43
|
German OL, Vallese-Maurizi H, Soto TB, Rotstein NP, Politi LE. Retina stem cells, hopes and obstacles. World J Stem Cells 2021; 13:1446-1479. [PMID: 34786153 PMCID: PMC8567457 DOI: 10.4252/wjsc.v13.i10.1446] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 07/14/2021] [Accepted: 09/17/2021] [Indexed: 02/07/2023] Open
Abstract
Retinal degeneration is a major contributor to visual dysfunction worldwide. Although it comprises several eye diseases, loss of retinal pigment epithelial (RPE) and photoreceptor cells are the major contributors to their pathogenesis. Early therapies included diverse treatments, such as provision of anti-vascular endothelial growth factor and many survival and trophic factors that, in some cases, slow down the progression of the degeneration, but do not effectively prevent it. The finding of stem cells (SC) in the eye has led to the proposal of cell replacement strategies for retina degeneration. Therapies using different types of SC, such as retinal progenitor cells (RPCs), embryonic SC, pluripotent SCs (PSCs), induced PSCs (iPSCs), and mesenchymal stromal cells, capable of self-renewal and of differentiating into multiple cell types, have gained ample support. Numerous preclinical studies have assessed transplantation of SC in animal models, with encouraging results. The aim of this work is to revise the different preclinical and clinical approaches, analyzing the SC type used, their efficacy, safety, cell attachment and integration, absence of tumor formation and immunorejection, in order to establish which were the most relevant and successful. In addition, we examine the questions and concerns still open in the field. The data demonstrate the existence of two main approaches, aimed at replacing either RPE cells or photoreceptors. Emerging evidence suggests that RPCs and iPSC are the best candidates, presenting no ethical concerns and a low risk of immunorejection. Clinical trials have already supported the safety and efficacy of SC treatments. Serious concerns are pending, such as the risk of tumor formation, lack of attachment or integration of transplanted cells into host retinas, immunorejection, cell death, and also ethical. However, the amazing progress in the field in the last few years makes it possible to envisage safe and effective treatments to restore vision loss in a near future.
Collapse
Affiliation(s)
- Olga L German
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, Bahia blanca 8000, Buenos Aires, Argentina
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, and Neurobiology Department, Instituto de Investigaciones Bioquímicas de Bahía Blanca (INIBIBB) Conicet, Bahía Blanca 8000, Buenos Aires, Argentina
| | - Harmonie Vallese-Maurizi
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, Bahia blanca 8000, Buenos Aires, Argentina
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, and Neurobiology Department, Instituto de Investigaciones Bioquímicas de Bahía Blanca (INIBIBB) Conicet, Bahía Blanca 8000, Buenos Aires, Argentina
| | - Tamara B Soto
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, and Neurobiology Department, Instituto de Investigaciones Bioquímicas de Bahía Blanca (INIBIBB) Conicet, Bahía Blanca 8000, Buenos Aires, Argentina
| | - Nora P Rotstein
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, Bahia blanca 8000, Buenos Aires, Argentina
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, and Neurobiology Department, Instituto de Investigaciones Bioquímicas de Bahía Blanca (INIBIBB) Conicet, Bahía Blanca 8000, Buenos Aires, Argentina
| | - Luis Enrique Politi
- Department of Biology, Biochemistry and Pharmacy, Universidad Nacional del Sur, and Neurobiology Department, Instituto de Investigaciones Bioquímicas de Bahía Blanca (INIBIBB) Conicet, Bahía Blanca 8000, Buenos Aires, Argentina
| |
Collapse
|
44
|
Maurya SS. Role of Enhancers in Development and Diseases. EPIGENOMES 2021; 5:epigenomes5040021. [PMID: 34968246 PMCID: PMC8715447 DOI: 10.3390/epigenomes5040021] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/21/2021] [Accepted: 09/28/2021] [Indexed: 12/26/2022] Open
Abstract
Enhancers are cis-regulatory elements containing short DNA sequences that serve as binding sites for pioneer/regulatory transcription factors, thus orchestrating the regulation of genes critical for lineage determination. The activity of enhancer elements is believed to be determined by transcription factor binding, thus determining the cell state identity during development. Precise spatio-temporal control of the transcriptome during lineage specification requires the coordinated binding of lineage-specific transcription factors to enhancers. Thus, enhancers are the primary determinants of cell identity. Numerous studies have explored the role and mechanism of enhancers during development and disease, and various basic questions related to the functions and mechanisms of enhancers have not yet been fully answered. In this review, we discuss the recently published literature regarding the roles of enhancers, which are critical for various biological processes governing development. Furthermore, we also highlight that altered enhancer landscapes provide an essential context to understand the etiologies and mechanisms behind numerous complex human diseases, providing new avenues for effective enhancer-based therapeutic interventions.
Collapse
Affiliation(s)
- Shailendra S Maurya
- Department of Pediatrics, Division of Pediatric Hematology and Oncology, Department of Developmental Biology, School of Medicine, Washington University in St. Louis, 660 South Euclid Avenue, St. Louis, MO 63110, USA
| |
Collapse
|
45
|
The evolutionary acquisition and mode of functions of promoter-associated non-coding RNAs (pancRNAs) for mammalian development. Essays Biochem 2021; 65:697-708. [PMID: 34328174 DOI: 10.1042/ebc20200143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 05/13/2021] [Accepted: 07/16/2021] [Indexed: 12/22/2022]
Abstract
Increasing evidence has shown that many long non-coding RNAs (lncRNAs) are involved in gene regulation in a variety of ways such as transcriptional, post-transcriptional and epigenetic regulation. Promoter-associated non-coding RNAs (pancRNAs), which are categorized into the most abundant single-copy lncRNA biotype, play vital regulatory roles in finely tuning cellular specification at the epigenomic level. In short, pancRNAs can directly or indirectly regulate downstream genes to participate in the development of organisms in a cell-specific manner. In this review, we will introduce the evolutionarily acquired characteristics of pancRNAs as determined by comparative epigenomics and elaborate on the research progress on pancRNA-involving processes in mammalian embryonic development, including neural differentiation.
Collapse
|
46
|
Beaulieu JM, O'Meara BC, Gilchrist MA. A Spatially Explicit Model of Stabilizing Selection for Improving Phylogenetic Inference. Mol Biol Evol 2021; 38:1641-1652. [PMID: 33306127 PMCID: PMC8042768 DOI: 10.1093/molbev/msaa318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
Ultraconserved elements (UCEs) are stretches of hundreds of nucleotides with highly conserved cores flanked by variable regions. Although the selective forces responsible for the preservation of UCEs are unknown, they are nonetheless believed to contain phylogenetically meaningful information from deep to shallow divergence events. Phylogenetic applications of UCEs assume the same degree of rate heterogeneity applies across the entire locus, including variable flanking regions. We present a Wright–Fisher model of selection on nucleotides (SelON) which includes the effects of mutation, drift, and spatially varying, stabilizing selection for an optimal nucleotide sequence. The SelON model assumes the strength of stabilizing selection follows a position-dependent Gaussian function whose exact shape can vary between UCEs. We evaluate SelON by comparing its performance to a simpler and spatially invariant GTR+Γ model using an empirical data set of 400 vertebrate UCEs used to determine the phylogenetic position of turtles. We observe much improvement in model fit of SelON over the GTR+Γ model, and support for turtles as sister to lepidosaurs. Overall, the UCE-specific parameters SelON estimates provide a compact way of quantifying the strength and variation in selection within and across UCEs. SelON can also be extended to include more realistic mapping functions between sequence and stabilizing selection as well as allow for greater levels of rate heterogeneity. By more explicitly modeling the nature of selection on UCEs, SelON and similar approaches can be used to better understand the biological mechanisms responsible for their preservation across highly divergent taxa and long evolutionary time scales.
Collapse
Affiliation(s)
- Jeremy M Beaulieu
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR, USA
| | - Brian C O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| | - Michael A Gilchrist
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, TN, USA
| |
Collapse
|
47
|
Abstract
We developed dbCNS (http://yamasati.nig.ac.jp/dbcns), a new database for conserved noncoding sequences (CNSs). CNSs exist in many eukaryotes and are assumed to be involved in protein expression control. Version 1 of dbCNS, introduced here, includes a powerful and precise CNS identification pipeline for multiple vertebrate genomes. Mutations in CNSs may induce morphological changes and cause genetic diseases. For this reason, many vertebrate CNSs have been identified, with special reference to primate genomes. We integrated ∼6.9 million CNSs from many vertebrate genomes into dbCNS, which allows users to extract CNSs near genes of interest using keyword searches. In addition to CNSs, dbCNS contains published genome sequences of 161 species. With purposeful taxonomic sampling of genomes, users can employ CNSs as queries to reconstruct CNS alignments and phylogenetic trees, to evaluate CNS modifications, acquisitions, and losses, and to roughly identify species with CNSs having accelerated substitution rates. dbCNS also produces links to dbSNP for searching pathogenic single-nucleotide polymorphisms in human CNSs. Thus, dbCNS connects morphological changes with genetic diseases. A test analysis using 38 gnathostome genomes was accomplished within 30 s. dbCNS results can evaluate CNSs identified by other stand-alone programs using genome-scale data.
Collapse
Affiliation(s)
- Jun Inoue
- Population Genetics Laboratory, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan.,Center for Earth Surface System Dynamics, Atmosphere and Ocean Research Institute, University of Tokyo, Kashiwa, Japan
| | - Naruya Saitou
- Population Genetics Laboratory, Department of Genomics and Evolutionary Biology, National Institute of Genetics, Mishima, Japan.,Department of Okinawa Bioinformation Bank, Faculty of Medicine, University of the Ryukyus, Okinawa, Japan
| |
Collapse
|
48
|
Ni P, Su Z. Accurate prediction of cis-regulatory modules reveals a prevalent regulatory genome of humans. NAR Genom Bioinform 2021; 3:lqab052. [PMID: 34159315 PMCID: PMC8210889 DOI: 10.1093/nargab/lqab052] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/01/2021] [Accepted: 06/14/2021] [Indexed: 02/07/2023] Open
Abstract
cis-regulatory modules(CRMs) formed by clusters of transcription factor (TF) binding sites (TFBSs) are as important as coding sequences in specifying phenotypes of humans. It is essential to categorize all CRMs and constituent TFBSs in the genome. In contrast to most existing methods that predict CRMs in specific cell types using epigenetic marks, we predict a largely cell type agonistic but more comprehensive map of CRMs and constituent TFBSs in the gnome by integrating all available TF ChIP-seq datasets. Our method is able to partition 77.47% of genome regions covered by available 6092 datasets into a CRM candidate (CRMC) set (56.84%) and a non-CRMC set (43.16%). Intriguingly, the predicted CRMCs are under strong evolutionary constraints, while the non-CRMCs are largely selectively neutral, strongly suggesting that the CRMCs are likely cis-regulatory, while the non-CRMCs are not. Our predicted CRMs are under stronger evolutionary constraints than three state-of-the-art predictions (GeneHancer, EnhancerAtlas and ENCODE phase 3) and substantially outperform them for recalling VISTA enhancers and non-coding ClinVar variants. We estimated that the human genome might encode about 1.47M CRMs and 68M TFBSs, comprising about 55% and 22% of the genome, respectively; for both of which, we predicted 80%. Therefore, the cis-regulatory genome appears to be more prevalent than originally thought.
Collapse
Affiliation(s)
- Pengyu Ni
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| | - Zhengchang Su
- Department of Bioinformatics and Genomics, the University of North Carolina at Charlotte, 9201 University City Boulevard, Charlotte, NC 28223, USA
| |
Collapse
|
49
|
Suzuki A, Guerrini MM, Yamamoto K. Functional genomics of autoimmune diseases. Ann Rheum Dis 2021; 80:689-697. [PMID: 33408079 DOI: 10.1136/annrheumdis-2019-216794] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 12/06/2020] [Indexed: 12/22/2022]
Abstract
For more than a decade, genome-wide association studies have been applied to autoimmune diseases and have expanded our understanding on the pathogeneses. Genetic risk factors associated with diseases and traits are essentially causative. However, elucidation of the biological mechanism of disease from genetic factors is challenging. In fact, it is difficult to identify the causal variant among multiple variants located on the same haplotype or linkage disequilibrium block and thus the responsible biological genes remain elusive. Recently, multiple studies have revealed that the majority of risk variants locate in the non-coding region of the genome and they are the most likely to regulate gene expression such as quantitative trait loci. Enhancer, promoter and long non-coding RNA appear to be the main target mechanisms of the risk variants. In this review, we discuss functional genetics to challenge these puzzles.
Collapse
Affiliation(s)
- Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Matteo Maurizio Guerrini
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Kazuhiko Yamamoto
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| |
Collapse
|
50
|
Zhu I, Song W, Ovcharenko I, Landsman D. A model of active transcription hubs that unifies the roles of active promoters and enhancers. Nucleic Acids Res 2021; 49:4493-4505. [PMID: 33872375 PMCID: PMC8096258 DOI: 10.1093/nar/gkab235] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Revised: 01/27/2021] [Accepted: 03/22/2021] [Indexed: 12/31/2022] Open
Abstract
An essential questions of gene regulation is how large number of enhancers and promoters organize into gene regulatory loops. Using transcription-factor binding enrichment as an indicator of enhancer strength, we identified a portion of H3K27ac peaks as potentially strong enhancers and found a universal pattern of promoter and enhancer distribution: At actively transcribed regions of length of ∼200-300 kb, the numbers of active promoters and enhancers are inversely related. Enhancer clusters are associated with isolated active promoters, regardless of the gene's cell-type specificity. As the number of nearby active promoters increases, the number of enhancers decreases. At regions where multiple active genes are closely located, there are few distant enhancers. With Hi-C analysis, we demonstrate that the interactions among the regulatory elements (active promoters and enhancers) occur predominantly in clusters and multiway among linearly close elements and the distance between adjacent elements shows a preference of ∼30 kb. We propose a simple rule of spatial organization of active promoters and enhancers: Gene transcriptions and regulations mainly occur at local active transcription hubs contributed dynamically by multiple elements from linearly close enhancers and/or active promoters. The hub model can be represented with a flower-shaped structure and implies an enhancer-like role of active promoters.
Collapse
Affiliation(s)
- Iris Zhu
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
| | - Wei Song
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ivan Ovcharenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|