1
|
Asim MN, Ibrahim MA, Zaib A, Dengel A. DNA sequence analysis landscape: a comprehensive review of DNA sequence analysis task types, databases, datasets, word embedding methods, and language models. Front Med (Lausanne) 2025; 12:1503229. [PMID: 40265190 PMCID: PMC12011883 DOI: 10.3389/fmed.2025.1503229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Accepted: 03/10/2025] [Indexed: 04/24/2025] Open
Abstract
Deoxyribonucleic acid (DNA) serves as fundamental genetic blueprint that governs development, functioning, growth, and reproduction of all living organisms. DNA can be altered through germline and somatic mutations. Germline mutations underlie hereditary conditions, while somatic mutations can be induced by various factors including environmental influences, chemicals, lifestyle choices, and errors in DNA replication and repair mechanisms which can lead to cancer. DNA sequence analysis plays a pivotal role in uncovering the intricate information embedded within an organism's genetic blueprint and understanding the factors that can modify it. This analysis helps in early detection of genetic diseases and the design of targeted therapies. Traditional wet-lab experimental DNA sequence analysis through traditional wet-lab experimental methods is costly, time-consuming, and prone to errors. To accelerate large-scale DNA sequence analysis, researchers are developing AI applications that complement wet-lab experimental methods. These AI approaches can help generate hypotheses, prioritize experiments, and interpret results by identifying patterns in large genomic datasets. Effective integration of AI methods with experimental validation requires scientists to understand both fields. Considering the need of a comprehensive literature that bridges the gap between both fields, contributions of this paper are manifold: It presents diverse range of DNA sequence analysis tasks and AI methodologies. It equips AI researchers with essential biological knowledge of 44 distinct DNA sequence analysis tasks and aligns these tasks with 3 distinct AI-paradigms, namely, classification, regression, and clustering. It streamlines the integration of AI into DNA sequence analysis tasks by consolidating information of 36 diverse biological databases that can be used to develop benchmark datasets for 44 different DNA sequence analysis tasks. To ensure performance comparisons between new and existing AI predictors, it provides insights into 140 benchmark datasets related to 44 distinct DNA sequence analysis tasks. It presents word embeddings and language models applications across 44 distinct DNA sequence analysis tasks. It streamlines the development of new predictors by providing a comprehensive survey of 39 word embeddings and 67 language models based predictive pipeline performance values as well as top performing traditional sequence encoding-based predictors and their performances across 44 DNA sequence analysis tasks.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
- Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
| | - Arooj Zaib
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
| | - Andreas Dengel
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern, Germany
- Intelligentx GmbH (intelligentx.com), Kaiserslautern, Germany
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern, Germany
| |
Collapse
|
2
|
Nesta A, Veiga DFT, Banchereau J, Anczukow O, Beck CR. Alternative splicing of transposable elements in human breast cancer. Mob DNA 2025; 16:6. [PMID: 39987084 PMCID: PMC11846448 DOI: 10.1186/s13100-025-00341-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2024] [Accepted: 01/09/2025] [Indexed: 02/24/2025] Open
Abstract
Transposable elements (TEs) drive genome evolution and can affect gene expression through diverse mechanisms. In breast cancer, disrupted regulation of TE sequences may facilitate tumor-specific transcriptomic alterations. We examine 142,514 full-length isoforms derived from long-read RNA sequencing (LR-seq) of 30 breast samples to investigate the effects of TEs on the breast cancer transcriptome. Approximately half of these isoforms contain TE sequences, and these contribute to half of the novel annotated splice junctions. We quantify splicing of these LR-seq derived isoforms in 1,135 breast tumors from The Cancer Genome Atlas (TCGA) and 1,329 healthy tissue samples from the Genotype-Tissue Expression (GTEx), and find 300 TE-overlapping tumor-specific splicing events. Some splicing events are enriched in specific breast cancer subtypes - for example, a TE-driven transcription start site upstream of ERBB2 in HER2 + tumors, and several TE-mediated splicing events are associated with patient survival and poor prognosis. The full-length sequences we capture with LR-seq reveal thousands of isoforms with signatures of RNA editing, including a novel isoform belonging to RHOA; a gene previously implicated in tumor progression. We utilize our full-length isoforms to discover polymorphic TE insertions that alter splicing and validate one of these events in breast cancer cell lines. Together, our results demonstrate the widespread effects of dysregulated TEs on breast cancer transcriptomes and highlight the advantages of long-read isoform sequencing for understanding TE biology. TE-derived isoforms may alter the expression of genes important in cancer and can potentially be used as novel, disease-specific therapeutic targets or biomarkers.One sentence summary: Transposable elements generate alternative isoforms and alter post-transcriptional regulation in human breast cancer.
Collapse
Affiliation(s)
- Alex Nesta
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| | - Diogo F T Veiga
- Department of Translational Medicine, School of Medical Sciences, University of Campinas (UNICAMP), Campinas, SP, 13083, Brazil
| | - Jacques Banchereau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Immunoledge LLC, Montclair, NJ, 07042, USA
| | - Olga Anczukow
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA.
| |
Collapse
|
3
|
Xu J, Wan J, Huang HY, Chen Y, Huang Y, Huang J, Zhang Z, Su C, Zhou Y, Lin X, Lin YCD, Huang HD. miRStart 2.0: enhancing miRNA regulatory insights through deep learning-based TSS identification. Nucleic Acids Res 2025; 53:D138-D146. [PMID: 39578697 PMCID: PMC11701676 DOI: 10.1093/nar/gkae1086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/17/2024] [Accepted: 11/14/2024] [Indexed: 11/24/2024] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that regulate gene expression by binding to the 3'-untranslated regions of target mRNAs, influencing various biological processes at the post-transcriptional level. Identifying miRNA transcription start sites (TSSs) and transcription factors' (TFs) regulatory roles is crucial for elucidating miRNA function and transcriptional regulation. miRStart 2.0 integrates over 4500 high-throughput datasets across five data types, utilizing a multi-modal approach to annotate 28 828 putative TSSs for 1745 human and 1181 mouse miRNAs, supported by sequencing-based signals. Over 6 million tissue-specific TF-miRNA interactions, integrated from ChIP-seq data, are supplemented by DNase hypersensitivity and UCSC conservation data, with network visualizations. Our deep learning-based model outperforms existing tools in miRNA TSS prediction, achieving the most overlaps with both cell-specific and non-cell-specific validated TSSs. The user-friendly web interface and visualization tools make miRStart 2.0 easily accessible to researchers, enabling efficient identification of miRNA upstream regulatory elements in relation to their TSSs. This updated database provides systems-level insights into gene regulation and disease mechanisms, offering a valuable resource for translational research, facilitating the discovery of novel therapeutic targets and precision medicine strategies. miRStart 2.0 is now accessible at https://awi.cuhk.edu.cn/∼miRStart2.
Collapse
Affiliation(s)
- Jiatong Xu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Jingting Wan
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Hsi-Yuan Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Yigang Chen
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Yixian Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Junyang Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Ziyue Zhang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Chang Su
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Yuming Zhou
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Xingqiao Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Yang-Chi-Dung Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
| | - Hsien-Da Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Guangdong Provincial Key Laboratory of Digital Biology and Drug Development, The Chinese University of Hong Kong, Shenzhen, 2001 Longxiang Boulevard, Longgang District, Shenzhen, Guangdong 518172, P.R. China
- Department of Endocrinology, Key Laboratory of Endocrinology of National Ministry of Health, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences & Peking Union Medical College, No.9 Dongdansantiao Street, Dongcheng District, Beijing 100730, P.R. China
| |
Collapse
|
4
|
Nesta A, Veiga DFT, Banchereau J, Anczukow O, Beck CR. Alternative splicing of transposable elements in human breast cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.26.615242. [PMID: 39386569 PMCID: PMC11463404 DOI: 10.1101/2024.09.26.615242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Transposable elements (TEs) drive genome evolution and can affect gene expression through diverse mechanisms. In breast cancer, disrupted regulation of TE sequences may facilitate tumor-specific transcriptomic alterations. We examine 142,514 full-length isoforms derived from long-read RNA sequencing (LR-seq) of 30 breast samples to investigate the effects of TEs on the breast cancer transcriptome. Approximately half of these isoforms contain TE sequences, and these contribute to half of the novel annotated splice junctions. We quantify splicing of these LR-seq derived isoforms in 1,135 breast tumors from The Cancer Genome Atlas (TCGA) and 1,329 healthy tissue samples from the Genotype-Tissue Expression (GTEx), and find 300 TE-overlapping tumor-specific splicing events. Some splicing events are enriched in specific breast cancer subtypes - for example, a TE-driven transcription start site upstream of ERBB2 in HER2+ tumors, and several TE-mediated splicing events are associated with patient survival and poor prognosis. The full-length sequences we capture with LR-seq reveal thousands of isoforms with signatures of RNA editing, including a novel isoform belonging to RHOA; a gene previously implicated in tumor progression. We utilize our full-length isoforms to discover polymorphic TE insertions that alter splicing and validate one of these events in breast cancer cell lines. Together, our results demonstrate the widespread effects of dysregulated TEs on breast cancer transcriptomes and highlight the advantages of long-read isoform sequencing for understanding TE biology. TE-derived isoforms may alter the expression of genes important in cancer and can potentially be used as novel, disease-specific therapeutic targets or biomarkers.
Collapse
Affiliation(s)
- Alex Nesta
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
| | - Diogo F. T. Veiga
- Department of Translational Medicine, School of Medical Sciences, University of Campinas (UNICAMP), Campinas, SP 13083, Brazil
| | - Jacques Banchereau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Immunoledge LLC, Montclair, NJ, 07042, USA
| | - Olga Anczukow
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Christine R. Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032 USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
5
|
Mazumdar V, Joshi K, Nandi BR, Namani S, Gupta VK, Radhakrishnan G. Host F-Box Protein 22 Enhances the Uptake of Brucella by Macrophages and Drives a Sustained Release of Proinflammatory Cytokines through Degradation of the Anti-Inflammatory Effector Proteins of Brucella. Infect Immun 2022; 90:e0006022. [PMID: 35420446 PMCID: PMC9119127 DOI: 10.1128/iai.00060-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 03/14/2022] [Indexed: 11/20/2022] Open
Abstract
Brucella species are intracellular bacterial pathogens, causing the worldwide zoonotic disease brucellosis. Brucella invades professional and nonprofessional phagocytic cells, followed by resisting intracellular killing and establishing a replication permissive niche. Brucella also modulates the innate and adaptive immune responses of the host for its chronic persistence. The complex intracellular cycle of Brucella depends in a major way on multiple host factors, but limited information is available on host and bacterial proteins that play an essential role in the invasion, intracellular replication, and modulation of host immune responses. By employing a small interfering RNA (siRNA) screening, we identified a role for the host protein FBXO22 in the Brucella-macrophage interaction. FBXO22 is the key element in the SCF E3 ubiquitination complex, where it determines the substrate specificity for ubiquitination and degradation of various host proteins. Downregulation of FBXO22 by siRNA or the CRISPR-Cas9 system resulted in diminished uptake of Brucella into macrophages, which was dependent on NF-κB-mediated regulation of phagocytic receptors. FBXO22 expression was upregulated in Brucella-infected macrophages, which resulted in induction of phagocytic receptors and enhanced production of proinflammatory cytokines through NF-κB. Furthermore, we found that FBXO22 recruits the effector proteins of Brucella, including the anti-inflammatory proteins TcpB and OMP25, for degradation through the SCF complex. We did not observe any role for another F-box-containing protein of the SCF complex, β-TrCP, in the Brucella-macrophage interaction. Our findings unravel novel functions of FBXO22 in host-pathogen interaction and its contribution to pathogenesis of infectious diseases.
Collapse
Affiliation(s)
- Varadendra Mazumdar
- Laboratory of Immunology and Microbial Pathogenesis, National Institute of Animal Biotechnology (NIAB), Hyderabad, Telangana, India
- Regional Centre for Biotechnology (RCB), Faridabad, Haryana, India
| | - Kiranmai Joshi
- Laboratory of Immunology and Microbial Pathogenesis, National Institute of Animal Biotechnology (NIAB), Hyderabad, Telangana, India
- Regional Centre for Biotechnology (RCB), Faridabad, Haryana, India
| | - Binita Roy Nandi
- Laboratory of Immunology and Microbial Pathogenesis, National Institute of Animal Biotechnology (NIAB), Hyderabad, Telangana, India
- Regional Centre for Biotechnology (RCB), Faridabad, Haryana, India
| | - Swapna Namani
- Laboratory of Immunology and Microbial Pathogenesis, National Institute of Animal Biotechnology (NIAB), Hyderabad, Telangana, India
| | - Vivek Kumar Gupta
- ICAR-Indian Veterinary Research Institute (ICAR-IVRI), Izatnagar, Bareilly, India
| | - Girish Radhakrishnan
- Laboratory of Immunology and Microbial Pathogenesis, National Institute of Animal Biotechnology (NIAB), Hyderabad, Telangana, India
| |
Collapse
|
6
|
Liang Y, Zhang S, Qiao H, Yao Y. iPromoter-ET: Identifying promoters and their strength by extremely randomized trees-based feature selection. Anal Biochem 2021; 630:114335. [PMID: 34389299 DOI: 10.1016/j.ab.2021.114335] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2021] [Revised: 07/24/2021] [Accepted: 08/09/2021] [Indexed: 10/20/2022]
Abstract
Promoter is a region of DNA that determines the transcription of a particular gene. There are several σ factors in the RNA polymerase, which has the function of identifying the promoter and facilitating the binding of the RNA polymerase to the promoter. Owing to the importance of promoter in genome research, it is an urgent task to develop computational tool for effectively identifying promoters and their strength facing the avalanche of DNA sequences discovered in the post-genomic age. In this paper, we develop a model named iPromoter-ET using the k-mer nucleotide composition, binary encoding and dinucleotide property matrix-based distance transformation for features extraction, and extremely randomized trees (extra trees) for feature selection. Its 1st layer is used to identify whether a DNA sequence is of promoter or not, while its 2nd layer is to identify promoter samples as being strong or weak promoter. Support vector machine and the five cross-validation are used to perform identification and assess performance, respectively. The results indicate that our model remarkably outperforms the existing models in both the 1st and 2nd layers for accuracy and stability. We anticipate that our proposed model will become a very effective intelligent tool, or at the least, a complementary tool to the existing modes of identifying promoters and their strength. Moreover, the datasets and codes for iPromoter-ET are freely available at https://github.com/shengli0201/iPromoter-ET.
Collapse
Affiliation(s)
- Yunyun Liang
- School of Science, Xi'an Polytechnic University, Xi'an, 710048, PR China.
| | - Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China
| | - Huijuan Qiao
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China
| | - Yingying Yao
- School of Mathematics and Statistics, Xidian University, Xi'an, 710071, PR China
| |
Collapse
|
7
|
Mohamed Sa’dom SAF, Raikundalia S, Shamsuddin S, See Too WC, Few LL. DNA Methylation of Human Choline Kinase Alpha Promoter-Associated CpG Islands in MCF-7 Cells. Genes (Basel) 2021; 12:genes12060853. [PMID: 34205960 PMCID: PMC8229565 DOI: 10.3390/genes12060853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Revised: 05/28/2021] [Accepted: 05/30/2021] [Indexed: 11/16/2022] Open
Abstract
Choline kinase (CK) is the enzyme catalyzing the first reaction in CDP-choline pathway for the biosynthesis of phosphatidylcholine. Higher expression of the α isozyme of CK has been implicated in carcinogenesis, and inhibition or downregulation of CKα (CHKA) is a promising anticancer approach. This study aimed to investigate the regulation of CKα expression by DNA methylation of the CpG islands found on the promoter of this gene in MCF-7 cells. Four CpG islands have been predicted in the 2000 bp promoter region of ckα (chka) gene. Six CpG island deletion mutants were constructed using PCR site-directed mutagenesis method and cloned into pGL4.10 vectors for promoter activity assays. Deletion of CpG4C region located between -225 and -56 significantly increased the promoter activity by 4-fold, indicating the presence of important repressive transcription factor binding site. The promoter activity of methylated full-length promoter was significantly lower than the methylated CpG4C deletion mutant by 16-fold. The results show that DNA methylation of CpG4C promotes the binding of the transcription factor that suppresses the promoter activity. Electrophoretic mobility shift assay analysis showed that cytosine methylation at MZF1 binding site in CpG4C increased the binding of putative MZF1 in nuclear extract. In conclusion, the results suggest that DNA methylation decreased the promoter activity by promoting the binding of putative MZF1 transcription factor at CpG4C region of the ckα gene promoter.
Collapse
|
8
|
Chen YL, Guo DH, Li QZ. An energy model for recognizing the prokaryotic promoters based on molecular structure. Genomics 2019; 112:2072-2079. [PMID: 31809797 DOI: 10.1016/j.ygeno.2019.12.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 11/06/2019] [Accepted: 12/01/2019] [Indexed: 11/19/2022]
Abstract
Promoter is an important functional elements of DNA sequences, which is in charge of gene transcription initiation. Recognizing promoter have important help for understanding the relative life phenomena. Based on the concept that promoter is mainly determined by its sequence and structure, a novel statistical physics model for predicting promoter in Escherichia coli K-12 is proposed. The total energies of DNA local structure of sequence segments in the three benchmark promoter sequence datasets, the sole prediction parameter, are calculated by using principles from statistical physics and information theory. The better results are obtained. And a web-server PhysMPrePro for predicting promoter is established at http://202.207.14.87:8032/bioinformation/PhysMPrePro/index.asp, so that other scientists can easily get their desired results by our web-server.
Collapse
Affiliation(s)
- Ying-Li Chen
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Inner Mongolia University, Hohhot 010070, China.
| | - Dong-Hua Guo
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China
| | - Qian-Zhong Li
- Laboratory of Theoretical Biophysics, School of Physical Science and Technology, Inner Mongolia University, Hohhot 010021, China; The State key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Inner Mongolia University, Hohhot 010070, China.
| |
Collapse
|
9
|
Abstract
Designing the expression cassettes with desired properties remains the most important consideration of gene engineering technology. One of the challenges for predictive gene expression is the modeling of synthetic gene switches to regulate one or more target genes which would directly respond to specific chemical, environmental, and physiological stimuli. Assessment of natural promoter, high-throughput sequencing, and modern biotech inventory aided in deciphering the structure of cis elements and molding the native cis elements into desired synthetic promoter. Synthetic promoters which are molded by rearrangement of cis motifs can greatly benefit plant biotechnology applications. This review gives a glimpse of the manual in vivo gene regulation through synthetic promoters. It summarizes the integrative design strategy of synthetic promoters and enumerates five approaches for constructing synthetic promoters. Insights into the pattern of cis regulatory elements in the pursuit of desirable "gene switches" to date has also been reevaluated. Joint strategies of bioinformatics modeling and randomized biochemical synthesis are addressed in an effort to construct synthetic promoters for intricate gene regulation.
Collapse
|
10
|
Yoshino A, Polouliakh N, Meguro A, Takeuchi M, Kawagoe T, Mizuki N. Chum salmon egg extracts induce upregulation of collagen type I and exert antioxidative effects on human dermal fibroblast cultures. Clin Interv Aging 2016; 11:1159-68. [PMID: 27621603 PMCID: PMC5010078 DOI: 10.2147/cia.s102092] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Components of fish roe possess antioxidant and antiaging activities, making them potentially very beneficial natural resources. Here, we investigated chum salmon eggs (CSEs) as a source of active ingredients, including vitamins, unsaturated fatty acids, and proteins. We incubated human dermal fibroblast cultures for 48 hours with high and low concentrations of CSE extracts and analyzed changes in gene expression. Cells treated with CSE extract showed concentration-dependent upregulation of collagen type I genes and of multiple antioxidative genes, including OXR1, TXNRD1, and PRDX family genes. We further conducted in silico phylogenetic footprinting analysis of promoter regions. These results suggested that transcription factors such as acute myeloid leukemia-1a and cyclic adenosine monophosphate response element-binding protein may be involved in the observed upregulation of antioxidative genes. Our results support the idea that CSEs are strong candidate sources of antioxidant materials and cosmeceutically effective ingredients.
Collapse
Affiliation(s)
- Atsushi Yoshino
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa
| | - Natalia Polouliakh
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa; Sony Computer Science Laboratories Inc., Fundamental Research Laboratories; Systems Biology Institute, Tokyo, Japan
| | - Akira Meguro
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa
| | - Masaki Takeuchi
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa; Inflammatory Disease Section, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Tatsukata Kawagoe
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa
| | - Nobuhisa Mizuki
- Department of Ophthalmology and Visual Science, Yokohama City University Graduate School of Medicine, Yokohama, Kanagawa
| |
Collapse
|
11
|
Identification of a Novel C-Terminal Truncated WT1 Isoform with Antagonistic Effects against Major WT1 Isoforms. PLoS One 2015; 10:e0130578. [PMID: 26090994 PMCID: PMC4474557 DOI: 10.1371/journal.pone.0130578] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2014] [Accepted: 05/22/2015] [Indexed: 01/10/2023] Open
Abstract
The Wilms’ tumor gene WT1 consists of 10 exons and encodes a zinc finger transcription factor. There are four major WT1 isoforms resulting from alternative splicing at two sites, exon 5 (17AA) and exon 9 (KTS). All major WT1 isoforms are overexpressed in leukemia and solid tumors and play oncogenic roles such as inhibition of apoptosis, and promotion of cell proliferation, migration and invasion. In the present study, a novel alternatively spliced WT1 isoform that had an extended exon 4 (designated as exon 4a) with an additional 153 bp (designated as 4a sequence) at the 3’ end was identified and designated as an Ex4a(+)WT1 isoform. The insertion of exon 4a resulted in the introduction of premature translational stop codons in the reading frame in exon 4a and production of C-terminal truncated WT1 proteins lacking zinc finger DNA-binding domain. Overexpression of the truncated Ex4a(+)WT1 isoform inhibited the major WT1-mediated transcriptional activation of anti-apoptotic Bcl-xL gene promoter and induced mitochondrial damage and apoptosis. Conversely, suppression of the Ex4a(+)WT1 isoform by Ex4a-specific siRNA attenuated apoptosis. These results indicated that the Ex4a(+)WT1 isoform exerted dominant negative effects on anti-apoptotic function of major WT1 isoforms. Ex4a(+)WT1 isoform was endogenously expressed as a minor isoform in myeloid leukemia and solid tumor cells and increased regardless of decrease in major WT1 isoforms during apoptosis, suggesting the dominant negative effects on anti-apoptotic function of major WT1 isoforms. These results indicated that Ex4a(+)WT1 isoform had an important physiological function that regulated oncogenic function of major WT1 isoforms.
Collapse
|
12
|
Yella VR, Bansal M. In silico Identification of Eukaryotic Promoters. SYSTEMS AND SYNTHETIC BIOLOGY 2015. [DOI: 10.1007/978-94-017-9514-2_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Lawhorn IEB, Ferreira JP, Wang CL. Evaluation of sgRNA target sites for CRISPR-mediated repression of TP53. PLoS One 2014; 9:e113232. [PMID: 25398078 PMCID: PMC4232525 DOI: 10.1371/journal.pone.0113232] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 10/21/2014] [Indexed: 12/21/2022] Open
Abstract
The CRISPR (clustered regularly interspaced short palindromic repeats) platform has been developed as a general method to direct proteins of interest to gene targets. While the native CRISPR system delivers a nuclease that cleaves and potentially mutates target genes, researchers have recently employed catalytically inactive CRISPR-associated 9 nuclease (dCas9) in order to target and repress genes without DNA cleavage or mutagenesis. With the intent of improving repression efficiency in mammalian cells, researchers have also fused dCas9 with a KRAB repressor domain. Here, we evaluated different genomic sgRNA targeting sites for repression of TP53. The sites spanned a 200-kb distance, which included the promoter, transcript sequence, and regions flanking the endogenous human TP53 gene. We showed that repression up to 86% can be achieved with dCas9 alone (i.e., without use of the KRAB domain) by targeting the complex to sites near the TP53 transcriptional start site. This work demonstrates that efficient transcriptional repression of endogenous human genes can be achieved by the targeted delivery of dCas9. Yet, the efficiency of repression strongly depends on the choice of the sgRNA target site.
Collapse
Affiliation(s)
- Ingrid E. B. Lawhorn
- Department of Chemical Engineering, Stanford University, Stanford, California, United States of America
| | - Joshua P. Ferreira
- Department of Chemical Engineering, Stanford University, Stanford, California, United States of America
| | - Clifford L. Wang
- Department of Chemical Engineering, Stanford University, Stanford, California, United States of America
| |
Collapse
|
14
|
Dreos R, Ambrosini G, Périer RC, Bucher P. The Eukaryotic Promoter Database: expansion of EPDnew and new promoter analysis tools. Nucleic Acids Res 2014; 43:D92-6. [PMID: 25378343 PMCID: PMC4383928 DOI: 10.1093/nar/gku1111] [Citation(s) in RCA: 229] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
We present an update of EPDNew (http://epd.vital-it.ch), a recently introduced new part of the Eukaryotic Promoter Database (EPD) which has been described in more detail in a previous NAR Database Issue. EPD is an old database of experimentally characterized eukaryotic POL II promoters, which are conceptually defined as transcription initiation sites or regions. EPDnew is a collection of automatically compiled, organism-specific promoter lists complementing the old corpus of manually compiled promoter entries of EPD. This new part is exclusively derived from next generation sequencing data from high-throughput promoter mapping experiments. We report on the recent growth of EPDnew, its extension to additional model organisms and its improved integration with other bioinformatics resources developed by our group, in particular the Signal Search Analysis and ChIP-Seq web servers.
Collapse
Affiliation(s)
- René Dreos
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | - Giovanna Ambrosini
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| | - Rouayda Cavin Périer
- Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| | - Philipp Bucher
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| |
Collapse
|
15
|
Forsdahl S, Kiselev Y, Hogseth R, Mjelle JE, Mikkola I. Pax6 regulates the expression of Dkk3 in murine and human cell lines, and altered responses to Wnt signaling are shown in FlpIn-3T3 cells stably expressing either the Pax6 or the Pax6(5a) isoform. PLoS One 2014; 9:e102559. [PMID: 25029272 PMCID: PMC4100929 DOI: 10.1371/journal.pone.0102559] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 06/19/2014] [Indexed: 02/07/2023] Open
Abstract
Pax6 is a transcription factor important for early embryo development. It is expressed in several cancer cell lines and tumors. In glioblastoma, PAX6 has been shown to function as a tumor suppressor. Dickkopf 3 (Dkk3) is well established as a tumor suppressor in several tumor types, but not much is known about the regulation of its expression. We have previously found that Pax6 and Pax6(5a) increase the expression of the Dkk3 gene in two stably transfected mouse fibroblast cell lines. In this study the molecular mechanism behind this regulation is looked at. Western blot and reverse transcriptase quantitative PCR (RT-qPCR) confirmed higher level of Dkk3 expression in both Pax6 and Pax6(5a) expressing cell lines compared to the control cell line. By the use of bioinformatics and electrophoretic mobility shift assay (EMSA) we have mapped a functional Pax6 binding site in the mouse Dkk3 promoter. The minimal Dkk3 promoter fragment required for transcriptional activation by Pax6 and Pax6(5a) was a 200 bp region just upstream of the transcriptional start site. Mutation of the evolutionary conserved binding site in this region abrogated transcriptional activation and binding of Pax6/Pax6(5a) to the mouse Dkk3 promoter. Since the identified Pax6 binding site in this promoter is conserved, RT-qPCR and Western blot were used to look for regulation of Dkk3/REIC expression in human cell lines. Six of eight cell lines tested showed changes in Dkk3/REIC expression after PAX6 siRNA knockdown. Interestingly, we observed that the Pax6/Pax6(5a) expressing mouse fibroblast cell lines were less responsive to canonical Wnt pathway stimulation than the control cell line when TOP/FOP activity and the levels of active β-catenin and GSK3-β Ser9 phosphorylation were measured after LiCl stimulation.
Collapse
Affiliation(s)
- Siri Forsdahl
- Research Group of Pharmacology, Department of Pharmacy, UiT – The Artic University of Norway, Tromsoe, Norway
| | - Yury Kiselev
- Research Group of Pharmacology, Department of Pharmacy, UiT – The Artic University of Norway, Tromsoe, Norway
- Norwegian Translational Cancer Research Center, Department of Medical Biology, UiT – The Arctic University of Norway, Tromsoe, Norway
| | - Rune Hogseth
- Research Group of Pharmacology, Department of Pharmacy, UiT – The Artic University of Norway, Tromsoe, Norway
| | - Janne E. Mjelle
- Research Group of Pharmacology, Department of Pharmacy, UiT – The Artic University of Norway, Tromsoe, Norway
| | - Ingvild Mikkola
- Research Group of Pharmacology, Department of Pharmacy, UiT – The Artic University of Norway, Tromsoe, Norway
- * E-mail:
| |
Collapse
|
16
|
Bae BI, Tietjen I, Atabay KD, Evrony GD, Johnson MB, Asare E, Wang PP, Murayama AY, Im K, Lisgo SN, Overman L, Šestan N, Chang BS, Barkovich AJ, Grant PE, Topçu M, Politsky J, Okano H, Piao X, Walsh CA. Evolutionarily dynamic alternative splicing of GPR56 regulates regional cerebral cortical patterning. Science 2014; 343:764-8. [PMID: 24531968 PMCID: PMC4480613 DOI: 10.1126/science.1244392] [Citation(s) in RCA: 159] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
The human neocortex has numerous specialized functional areas whose formation is poorly understood. Here, we describe a 15-base pair deletion mutation in a regulatory element of GPR56 that selectively disrupts human cortex surrounding the Sylvian fissure bilaterally including "Broca's area," the primary language area, by disrupting regional GPR56 expression and blocking RFX transcription factor binding. GPR56 encodes a heterotrimeric guanine nucleotide-binding protein (G protein)-coupled receptor required for normal cortical development and is expressed in cortical progenitor cells. GPR56 expression levels regulate progenitor proliferation. GPR56 splice forms are highly variable between mice and humans, and the regulatory element of gyrencephalic mammals directs restricted lateral cortical expression. Our data reveal a mechanism by which control of GPR56 expression pattern by multiple alternative promoters can influence stem cell proliferation, gyral patterning, and, potentially, neocortex evolution.
Collapse
Affiliation(s)
- Byoung-Il Bae
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Ian Tietjen
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Kutay D. Atabay
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Gilad D. Evrony
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Matthew B. Johnson
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Ebenezer Asare
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Peter P. Wang
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| | - Ayako Y. Murayama
- Department of Physiology, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Kiho Im
- Division of Newborn Medicine, Center for Fetal Neonatal Neuroimaging and Developmental Science, Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Steven N. Lisgo
- The MRC-Wellcome Trust Human Developmental Biology Resource (HDBR), Newcastle, Institute of Genetic Medicine, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Lynne Overman
- The MRC-Wellcome Trust Human Developmental Biology Resource (HDBR), Newcastle, Institute of Genetic Medicine, International Centre for Life, Central Parkway, Newcastle upon Tyne NE1 3BZ, UK
| | - Nenad Šestan
- Department of Neurobiology and Kavli Institute of Neuroscience, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Bernard S. Chang
- Beth Israel Deaconess Medical Center, Comprehensive Epilepsy Center, Boston, MA 02215, USA
| | - A. James Barkovich
- Departments of Radiology, Pediatrics, Neurology, and Neurological Surgery, University of California San Francisco, San Francisco, CA 94143, USA
| | - P. Ellen Grant
- Division of Newborn Medicine, Center for Fetal Neonatal Neuroimaging and Developmental Science, Department of Radiology, Boston Children’s Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Meral Topçu
- Department of Pediatrics, Hacettepe University Faculty of Medicine, Ankara, Turkey
| | - Jeffrey Politsky
- Department of Neurology, Medical College of Georgia, Augusta, GA 30912, USA
| | - Hideyuki Okano
- Department of Physiology, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Xianhua Piao
- Division of Newborn Medicine, Boston Children’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Christopher A. Walsh
- Division of Genetics and Genomics, Manton Center for Orphan Disease, and Howard Hughes Medical Institute, Boston Children’s Hospital, Broad Institute of MIT and Harvard, and Departments of Pediatrics and Neurology, Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
17
|
Imai S, Kikuchi R, Tsuruya Y, Naoi S, Nishida S, Kusuhara H, Sugiyama Y. Epigenetic regulation of organic anion transporting polypeptide 1B3 in cancer cell lines. Pharm Res 2013; 30:2880-2890. [PMID: 23812637 DOI: 10.1007/s11095-013-1117-1] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 06/11/2013] [Indexed: 11/30/2022]
Abstract
PURPOSE The expression of a multispecific organic anion transporter, OATP1B3/SLCO1B3, is associated with clinical prognosis and survival of cancer cells. The aims of present study were to investigate the involvement of epigenetic regulation in mRNA expression of a cancer-type variant of OATP1B3 (Ct-OATP1B3) in cancer cell lines. METHODS The membrane localization and transport functions of Ct-OATP1B3 were investigated in HEK293 cells transiently expressing Ct-OATP1B3. DNA methylation profiles around the transcriptional start site of Ct-OATP1B3 in cancer cell lines were determined. The effects of a DNA methyltransferase inhibitor and siRNA knockdown of methyl-DNA binding proteins (MBDs) on the expression of Ct-OATP1B3 mRNA were investigated. RESULTS 5'-RACE identified the TSS of Ct-OATP1B3 in PK-8 cells. Ct-OATP1B3 was localized on the plasma membrane, and showed the transport activities of E217βG, fluvastatin, rifampicin, and Gd-EOB-DTPA. The CpG dinucleotides were hypomethylated in Ct-OATP1B3-positive cell lines (DLD-1, TFK-1, PK-8, and PK-45P) but were hypermethylated in Ct-OATP1B3-negative cell lines (HepG2 and Caco-2). Treatment with a DNA methyltransferase inhibitor and siRNA knockdown of MBD2 significantly increased the expression of Ct-OATP1B3 mRNA in HepG2 and Caco-2. CONCLUSIONS Ct-OATP1B3 is capable of transporting its substrates into cancer cells. Its mRNA expression is regulated by DNA methylation-dependent gene silencing involving MBD2.
Collapse
Affiliation(s)
- Satoki Imai
- Laboratory of Molecular Pharmacokinetics Graduate School of Pharmaceutical Sciences, The University of Tokyo, Tokyo, Japan
| | | | | | | | | | | | | |
Collapse
|
18
|
Nakayama M, Takeda M, Asaumi Y, Shimokawa H. Identification and visualization of stimulus-specific transcriptional activity in cardiac hypertrophy in mice. Int J Cardiovasc Imaging 2013; 30:211-9. [PMID: 24162179 DOI: 10.1007/s10554-013-0314-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 10/14/2013] [Indexed: 10/26/2022]
Abstract
Identification of specific signaling pathways for cardiac hypertrophy in living animals is challenging because no methods have been established to directly observe sequential molecular signaling events at the transcriptional level during pathogenesis. Here, our aim was to develop a useful method for monitoring the specific signaling pathways involved in the development of cardiac hypertrophy in vivo. Expression profiling of the left ventricle by microarray was performed in 2 different mouse models of cardiac hypertrophy: mechanical pressure overload by transverse aortic constriction (TAC) and neurohumoral activation by angiotensin II (Ang II) infusion. To annotate the information on transcription factor-binding sites, we collected promoter sequences and identified significantly frequent transcription factor-binding sites in the promoter regions of coregulated genes from both models (P < 0.05, binomial probability). Finally, we injected a firefly luciferase vector plasmid containing each transcription factor-binding site into the left ventricle in both models. In the TAC and Ang II models, we selected 379 and 12 upregulated genes, respectively. Twenty binding sites for transcription factors, including activator protein 4, were identified in the TAC model, and 4 sites for transcription factors, including ecotropic viral integration 1, were identified in the Ang II model. GATA-binding sites were noted in both models of cardiac hypertrophy. Using the firefly luciferase reporter, we demonstrated the enhancement of transcriptional activity during the progression of cardiac hypertrophy using in vivo imaging in live mice. These results suggested that our approach was useful for the identification of unique transcription factors that characterize different models of cardiac hypertrophy in vivo.
Collapse
Affiliation(s)
- Masaharu Nakayama
- Department of Cardiovascular Medicine, Tohoku University Graduate School of Medicine, 1-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi, 980-8574, Japan,
| | | | | | | |
Collapse
|
19
|
Identification and properties of a novel variant of NBC4 (Na(+)/HCO(3)- co-transporter 4) that is predominantly expressed in the choroid plexus. Biochem J 2013. [PMID: 23205667 DOI: 10.1042/bj20121515] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Secretion of HCO(3)- at the apical side of the epithelial cells of the choroid plexus is an essential step in the formation of cerebrospinal fluid. Anion conductance with a high degree of HCO(3)- permeability has been observed and suggested to be the major pathway for HCO(3)- transport across the apical membrane. Recently, it was found that NBC (Na(+)/HCO(3)- co-transporter) 4, an electrogenic member of the NBC family, was expressed in the choroid plexus. We found that a novel variant of the NBC4 [NBC4g/Slc4a5 (solute carrier family 4, sodium bicarbonate co-transporter, member 5)] is almost exclusively expressed in the apical membrane of rat choroid plexus epithelium at exceptionally high levels. RNA interference-mediated knockdown allowed the functional demonstration that NBC4g is the major player in the HCO(3)- transport across the apical membrane of the choroid plexus epithelium. When combined with a recent observation that in choroid plexus epithelial cells electrogenic NBC operates with a stoichiometry of 3:1, the results of the present study suggest that NBC4g mediates the efflux of HCO(3)- and contributes to cerebrospinal fluid production.
Collapse
|
20
|
Tong CW, Wang JL, Jiang MS, Hsu CH, Chang WT, Huang AM. Novel genes that mediate nuclear respiratory factor 1-regualted neurite outgrowth in neuroblastoma IMR-32 cells. Gene 2012; 515:62-70. [PMID: 23219993 DOI: 10.1016/j.gene.2012.11.026] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2012] [Accepted: 11/27/2012] [Indexed: 11/17/2022]
Abstract
Nuclear respiratory factor-1 (NRF-1) is a transcription factor that functions in neurite outgrowth; however, the genes downstream from NRF-1 that mediate this function remain largely unknown. This study employs a genome-wide analysis approach to identify NRF-1-targeted genes in human neuroblastoma IMR-32 cells. A total of 916 human genes containing the putative NRF-1 response element (NRE) in their promoter regions were identified using a cutoff score determined by results from electrophoretic mobility shift assays (EMSA). Seventy-four NRF-1 target genes were listed according to the typical locations and high conservation of NREs. Fifteen genes, MAPRE3, NPDC1, RAB3IP, TRAPPC3, SMAD5, PIP5K1A, USP10, SPRY4, GTF2F2, NR1D1, SUV39H2, SKA3, RHOA, RAPGEF6, and SMAP1 were selected for biological confirmation. EMSA and chromatin immunoprecipitation confirmed that all NREs of these fifteen genes are critical for NRF-1 binding. Quantitative RT-PCR demonstrated that mRNA levels of 12 of these genes are regulated by NRF-1. Overexpression or knockdown of candidate genes demonstrated that MAPRE3, NPDC1, SMAD5, USP10, SPRY4, GTF2F2, SKA3, SMAP1 positively regulated, and RHOA and RAPGEF6 negatively regulated neurite outgrowth. Overall, our data showed that the combination of genome-wide bioinformatic analysis and biological experiments helps to identify the novel NRF-1-regulated genes, which play roles in differentiation of neuroblastoma cells.
Collapse
Affiliation(s)
- Chih-Wei Tong
- Department of Physiology, National Cheng Kung University, College of Medicine, Tainan, Taiwan
| | | | | | | | | | | |
Collapse
|
21
|
Dreos R, Ambrosini G, Cavin Périer R, Bucher P. EPD and EPDnew, high-quality promoter resources in the next-generation sequencing era. Nucleic Acids Res 2012. [PMID: 23193273 PMCID: PMC3531148 DOI: 10.1093/nar/gks1233] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Eukaryotic Promoter Database (EPD), available online at http://epd.vital-it.ch, is a collection of experimentally defined eukaryotic POL II promoters which has been maintained for more than 25 years. A promoter is represented by a single position in the genome, typically the major transcription start site (TSS). EPD primarily serves biologists interested in analysing the motif content, chromatin structure or DNA methylation status of co-regulated promoter subsets. Initially, promoter evidence came from TSS mapping experiments targeted at single genes and published in journal articles. Today, the TSS positions provided by EPD are inferred from next-generation sequencing data distributed in electronic form. Traditionally, EPD has been a high-quality database with low coverage. The focus of recent efforts has been to reach complete gene coverage for important model organisms. To this end, we introduced a new section called EPDnew, which is automatically assembled from multiple, carefully selected input datasets. As another novelty, we started to use chromatin signatures in addition to mRNA 5′tags to locate promoters of weekly expressed genes. Regarding user interfaces, we introduced a new promoter viewer which enables users to explore promoter-defining experimental evidence in a UCSC genome browser window.
Collapse
Affiliation(s)
- René Dreos
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | | | | | | |
Collapse
|
22
|
Li H, Chen D, Zhang J. Analysis of intron sequence features associated with transcriptional regulation in human genes. PLoS One 2012; 7:e46784. [PMID: 23082130 PMCID: PMC3474797 DOI: 10.1371/journal.pone.0046784] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 09/06/2012] [Indexed: 11/18/2022] Open
Abstract
Although some preliminary work has revealed the potential transcriptional regulatory function of the introns in eukaryotes, additional evidences are needed to support this conjecture. In this study, we perform systemic analyses of the sequence characteristics of human introns. The results show that the first introns are generally longer and C, G and their dinucleotide compositions are over-represented relative to other introns, which are consistent with the previous findings. In addition, some new phenomena concerned with transcriptional regulation are found: i) the first introns are enriched in CpG islands; and ii) the percentages of the first introns containing TATA, CAAT and GC boxes are relatively higher than other position introns. The similar features of introns are observed in tissue-specific genes. The results further support that the first introns of human genes are likely to be involved in transcriptional regulation, and give an insight into the transcriptional regulatory regions of genes.
Collapse
Affiliation(s)
- Huimin Li
- Laboratory for Conservation and Utilization of Bio-resources, Yunnan University, Kunming, China
- School of Mathematics and Computer Science, Yunnan University of Nationalities, Kunming, China
| | - Dan Chen
- Laboratory for Conservation and Utilization of Bio-resources, Yunnan University, Kunming, China
- School of Mathematics and Statistics, Yunnan University, Kunming, China
| | - Jing Zhang
- School of Mathematics and Statistics, Yunnan University, Kunming, China
- * E-mail:
| |
Collapse
|
23
|
Katsushima K, Shinjo K, Natsume A, Ohka F, Fujii M, Osada H, Sekido Y, Kondo Y. Contribution of microRNA-1275 to Claudin11 protein suppression via a polycomb-mediated silencing mechanism in human glioma stem-like cells. J Biol Chem 2012; 287:27396-406. [PMID: 22736761 DOI: 10.1074/jbc.m112.359109] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Glioblastomas show heterogeneous histological features, and tumor cells show distinct phenotypic states that confer different functional attributes and an aggressive character. However, the molecular mechanisms underlying the heterogeneity in this disease are poorly understood. Glioma stem-like cells (GSCs) are considered able to aberrantly differentiate into diverse cell types and may contribute to the establishment of tumor heterogeneity. Using a GSC model, we investigated differentially expressed microRNAs (miRNAs) and associated epigenetic mechanisms that regulate the differentiation of GSCs. miRNA profiling using microarray technology showed that 13 and 34 miRNAs were commonly up-regulated and down-regulated in two independent GSC lines during differentiation, respectively. Among this set of miRNAs, quantitative PCR analysis showed that miRNA-1275 (miR-1275) was consistently down-regulated during GSC differentiation, along with the up-regulation of its target, CLDN11, an important protein during oligodendroglial lineage differentiation. Inhibition of miR-1275 with a specific antisense oligonucleotide (anti-miR-1275) in GSCs increased the expression of CLDN11, together with significant growth suppression. Epigenetic analysis revealed that gain of histone H3 lysine 27 trimethylation (H3K27me3) in the primary microRNA-1275 promoter was closely associated with miR-1275 expression. Treatment with 3-deazaneplanocin A, an inhibitor of H3K27 methyltransferase, attenuated CLDN11 induction by serum stimulation in parallel with sustained miR-1275 expression. Our results have illuminated the epigenetic regulatory pathways of miR-1275 that are closely associated with oligodendroglial differentiation, which may contribute to the tissue heterogeneity seen in the formation of glioblastomas. Given that inhibition of miR-1275 induces expression of oligodendroglial lineage proteins and suppresses tumor cell proliferation, this may be a potential therapeutic target for glioblastomas.
Collapse
Affiliation(s)
- Keisuke Katsushima
- Division of Molecular Oncology, Aichi Cancer Center Research Institute, Nagoya 464-8681, Japan
| | | | | | | | | | | | | | | |
Collapse
|
24
|
Morita M, Nakamura M, Hamada M, Takahashi S. Combinatorial motif analysis of regulatory gene expression in Mafb deficient macrophages. BMC SYSTEMS BIOLOGY 2011; 5 Suppl 2:S7. [PMID: 22784578 PMCID: PMC3287487 DOI: 10.1186/1752-0509-5-s2-s7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Background Deficiency of the transcription factor MafB, which is normally expressed in macrophages, can underlie cellular dysfunction associated with a range of autoimmune diseases and arteriosclerosis. MafB has important roles in cell differentiation and regulation of target gene expression; however, the mechanisms of this regulation and the identities of other transcription factors with which MafB interacts remain uncertain. Bioinformatics methods provide a valuable approach for elucidating the nature of these interactions with transcriptional regulatory elements from a large number of DNA sequences. In particular, identification of patterns of co-occurrence of regulatory cis-elements (motifs) offers a robust approach. Results Here, the directional relationships among several functional motifs were evaluated using the Log-linear Graphical Model (LGM) after extraction and search for evolutionarily conserved motifs. This analysis highlighted GATA-1 motifs and 5’AT-rich half Maf recognition elements (MAREs) in promoter regions of 18 genes that were down-regulated in Mafb deficient macrophages. GATA-1 motifs and MafB motifs could regulate expression of these genes in both a negative and positive manner, respectively. The validity of this conclusion was tested with data from a luciferase assay that used a C1qa promoter construct carrying both the GATA-1 motifs and MAREs. GATA-1 was found to inhibit the activity of the C1qa promoter with the GATA-1 motifs and MafB motifs. Conclusions These observations suggest that both the GATA-1 motifs and MafB motifs are important for lineage specific expression of C1qa. In addition, these findings show that analysis of combinations of evolutionarily conserved motifs can be successfully used to identify patterns of gene regulation.
Collapse
Affiliation(s)
- Mariko Morita
- Department of Anatomy and Embryology, Institute of Basic Medical Sciences, Graduate School of Comprehensive Human Sciences, University of Tsukuba, 1-1-1, Tennodai, Tsukuba, 305-8575, Ibaraki, Japan.
| | | | | | | |
Collapse
|
25
|
Yamashita R, Sugano S, Suzuki Y, Nakai K. DBTSS: DataBase of Transcriptional Start Sites progress report in 2012. Nucleic Acids Res 2011; 40:D150-4. [PMID: 22086958 PMCID: PMC3245115 DOI: 10.1093/nar/gkr1005] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
To support transcriptional regulation studies, we have constructed DBTSS (DataBase of Transcriptional Start Sites), which contains exact positions of transcriptional start sites (TSSs), determined with our own technique named TSS-seq, in the genomes of various species. In its latest version, DBTSS covers the data of the majority of human adult and embryonic tissues: it now contains 418 million TSS tag sequences from 28 tissues/cell cultures. Moreover, we integrated a series of our own transcriptomic data, such as the RNA-seq data of subcellular-fractionated RNAs as well as the ChIP-seq data of histone modifications and the binding of RNA polymerase II/several transcription factors in cultured cell lines into our original TSS information. We also included several external epigenomic data, such as the chromatin map of the ENCODE project. We further associated our TSS information with public or original single-nucleotide variation (SNV) data, in order to identify SNVs in the regulatory regions. These data can be browsed in our new viewer, which supports versatile search conditions of users. We believe that our new DBTSS will be an invaluable resource for interpreting the differential uses of TSSs and for identifying human genetic variations that are associated with disordered transcriptional regulation. DBTSS can be accessed at http://dbtss.hgc.jp.
Collapse
Affiliation(s)
- Riu Yamashita
- Frontier Research Initiative, Institute of Medical Science, Human Genome Center, Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
| | | | | | | |
Collapse
|
26
|
Functional characterization of human Kindlin-2 core promoter identifies a key role of SP1 in Kindlin-2 transcriptional regulation. Cell Mol Biol Lett 2011; 16:638-51. [PMID: 21922223 PMCID: PMC6275727 DOI: 10.2478/s11658-011-0028-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2011] [Accepted: 09/08/2011] [Indexed: 11/20/2022] Open
Abstract
Kindlin-2 is a recently identified FERM and PH domain containing integrin interacting protein. Kindlin-2 is ubiquitously expressed in normal tissues. So far, much effort has been spent exploring the functional aspects of Kindlin-2. However, the transcriptional regulation of Kindlin-2 has not yet been investigated. In this study we identified and functionally characterized the promoter of the human Kindlin-2 gene. We show that the core promoter of Kindlin-2 is a 39 base pair long GC rich fragment located −122/-83 upstream of the Kindlin-2 transcription start site. Functional characterization of this core promoter region by both in silico as well as in vitro/in vivo analysis shows that the transcription factor SP1 plays an important role in regulation of Kindlin-2 expression.
Collapse
|
27
|
Ohshima D, Qin J, Konno H, Hirosawa A, Shiraishi T, Yanai H, Shimo Y, Shinzawa M, Akiyama N, Yamashita R, Nakai K, Akiyama T, Inoue JI. RANK signaling induces interferon-stimulated genes in the fetal thymic stroma. Biochem Biophys Res Commun 2011; 408:530-6. [PMID: 21527253 DOI: 10.1016/j.bbrc.2011.04.049] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Accepted: 04/09/2011] [Indexed: 11/22/2022]
Abstract
Medullary thymic epithelial cells (mTECs) are essential for thymic negative selection to prevent autoimmunity. Previous studies show that mTEC development is dependent on the signal transducers TRAF6 and NIK. However, the downstream target genes of signals controlled by these molecules remain unknown. We performed a microarray analysis on mRNAs down-regulated by deficiencies in TRAF6 or functional NIK in an in vitro organ culture of fetal thymic stromata (2DG-FTOC). An in silico analysis of transcription factor binding sites in plausible promoter regions of differentially expressed genes suggests that STAT1 is involved in TRAF6- and NIK-dependent gene expression. Indeed, the signal of RANK, a TNF receptor family member that activates TRAF6 and NIK, induces the activation of STAT1 in 2DG-FTOC. Moreover, RANK signaling induces the up-regulation of interferon (IFN)-stimulated gene (ISG) expression, suggesting that the RANKL-dependent activation of STAT1 up-regulates ISG expression. The RANKL-dependent expression levels of ISGs were reduced but not completely abolished in interferon α receptor 1-deficient (Ifnar1(-/-)) 2DG-FTOC. Our data suggest that RANK signaling induces ISG expression in both type I interferon-independent and interferon-dependent mechanisms.
Collapse
Affiliation(s)
- Daisuke Ohshima
- Division of Cellular and Molecular Biology, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Davies JS, Klein DC, Carter DA. Selective genomic targeting by FRA-2/FOSL2 transcription factor: regulation of the Rgs4 gene is mediated by a variant activator protein 1 (AP-1) promoter sequence/CREB-binding protein (CBP) mechanism. J Biol Chem 2011; 286:15227-39. [PMID: 21367864 PMCID: PMC3083148 DOI: 10.1074/jbc.m110.201996] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/12/2011] [Indexed: 01/21/2023] Open
Abstract
FRA-2/FOSL2 is a basic region-leucine zipper motif transcription factor that is widely expressed in mammalian tissues. The functional repertoire of this factor is unclear, partly due to a lack of knowledge of genomic sequences that are targeted. Here, we identified novel, functional FRA-2 targets across the genome through expression profile analysis in a knockdown transgenic rat. In this model, a nocturnal rhythm of pineal gland FRA-2 is suppressed by a genetically encoded, dominant negative mutant protein. Bioinformatic analysis of validated sets of FRA-2-regulated and -nonregulated genes revealed that the FRA-2 regulon is limited by genomic target selection rules that, in general, transcend core cis-sequence identity. However, one variant AP-1-related (AP-1R) sequence was common to a subset of regulated genes. The functional activity and protein binding partners of a candidate AP-1R sequence were determined for a novel FRA-2-repressed gene, Rgs4. FRA-2 protein preferentially associated with a proximal Rgs4 AP-1R sequence as demonstrated by ex vivo ChIP and in vitro EMSA analysis; moreover, transcriptional repression was blocked by mutation of the AP-1R sequence, whereas mutation of an upstream consensus AP-1 family sequence did not affect Rgs4 expression. Nocturnal changes in protein complexes at the Rgs4 AP-1R sequence are associated with FRA-2-dependent dismissal of the co-activator, CBP; this provides a mechanistic basis for Rgs4 gene repression. These studies have also provided functional insight into selective genomic targeting by FRA-2, highlighting discordance between predicted and actual targets. Future studies should address FRA-2-Rgs4 interactions in other systems, including the brain, where FRA-2 function is poorly understood.
Collapse
Affiliation(s)
- Jeff S. Davies
- From the School of Biosciences, Cardiff University, Cardiff CF10 3AX, Wales, United Kingdom and
| | - David C. Klein
- the Section on Neuroendocrinology, Program on Developmental Endocrinology and Genetics, NICHD, National Institutes of Health, Bethesda, Maryland 20892
| | - David A. Carter
- From the School of Biosciences, Cardiff University, Cardiff CF10 3AX, Wales, United Kingdom and
| |
Collapse
|
29
|
Kim HJ, Ko MS, Kim HK, Cho WJ, Lee SH, Lee BJ, Park JW. Transcription factor Sp1 regulates basal transcription of the human DRG2 gene. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2011; 1809:184-90. [PMID: 21296692 DOI: 10.1016/j.bbagrm.2011.01.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Revised: 01/18/2011] [Accepted: 01/20/2011] [Indexed: 11/30/2022]
Abstract
Developmentally regulated GTP-binding protein 2 (DRG2) is an evolutionarily conserved GTP-binding protein. DRG2 mRNA expression has been confirmed in many animal and human tissues. DRG2 is thought to play an essential role in the control of cell growth and differentiation. However, transcriptional regulation of DRG2 is largely unknown. To investigate the mechanisms controlling DRG2 expression, we cloned 1509bp of the 5'-flanking sequence of this gene. Deletion analysis showed that the region between -113 and -70 is essential for the basal level expression of the DRG2 gene in K562 human erythroleukemic cells. Mutation of a putative stimulating protein 1 (Sp1) regulatory site located at position -108 resulted in a significant decline in DRG2 promoter activity. Electrophoretic mobility shift assay and chromatin immunoprecipitation analysis revealed that Sp1 binds to this site. Knockdown of Sp1 expression using siRNA inhibited the promoter activation as well as the endogenous DRG2 transcriptional level. Taken together, these results demonstrate that basal expression level of DRG2 is regulated by the Sp1 transcription factor.
Collapse
Affiliation(s)
- Hyo Jeong Kim
- Department of Biological Sciences, University of Ulsan, Ulsan 680-749, Korea
| | | | | | | | | | | | | |
Collapse
|
30
|
Motti D, Le Duigou C, Eugène E, Chemaly N, Wittner L, Lazarevic D, Krmac H, Marstrand T, Valen E, Sanges R, Stupka E, Sandelin A, Cherubini E, Gustincich S, Miles R. Gene expression analysis of the emergence of epileptiform activity after focal injection of kainic acid into mouse hippocampus. Eur J Neurosci 2011; 32:1364-79. [PMID: 20950280 DOI: 10.1111/j.1460-9568.2010.07403.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
We report gene profiling data on genomic processes underlying the progression towards recurrent seizures after injection of kainic acid (KA) into the mouse hippocampus. Focal injection enabled us to separate the effects of proepileptic stimuli initiated by KA injection. Both the injected and contralateral hippocampus participated in the status epilepticus. However, neuronal death induced by KA treatment was restricted to the injected hippocampus, although there was some contralateral axonal degeneration. We profiled gene expression changes in dorsal and ventral regions of both the injected and contralateral hippocampus. Changes were detected in the expression of 1526 transcripts in samples from three time-points: (i) during the KA-induced status epilepticus, (ii) at 2 weeks, before recurrent seizures emerged, and (iii) at 6 months after seizures emerged. Grouping genes with similar spatio-temporal changes revealed an early transcriptional response, strong immune, cell death and growth responses at 2 weeks and an activation of immune and extracellular matrix genes persisting at 6 months. Immunostaining for proteins coded by genes identified from array studies provided evidence for gliogenesis and suggested that the proteoglycan biglycan is synthesized by astrocytes and contributes to a glial scar. Gene changes at 6 months after KA injection were largely restricted to tissue from the injection site. This suggests that either recurrent seizures might depend on maintained processes including immune responses and changes in extracellular matrix proteins near the injection site or alternatively might result from processes, such as growth, distant from the injection site and terminated while seizures are maintained.
Collapse
Affiliation(s)
- Dario Motti
- SISSA/ISAS International School for Advanced Studies, Neurobiology Sector, Trieste, Italy
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
32
|
Dineen DG, Schröder M, Higgins DG, Cunningham P. Ensemble approach combining multiple methods improves human transcription start site prediction. BMC Genomics 2010; 11:677. [PMID: 21118509 PMCID: PMC3053590 DOI: 10.1186/1471-2164-11-677] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 11/30/2010] [Indexed: 11/20/2022] Open
Abstract
Background The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier ('Profisi Ensemble') using predictions from 7 programs, along with 2 other data sources. Support vector machines using 'full' and 'reduced' data sets are combined in an either/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions Supervised learning methods are a useful way to combine predictions from diverse sources.
Collapse
Affiliation(s)
- David G Dineen
- Complex and Adaptive Systems Laboratory (CASL), University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | |
Collapse
|
33
|
Yang MQ, Laflamme K, Gotea V, Joiner CH, Seidel NE, Wong C, Petrykowska HM, Lichtenberg J, Lee S, Welch L, Gallagher PG, Bodine DM, Elnitski L. Genome-wide detection of a TFIID localization element from an initial human disease mutation. Nucleic Acids Res 2010; 39:2175-87. [PMID: 21071415 PMCID: PMC3064768 DOI: 10.1093/nar/gkq1035] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Eukaryotic core promoters are often characterized by the presence of consensus motifs such as the TATA box or initiator elements, which attract and direct the transcriptional machinery to the transcription start site. However, many human promoters have none of the known core promoter motifs, suggesting that undiscovered promoter motifs exist in the genome. We previously identified a mutation in the human Ankyrin-1 (ANK-1) promoter that causes the disease ankyrin-deficient Hereditary Spherocytosis (HS). Although the ANK-1 promoter is CpG rich, no discernable basal promoter elements had been identified. We showed that the HS mutation disrupted the binding of the transcription factor TFIID, the major component of the pre-initiation complex. We hypothesized that the mutation identified a candidate promoter element with a more widespread role in gene regulation. We examined 17,181 human promoters for the experimentally validated binding site, called the TFIID localization sequence (DLS) and found three times as many promoters containing DLS than TATA motifs. Mutational analyses of DLS sequences confirmed their functional significance, as did the addition of a DLS site to a minimal Sp1 promoter. Our results demonstrate that novel promoter elements can be identified on a genome-wide scale through observations of regulatory disruptions that cause human disease.
Collapse
Affiliation(s)
- Mary Q Yang
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Rockville, MD 20852, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Kogai T, Liu YY, Richter LL, Mody K, Kagechika H, Brent GA. Retinoic acid induces expression of the thyroid hormone transporter, monocarboxylate transporter 8 (Mct8). J Biol Chem 2010; 285:27279-27288. [PMID: 20573951 DOI: 10.1074/jbc.m110.123158] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Retinoic acid (RA) and thyroid hormone are critical for differentiation and organogenesis in the embryo. Mct8 (monocarboxylate transporter 8), expressed predominantly in the brain and placenta, mediates thyroid hormone uptake from the circulation and is required for normal neural development. RA induces differentiation of F9 mouse teratocarcinoma cells toward neurons as well as extraembryonal endoderm. We hypothesized that Mct8 is functionally expressed in F9 cells and induced by RA. All-trans-RA (tRA) and other RA receptor (RAR) agonists dramatically (>300-fold) induced Mct8. tRA treatment significantly increased uptake of triiodothyronine and thyroxine (4.1- and 4.3-fold, respectively), which was abolished by a selective Mct8 inhibitor, bromosulfophthalein. Sequence inspection of the Mct8 promoter region and 5'-rapid amplification of cDNA ends PCR analysis in F9 cells identified 11 transcription start sites and a proximal Sp1 site but no TATA box. tRA significantly enhanced Mct8 promoter activity through a consensus RA-responsive element located 6.6 kilobases upstream of the coding region. A chromatin immunoprecipitation assay demonstrated binding of RAR and retinoid X receptor to the RA response element. The promotion of thyroid hormone uptake through the transcriptional up-regulation of Mct8 by RAR is likely to be important for extraembryonic endoderm development and neural differentiation. This finding demonstrates cross-talk between RA signaling and thyroid hormone signaling in early development at the level of the thyroid hormone transporter.
Collapse
Affiliation(s)
- Takahiko Kogai
- Molecular Endocrinology Laboratory, Veterans Affairs Greater Los Angeles Healthcare System, and the Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles, California 90073.
| | - Yan-Yun Liu
- Molecular Endocrinology Laboratory, Veterans Affairs Greater Los Angeles Healthcare System, and the Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles, California 90073
| | - Laura L Richter
- Molecular Endocrinology Laboratory, Veterans Affairs Greater Los Angeles Healthcare System, and the Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles, California 90073
| | - Kaizeen Mody
- Molecular Endocrinology Laboratory, Veterans Affairs Greater Los Angeles Healthcare System, and the Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles, California 90073
| | - Hiroyuki Kagechika
- Institute of Biomaterials and Bioengineering, Tokyo Medical and Dental University, Tokyo 101-0062, Japan
| | - Gregory A Brent
- Molecular Endocrinology Laboratory, Veterans Affairs Greater Los Angeles Healthcare System, and the Departments of Medicine and Physiology, David Geffen School of Medicine at UCLA, Los Angeles, California 90073.
| |
Collapse
|
35
|
Mätlik K, Redik K, Speek M. L1 antisense promoter drives tissue-specific transcription of human genes. J Biomed Biotechnol 2010; 2006:71753. [PMID: 16877819 PMCID: PMC1559930 DOI: 10.1155/jbb/2006/71753] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription of transposable elements interspersed in the genome
is controlled by complex interactions between their regulatory
elements and host factors. However, the same regulatory elements
may be occasionally used for the transcription of host genes. One
such example is the human L1 retrotransposon, which contains an
antisense promoter (ASP) driving transcription into adjacent genes
yielding chimeric transcripts. We have characterized 49 chimeric
mRNAs corresponding to sense and antisense strands of human genes.
Here we show that L1 ASP is capable of functioning as an
alternative promoter, giving rise to a chimeric transcript whose
coding region is identical to the ORF of mRNA of the following
genes: KIAA1797, CLCN5, and SLCO1A2.
Furthermore, in these cases the activity of L1 ASP is
tissue-specific and may expand the expression pattern of the
respective gene. The activity of L1 ASP is tissue-specific also in
cases where L1 ASP produces antisense RNAs complementary to
COL11A1 and BOLL mRNAs. Simultaneous assessment
of the activity of L1 ASPs in multiple loci revealed the presence
of L1 ASP-derived transcripts in all human tissues examined. We
also demonstrate that L1 ASP can act as a promoter in vivo and
predict that it has a heterogeneous transcription initiation site.
Our data suggest that L1 ASP-driven transcription may increase the
transcriptional flexibility of several human genes.
Collapse
Affiliation(s)
- Kert Mätlik
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
| | - Kaja Redik
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
| | - Mart Speek
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
- *Mart Speek:
| |
Collapse
|
36
|
Brown HJ, Peng L, Harada JN, Walker JR, Cole S, Lin SF, Zack JA, Chanda SK, Sun R. Gene expression and transcription factor profiling reveal inhibition of transcription factor cAMP-response element-binding protein by gamma-herpesvirus replication and transcription activator. J Biol Chem 2010; 285:25139-53. [PMID: 20516076 DOI: 10.1074/jbc.m110.137737] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Herpesvirus replication involves the expression of over 80 viral genes in a well ordered sequence, leading to the production of new virions. Viral genes expressed during the earliest phases of replication often regulate both viral and cellular genes. Therefore, they have the potential to bring about dramatic functional changes within the cell. Replication and transcription activator (RTA) is a potent immediate early transcription activator of the gamma-herpesvirus family. This family includes Epstein-Barr virus and Kaposi sarcoma-associated herpesvirus, human pathogens associated with malignancy. Here we combine gene array technology with transcription factor profiling to identify the earliest DNA promoter and cellular transcription factor targets of RTA in the cellular genome. We find that expression of RTA leads to both activation and inhibition of distinct groups of cellular genes. The identity of the target genes suggests that RTA rapidly changes the cellular environment to counteract cell death pathways, support growth factor signaling, and also promote immune evasion of the infected cell. Transcription factor profiling of the target gene promoters highlighted distinct pathways involved in gene activation at specific time points. Most notable throughout was the high level of cAMP-response element-binding protein (CREB)-response elements in RTA target genes. We find that RTA can function as either an activator or an inhibitor of CREB-response genes, depending on the promoter context. The association with CREB also highlights a novel connection and coordination between viral and cellular "immediate early" responses.
Collapse
Affiliation(s)
- Helen J Brown
- Department of Microbiology, Division of Hematology-Oncology, David Geffen School of Medicine, UCLA, Los Angeles, California 90095, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Functional analysis of a novel cis-acting regulatory region within the human ankyrin gene (ANK-1) promoter. Mol Cell Biol 2010; 30:3493-502. [PMID: 20479128 DOI: 10.1128/mcb.00119-10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
The characterization of atypical mutations in loci associated with diseases is a powerful tool to discover novel regulatory elements. We previously identified a dinucleotide deletion in the human ankyrin-1 gene (ANK-1) promoter that underlies ankyrin-deficient hereditary spherocytosis. The presence of the deletion was associated with a decrease in promoter function both in vitro and in vivo establishing it as a causative hereditary spherocytosis mutation. The dinucleotide deletion is located in the 5' untranslated region of the ANK-1 gene and disrupts the binding of TATA binding protein and TFIID, components of the preinitiation complex. We hypothesized that the nucleotides surrounding the mutation define an uncharacterized regulatory sequence. To test this hypothesis, we generated a library of more than 16,000 ANK-1 promoters with degenerate sequence around the mutation and cloned the functional promoter sequences after cell-free transcription. We identified the wild type and three additional sequences, from which we derived a consensus. The sequences were shown to be functional in cell-free transcription, transient-transfection, and transgenic mouse assays. One sequence increased ANK-1 promoter function 5-fold, while randomly chosen sequences decreased ANK-1 promoter function. Our results demonstrate a novel functional motif in the ANK-1 promoter.
Collapse
|
38
|
Huda A, Mariño-Ramírez L, Jordan IK. Epigenetic histone modifications of human transposable elements: genome defense versus exaptation. Mob DNA 2010; 1:2. [PMID: 20226072 PMCID: PMC2836006 DOI: 10.1186/1759-8753-1-2] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Accepted: 01/25/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transposition is disruptive in nature and, thus, it is imperative for host genomes to evolve mechanisms that suppress the activity of transposable elements (TEs). At the same time, transposition also provides diverse sequences that can be exapted by host genomes as functional elements. These notions form the basis of two competing hypotheses pertaining to the role of epigenetic modifications of TEs in eukaryotic genomes: the genome defense hypothesis and the exaptation hypothesis. To date, all available evidence points to the genome defense hypothesis as the best explanation for the biological role of TE epigenetic modifications. RESULTS We evaluated several predictions generated by the genome defense hypothesis versus the exaptation hypothesis using recently characterized epigenetic histone modification data for the human genome. To this end, we mapped chromatin immunoprecipitation sequence tags from 38 histone modifications, characterized in CD4+ T cells, to the human genome and calculated their enrichment and depletion in all families of human TEs. We found that several of these families are significantly enriched or depleted for various histone modifications, both active and repressive. The enrichment of human TE families with active histone modifications is consistent with the exaptation hypothesis and stands in contrast to previous analyses that have found mammalian TEs to be exclusively repressively modified. Comparisons between TE families revealed that older families carry more histone modifications than younger ones, another observation consistent with the exaptation hypothesis. However, data from within family analyses on the relative ages of epigenetically modified elements are consistent with both the genome defense and exaptation hypotheses. Finally, TEs located proximal to genes carry more histone modifications than the ones that are distal to genes, as may be expected if epigenetically modified TEs help to regulate the expression of nearby host genes. CONCLUSIONS With a few exceptions, most of our findings support the exaptation hypothesis for the role of TE epigenetic modifications when vetted against the genome defense hypothesis. The recruitment of epigenetic modifications may represent an additional mechanism by which TEs can contribute to the regulatory functions of their host genomes.
Collapse
Affiliation(s)
- Ahsan Huda
- School of Biology, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA 30332, USA
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Computational Biology and Bioinformatics Unit, Biotechnology and Bioindustry Center, Corporacion Colombiana de Investigacion, Agropecuaria - CORPOICA, Km 14 Via a Mosquera, Bogota, Colombia
| | - I King Jordan
- School of Biology, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA 30332, USA
| |
Collapse
|
39
|
Heteronemin, a spongean sesterterpene, inhibits TNF alpha-induced NF-kappa B activation through proteasome inhibition and induces apoptotic cell death. Biochem Pharmacol 2010; 79:610-22. [PMID: 19814997 DOI: 10.1016/j.bcp.2009.09.027] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2009] [Revised: 09/29/2009] [Accepted: 09/30/2009] [Indexed: 01/08/2023]
Abstract
In this study, we investigated the biological effects of heteronemin, a marine sesterterpene isolated from the sponge Hyrtios sp. on chronic myelogenous leukemia cells. To gain further insight into the molecular mechanisms triggered by this compound, we initially performed DNA microarray profiling and determined which genes respond to heteronemin stimulation in TNFalpha-treated cells and which genes display an interaction effect between heteronemin and TNFalpha. Within the differentially regulated genes, we found that heteronemin was affecting cellular processes including cell cycle, apoptosis, mitogen-activated protein kinases (MAPKs) pathway and the nuclear factor kappaB (NF-kappaB) signaling cascade. We confirmed in silico experiments regarding NF-kappaB inhibition by reporter gene analysis, electrophoretic mobility shift analysis and I-kappaB degradation. In order to assess the underlying molecular mechanisms, we determined that heteronemin inhibits both trypsin and chymotrypsin-like proteasome activity at an IC(50) of 0.4 microM. Concomitant to the inhibition of the NF-kappaB pathway, we also observed a reduction in cellular viability. Heteronemin induces apoptosis as shown by annexin V-FITC/propidium iodide-staining, nuclear morphology analysis, pro-caspase-3, -8 and -9 and poly(ADP-ribose) polymerase (PARP) cleavage as well as truncation of Bid. Altogether, results show that this compound has potential as anti-inflammatory and anti-cancer agent.
Collapse
|
40
|
Murakami N, Hashidate T, Harayama T, Yokomizo T, Shimizu T, Nakamura M. Transcriptional regulation of human G2A in monocytes/ macrophages: involvement of c/EBPs, Runx and Pu.1. Genes Cells 2009; 14:1441-55. [DOI: 10.1111/j.1365-2443.2009.01360.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
41
|
Huda A, Mariño-Ramírez L, Landsman D, Jordan IK. Repetitive DNA elements, nucleosome binding and human gene expression. Gene 2009; 436:12-22. [PMID: 19393174 PMCID: PMC2921533 DOI: 10.1016/j.gene.2009.01.013] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Accepted: 01/23/2009] [Indexed: 11/26/2022]
Abstract
We evaluated the epigenetic contributions of repetitive DNA elements to human gene regulation. Human proximal promoter sequences show distinct distributions of transposable elements (TEs) and simple sequence repeats (SSRs). TEs are enriched distal from transcriptional start sites (TSSs) and their frequency decreases closer to TSSs, being largely absent from the core promoter region. SSRs, on the other hand, are found at low frequency distal to the TSS and then increase in frequency starting approximately 150 bp upstream of the TSS. The peak of SSR density is centered around the -35 bp position where the basal transcriptional machinery assembles. These trends in repetitive sequence distribution are strongly correlated, positively for TEs and negatively for SSRs, with relative nucleosome binding affinities along the promoters. Nucleosomes bind with highest probability distal from the TSS and the nucleosome binding affinity steadily decreases reaching its nadir just upstream of the TSS at the same point where SSR frequency is at its highest. Promoters that are enriched for TEs are more highly and broadly expressed, on average, than promoters that are devoid of TEs. In addition, promoters that have similar repetitive DNA profiles regulate genes that have more similar expression patterns and encode proteins with more similar functions than promoters that differ with respect to their repetitive DNA. Furthermore, distinct repetitive DNA promoter profiles are correlated with tissue-specific patterns of expression. These observations indicate that repetitive DNA elements mediate chromatin accessibility in proximal promoter regions and the repeat content of promoters is relevant to both gene expression and function.
Collapse
Affiliation(s)
- Ahsan Huda
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894
| | - David Landsman
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894
| | - I. King Jordan
- School of Biology, Georgia Institute of Technology, Atlanta, GA 30332
| |
Collapse
|
42
|
RBF-TSS: identification of transcription start site in human using radial basis functions network and oligonucleotide positional frequencies. PLoS One 2009; 4:e4878. [PMID: 19287502 PMCID: PMC2654504 DOI: 10.1371/journal.pone.0004878] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2008] [Accepted: 02/20/2009] [Indexed: 11/19/2022] Open
Abstract
Accurate identification of promoter regions and transcription start sites (TSS) in genomic DNA allows for a more complete understanding of the structure of genes and gene regulation within a given genome. Many recently published methods have achieved high identification accuracy of TSS. However, models providing more accurate modeling of promoters and TSS are needed. A novel identification method for identifying transcription start sites that improves the accuracy of TSS recognition for recently published methods is proposed. This method incorporates a metric feature based on oligonucleotide positional frequencies, taking into account the nature of promoters. A radial basis function neural network for identifying transcription start sites (RBF-TSS) is proposed and employed as a classification algorithm. Using non-overlapping chunks (windows) of size 50 and 500 on the human genome, the proposed method achieves an area under the Receiver Operator Characteristic curve (auROC) of 94.75% and 95.08% respectively, providing increased performance over existing TSS prediction methods.
Collapse
|
43
|
Megraw M, Pereira F, Jensen ST, Ohler U, Hatzigeorgiou AG. A transcription factor affinity-based code for mammalian transcription initiation. Genome Res 2009; 19:644-56. [PMID: 19141595 DOI: 10.1101/gr.085449.108] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The recent arrival of large-scale cap analysis of gene expression (CAGE) data sets in mammals provides a wealth of quantitative information on coding and noncoding RNA polymerase II transcription start sites (TSS). Genome-wide CAGE studies reveal that a large fraction of TSS exhibit peaks where the vast majority of associated tags map to a particular location ( approximately 45%), whereas other active regions contain a broader distribution of initiation events. The presence of a strong single peak suggests that transcription at these locations may be mediated by position-specific sequence features. We therefore propose a new model for single-peaked TSS based solely on known transcription factors (TFs) and their respective regions of positional enrichment. This probabilistic model leads to near-perfect classification results in cross-validation (auROC = 0.98), and performance in genomic scans demonstrates that TSS prediction with both high accuracy and spatial resolution is achievable for a specific but large subgroup of mammalian promoters. The interpretable model structure suggests a DNA code in which canonical sequence features such as TATA-box, Initiator, and GC content do play a significant role, but many additional TFs show distinct spatial biases with respect to TSS location and are important contributors to the accurate prediction of single-peak transcription initiation sites. The model structure also reveals that CAGE tag clusters distal from annotated gene starts have distinct characteristics compared to those close to gene 5'-ends. Using this high-resolution single-peak model, we predict TSS for approximately 70% of mammalian microRNAs based on currently available data.
Collapse
Affiliation(s)
- Molly Megraw
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | | | | | |
Collapse
|
44
|
Askary A, Masoudi-Nejad A, Sharafi R, Mizbani A, Parizi SN, Purmasjedi M. N4: A precise and highly sensitive promoter predictor using neural network fed by nearest neighbors. Genes Genet Syst 2009; 84:425-30. [DOI: 10.1266/ggs.84.425] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Affiliation(s)
- Amjad Askary
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics and COE in Biomathematics, University of Tehran
- Department of Biotechnology, College of Science, University of Tehran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics and COE in Biomathematics, University of Tehran
| | - Roozbeh Sharafi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics and COE in Biomathematics, University of Tehran
| | - Amir Mizbani
- Department of Biotechnology, College of Science, University of Tehran
| | | | - Malihe Purmasjedi
- Department of Biotechnology, College of Science, University of Tehran
| |
Collapse
|
45
|
Ertel A, Tozeren A. Human and mouse switch-like genes share common transcriptional regulatory mechanisms for bimodality. BMC Genomics 2008; 9:628. [PMID: 19105848 PMCID: PMC2631022 DOI: 10.1186/1471-2164-9-628] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2008] [Accepted: 12/23/2008] [Indexed: 12/13/2022] Open
Abstract
Background Gene expression is controlled over a wide range at the transcript level through complex interplay between DNA and regulatory proteins, resulting in profiles of gene expression that can be represented as normal, graded, and bimodal (switch-like) distributions. We have previously performed genome-scale identification and annotation of genes with switch-like expression at the transcript level in mouse, using large microarray datasets for healthy tissue, in order to study the cellular pathways and regulatory mechanisms involving this class of genes. We showed that a large population of bimodal mouse genes encoding for cell membrane and extracellular matrix proteins is involved in communication pathways. This study expands on previous results by annotating human bimodal genes, investigating their correspondence to bimodality in mouse orthologs and exploring possible regulatory mechanisms that contribute to bimodality in gene expression in human and mouse. Results Fourteen percent of the human genes on the HGU133A array (1847 out of 13076) were identified as bimodal or switch-like. More than 40% were found to have bimodal mouse orthologs. KEGG pathways enriched for bimodal genes included ECM-receptor interaction, focal adhesion, and tight junction, showing strong similarity to the results obtained in mouse. Tissue-specific modes of expression of bimodal genes among brain, heart, and skeletal muscle were common between human and mouse. Promoter analysis revealed a higher than average number of transcription start sites per gene within the set of bimodal genes. Moreover, the bimodal gene set had differentially methylated histones compared to the set of the remaining genes in the genome. Conclusion The fact that bimodal genes were enriched within the cell membrane and extracellular environment make these genes as candidates for biomarkers for tissue specificity. The commonality of the important roles bimodal genes play in tissue differentiation in both the human and mouse indicates the potential value of mouse data in providing context for human tissue studies. The regulation motifs enriched in the bimodal gene set (TATA boxes, alternative promoters, methlyation) have known associations with complex diseases, such as cancer, providing further potential for the use of bimodal genes in studying the molecular basis of disease.
Collapse
Affiliation(s)
- Adam Ertel
- Center for Integrated Bioinformatics, School of Biomedical Engineering, Science and Health Systems, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19104, USA
| | | |
Collapse
|
46
|
Ali AM, Bajaj V, Gopinath KS, Kumar A. Characterization of the human SLC22A18 gene promoter and its regulation by the transcription factor Sp1. Gene 2008; 429:37-43. [PMID: 18996451 DOI: 10.1016/j.gene.2008.10.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2008] [Revised: 10/10/2008] [Accepted: 10/11/2008] [Indexed: 11/19/2022]
Abstract
SLC22A18, a poly-specific organic cation transporter, is paternally imprinted in humans and mice. It shows loss-of-heterozygosity in childhood and adult tumors, and gain-of-imprinting in hepatocarcinomas and breast cancers. Despite the importance of this gene, its transcriptional regulation has not been studied, and the promoter has not yet been characterized. We therefore set out to identify the potential cis-regulatory elements including the promoter of this gene. The luciferase reporter assay in human cells indicated that a region from -120 bp to +78 bp is required for the core promoter activity. No consensus TATA or CAAT boxes were found in this region, but two Sp1 binding sites were conserved in human, chimpanzee, mouse and rat. Mutational analysis of the two Sp1 sites suggested their requirement for the promoter activity. Chromatin-immunoprecipitation showed binding of Sp1 to the promoter region in vivo. Overexpression of Sp1 in Drosophila Sp1-null SL2 cells suggested that Sp1 is the transactivator of the promoter. The human core promoter was functional in mouse 3T3 and monkey COS7 cells. We found a CpG island which spanned the core promoter and exon 1. COBRA technique did not reveal promoter methylation in 10 normal oral tissues, 14 oral tumors, and two human cell lines HuH7 and A549. This study provides the first insight into the mechanism that controls expression of this imprinted tumor suppressor gene. A COBRA-based assay has been developed to look for promoter methylation in different cancers. The present data will help to understand the regulation of this gene and its role in tumorigenesis.
Collapse
Affiliation(s)
- Abdullah Mahmood Ali
- Department of Molecular Reproduction, Development and Genetics, Indian Institute of Science, Bangalore 560012, India
| | | | | | | |
Collapse
|
47
|
Analysis of gene regulatory networks in the mammalian circadian rhythm. PLoS Comput Biol 2008; 4:e1000193. [PMID: 18846204 PMCID: PMC2543109 DOI: 10.1371/journal.pcbi.1000193] [Citation(s) in RCA: 215] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2008] [Accepted: 08/27/2008] [Indexed: 11/19/2022] Open
Abstract
Circadian rhythm is fundamental in regulating a wide range of cellular, metabolic, physiological, and behavioral activities in mammals. Although a small number of key circadian genes have been identified through extensive molecular and genetic studies in the past, the existence of other key circadian genes and how they drive the genomewide circadian oscillation of gene expression in different tissues still remains unknown. Here we try to address these questions by integrating all available circadian microarray data in mammals. We identified 41 common circadian genes that showed circadian oscillation in a wide range of mouse tissues with a remarkable consistency of circadian phases across tissues. Comparisons across mouse, rat, rhesus macaque, and human showed that the circadian phases of known key circadian genes were delayed for 4–5 hours in rat compared to mouse and 8–12 hours in macaque and human compared to mouse. A systematic gene regulatory network for the mouse circadian rhythm was constructed after incorporating promoter analysis and transcription factor knockout or mutant microarray data. We observed the significant association of cis-regulatory elements: EBOX, DBOX, RRE, and HSE with the different phases of circadian oscillating genes. The analysis of the network structure revealed the paths through which light, food, and heat can entrain the circadian clock and identified that NR3C1 and FKBP/HSP90 complexes are central to the control of circadian genes through diverse environmental signals. Our study improves our understanding of the structure, design principle, and evolution of gene regulatory networks involved in the mammalian circadian rhythm. Circadian rhythm is universally present from unicellular organisms to complex organisms and plays an important role in physiological processes such as the sleep–wake cycle in mammals. The mammalian circadian rhythm presents an excellent system for studying gene regulatory networks as a large number of genes are undergoing circadian oscillation in their expression levels. By integrating all available microarray experiments on circadian rhythm in different tissues and species in mammals, we identified a set of common circadian genes lying in the center of the circadian clock. Significant differences in the circadian oscillation of gene expression among mouse, rat, macaque, and human have been observed that underlie their physiological and behavioral differences. We constructed a gene regulatory network for the mouse circadian rhythm using knockout or mutant microarray data that have previously received little attention. Further analysis revealed not only additional feedback loops in the network contributing to the robustness of the circadian clock but also how environmental factors such as light, food, and heat can entrain the circadian rhythm. Our study provides the first gene regulatory network of the mammalian circadian rhythm at the system level. It is also the first attempt to compare gene regulatory networks of circadian rhythm in different mammalian species.
Collapse
|
48
|
Petit MM, Lindskog H, Larsson E, Wasteson P, Athley E, Breuer S, Angstenberger M, Hertfelder D, Mattsson E, Nordheim A, Nelander S, Lindahl P. Smooth Muscle Expression of Lipoma Preferred Partner Is Mediated by an Alternative Intronic Promoter That Is Regulated by Serum Response Factor/Myocardin. Circ Res 2008; 103:61-9. [DOI: 10.1161/circresaha.108.177436] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Lipoma preferred partner (LPP) was recently recognized as a smooth muscle marker that plays a role in smooth muscle cell migration. In this report, we focus on the transcriptional regulation of the LPP gene. In particular, we investigate whether LPP is directly regulated by serum response factor (SRF). We show that the LPP gene contains 3 evolutionarily conserved CArG boxes and that 1 of these is part of an alternative promoter in intron 2. Quantitative RT-PCR shows that this alternative promoter directs transcription specifically to smooth muscle containing tissues in vivo. By using chromatin immunoprecipitation, we demonstrate that 2 of the CArG boxes, including the promoter-associated CArG box, bind to endogenous SRF in cultured aortic smooth muscle cells. Electrophoretic mobility-shift assays show that the conserved CArG boxes bind SRF in vitro. In reporter experiments, we show that the alternative promoter has transcriptional capacity that is dependent on SRF/myocardin and that the promoter associated CArG box is required for that activity. Finally, we show by quantitative RT-PCR that the alternative promoter is strongly downregulated in SRF-deficient embryonic stem cells and in smooth muscle tissues derived from conditional SRF knockout mice. Collectively, our data demonstrate that expression of LPP in smooth muscle is mediated by an alternative promoter that is regulated by SRF/myocardin.
Collapse
Affiliation(s)
- Marleen M.R. Petit
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Henrik Lindskog
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Erik Larsson
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Per Wasteson
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Elisabeth Athley
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Silke Breuer
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Meike Angstenberger
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - David Hertfelder
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Erney Mattsson
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Alfred Nordheim
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Sven Nelander
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| | - Per Lindahl
- From the Wallenberg Laboratory (M.M.R.P., H.L., E.L., P.W., E.A., S.B., E.M., S.N., P.L.), Sahlgrenska University Hospital, Göteborg, Sweden; Institute of Biomedicine (E.L., P.W., P.L.), Department of Medical Biochemistry and Cell Biology, Sahlgrenska Academy, University of Göteborg, Göteborg, Sweden; and Interfaculty Institute for Cell Biology (M.A., D.H., A.N.), Tuebingen University, Germany. Present address for M.M.R.P.: Department of Human Genetics, University of Leuven, Belgium. Present
| |
Collapse
|
49
|
Naismith L, Lalancette C, Platts AE, Krawetz SA. The KLAB Toolbox: a suite of in-house software applications for epigenetic analysis. Syst Biol Reprod Med 2008; 54:97-108. [PMID: 18446650 DOI: 10.1080/19396360801935644] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Systems biology presents a new paradigm for elucidating the processes required to organize and sustain life. We now have access to whole genome sequences, gene expression data for multiple cell types, and databases for regulatory elements governing these genes. These resources make it feasible to identify conserved genomic sequences across multiple species, transcription factors regulating the expression of genes with similar expression patterns within a given cell type and to compare expression levels of specific genes between normal and diseased cellular states. In order to utilize this wealth of information, new computational tools that integrate these datasets in a genome-wide context are required. Using the protamine cluster as an example, we present a series of in-house applications that we have developed to integrate, contextualize and visualize datasets across multiple hierarchies.
Collapse
Affiliation(s)
- Laura Naismith
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | | | | | | |
Collapse
|
50
|
Li X, Zeng J, Yan H. PCA-HPR: a principle component analysis model for human promoter recognition. Bioinformation 2008; 2:373-8. [PMID: 18795109 PMCID: PMC2533055 DOI: 10.6026/97320630002373] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Revised: 05/06/2008] [Accepted: 05/09/2008] [Indexed: 11/23/2022] Open
Abstract
We describe a promoter recognition method named PCA-HPR to locate eukaryotic promoter regions and predict transcription start sites (TSSs). We computed codon (3-mer) and pentamer (5-mer) frequencies and created codon and pentamer frequency feature matrices to extract informative and discriminative features for effective classification. Principal component analysis (PCA) is applied to the feature matrices and a subset of principal components (PCs) are selected for classification. Our system uses three neural network classifiers to distinguish promoters versus exons, promoters versus introns, and promoters versus 3' un-translated region (3'UTR). We compared PCA-HPR with three well-known existing promoter prediction systems such as DragonGSF, Eponine and FirstEF. Validation shows that PCA-HPR achieves the best performance with three test sets for all the four predictive systems.
Collapse
Affiliation(s)
- Xiaomeng Li
- Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong.
| | | | | |
Collapse
|