1
|
Yu J, Yu J, Kang Z, Peng Y. Integration of single-cell sequencing and mendelian randomization reveals novel causal pathways between monocytes and hepatocellular carcinoma. Discov Oncol 2025; 16:604. [PMID: 40272662 PMCID: PMC12021761 DOI: 10.1007/s12672-025-02357-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/18/2024] [Accepted: 04/09/2025] [Indexed: 04/27/2025] Open
Abstract
BACKGROUND Hepatocellular carcinoma (HCC) represents one of the most prevalent malignant neoplasms worldwide, characterized by poor prognosis and low 5-year survival rates. Despite extensive research, its pathogenesis remains largely unclear. Within the tumor microenvironment (TME), monocytes play a dual role: they participate in tumor cell recognition and elimination while regulating immune responses through cytokine secretion. This study aims to investigate the association between differentially expressed genes in monocytes and HCC development. METHODS This investigation employed single-cell transcriptomic analysis of human hepatic innate lymphoid cells (ILCs) to identify monocyte subpopulations and their cellular markers. Subsequently, two-sample Mendelian randomization (MR) analysis was conducted to examine the causal relationships between these cells, their associated genes, and HCC development. RESULTS Through comprehensive analysis of the monocyte cluster, we identified 2338 differentially expressed genes (DEGs). MR analysis revealed 13 genes significantly associated with HCC risk: CONCLUSION: This study represents the first integration of single-cell sequencing technology with MR analysis to investigate the relationship between monocytes and HCC. Through this innovative methodological approach, we have revealed potential associations between monocyte gene expression and HCC development, providing new directions for further research on HCC prevention and treatment, as well as identifying potential therapeutic targets.
Collapse
Affiliation(s)
- Jiang Yu
- North Sichuan Medical College, No. 234 Fujiang Road, Shunqing District, Nanchong City, Postal Code: 637000, Sichuan Province, China
| | - Jing Yu
- North Sichuan Medical College, No. 234 Fujiang Road, Shunqing District, Nanchong City, Postal Code: 637000, Sichuan Province, China
| | - Zhou Kang
- North Sichuan Medical College, No. 234 Fujiang Road, Shunqing District, Nanchong City, Postal Code: 637000, Sichuan Province, China
| | - Yong Peng
- Department of General Surgery, The Second Clinical Medical College, North Sichuan Medical College, Nanchong Central Hospital, No. 97, Renmin South Road, Shunqing District, Nanchong City, Postal Code: 637000, Sichuan Province, China.
| |
Collapse
|
2
|
Zhao B, Song K, Wei DQ, Xiong Y, Ding J. scCobra allows contrastive cell embedding learning with domain adaptation for single cell data integration and harmonization. Commun Biol 2025; 8:233. [PMID: 39948393 PMCID: PMC11825689 DOI: 10.1038/s42003-025-07692-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2024] [Accepted: 02/06/2025] [Indexed: 02/16/2025] Open
Abstract
The rapid advancement of single-cell technologies has created an urgent need for effective methods to integrate and harmonize single-cell data. Technical and biological variations across studies complicate data integration, while conventional tools often struggle with reliance on gene expression distribution assumptions and over-correction. Here, we present scCobra, a deep generative neural network designed to overcome these challenges through contrastive learning with domain adaptation. scCobra effectively mitigates batch effects, minimizes over-correction, and ensures biologically meaningful data integration without assuming specific gene expression distributions. It enables online label transfer across datasets with batch effects, allowing continuous integration of new data without retraining. Additionally, scCobra supports batch effect simulation, advanced multi-omic integration, and scalable processing of large datasets. By integrating and harmonizing datasets from similar studies, scCobra expands the available data for investigating specific biological problems, improving cross-study comparability, and revealing insights that may be obscured in isolated datasets.
Collapse
Affiliation(s)
- Bowen Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
- Division of Experimental Medicine, Department of Medicine, McGill University, Montreal, QC, Canada
| | - Kailu Song
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada
- Quantitative Life Sciences, McGill University, Montreal, QC, Canada
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
| | - Jun Ding
- Meakins-Christie Laboratories, Department of Medicine, McGill University Health Centre, Montreal, QC, Canada.
- Division of Experimental Medicine, Department of Medicine, McGill University, Montreal, QC, Canada.
- Quantitative Life Sciences, McGill University, Montreal, QC, Canada.
- School of Computer Science, McGill University, Montreal, QC, Canada.
- Mila-Quebec AI Institute, Montreal, QC, Canada.
| |
Collapse
|
3
|
Bramon Mora B, Lindsay H, Thiébaut A, Stuart KD, Gottardo R. tagtango: an application to compare single-cell annotations. Bioinformatics 2025; 41:btaf012. [PMID: 39798134 PMCID: PMC11814489 DOI: 10.1093/bioinformatics/btaf012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 12/11/2024] [Accepted: 01/08/2025] [Indexed: 01/15/2025] Open
Abstract
SUMMARY In this article, we present tagtango, an innovative R package and web application designed for robust and intuitive comparison of single-cell clusters and annotations. It offers an interactive platform that simplifies the exploration of differences and similarities among different clustering and annotation methods. Leveraging single-cell data analysis and different visualizations, it allows researchers to dissect the underlying biological differences across groups. tagtango is a user-friendly application that is portable and works seamlessly across multiple operating systems. AVAILABILITY AND IMPLEMENTATION tagtango is freely available at https://github.com/bernibra/tagtango as an R package as well as an online web service at https://tagtango.unil.ch.
Collapse
Affiliation(s)
- Bernat Bramon Mora
- Biomedical Data Science Center, Lausanne University Hospital, Vaud 1005, Switzerland
- Biomedical Data Science Center, University of Lausanne, Vaud 1015, Switzerland
- Swiss Institute of Bioinformatics, Vaud 1015, Switzerland
| | - Helen Lindsay
- Biomedical Data Science Center, Lausanne University Hospital, Vaud 1005, Switzerland
- Biomedical Data Science Center, University of Lausanne, Vaud 1015, Switzerland
- Swiss Institute of Bioinformatics, Vaud 1015, Switzerland
| | - Antonin Thiébaut
- Biomedical Data Science Center, Lausanne University Hospital, Vaud 1005, Switzerland
- Biomedical Data Science Center, University of Lausanne, Vaud 1015, Switzerland
- Swiss Institute of Bioinformatics, Vaud 1015, Switzerland
| | - Kenneth D Stuart
- Center for Global Infectious Disease Research, Seattle Children’s Hospital, WA 98105, United States
| | - Raphael Gottardo
- Biomedical Data Science Center, Lausanne University Hospital, Vaud 1005, Switzerland
- Biomedical Data Science Center, University of Lausanne, Vaud 1015, Switzerland
- Swiss Institute of Bioinformatics, Vaud 1015, Switzerland
- School of Life Sciences, EPFL - Swiss Federal Technology Institute of Lausanne, Lausanne, Vaud 1015, Switzerland
| |
Collapse
|
4
|
Yang T, Zhang N, Yang N. Single-cell sequencing in diabetic retinopathy: progress and prospects. J Transl Med 2025; 23:49. [PMID: 39806376 PMCID: PMC11727737 DOI: 10.1186/s12967-024-06066-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2024] [Accepted: 12/30/2024] [Indexed: 01/16/2025] Open
Abstract
Diabetic retinopathy is a major ocular complication of diabetes, characterized by progressive retinal microvascular damage and significant visual impairment in working-age adults. Traditional bulk RNA sequencing offers overall gene expression profiles but does not account for cellular heterogeneity. Single-cell RNA sequencing overcomes this limitation by providing transcriptomic data at the individual cell level and distinguishing novel cell subtypes, developmental trajectories, and intercellular communications. Researchers can use single-cell sequencing to draw retinal cell atlases and identify the transcriptomic features of retinal cells, enhancing our understanding of the pathogenesis and pathological changes in diabetic retinopathy. Additionally, single-cell sequencing is widely employed to analyze retinal organoids and single extracellular vesicles. Single-cell multi-omics sequencing integrates omics information, whereas stereo-sequencing analyzes gene expression and spatiotemporal data simultaneously. This review discusses the protocols of single-cell sequencing for obtaining single cells from retina and accurate sequencing data. It highlights the applications and advancements of single-cell sequencing in the study of normal retinas and the pathological changes associated with diabetic retinopathy. This underscores the potential of these technologies to deepen our understanding of the pathogenesis of diabetic retinopathy that may lead to the introduction of new therapeutic strategies.
Collapse
Affiliation(s)
- Tianshu Yang
- Department of Ophthalmology, Renmin Hospital of Wuhan University, Jiefang Road, Wuhan, Hubei, 430060, China
| | - Ningzhi Zhang
- Department of Ophthalmology, Renmin Hospital of Wuhan University, Jiefang Road, Wuhan, Hubei, 430060, China
| | - Ning Yang
- Department of Ophthalmology, Renmin Hospital of Wuhan University, Jiefang Road, Wuhan, Hubei, 430060, China.
| |
Collapse
|
5
|
Nasrollahi FSF, Silva FN, Liu S, Chaudhuri S, Yu M, Wang J, Nho K, Saykin AJ, Bennett DA, Sporns O, Fortunato S. Cell Type Differentiation Using Network Clustering Algorithms. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.04.626793. [PMID: 39677670 PMCID: PMC11643020 DOI: 10.1101/2024.12.04.626793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Single cell RNA-seq (scRNA-seq) technologies provide unprecedented resolution representing transcriptomics at the level of single cell. One of the biggest challenges in scRNA-seq data analysis is the cell type annotation, which is usually inferred by cell separation approaches. In-silico algorithms that accurately identify individual cell types in ongoing single-cell sequencing studies are crucial for unlocking cellular heterogeneity and understanding the biological basis of diseases. In this study, we focus on robustly identifying cell types in single-cell RNA sequencing data; we conduct a comparative analysis using methods established in biology, like Seurat, Leiden, and WGCNA, as well as Infomap, statistical inference via Stochastic Block Models (SBM), and single-cell Graph Neural Networks (scGNN). We also analyze preprocessing pipelines to identify and optimize key components in the process. Leveraging two independent datasets, PBMC and ROSMAP, we employ clustering algorithms on cell-cell networks derived from gene expression data. Our findings reveal that while clusters detected by WGCNA exhibit limited correspondence with cell types, those identified by multiresolution Infomap and Leiden, and SBM show a closer alignment, with Infomap standing out as a particularly effective approach. Infomap notably offers valuable insights for the precise characterization of cellular landscapes related to neurodegenration and immunology in scRNA-seq.
Collapse
Affiliation(s)
| | - Filipi Nascimento Silva
- Observatory of Social Media, Luddy School of Informatics, Computing, and Engineering, Indiana University, Indiana, USA
| | - Shiwei Liu
- Center for Neuroimaging and the Indiana Alzheimer’s Disease Research Center, Indiana University, Indiana, USA
| | - Soumilee Chaudhuri
- Center for Neuroimaging and the Indiana Alzheimer’s Disease Research Center, Indiana University, Indiana, USA
| | - Meichen Yu
- Center for Neuroimaging and the Indiana Alzheimer’s Disease Research Center, Indiana University, Indiana, USA
| | - Juexin Wang
- Department of Biomedical Engineering and Informatics, Luddy School of Informatics, Computing, and Engineering, Indiana University, Indiana, USA
| | - Kwangsik Nho
- Center for Neuroimaging and the Indiana Alzheimer’s Disease Research Center, Indiana University, Indiana, USA
| | - Andrew J. Saykin
- Center for Neuroimaging and the Indiana Alzheimer’s Disease Research Center, Indiana University, Indiana, USA
| | - David A. Bennett
- Rush Alzheimer’s Disease Center (Drs. Bennett, Schneider, and Wilson) and Rush Institute for Healthy Aging (Drs. Bienias and Evans), Rush University Medical Center, Illinois, USA
| | - Olaf Sporns
- Department of Psychology, Indiana University, Indiana, USA
| | - Santo Fortunato
- Observatory of Social Media, Luddy School of Informatics, Computing, and Engineering, Indiana University, Indiana, USA
| |
Collapse
|
6
|
Goggin SM, Zunder ER. ESCHR: a hyperparameter-randomized ensemble approach for robust clustering across diverse datasets. Genome Biol 2024; 25:242. [PMID: 39285487 PMCID: PMC11406744 DOI: 10.1186/s13059-024-03386-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 08/30/2024] [Indexed: 09/19/2024] Open
Abstract
Clustering is widely used for single-cell analysis, but current methods are limited in accuracy, robustness, ease of use, and interpretability. To address these limitations, we developed an ensemble clustering method that outperforms other methods at hard clustering without the need for hyperparameter tuning. It also performs soft clustering to characterize continuum-like regions and quantify clustering uncertainty, demonstrated here by mapping the connectivity and intermediate transitions between MNIST handwritten digits and between hypothalamic tanycyte subpopulations. This hyperparameter-randomized ensemble approach improves the accuracy, robustness, ease of use, and interpretability of single-cell clustering, and may prove useful in other fields as well.
Collapse
Affiliation(s)
- Sarah M Goggin
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA, 22902, USA
| | - Eli R Zunder
- Neuroscience Graduate Program, School of Medicine, University of Virginia, Charlottesville, VA, 22902, USA.
- Department of Biomedical Engineering, School of Engineering, University of Virginia, Charlottesville, VA, 22902, USA.
| |
Collapse
|
7
|
Zhang K, Kan H, Mao A, Yu F, Geng L, Zhou T, Feng L, Ma X. Integrated Single-Cell Transcriptomic Atlas of Human Kidney Endothelial Cells. J Am Soc Nephrol 2024; 35:578-593. [PMID: 38351505 PMCID: PMC11149048 DOI: 10.1681/asn.0000000000000320] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 02/09/2024] [Indexed: 03/23/2024] Open
Abstract
Key Points We created a comprehensive reference atlas of normal human kidney endothelial cells. We confirmed that endothelial cell types in the human kidney were also highly conserved in the mouse kidney. Background Kidney endothelial cells are exposed to different microenvironmental conditions that support specific physiologic processes. However, the heterogeneity of human kidney endothelial cells has not yet been systematically described. Methods We reprocessed and integrated seven human kidney control single-cell/single-nucleus RNA sequencing datasets of >200,000 kidney cells in the same process. Results We identified five major cell types, 29,992 of which were endothelial cells. Endothelial cell reclustering identified seven subgroups that differed in molecular characteristics and physiologic functions. Mapping new data to a normal kidney endothelial cell atlas allows rapid data annotation and analysis. We confirmed that endothelial cell types in the human kidney were also highly conserved in the mouse kidney and identified endothelial marker genes that were conserved in humans and mice, as well as differentially expressed genes between corresponding subpopulations. Furthermore, combined analysis of single-cell transcriptome data with public genome-wide association study data showed a significant enrichment of endothelial cells, especially arterial endothelial cells, in BP heritability. Finally, we identified M1 and M12 from coexpression networks in endothelial cells that may be deeply involved in BP regulation. Conclusions We created a comprehensive reference atlas of normal human kidney endothelial cells that provides the molecular foundation for understanding how the identity and function of kidney endothelial cells are altered in disease, aging, and between species. Finally, we provide a publicly accessible online tool to explore the datasets described in this work (https://vascularmap.jiangnan.edu.cn ).
Collapse
Affiliation(s)
- Ka Zhang
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
- School of Food Science and Technology, Jiangnan University, Wuxi, China
| | - Hao Kan
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Aiqin Mao
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Fan Yu
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Li Geng
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Tingting Zhou
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Lei Feng
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| | - Xin Ma
- Wuxi School of Medicine, Jiangnan University, Wuxi, China
| |
Collapse
|
8
|
Feng M, Fei S, Zou J, Xia J, Lai W, Huang Y, Swevers L, Sun J. Single-Nucleus Sequencing of Silkworm Larval Brain Reveals the Key Role of Lysozyme in the Antiviral Immune Response in Brain Hemocytes. J Innate Immun 2024; 16:173-187. [PMID: 38387449 PMCID: PMC10965234 DOI: 10.1159/000537815] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 02/01/2024] [Indexed: 02/24/2024] Open
Abstract
INTRODUCTION The brain is considered as an immune-privileged organ, yet innate immune reactions can occur in the central nervous system of vertebrates and invertebrates. Silkworm (Bombyx mori) is an economically important insect and a lepidopteran model species. The diversity of cell types in the silkworm brain, and how these cell subsets produce an immune response to virus infection, remains largely unknown. METHODS Single-nucleus RNA sequencing (snRNA-seq), bioinformatics analysis, RNAi, and other methods were mainly used to analyze the cell types and gene functions of the silkworm brain. RESULTS We used snRNA-seq to identify 19 distinct clusters representing Kenyon cell, glial cell, olfactory projection neuron, optic lobes neuron, hemocyte-like cell, and muscle cell types in the B. mori nucleopolyhedrovirus (BmNPV)-infected and BmNPV-uninfected silkworm larvae brain at the late stage of infection. Further, we found that the cell subset that exerts an antiviral function in the silkworm larvae brain corresponds to hemocytes. Specifically, antimicrobial peptides were significantly induced by BmNPV infection in the hemocytes, especially lysozyme, exerting antiviral effects. CONCLUSION Our single-cell dataset reveals the diversity of silkworm larvae brain cells, and the transcriptome analysis provides insights into the immune response following virus infection at the single-cell level.
Collapse
Affiliation(s)
- Min Feng
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Shigang Fei
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Jinglei Zou
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Junming Xia
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Wenxuan Lai
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Yigui Huang
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| | - Luc Swevers
- Insect Molecular Genetics and Biotechnology, National Centre for Scientific Research Demokritos, Institute of Biosciences and Applications, Athens, Greece
| | - Jingchen Sun
- Guangdong Provincial Key Laboratory of Agro-animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, China
| |
Collapse
|
9
|
Church SH, Mah JL, Wagner G, Dunn CW. Normalizing need not be the norm: count-based math for analyzing single-cell data. Theory Biosci 2024; 143:45-62. [PMID: 37947999 DOI: 10.1007/s12064-023-00408-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 10/13/2023] [Indexed: 11/12/2023]
Abstract
Counting transcripts of mRNA are a key method of observation in modern biology. With advances in counting transcripts in single cells (single-cell RNA sequencing or scRNA-seq), these data are routinely used to identify cells by their transcriptional profile, and to identify genes with differential cellular expression. Because the total number of transcripts counted per cell can vary for technical reasons, the first step of many commonly used scRNA-seq workflows is to normalize by sequencing depth, transforming counts into proportional abundances. The primary objective of this step is to reshape the data such that cells with similar biological proportions of transcripts end up with similar transformed measurements. But there is growing concern that normalization and other transformations result in unintended distortions that hinder both analyses and the interpretation of results. This has led to an intense focus on optimizing methods for normalization and transformation of scRNA-seq data. Here, we take an alternative approach, by avoiding normalization and transformation altogether. We abandon the use of distances to compare cells, and instead use a restricted algebra, motivated by measurement theory and abstract algebra, that preserves the count nature of the data. We demonstrate that this restricted algebra is sufficient to draw meaningful and practical comparisons of gene expression through the use of the dot product and other elementary operations. This approach sidesteps many of the problems with common transformations, and has the added benefit of being simpler and more intuitive. We implement our approach in the package countland, available in python and R.
Collapse
Affiliation(s)
- Samuel H Church
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA.
| | - Jasmine L Mah
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| | - Günter Wagner
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
- Yale Systems Biology Institute, Yale University, New Haven, CT, USA
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale Medical School, New Haven, CT, USA
- Department of Obstetrics and Gynecology, Wayne State University, Detroit, MI, USA
| | - Casey W Dunn
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
| |
Collapse
|
10
|
Carbonetto P, Luo K, Sarkar A, Hung A, Tayeb K, Pott S, Stephens M. GoM DE: interpreting structure in sequence count data with differential expression analysis allowing for grades of membership. Genome Biol 2023; 24:236. [PMID: 37858253 PMCID: PMC10588049 DOI: 10.1186/s13059-023-03067-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 09/20/2023] [Indexed: 10/21/2023] Open
Abstract
Parts-based representations, such as non-negative matrix factorization and topic modeling, have been used to identify structure from single-cell sequencing data sets, in particular structure that is not as well captured by clustering or other dimensionality reduction methods. However, interpreting the individual parts remains a challenge. To address this challenge, we extend methods for differential expression analysis by allowing cells to have partial membership to multiple groups. We call this grade of membership differential expression (GoM DE). We illustrate the benefits of GoM DE for annotating topics identified in several single-cell RNA-seq and ATAC-seq data sets.
Collapse
Affiliation(s)
- Peter Carbonetto
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Research Computing Center, University of Chicago, Chicago, IL, USA
| | - Kaixuan Luo
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Abhishek Sarkar
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Vesalius Therapeutics, Cambridge, MA, USA
| | - Anthony Hung
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA
| | - Karl Tayeb
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Committee on Genetics, Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
| | - Sebastian Pott
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA
| | - Matthew Stephens
- Department of Human Genetics, University of Chicago, Chicago, IL, USA.
- Department of Statistics, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
11
|
Mangiola S, Roth-Schulze AJ, Trussart M, Zozaya-Valdés E, Ma M, Gao Z, Rubin AF, Speed TP, Shim H, Papenfuss AT. sccomp: Robust differential composition and variability analysis for single-cell data. Proc Natl Acad Sci U S A 2023; 120:e2203828120. [PMID: 37549298 PMCID: PMC10438834 DOI: 10.1073/pnas.2203828120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2022] [Accepted: 05/18/2023] [Indexed: 08/09/2023] Open
Abstract
Cellular omics such as single-cell genomics, proteomics, and microbiomics allow the characterization of tissue and microbial community composition, which can be compared between conditions to identify biological drivers. This strategy has been critical to revealing markers of disease progression, such as cancer and pathogen infection. A dedicated statistical method for differential variability analysis is lacking for cellular omics data, and existing methods for differential composition analysis do not model some compositional data properties, suggesting there is room to improve model performance. Here, we introduce sccomp, a method for differential composition and variability analyses that jointly models data count distribution, compositionality, group-specific variability, and proportion mean-variability association, being aware of outliers. sccomp provides a comprehensive analysis framework that offers realistic data simulation and cross-study knowledge transfer. Here, we demonstrate that mean-variability association is ubiquitous across technologies, highlighting the inadequacy of the very popular Dirichlet-multinomial distribution. We show that sccomp accurately fits experimental data, significantly improving performance over state-of-the-art algorithms. Using sccomp, we identified differential constraints and composition in the microenvironment of primary breast cancer.
Collapse
Affiliation(s)
- Stefano Mangiola
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC3052, Australia
| | - Alexandra J. Roth-Schulze
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC3052, Australia
| | - Marie Trussart
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
| | - Enrique Zozaya-Valdés
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC3052, Australia
| | - Mengyao Ma
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
| | - Zijie Gao
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC3052, Australia
- School of Mathematics and Statistics, University of Melbourne, Parkville, VIC3052, Australia
| | - Alan F. Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC3052, Australia
| | - Terence P. Speed
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
| | - Heejung Shim
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC3052, Australia
- School of Mathematics and Statistics, University of Melbourne, Parkville, VIC3052, Australia
| | - Anthony T. Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC3052, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC3052, Australia
| |
Collapse
|
12
|
Xu Y, Kramann R, McCord RP, Hayat S. Fast model-free standardization and integration of single-cell transcriptomics data. RESEARCH SQUARE 2023:rs.3.rs-2485985. [PMID: 36747625 PMCID: PMC9901035 DOI: 10.21203/rs.3.rs-2485985/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Single-cell transcriptomics datasets from the same anatomical sites generated by different research labs are becoming increasingly common. However, fast and computationally inexpensive tools for standardization of cell-type annotation and data integration are still needed in order to increase research inclusivity. To standardize cell-type annotation and integrate single-cell transcriptomics datasets, we have built a fast model-free integration method, named MASI (Marker-Assisted Standardization and Integration). MASI first identifies putative cell-type markers from reference data through an ensemble approach. Then, it converts gene expression matrix to cell-type score matrix with the identified putative cell-type markers for the purpose of cell-type annotation and data integration. Because of integration through cell-type markers instead of model inference, MASI can annotate approximately one million cells on a personal laptop, which provides a cheap computational alternative for the single-cell community. We benchmark MASI with other well-established methods and demonstrate that MASI outperforms other methods based on speed. Its performance for both tasks of data integration and cell-type annotation are comparable or even superior to these existing methods. To harness knowledge from single-cell atlases, we demonstrate three case studies that cover integration across biological conditions, surveyed participants, and research groups, respectively.
Collapse
Affiliation(s)
- Yang Xu
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
| | - Rafael Kramann
- Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany
| | - Rachel Patton McCord
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Sikander Hayat
- Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
13
|
Li Z, Zhou X. BASS: multi-scale and multi-sample analysis enables accurate cell type clustering and spatial domain detection in spatial transcriptomic studies. Genome Biol 2022; 23:168. [PMID: 35927760 PMCID: PMC9351148 DOI: 10.1186/s13059-022-02734-7] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 07/21/2022] [Indexed: 02/08/2023] Open
Abstract
Spatial transcriptomic studies are reaching single-cell spatial resolution, with data often collected from multiple tissue sections. Here, we present a computational method, BASS, that enables multi-scale and multi-sample analysis for single-cell resolution spatial transcriptomics. BASS performs cell type clustering at the single-cell scale and spatial domain detection at the tissue regional scale, with the two tasks carried out simultaneously within a Bayesian hierarchical modeling framework. We illustrate the benefits of BASS through comprehensive simulations and applications to three datasets. The substantial power gain brought by BASS allows us to reveal accurate transcriptomic and cellular landscape in both cortex and hypothalamus.
Collapse
Affiliation(s)
- Zheng Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Center for Statistical Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
14
|
Dave A, Nekritz E, Charytonowicz D, Beaumont M, Smith M, Beaumont K, Silva J, Sebra R. Integration of Single-Cell Transcriptomics With a High Throughput Functional Screening Assay to Resolve Cell Type, Growth Kinetics, and Stemness Heterogeneity Within the Comma-1D Cell Line. Front Genet 2022; 13:894597. [PMID: 36630696 PMCID: PMC9237515 DOI: 10.3389/fgene.2022.894597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 05/20/2022] [Indexed: 01/14/2023] Open
Abstract
Cell lines are one of the most frequently implemented model systems in life sciences research as they provide reproducible high throughput testing. Differentiation of cell cultures varies by line and, in some cases, can result in functional modifications within a population. Although research is increasingly dependent on these in vitro model systems, the heterogeneity within cell lines has not been thoroughly investigated. Here, we have leveraged high throughput single-cell assays to investigate the Comma-1D mouse cell line that is known to differentiate in culture. Using scRNASeq and custom single-cell phenotype assays, we resolve the clonal heterogeneity within the referenced cell line on the genomic and functional level. We performed a cohesive analysis of the transcriptome of 5,195 sequenced cells, of which 85.3% of the total reads successfully mapped to the mm10-3.0.0 reference genome. Across multiple gene expression analysis pipelines, both luminal and myoepithelial lineages were observed. Deep differential gene expression analysis revealed eight subclusters identified as luminal progenitor, luminal differentiated, myoepithelial differentiated, and fibroblast subpopulations-suggesting functional clustering within each lineage. Gene expression of published mammary stem cell (MaSC) markers Epcam, Cd49f, and Sca-1 was detected across the population, with 116 (2.23%) sequenced cells expressing all three markers. To gain insight into functional heterogeneity, cells with patterned MaSC marker expression were isolated and phenotypically investigated through a custom single-cell high throughput assay. The comparison of growth kinetics demonstrates functional heterogeneity within each cell cluster while also illustrating significant limitations in current cell isolation methods. We outlined the upstream use of our novel automated cell identification platform-to be used prior to single-cell culture-for reduced cell stress and improved rare cell identification and capture. Through compounding single-cell pipelines, we better reveal the heterogeneity within Comma-1D to identify subpopulations with specific functional characteristics.
Collapse
Affiliation(s)
- Arpit Dave
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Erin Nekritz
- Department of Pathology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
| | - Daniel Charytonowicz
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Michael Beaumont
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Melissa Smith
- Department of Biochemistry and Molecular Genetics, University of Louisville, Louisville, KY, United States
| | - Kristin Beaumont
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Jose Silva
- Department of Pathology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Icahn Genomics Institute, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Center for Advanced Genomics Technology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
- Sema4, A Mount Sinai Venture, Stamford, CT, United States
| |
Collapse
|
15
|
Cytotoxic innate lymphoid cells sense cancer cell-expressed interleukin-15 to suppress human and murine malignancies. Nat Immunol 2022; 23:904-915. [PMID: 35618834 DOI: 10.1038/s41590-022-01213-2] [Citation(s) in RCA: 53] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 04/14/2022] [Indexed: 12/15/2022]
Abstract
Malignancy can be suppressed by the immune system. However, the classes of immunosurveillance responses and their mode of tumor sensing remain incompletely understood. Here, we show that although clear cell renal cell carcinoma (ccRCC) was infiltrated by exhaustion-phenotype CD8+ T cells that negatively correlated with patient prognosis, chromophobe RCC (chRCC) had abundant infiltration of granzyme A-expressing intraepithelial type 1 innate lymphoid cells (ILC1s) that positively associated with patient survival. Interleukin-15 (IL-15) promoted ILC1 granzyme A expression and cytotoxicity, and IL-15 expression in chRCC tumor tissue positively tracked with the ILC1 response. An ILC1 gene signature also predicted survival of a subset of breast cancer patients in association with IL-15 expression. Notably, ILC1s directly interacted with cancer cells, and IL-15 produced by cancer cells supported the expansion and anti-tumor function of ILC1s in a murine breast cancer model. Thus, ILC1 sensing of cancer cell IL-15 defines an immunosurveillance mechanism of epithelial malignancies.
Collapse
|
16
|
Li L, Zhang Y, Ren Y, Cheng Z, Zhang Y, Wang X, Zhao H, Lu H. Pan-Cancer Single-Cell Analysis Reveals the Core Factors and Pathway in Specific Cancer Stem Cells of Upper Gastrointestinal Cancer. Front Bioeng Biotechnol 2022; 10:849798. [PMID: 35646860 PMCID: PMC9136039 DOI: 10.3389/fbioe.2022.849798] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 04/07/2022] [Indexed: 12/24/2022] Open
Abstract
Upper gastrointestinal cancer (UGIC) is an aggressive carcinoma with increasing incidence and poor outcomes worldwide. Here, we collected 39,057 cells, and they were annotated into nine cell types. By clustering cancer stem cells (CSCs), we discovered the ubiquitous existence of sub-cluster CSCs in all UGICs, which is named upper gastrointestinal cancer stem cells (UGCSCs). The identification of UGCSC function is coincident with the carcinogen of UGICs. We compared the UGCSC expression profile with 215,291 single cells from six other cancers and discovered that UGCSCs are specific tumor stem cells in UGIC. Exploration of the expression network indicated that inflammatory genes (CXCL8, CXCL3, PIGR, and RNASE1) and Wnt pathway genes (GAST, REG1A, TFF3, and ZG16B) are upregulated in tumor stem cells of UGICs. These results suggest a new mechanism for carcinogenesis in UGIC: mucosa damage and repair caused by poor eating habits lead to chronic inflammation, and the persistent chronic inflammation triggers the Wnt pathway; ultimately, this process induces UGICs. These findings establish the core signal pathway that connects poor eating habits and UGIC. Our system provides deeper insights into UGIC carcinogens and a platform to promote gastrointestinal cancer diagnosis and therapy.
Collapse
Affiliation(s)
- Leijie Li
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yujia Zhang
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yongyong Ren
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Zhiwei Cheng
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Yuening Zhang
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Xinbo Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| | - Hongyu Zhao
- Department of Biostatistics, Yale University, New Haven, CT, United States
| | - Hui Lu
- SJTU-Yale Joint Center for Biostatistics and Data Science, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- *Correspondence: Hui Lu,
| |
Collapse
|
17
|
Yu L, Cao Y, Yang JYH, Yang P. Benchmarking clustering algorithms on estimating the number of cell types from single-cell RNA-sequencing data. Genome Biol 2022; 23:49. [PMID: 35135612 PMCID: PMC8822786 DOI: 10.1186/s13059-022-02622-0] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 01/27/2022] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND A key task in single-cell RNA-seq (scRNA-seq) data analysis is to accurately detect the number of cell types in the sample, which can be critical for downstream analyses such as cell type identification. Various scRNA-seq data clustering algorithms have been specifically designed to automatically estimate the number of cell types through optimising the number of clusters in a dataset. The lack of benchmark studies, however, complicates the choice of the methods. RESULTS We systematically benchmark a range of popular clustering algorithms on estimating the number of cell types in a variety of settings by sampling from the Tabula Muris data to create scRNA-seq datasets with a varying number of cell types, varying number of cells in each cell type, and different cell type proportions. The large number of datasets enables us to assess the performance of the algorithms, covering four broad categories of approaches, from various aspects using a panel of criteria. We further cross-compared the performance on datasets with high cell numbers using Tabula Muris and Tabula Sapiens data. CONCLUSIONS We identify the strengths and weaknesses of each method on multiple criteria including the deviation of estimation from the true number of cell types, variability of estimation, clustering concordance of cells to their predefined cell types, and running time and peak memory usage. We then summarise these results into a multi-aspect recommendation to the users. The proposed stability-based approach for estimating the number of cell types is implemented in an R package and is freely available from ( https://github.com/PYangLab/scCCESS ).
Collapse
Affiliation(s)
- Lijia Yu
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Yue Cao
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Jean Y H Yang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia
| | - Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, Sydney, NSW, 2006, Australia.
- Computational Systems Biology Group, Children's Medical Research Institute, University of Sydney, Westmead, NSW, 2145, Australia.
- Charles Perkins Centre, University of Sydney, Sydney, NSW, 2006, Australia.
| |
Collapse
|
18
|
Lotfollahi M, Naghipourfar M, Luecken MD, Khajavi M, Büttner M, Wagenstetter M, Avsec Ž, Gayoso A, Yosef N, Interlandi M, Rybakov S, Misharin AV, Theis FJ. Mapping single-cell data to reference atlases by transfer learning. Nat Biotechnol 2022; 40:121-130. [PMID: 34462589 PMCID: PMC8763644 DOI: 10.1038/s41587-021-01001-7] [Citation(s) in RCA: 233] [Impact Index Per Article: 77.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 06/28/2021] [Indexed: 02/07/2023]
Abstract
Large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. Yet learning from reference data is complicated by batch effects between datasets, limited availability of computational resources and sharing restrictions on raw data. Here we introduce a deep learning strategy for mapping query datasets on top of a reference called single-cell architectural surgery (scArches). scArches uses transfer learning and parameter optimization to enable efficient, decentralized, iterative reference building and contextualization of new datasets with existing references without sharing raw data. Using examples from mouse brain, pancreas, immune and whole-organism atlases, we show that scArches preserves biological state information while removing batch effects, despite using four orders of magnitude fewer parameters than de novo integration. scArches generalizes to multimodal reference mapping, allowing imputation of missing modalities. Finally, scArches retains coronavirus disease 2019 (COVID-19) disease variation when mapping to a healthy reference, enabling the discovery of disease-specific cell states. scArches will facilitate collaborative projects by enabling iterative construction, updating, sharing and efficient use of reference atlases.
Collapse
Affiliation(s)
- Mohammad Lotfollahi
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany
| | - Mohsen Naghipourfar
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Malte D Luecken
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Matin Khajavi
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Maren Büttner
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Marco Wagenstetter
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
| | - Žiga Avsec
- Department of Computer Science, Technical University of Munich, Munich, Germany
| | - Adam Gayoso
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nir Yosef
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Ragon Institute of MGH, MIT and Harvard, Cambridge, MA, USA
| | - Marta Interlandi
- Institute of Medical Informatics, University of Münster, Münster, Germany
| | - Sergei Rybakov
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, Munich, Germany
| | - Alexander V Misharin
- Division of Pulmonary and Critical Care Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Fabian J Theis
- Helmholtz Center Munich-German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany.
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
- Department of Mathematics, Technical University of Munich, Munich, Germany.
| |
Collapse
|
19
|
Abstract
Epigenome regulation has emerged as an important mechanism for the maintenance of organ function in health and disease. Dissecting epigenomic alterations and resultant gene expression changes in single cells provides unprecedented resolution and insight into cellular diversity, modes of gene regulation, transcription factor dynamics and 3D genome organization. In this chapter, we summarize the transformative single-cell epigenomic technologies that have deepened our understanding of the fundamental principles of gene regulation. We provide a historical perspective of these methods, brief procedural outline with emphasis on the computational tools used to meaningfully dissect information. Our overall goal is to aid scientists using these technologies in their favorite system of interest.
Collapse
Affiliation(s)
- Krystyna Mazan-Mamczarz
- Laboratory of Genetics and Genomics, National Institute on Aging (NIA), Intramural Research Program (IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Jisu Ha
- Laboratory of Genetics and Genomics, National Institute on Aging (NIA), Intramural Research Program (IRP), National Institutes of Health (NIH), Baltimore, MD, USA
| | - Supriyo De
- Laboratory of Genetics and Genomics, National Institute on Aging (NIA), Intramural Research Program (IRP), National Institutes of Health (NIH), Baltimore, MD, USA
- Laboratory of Genetics and Genomics, and Computational Biology and Genomics Core, National Institute on Aging-Intramural Research Program, National Institute of Health, Baltimore, MD, USA
| | - Payel Sen
- Laboratory of Genetics and Genomics, National Institute on Aging (NIA), Intramural Research Program (IRP), National Institutes of Health (NIH), Baltimore, MD, USA.
| |
Collapse
|
20
|
You Y, Tian L, Su S, Dong X, Jabbari JS, Hickey PF, Ritchie ME. Benchmarking UMI-based single-cell RNA-seq preprocessing workflows. Genome Biol 2021; 22:339. [PMID: 34906205 PMCID: PMC8672463 DOI: 10.1186/s13059-021-02552-3] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/22/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Single-cell RNA-sequencing (scRNA-seq) technologies and associated analysis methods have rapidly developed in recent years. This includes preprocessing methods, which assign sequencing reads to genes to create count matrices for downstream analysis. While several packaged preprocessing workflows have been developed to provide users with convenient tools for handling this process, how they compare to one another and how they influence downstream analysis have not been well studied. RESULTS Here, we systematically benchmark the performance of 10 end-to-end preprocessing workflows (Cell Ranger, Optimus, salmon alevin, alevin-fry, kallisto bustools, dropSeqPipe, scPipe, zUMIs, celseq2, and scruff) using datasets yielding different biological complexity levels generated by CEL-Seq2 and 10x Chromium platforms. We compare these workflows in terms of their quantification properties directly and their impact on normalization and clustering by evaluating the performance of different method combinations. While the scRNA-seq preprocessing workflows compared vary in their detection and quantification of genes across datasets, after downstream analysis with performant normalization and clustering methods, almost all combinations produce clustering results that agree well with the known cell type labels that provided the ground truth in our analysis. CONCLUSIONS In summary, the choice of preprocessing method was found to be less important than other steps in the scRNA-seq analysis process. Our study comprehensively compares common scRNA-seq preprocessing workflows and summarizes their characteristics to guide workflow users.
Collapse
Affiliation(s)
- Yue You
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Luyi Tian
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Shian Su
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Xueyi Dong
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
| | - Jafar S. Jabbari
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Peter F. Hickey
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
- Single-Cell Open Research Endeavour (SCORE), The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
| | - Matthew E. Ritchie
- Epigenetics and Development Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, Australia
- School of Mathematics and Statistics, The University of Melbourne, Parkville, Australia
| |
Collapse
|
21
|
Bej S, Galow AM, David R, Wolfien M, Wolkenhauer O. Automated annotation of rare-cell types from single-cell RNA-sequencing data through synthetic oversampling. BMC Bioinformatics 2021; 22:557. [PMID: 34798805 PMCID: PMC8603509 DOI: 10.1186/s12859-021-04469-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 11/03/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The research landscape of single-cell and single-nuclei RNA-sequencing is evolving rapidly. In particular, the area for the detection of rare cells was highly facilitated by this technology. However, an automated, unbiased, and accurate annotation of rare subpopulations is challenging. Once rare cells are identified in one dataset, it is usually necessary to generate further specific datasets to enrich the analysis (e.g., with samples from other tissues). From a machine learning perspective, the challenge arises from the fact that rare-cell subpopulations constitute an imbalanced classification problem. We here introduce a Machine Learning (ML)-based oversampling method that uses gene expression counts of already identified rare cells as an input to generate synthetic cells to then identify similar (rare) cells in other publicly available experiments. We utilize single-cell synthetic oversampling (sc-SynO), which is based on the Localized Random Affine Shadowsampling (LoRAS) algorithm. The algorithm corrects for the overall imbalance ratio of the minority and majority class. RESULTS We demonstrate the effectiveness of our method for three independent use cases, each consisting of already published datasets. The first use case identifies cardiac glial cells in snRNA-Seq data (17 nuclei out of 8635). This use case was designed to take a larger imbalance ratio (~1 to 500) into account and only uses single-nuclei data. The second use case was designed to jointly use snRNA-Seq data and scRNA-Seq on a lower imbalance ratio (~1 to 26) for the training step to likewise investigate the potential of the algorithm to consider both single-cell capture procedures and the impact of "less" rare-cell types. The third dataset refers to the murine data of the Allen Brain Atlas, including more than 1 million cells. For validation purposes only, all datasets have also been analyzed traditionally using common data analysis approaches, such as the Seurat workflow. CONCLUSIONS In comparison to baseline testing without oversampling, our approach identifies rare-cells with a robust precision-recall balance, including a high accuracy and low false positive detection rate. A practical benefit of our algorithm is that it can be readily implemented in other and existing workflows. The code basis in R and Python is publicly available at FairdomHub, as well as GitHub, and can easily be transferred to identify other rare-cell types.
Collapse
Affiliation(s)
- Saptarshi Bej
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany
- Leibniz-Institute for Food Systems Biology, Technical University of Munich, 85354, Freising, Germany
| | - Anne-Marie Galow
- Institute of Genome Biology, Research Institute for Farm Animal Biology, 18196, Dummerstorf, Germany
| | - Robert David
- Department of Cardiac Surgery, Rostock University Medical Centre, 18057, Rostock, Germany
- Department of Life, Light and Matter, University of Rostock, 18059, Rostock, Germany
| | - Markus Wolfien
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany
| | - Olaf Wolkenhauer
- Department of Systems Biology and Bioinformatics, University of Rostock, 18057, Rostock, Germany.
- Leibniz-Institute for Food Systems Biology, Technical University of Munich, 85354, Freising, Germany.
- Stellenbosch Institute of Advanced Study, Stellenbosch University, Stellenbosch, 7602, South Africa.
| |
Collapse
|
22
|
Zappia L, Theis FJ. Over 1000 tools reveal trends in the single-cell RNA-seq analysis landscape. Genome Biol 2021; 22:301. [PMID: 34715899 PMCID: PMC8555270 DOI: 10.1186/s13059-021-02519-4] [Citation(s) in RCA: 84] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Accepted: 10/14/2021] [Indexed: 11/16/2022] Open
Abstract
Recent years have seen a revolution in single-cell RNA-sequencing (scRNA-seq) technologies, datasets, and analysis methods. Since 2016, the scRNA-tools database has cataloged software tools for analyzing scRNA-seq data. With the number of tools in the database passing 1000, we provide an update on the state of the project and the field. This data shows the evolution of the field and a change of focus from ordering cells on continuous trajectories to integrating multiple samples and making use of reference datasets. We also find that open science practices reward developers with increased recognition and help accelerate the field.
Collapse
Affiliation(s)
- Luke Zappia
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Zentrum München, 85764, Neuherberg, Germany.
- Department of Mathematics, Technical University of Munich, 85748, Garching bei München, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354, Freising, Germany.
| |
Collapse
|
23
|
Transcriptional Differences in Lipid-Metabolizing Enzymes in Murine Sebocytes Derived from Sebaceous Glands of the Skin and Preputial Glands. Int J Mol Sci 2021; 22:ijms222111631. [PMID: 34769061 PMCID: PMC8584257 DOI: 10.3390/ijms222111631] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 10/22/2021] [Accepted: 10/25/2021] [Indexed: 12/18/2022] Open
Abstract
Sebaceous glands are adnexal structures, which critically contribute to skin homeostasis and the establishment of a functional epidermal barrier. Sebocytes, the main cell population found within the sebaceous glands, are highly specialized lipid-producing cells. Sebaceous gland-resembling tissue structures are also found in male rodents in the form of preputial glands. Similar to sebaceous glands, they are composed of lipid-specialized sebocytes. Due to a lack of adequate organ culture models for skin sebaceous glands and the fact that preputial glands are much larger and easier to handle, previous studies used preputial glands as a model for skin sebaceous glands. Here, we compared both types of sebocytes, using a single-cell RNA sequencing approach, to unravel potential similarities and differences between the two sebocyte populations. In spite of common gene expression patterns due to general lipid-producing properties, we found significant differences in the expression levels of genes encoding enzymes involved in the biogenesis of specialized lipid classes. Specifically, genes critically involved in the mevalonate pathway, including squalene synthase, as well as the sphingolipid salvage pathway, such as ceramide synthase, (acid) sphingomyelinase or acid and alkaline ceramidases, were significantly less expressed by preputial gland sebocytes. Together, our data revealed tissue-specific sebocyte populations, indicating major developmental, functional as well as biosynthetic differences between both glands. The use of preputial glands as a surrogate model to study skin sebaceous glands is therefore limited, and major differences between both glands need to be carefully considered before planning an experiment.
Collapse
|
24
|
Shiga M, Seno S, Onizuka M, Matsuda H. SC-JNMF: single-cell clustering integrating multiple quantification methods based on joint non-negative matrix factorization. PeerJ 2021; 9:e12087. [PMID: 34532161 PMCID: PMC8404576 DOI: 10.7717/peerj.12087] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 08/07/2021] [Indexed: 11/20/2022] Open
Abstract
Single-cell RNA-sequencing is a rapidly evolving technology that enables us to understand biological processes at unprecedented resolution. Single-cell expression analysis requires a complex data processing pipeline, and the pipeline is divided into two main parts: The quantification part, which converts the sequence information into gene-cell matrix data; the analysis part, which analyzes the matrix data using statistics and/or machine learning techniques. In the analysis part, unsupervised cell clustering plays an important role in identifying cell types and discovering cell diversity and subpopulations. Identified cell clusters are also used for subsequent analysis, such as finding differentially expressed genes and inferring cell trajectories. However, single-cell clustering using gene expression profiles shows different results depending on the quantification methods. Clustering results are greatly affected by the quantification method used in the upstream process. In other words, even if the original RNA-sequence data is the same, gene expression profiles processed by different quantification methods will produce different clusters. In this article, we propose a robust and highly accurate clustering method based on joint non-negative matrix factorization (joint-NMF) by utilizing the information from multiple gene expression profiles quantified using different methods from the same RNA-sequence data. Our joint-NMF can extract common factors among multiple gene expression profiles by applying each NMF under the constraint that one of the factorized matrices is shared among multiple NMFs. The joint-NMF determines more robust and accurate cell clustering results by leveraging multiple quantification methods compared to conventional clustering methods, which use only a single gene expression profile. Additionally, we showed the usefulness of discovering marker genes with the extracted features using our method.
Collapse
Affiliation(s)
- Mikio Shiga
- Graduate School of Information Science and Technology, Osaka University, Osaka, Japan
| | - Shigeto Seno
- Graduate School of Information Science and Technology, Osaka University, Osaka, Japan
| | - Makoto Onizuka
- Graduate School of Information Science and Technology, Osaka University, Osaka, Japan
| | - Hideo Matsuda
- Graduate School of Information Science and Technology, Osaka University, Osaka, Japan
| |
Collapse
|
25
|
Schmidt F, Ranjan B, Lin Q, Krishnan V, Joanito I, Honardoost M, Nawaz Z, Venkatesh P, Tan J, Rayan N, Ong S, Prabhakar S. RCA2: a scalable supervised clustering algorithm that reduces batch effects in scRNA-seq data. Nucleic Acids Res 2021; 49:8505-8519. [PMID: 34320202 PMCID: PMC8344557 DOI: 10.1093/nar/gkab632] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 07/06/2021] [Accepted: 07/13/2021] [Indexed: 11/13/2022] Open
Abstract
The transcriptomic diversity of cell types in the human body can be analysed in unprecedented detail using single cell (SC) technologies. Unsupervised clustering of SC transcriptomes, which is the default technique for defining cell types, is prone to group cells by technical, rather than biological, variation. Compared to de-novo (unsupervised) clustering, we demonstrate using multiple benchmarks that supervised clustering, which uses reference transcriptomes as a guide, is robust to batch effects and data quality artifacts. Here, we present RCA2, the first algorithm to combine reference projection (batch effect robustness) with graph-based clustering (scalability). In addition, RCA2 provides a user-friendly framework incorporating multiple commonly used downstream analysis modules. RCA2 also provides new reference panels for human and mouse and supports generation of custom panels. Furthermore, RCA2 facilitates cell type-specific QC, which is essential for accurate clustering of data from heterogeneous tissues. We demonstrate the advantages of RCA2 on SC data from human bone marrow, healthy PBMCs and PBMCs from COVID-19 patients. Scalable supervised clustering methods such as RCA2 will facilitate unified analysis of cohort-scale SC datasets.
Collapse
Affiliation(s)
- Florian Schmidt
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Bobby Ranjan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Quy Xiao Xuan Lin
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | | | - Ignasius Joanito
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Mohammad Amin Honardoost
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
- Department of Medicine, School of Medicine, National University of Singapore, 1 Kent Ridge Road, level 10, NUHS Tower Block, 119228, Singapore
| | - Zahid Nawaz
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Prasanna Nori Venkatesh
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Joanna Tan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Nirmala Arul Rayan
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| | - Sin Tiong Ong
- DUKE-NUS Medical School, 8 College Rd, 169857, Singapore
- Department of Medicine, Duke University Medical Center, Durham, NC 27710, USA
| | - Shyam Prabhakar
- Laboratory of Systems Biology and Data Analytics, Genome Institute of Singapore, A*STAR, 60 Biopolis St, 138672, Singapore
| |
Collapse
|
26
|
Talukdar S, Chang Z, Winterhoff B, Starr TK. Single-Cell RNA Sequencing of Ovarian Cancer: Promises and Challenges. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2021; 1330:113-123. [PMID: 34339033 DOI: 10.1007/978-3-030-73359-9_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Ovarian cancer remains the leading cause of death from gynecologic malignancy in the Western world. Tumors are comprised of heterogeneous populations of various cancer, immune, and stromal cells; it is hypothesized that rare cancer stem cells within these subpopulations lead to disease recurrence and treatment resistance. Technological advances now allow for the analysis of tumor genomes and transcriptomes at the single-cell level, which provides the resolution to potentially identify these rare cancer stem cells within the larger tumor.In this chapter, we review the evolution of next-generation RNA sequencing techniques, the methodology of single-cell isolation and sequencing, sequencing data analysis, and the potential applications in ovarian cancer. We also summarize the current published work using single-cell sequencing in ovarian cancer.By utilizing this novel technique to characterize the gene expression of rare subpopulations, new targets and treatment pathways may be identified in ovarian cancer to change treatment paradigms.
Collapse
Affiliation(s)
- Shobhana Talukdar
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Zenas Chang
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Boris Winterhoff
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA
| | - Timothy K Starr
- Division of Gynecologic Oncology, Department of Obstetrics, Gynecology and Women's Health, University of Minnesota School of Medicine, Minneapolis, MN, USA.
- Masonic Cancer Center, University of Minnesota, Minneapolis, MN, USA.
- Institute of Health Informatics, University of Minnesota, Minneapolis, MN, USA.
| |
Collapse
|
27
|
Wilk AJ, Lee MJ, Wei B, Parks B, Pi R, Martínez-Colón GJ, Ranganath T, Zhao NQ, Taylor S, Becker W, Stanford COVID-19 Biobank, Jimenez-Morales D, Blomkalns AL, O’Hara R, Ashley EA, Nadeau KC, Yang S, Holmes S, Rabinovitch M, Rogers AJ, Greenleaf WJ, Blish CA. Multi-omic profiling reveals widespread dysregulation of innate immunity and hematopoiesis in COVID-19. J Exp Med 2021; 218:e20210582. [PMID: 34128959 PMCID: PMC8210586 DOI: 10.1084/jem.20210582] [Citation(s) in RCA: 138] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 05/13/2021] [Accepted: 05/13/2021] [Indexed: 12/20/2022] Open
Abstract
Our understanding of protective versus pathological immune responses to SARS-CoV-2, the virus that causes coronavirus disease 2019 (COVID-19), is limited by inadequate profiling of patients at the extremes of the disease severity spectrum. Here, we performed multi-omic single-cell immune profiling of 64 COVID-19 patients across the full range of disease severity, from outpatients with mild disease to fatal cases. Our transcriptomic, epigenomic, and proteomic analyses revealed widespread dysfunction of peripheral innate immunity in severe and fatal COVID-19, including prominent hyperactivation signatures in neutrophils and NK cells. We also identified chromatin accessibility changes at NF-κB binding sites within cytokine gene loci as a potential mechanism for the striking lack of pro-inflammatory cytokine production observed in monocytes in severe and fatal COVID-19. We further demonstrated that emergency myelopoiesis is a prominent feature of fatal COVID-19. Collectively, our results reveal disease severity-associated immune phenotypes in COVID-19 and identify pathogenesis-associated pathways that are potential targets for therapeutic intervention.
Collapse
Affiliation(s)
- Aaron J. Wilk
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Madeline J. Lee
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Bei Wei
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Benjamin Parks
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Graduate Program in Computer Science, Stanford University School of Medicine, Stanford, CA
| | - Ruoxi Pi
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | | | - Thanmayi Ranganath
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Nancy Q. Zhao
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Shalina Taylor
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA
- Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford University School of Medicine, Stanford, CA
| | - Winston Becker
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | | | | | - Andra L. Blomkalns
- Department of Emergency Medicine, Stanford University School of Medicine, Stanford, CA
| | - Ruth O’Hara
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA
| | - Euan A. Ashley
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - Kari C. Nadeau
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
- Sean N. Parker Center for Allergy and Asthma Research, Stanford University School of Medicine, Stanford, CA
| | - Samuel Yang
- Department of Emergency Medicine, Stanford University School of Medicine, Stanford, CA
| | - Susan Holmes
- Department of Statistics, Stanford University, Stanford, CA
| | - Marlene Rabinovitch
- Department of Pediatrics, Stanford University School of Medicine, Stanford, CA
- Cardiovascular Institute, Stanford University School of Medicine, Stanford, CA
- Vera Moulton Wall Center for Pulmonary Vascular Disease, Stanford University School of Medicine, Stanford, CA
| | - Angela J. Rogers
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
| | - William J. Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Department of Applied Physics, Stanford University, Stanford, CA
| | - Catherine A. Blish
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA
- Department of Medicine, Stanford University School of Medicine, Stanford, CA
- Chan Zuckerberg Biohub, San Francisco, CA
| |
Collapse
|
28
|
Song D, Li K, Hemminger Z, Wollman R, Li JJ. scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 2021; 37:i358-i366. [PMID: 34252925 PMCID: PMC8275345 DOI: 10.1093/bioinformatics/btab273] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Motivation Single-cell RNA sequencing (scRNA-seq) captures whole transcriptome information of individual cells. While scRNA-seq measures thousands of genes, researchers are often interested in only dozens to hundreds of genes for a closer study. Then, a question is how to select those informative genes from scRNA-seq data. Moreover, single-cell targeted gene profiling technologies are gaining popularity for their low costs, high sensitivity and extra (e.g. spatial) information; however, they typically can only measure up to a few hundred genes. Then another challenging question is how to select genes for targeted gene profiling based on existing scRNA-seq data. Results Here, we develop the single-cell Projective Non-negative Matrix Factorization (scPNMF) method to select informative genes from scRNA-seq data in an unsupervised way. Compared with existing gene selection methods, scPNMF has two advantages. First, its selected informative genes can better distinguish cell types. Second, it enables the alignment of new targeted gene profiling data with reference data in a low-dimensional space to facilitate the prediction of cell types in the new data. Technically, scPNMF modifies the PNMF algorithm for gene selection by changing the initialization and adding a basis selection step, which selects informative bases to distinguish cell types. We demonstrate that scPNMF outperforms the state-of-the-art gene selection methods on diverse scRNA-seq datasets. Moreover, we show that scPNMF can guide the design of targeted gene profiling experiments and the cell-type annotation on targeted gene profiling data. Availability and implementation The R package is open-access and available at https://github.com/JSB-UCLA/scPNMF. The data used in this work are available at Zenodo: https://doi.org/10.5281/zenodo.4797997. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, CA 90095-7246, USA
| | - Kexin Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA
| | - Zachary Hemminger
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, CA 90095-7239, USA
| | - Roy Wollman
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA 90095, USA.,Department of Integrative Biology and Physiology, University of California, Los Angeles, CA 90095-7239, USA.,Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095-1569, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, CA 90095-1554, USA.,Department of Human Genetics, University of California, Los Angeles, CA 90095-7088, USA.,Department of Computational Medicine, University of California, Los Angeles, CA 90095-1766, USA.,Department of Biostatistics, University of California Los Angeles, CA 90095-1772, USA
| |
Collapse
|
29
|
Wang Q, Peng C, Yang M, Huang F, Duan X, Wang S, Cheng H, Yang H, Zhao H, Qin Q. Single-cell RNA-seq landscape midbrain cell responses to red spotted grouper nervous necrosis virus infection. PLoS Pathog 2021; 17:e1009665. [PMID: 34185811 PMCID: PMC8241073 DOI: 10.1371/journal.ppat.1009665] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Accepted: 05/24/2021] [Indexed: 12/25/2022] Open
Abstract
Viral nervous necrosis (VNN) is an acute and serious fish disease caused by nervous necrosis virus (NNV) which has been reported massive mortality in more than fifty teleost species worldwide. VNN causes damage of necrosis and vacuolation to central nervous system (CNS) cells in fish. It is difficult to identify the specific type of cell targeted by NNV, and to decipher the host immune response because of the functional diversity and highly complex anatomical and cellular composition of the CNS. In this study, we found that the red spotted grouper NNV (RGNNV) mainly attacked the midbrain of orange-spotted grouper (Epinephelus coioides). We conducted single-cell RNA-seq analysis of the midbrain of healthy and RGNNV-infected fish and identified 35 transcriptionally distinct cell subtypes, including 28 neuronal and 7 non-neuronal cell types. An evaluation of the subpopulations of immune cells revealed that macrophages were enriched in RGNNV-infected fish, and the transcriptional profiles of macrophages indicated an acute cytokine and inflammatory response. Unsupervised pseudotime analysis of immune cells showed that microglia transformed into M1-type activated macrophages to produce cytokines to reduce the damage to nerve tissue caused by the virus. We also found that RGNNV targeted neuronal cell types was GLU1 and GLU3, and we found that the key genes and pathways by which causes cell cytoplasmic vacuoles and autophagy significant enrichment, this may be the major route viruses cause cell death. These data provided a comprehensive transcriptional perspective of the grouper midbrain and the basis for further research on how viruses infect the teleost CNS.
Collapse
Affiliation(s)
- Qing Wang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
| | - Cheng Peng
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Institute of Zoology, Academy of Sciences, Guangzhou, China
| | - Min Yang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
| | - Fengqi Huang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Xuzhuo Duan
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Shaowen Wang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
| | - Huitao Cheng
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
| | - Huirong Yang
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
| | - Huihong Zhao
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
- * E-mail: (HZ); (QQ)
| | - Qiwei Qin
- Joint Laboratory of Guangdong Province and Hong Kong Region on Marine Bioresource Conservation and Exploitation, College of Marine Sciences, South China Agricultural University, Guangzhou, China
- Guangdong Laboratory for Lingnan Modern Agriculture, Guangzhou, China
- Laboratory for Marine Biology and Biotechnology, Qingdao National Laboratory for Marine Science and Technology, Qingdao, China
- * E-mail: (HZ); (QQ)
| |
Collapse
|
30
|
Abstract
Studies of the spatiotemporal, transcriptomic, and morphological diversity of radial glia (RG) have spurred our current models of human corticogenesis. In the developing cortex, neural intermediate progenitor cells (nIPCs) are a neuron-producing transit-amplifying cell type born in the germinal zones of the cortex from RG. The potential diversity of the nIPC population, that produces a significant portion of excitatory cortical neurons, is understudied, particularly in the developing human brain. Here we explore the spatiotemporal, transcriptomic, and morphological variation that exists within the human nIPC population and provide a resource for future studies. We observe that the spatial distribution of nIPCs in the cortex changes abruptly around gestational week (GW) 19/20, marking a distinct shift in cellular distribution and organization during late neurogenesis. We also identify five transcriptomic subtypes, one of which appears at this spatiotemporal transition. Finally, we observe a diversity of nIPC morphologies that do not correlate with specific transcriptomic subtypes. These results provide an analysis of the spatiotemporal, transcriptional, and morphological diversity of nIPCs in developing brain tissue and provide an atlas of nIPC subtypes in the developing human cortex that can benchmark in vitro models of human development such as cerebral organoids and help inform future studies of how nIPCs contribute to cortical neurogenesis.
Collapse
|
31
|
Li L, Xiong F, Wang Y, Zhang S, Gong Z, Li X, He Y, Shi L, Wang F, Liao Q, Xiang B, Zhou M, Li X, Li Y, Li G, Zeng Z, Xiong W, Guo C. What are the applications of single-cell RNA sequencing in cancer research: a systematic review. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2021; 40:163. [PMID: 33975628 PMCID: PMC8111731 DOI: 10.1186/s13046-021-01955-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/20/2021] [Indexed: 12/18/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a tool for studying gene expression at the single-cell level that has been widely used due to its unprecedented high resolution. In the present review, we outline the preparation process and sequencing platforms for the scRNA-seq analysis of solid tumor specimens and discuss the main steps and methods used during data analysis, including quality control, batch-effect correction, normalization, cell cycle phase assignment, clustering, cell trajectory and pseudo-time reconstruction, differential expression analysis and gene set enrichment analysis, as well as gene regulatory network inference. Traditional bulk RNA sequencing does not address the heterogeneity within and between tumors, and since the development of the first scRNA-seq technique, this approach has been widely used in cancer research to better understand cancer cell biology and pathogenetic mechanisms. ScRNA-seq has been of great significance for the development of targeted therapy and immunotherapy. In the second part of this review, we focus on the application of scRNA-seq in solid tumors, and summarize the findings and achievements in tumor research afforded by its use. ScRNA-seq holds promise for improving our understanding of the molecular characteristics of cancer, and potentially contributing to improved diagnosis, prognosis, and therapeutics.
Collapse
Affiliation(s)
- Lvyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Fang Xiong
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Yumin Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.,Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Shanshan Zhang
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhaojian Gong
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Xiayu Li
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Yi He
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Lei Shi
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Fuyan Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Qianjin Liao
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Bo Xiang
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Ming Zhou
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Xiaoling Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Yong Li
- Department of Medicine, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | - Guiyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Zhaoyang Zeng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Wei Xiong
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| | - Can Guo
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| |
Collapse
|
32
|
Krishna C, DiNatale RG, Kuo F, Srivastava RM, Vuong L, Chowell D, Gupta S, Vanderbilt C, Purohit TA, Liu M, Kansler E, Nixon BG, Chen YB, Makarov V, Blum KA, Attalla K, Weng S, Salmans ML, Golkaram M, Liu L, Zhang S, Vijayaraghavan R, Pawlowski T, Reuter V, Carlo MI, Voss MH, Coleman J, Russo P, Motzer RJ, Li MO, Leslie CS, Chan TA, Hakimi AA. Single-cell sequencing links multiregional immune landscapes and tissue-resident T cells in ccRCC to tumor topology and therapy efficacy. Cancer Cell 2021; 39:662-677.e6. [PMID: 33861994 PMCID: PMC8268947 DOI: 10.1016/j.ccell.2021.03.007] [Citation(s) in RCA: 229] [Impact Index Per Article: 57.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 01/18/2021] [Accepted: 03/22/2021] [Indexed: 02/08/2023]
Abstract
Clear cell renal cell carcinomas (ccRCCs) are highly immune infiltrated, but the effect of immune heterogeneity on clinical outcome in ccRCC has not been fully characterized. Here we perform paired single-cell RNA (scRNA) and T cell receptor (TCR) sequencing of 167,283 cells from multiple tumor regions, lymph node, normal kidney, and peripheral blood of two immune checkpoint blockade (ICB)-naïve and four ICB-treated patients to map the ccRCC immune landscape. We detect extensive heterogeneity within and between patients, with enrichment of CD8A+ tissue-resident T cells in a patient responsive to ICB and tumor-associated macrophages (TAMs) in a resistant patient. A TCR trajectory framework suggests distinct T cell differentiation pathways between patients responding and resistant to ICB. Finally, scRNA-derived signatures of tissue-resident T cells and TAMs are associated with response to ICB and targeted therapies across multiple independent cohorts. Our study establishes a multimodal interrogation of the cellular programs underlying therapeutic efficacy in ccRCC.
Collapse
Affiliation(s)
- Chirag Krishna
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Renzo G DiNatale
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Fengshen Kuo
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Raghvendra M Srivastava
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Lynda Vuong
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Diego Chowell
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Sounak Gupta
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Chad Vanderbilt
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Tanaya A Purohit
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Ming Liu
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Emily Kansler
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Briana G Nixon
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Immunology and Microbial Pathogenesis Program, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY 10065, USA
| | - Ying-Bei Chen
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Vladimir Makarov
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Kyle A Blum
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Human Oncology and Pathogenesis Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Kyrollis Attalla
- Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Stanley Weng
- Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | | | - Mahdi Golkaram
- Illumina, Inc., 5200 Illumina Way, San Diego, CA 92122, USA
| | - Li Liu
- Illumina, Inc., 5200 Illumina Way, San Diego, CA 92122, USA
| | - Shile Zhang
- Illumina, Inc., 5200 Illumina Way, San Diego, CA 92122, USA
| | | | | | - Victor Reuter
- Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Maria I Carlo
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York, NY 10065, USA
| | - Martin H Voss
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York, NY 10065, USA
| | - Jonathan Coleman
- Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Paul Russo
- Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Robert J Motzer
- Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, New York, NY 10065, USA
| | - Ming O Li
- Immunology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Christina S Leslie
- Computational and Systems Biology Program, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| | - Timothy A Chan
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Center for Immunotherapy and Precision Immuno-Oncology, Cleveland Clinic, Cleveland, OH 44195, USA; Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; National Center for Regenerative Medicine, Cleveland Clinic, Cleveland, OH 44195, USA.
| | - A Ari Hakimi
- Immunogenomics and Precision Oncology Platform, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA; Urology Service, Department of Surgery, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.
| |
Collapse
|
33
|
Marques L, Lontra P, Wanke P, Antunes JJM. Governance modes in supply chains and financial performance at buyer, supplier and dyadic levels: the positive impact of power balance. BENCHMARKING-AN INTERNATIONAL JOURNAL 2021. [DOI: 10.1108/bij-03-2020-0114] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeThis study analyzes whether power in the supply chain, based on governance modes and network centrality, explain financial performance at different levels of analysis: buyers, suppliers and dyads.Design/methodology/approachThe study employs a dual macro-micro lens based on global value chain (i.e. market, modular, relational and captive governance modes) and social network analysis (network centrality) to assess the impact of power (im)balance onto financial performance. Different from previous research, this study adopts information reliability techniques – such as information entropy – to differentiate the weights of distinct financial performance metrics in terms of the maximal entropy principle. This principle states that the probability distribution that best represents the current state of knowledge given prior data is the one with largest entropy. These weights are used in TOPSIS analysis.FindingsResults offer insightful reflections to SCM research. We show that buyers outperform suppliers due to power asymmetry. We ground our findings both analyzing across governance modes and comparing network centrality. We show that market and modular governances (where power balance prevails) outperform relational and captive modes at the dyadic level – thus inferring that in the long run these governance modes may lead to financially healthier supply chains.Originality/valueThis study advances SCM research by exploring the impact of governance modes and network centrality on performance at both firm and dyadic levels while employing an innovative combination of secondary data and robust set of techniques including TOPSIS, WASPAS and information entropy.
Collapse
|
34
|
Al Mahi N, Zhang EY, Sherman S, Yu JJ, Medvedovic M. Connectivity Map Analysis of a Single-Cell RNA-Sequencing -Derived Transcriptional Signature of mTOR Signaling. Int J Mol Sci 2021; 22:ijms22094371. [PMID: 33922083 PMCID: PMC8122562 DOI: 10.3390/ijms22094371] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 04/14/2021] [Accepted: 04/14/2021] [Indexed: 12/12/2022] Open
Abstract
In the connectivity map (CMap) approach to drug repositioning and development, transcriptional signature of disease is constructed by differential gene expression analysis between the diseased tissue or cells and the control. The negative correlation between the transcriptional disease signature and the transcriptional signature of the drug, or a bioactive compound, is assumed to indicate its ability to “reverse” the disease process. A major limitation of traditional CMaP analysis is the use of signatures derived from bulk disease tissues. Since the key driver pathways are most likely dysregulated in only a subset of cells, the “averaged” transcriptional signatures resulting from bulk analysis lack the resolution to effectively identify effective therapeutic agents. The use of single-cell RNA-seq (scRNA-seq) transcriptomic assay facilitates construction of disease signatures that are specific to individual cell types, but methods for using scRNA-seq data in the context of CMaP analysis are lacking. Lymphangioleiomyomatosis (LAM) mutations in TSC1 or TSC2 genes result in the activation of the mTOR complex 1 (mTORC1). The mTORC1 inhibitor Sirolimus is the only FDA-approved drug to treat LAM. Novel therapies for LAM are urgently needed as the disease recurs with discontinuation of the treatment and some patients are insensitive to the drug. We developed methods for constructing disease transcriptional signatures and CMaP analysis using scRNA-seq profiling and applied them in the analysis of scRNA-seq data of lung tissue from naïve and sirolimus-treated LAM patients. New methods successfully implicated mTORC1 inhibitors, including Sirolimus, as capable of reverting the LAM transcriptional signatures. The CMaP analysis mimicking standard bulk-tissue approach failed to detect any connection between the LAM signature and mTORC1 signaling. This indicates that the precise signature derived from scRNA-seq data using our methods is the crucial difference between the success and the failure to identify effective therapeutic treatments in CMaP analysis.
Collapse
Affiliation(s)
- Naim Al Mahi
- Division of Biostatistics and Bioinformatics, Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
- AbbVie Inc., North Chicago, IL 60064, USA
| | - Erik Y. Zhang
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA; (E.Y.Z.); (J.J.Y.)
| | | | - Jane J. Yu
- Division of Pulmonary, Critical Care and Sleep Medicine, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA; (E.Y.Z.); (J.J.Y.)
| | - Mario Medvedovic
- Division of Biostatistics and Bioinformatics, Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA;
- Department of Biomedical Informatics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
- Correspondence:
| |
Collapse
|
35
|
Benchmarking mass spectrometry based proteomics algorithms using a simulated database. ACTA ACUST UNITED AC 2021; 10. [PMID: 34012763 DOI: 10.1007/s13721-021-00298-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Protein sequencing algorithms process data from a variety of instruments that has been generated under diverse experimental conditions. Currently there is no way to predict the accuracy of an algorithm for a given data set. Most of the published algorithms and associated software has been evaluated on limited number of experimental data sets. However, these performance evaluations do not cover the complete search space the algorithmand the software might encounter in real-world. To this end, we present a database of simulated spectra that can be used to benchmark any spectra to peptide search engine. We demonstrate the usability of this database by bench marking two popular peptide sequencing engines. We show wide variation in the accuracy of peptide deductions and a complete quality profile of a given algorithm can be useful for practitioners and algorithm developers. All benchmarking data is available at https://users.cs.fiu.edu/~fsaeed/Benchmark.html.
Collapse
|
36
|
Do VH, Rojas Ringeling F, Canzar S. Linear-time cluster ensembles of large-scale single-cell RNA-seq and multimodal data. Genome Res 2021; 31:677-688. [PMID: 33627473 PMCID: PMC8015854 DOI: 10.1101/gr.267906.120] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Accepted: 02/19/2021] [Indexed: 12/25/2022]
Abstract
A fundamental task in single-cell RNA-seq (scRNA-seq) analysis is the identification of transcriptionally distinct groups of cells. Numerous methods have been proposed for this problem, with a recent focus on methods for the cluster analysis of ultralarge scRNA-seq data sets produced by droplet-based sequencing technologies. Most existing methods rely on a sampling step to bridge the gap between algorithm scalability and volume of the data. Ignoring large parts of the data, however, often yields inaccurate groupings of cells and risks overlooking rare cell types. We propose method Specter that adopts and extends recent algorithmic advances in (fast) spectral clustering. In contrast to methods that cluster a (random) subsample of the data, we adopt the idea of landmarks that are used to create a sparse representation of the full data from which a spectral embedding can then be computed in linear time. We exploit Specter's speed in a cluster ensemble scheme that achieves a substantial improvement in accuracy over existing methods and identifies rare cell types with high sensitivity. Its linear-time complexity allows Specter to scale to millions of cells and leads to fast computation times in practice. Furthermore, on CITE-seq data that simultaneously measures gene and protein marker expression, we show that Specter is able to use multimodal omics measurements to resolve subtle transcriptomic differences between subpopulations of cells.
Collapse
Affiliation(s)
- Van Hoan Do
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | | | - Stefan Canzar
- Gene Center, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| |
Collapse
|
37
|
Liu HL, Wang YN, Feng SY. Brain tumors: Cancer stem-like cells interact with tumor microenvironment. World J Stem Cells 2020; 12:1439-1454. [PMID: 33505594 PMCID: PMC7789119 DOI: 10.4252/wjsc.v12.i12.1439] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 10/07/2020] [Accepted: 10/27/2020] [Indexed: 02/06/2023] Open
Abstract
Cancer stem-like cells (CSCs) with potential of self-renewal drive tumorigenesis. Brain tumor microenvironment (TME) has been identified as a critical regulator of malignancy progression. Many researchers are searching new ways to characterize tumors with the goal of predicting how they respond to treatment. Here, we describe the striking parallels between normal stem cells and CSCs. We review the microenvironmental aspects of brain tumors, in particular composition and vital roles of immune cells infiltrating glioma and medulloblastoma. By highlighting that CSCs cooperate with TME via various cellular communication approaches, we discuss the recent advances in therapeutic strategies targeting the components of TME. Identification of the complex and interconnected factors can facilitate the development of promising treatments for these deadly malignancies.
Collapse
Affiliation(s)
- Hai-Long Liu
- Department of Neurosurgery, Chinese PLA General Hospital, Beijing 100853, China
| | - Ya-Nan Wang
- Department of Pathology, Affiliated Hospital of Hebei University, Baoding 071000, Hebei Province, China
| | - Shi-Yu Feng
- Department of Neurosurgery, Chinese PLA General Hospital, Beijing 100853, China
| |
Collapse
|
38
|
Dimitrov D, Gu Q. BingleSeq: a user-friendly R package for bulk and single-cell RNA-Seq data analysis. PeerJ 2020; 8:e10469. [PMID: 33391870 PMCID: PMC7761193 DOI: 10.7717/peerj.10469] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 11/11/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND RNA sequencing is an indispensable research tool used in a broad range of transcriptome analysis studies. The most common application of RNA Sequencing is differential expression analysis and it is used to determine genetic loci with distinct expression across different conditions. An emerging field called single-cell RNA sequencing is used for transcriptome profiling at the individual cell level. The standard protocols for both of these approaches include the processing of sequencing libraries and result in the generation of count matrices. An obstacle to these analyses and the acquisition of meaningful results is that they require programing expertise. Although some effort has been directed toward the development of user-friendly RNA-Seq analysis analysis tools, few have the flexibility to explore both Bulk and single-cell RNA sequencing. IMPLEMENTATION BingleSeq was developed as an intuitive application that provides a user-friendly solution for the analysis of count matrices produced by both Bulk and Single-cell RNA-Seq experiments. This was achieved by building an interactive dashboard-like user interface which incorporates three state-of-the-art software packages for each type of the aforementioned analyses. Furthermore, BingleSeq includes additional features such as visualization techniques, extensive functional annotation analysis and rank-based consensus for differential gene analysis results. As a result, BingleSeq puts some of the best reviewed and most widely used packages and tools for RNA-Seq analyses at the fingertips of biologists with no programing experience. AVAILABILITY BingleSeq is as an easy-to-install R package available on GitHub at https://github.com/dbdimitrov/BingleSeq/.
Collapse
Affiliation(s)
- Daniel Dimitrov
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Glasgow, UK
| | - Quan Gu
- MRC-University of Glasgow Centre for Virus Research, University of Glasgow, Glasgow, UK
| |
Collapse
|
39
|
Andrews TS, Kiselev VY, McCarthy D, Hemberg M. Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data. Nat Protoc 2020; 16:1-9. [PMID: 33288955 DOI: 10.1038/s41596-020-00409-w] [Citation(s) in RCA: 174] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 09/08/2020] [Indexed: 01/01/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a popular and powerful technology that allows you to profile the whole transcriptome of a large number of individual cells. However, the analysis of the large volumes of data generated from these experiments requires specialized statistical and computational methods. Here we present an overview of the computational workflow involved in processing scRNA-seq data. We discuss some of the most common tasks and the tools available for addressing central biological questions. In this article and our companion website ( https://scrnaseq-course.cog.sanger.ac.uk/website/index.html ), we provide guidelines regarding best practices for performing computational analyses. This tutorial provides a hands-on guide for experimentalists interested in analyzing their data as well as an overview for bioinformaticians seeking to develop new computational methods.
Collapse
Affiliation(s)
| | | | - Davis McCarthy
- Bioinformatics and Cellular Genomics, St Vincent's Institute of Medical Research, Fitzroy, Victoria, Australia.,Melbourne Integrative Genomics, Faculty of Science, University of Melbourne, Melbourne, Victoria, Australia
| | | |
Collapse
|
40
|
Augsornworawat P, Millman JR. Single-cell RNA sequencing for engineering and studying human islets. CURRENT OPINION IN BIOMEDICAL ENGINEERING 2020; 16:27-33. [PMID: 33738370 PMCID: PMC7963276 DOI: 10.1016/j.cobme.2020.06.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The islets of Langerhans are complex tissues composed of several cell types that secrete hormones. Loss or dysfunction of the insulin-producing β cells leads to dysregulation of blood glucose levels, resulting in diabetes. A major goal in cellular engineering has been to generate β cells from stem cells for use in cell-based therapies. However, the presence of other cell types within these islets can mask important details about β cells when using population-level assays. Single-cell RNA sequencing have enabled transcriptional assessment of individual cells within mixed populations. These technologies allow for accurate assessment of specific cell types and subtypes of β cells. Studies investigating different stages of β cell maturity have led to several insights into understanding islet development and diabetes pathology. Here, we highlight the key findings from the use of single-cell RNA sequencing on stem cell-derived and primary human islet cells found in different maturation and diabetic states.
Collapse
Affiliation(s)
- Punn Augsornworawat
- Division of Endocrinology, Metabolism and Lipid Research, Washington University School of Medicine, Campus Box 8127, 660 South Euclid Avenue, St. Louis, MO 63110, USA
- Department of Biomedical Engineering, Washington University in St. Louis, 1 Brookings Drive, St. Louis, MO 63130, USA
| | - Jeffrey R. Millman
- Division of Endocrinology, Metabolism and Lipid Research, Washington University School of Medicine, Campus Box 8127, 660 South Euclid Avenue, St. Louis, MO 63110, USA
- Department of Biomedical Engineering, Washington University in St. Louis, 1 Brookings Drive, St. Louis, MO 63130, USA
| |
Collapse
|
41
|
Crowell HL, Soneson C, Germain PL, Calini D, Collin L, Raposo C, Malhotra D, Robinson MD. muscat detects subpopulation-specific state transitions from multi-sample multi-condition single-cell transcriptomics data. Nat Commun 2020; 11:6077. [PMID: 33257685 PMCID: PMC7705760 DOI: 10.1038/s41467-020-19894-4] [Citation(s) in RCA: 232] [Impact Index Per Article: 46.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2020] [Accepted: 11/05/2020] [Indexed: 02/06/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has become an empowering technology to profile the transcriptomes of individual cells on a large scale. Early analyses of differential expression have aimed at identifying differences between subpopulations to identify subpopulation markers. More generally, such methods compare expression levels across sets of cells, thus leading to cross-condition analyses. Given the emergence of replicated multi-condition scRNA-seq datasets, an area of increasing focus is making sample-level inferences, termed here as differential state analysis; however, it is not clear which statistical framework best handles this situation. Here, we surveyed methods to perform cross-condition differential state analyses, including cell-level mixed models and methods based on aggregated pseudobulk data. To evaluate method performance, we developed a flexible simulation that mimics multi-sample scRNA-seq data. We analyzed scRNA-seq data from mouse cortex cells to uncover subpopulation-specific responses to lipopolysaccharide treatment, and provide robust tools for multi-condition analysis within the muscat R package.
Collapse
Affiliation(s)
- Helena L Crowell
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Charlotte Soneson
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, Zurich, Switzerland
- Friedrich Miescher Institute for Biomedical Research and SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Pierre-Luc Germain
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- D-HEST Institute for Neuroscience, Swiss Federal Institute of Technology, Zurich, Switzerland
| | - Daniela Calini
- F. Hoffmann-La Roche Ltd., Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland
| | - Ludovic Collin
- F. Hoffmann-La Roche Ltd., Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland
| | - Catarina Raposo
- F. Hoffmann-La Roche Ltd., Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland
| | - Dheeraj Malhotra
- F. Hoffmann-La Roche Ltd., Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland
| | - Mark D Robinson
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, Zurich, Switzerland.
| |
Collapse
|
42
|
Mohanraj S, Díaz-Mejía JJ, Pham MD, Elrick H, Husić M, Rashid S, Luo P, Bal P, Lu K, Patel S, Mahalanabis A, Naidas A, Christensen E, Croucher D, Richards LM, Shooshtari P, Brudno M, Ramani AK, Pugh TJ. CReSCENT: CanceR Single Cell ExpressioN Toolkit. Nucleic Acids Res 2020; 48:W372-W379. [PMID: 32479601 PMCID: PMC7319570 DOI: 10.1093/nar/gkaa437] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Revised: 04/28/2020] [Accepted: 05/12/2020] [Indexed: 01/10/2023] Open
Abstract
CReSCENT: CanceR Single Cell ExpressioN Toolkit (https://crescent.cloud), is an intuitive and scalable web portal incorporating a containerized pipeline execution engine for standardized analysis of single-cell RNA sequencing (scRNA-seq) data. While scRNA-seq data for tumour specimens are readily generated, subsequent analysis requires high-performance computing infrastructure and user expertise to build analysis pipelines and tailor interpretation for cancer biology. CReSCENT uses public data sets and preconfigured pipelines that are accessible to computational biology non-experts and are user-editable to allow optimization, comparison, and reanalysis for specific experiments. Users can also upload their own scRNA-seq data for analysis and results can be kept private or shared with other users.
Collapse
Affiliation(s)
- Suluxan Mohanraj
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada
| | - J Javier Díaz-Mejía
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada
| | - Martin D Pham
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Hillary Elrick
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Shaikh Rashid
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Ping Luo
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada
| | - Prabnur Bal
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Kevin Lu
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Samarth Patel
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Alaina Mahalanabis
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Alaine Naidas
- University of Western Ontario, London, ON N6A 3K7, Canada
| | | | - Danielle Croucher
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada
| | - Laura M Richards
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada
| | - Parisa Shooshtari
- University of Western Ontario, London, ON N6A 3K7, Canada.,Children's Health Research Institute, London, ON N6C 2R5, Canada.,Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada
| | - Michael Brudno
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada.,Techna Institute, University Health Network, Toronto, ON M5G 0A3, Canada.,Department of Computer Science, University of Toronto, Toronto, ON M5S 3K1, Canada
| | - Arun K Ramani
- Centre for Computational Medicine, The Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
| | - Trevor J Pugh
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON M5G 0A3, Canada.,Ontario Institute for Cancer Research, Toronto, ON M5G 0A3, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON M5S 3K1, Canada
| |
Collapse
|
43
|
Zhao X, Wu S, Fang N, Sun X, Fan J. Evaluation of single-cell classifiers for single-cell RNA sequencing data sets. Brief Bioinform 2020; 21:1581-1595. [PMID: 31675098 PMCID: PMC7947964 DOI: 10.1093/bib/bbz096] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 07/06/2019] [Accepted: 07/08/2019] [Indexed: 12/19/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional workflow based on clustering and marker genes is not scalable for an increasingly large number of scRNA-seq data sets due to complicated procedures and manual annotation. Therefore, a number of tools have been developed recently to predict cell types in new data sets using reference data sets. These methods have not been generally adapted due to a lack of tool benchmarking and user guidance. In this article, we performed a comprehensive and impartial evaluation of nine classification software tools specifically designed for scRNA-seq data sets. Results showed that Seurat based on random forest, SingleR based on correlation analysis and CaSTLe based on XGBoost performed better than others. A simple ensemble voting of all tools can improve the predictive accuracy. Under nonideal situations, such as small-sized and class-imbalanced reference data sets, tools based on cluster-level similarities have superior performance. However, even with the function of assigning 'unassigned' labels, it is still challenging to catch novel cell types by solely using any of the single-cell classifiers. This article provides a guideline for researchers to select and apply suitable classification tools in their analysis workflows and sheds some lights on potential direction of future improvement on classification tools.
Collapse
Affiliation(s)
- Xinlei Zhao
- State Key Laboratory of Bioelectronics, Biomedical Engineering School, Southeast University, Nanjing 210096, China
- Singleron Biotechnologies, Nanjing 211800, China
| | - Shuang Wu
- Singleron Biotechnologies, Nanjing 211800, China
| | - Nan Fang
- State Key Laboratory of Bioelectronics, Biomedical Engineering School, Southeast University, Nanjing 210096, China
| | - Xiao Sun
- State Key Laboratory of Bioelectronics, Biomedical Engineering School, Southeast University, Nanjing 210096, China
| | - Jue Fan
- Singleron Biotechnologies, Nanjing 211800, China
| |
Collapse
|
44
|
Wang X, Sun Z, Zhang Y, Xu Z, Xin H, Huang H, Duerr RH, Chen K, Ding Y, Chen W. BREM-SC: a bayesian random effects mixture model for joint clustering single cell multi-omics data. Nucleic Acids Res 2020; 48:5814-5824. [PMID: 32379315 PMCID: PMC7293045 DOI: 10.1093/nar/gkaa314] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Revised: 03/17/2020] [Accepted: 04/20/2020] [Indexed: 12/25/2022] Open
Abstract
Droplet-based single cell transcriptome sequencing (scRNA-seq) technology, largely represented by the 10× Genomics Chromium system, is able to measure the gene expression from tens of thousands of single cells simultaneously. More recently, coupled with the cutting-edge Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-seq), the droplet-based system has allowed for immunophenotyping of single cells based on cell surface expression of specific proteins together with simultaneous transcriptome profiling in the same cell. Despite the rapid advances in technologies, novel statistical methods and computational tools for analyzing multi-modal CITE-Seq data are lacking. In this study, we developed BREM-SC, a novel Bayesian Random Effects Mixture model that jointly clusters paired single cell transcriptomic and proteomic data. Through simulation studies and analysis of public and in-house real data sets, we successfully demonstrated the validity and advantages of this method in fully utilizing both types of data to accurately identify cell clusters. In addition, as a probabilistic model-based approach, BREM-SC is able to quantify the clustering uncertainty for each single cell. This new method will greatly facilitate researchers to jointly study transcriptome and surface proteins at the single cell level to make new biological discoveries, particularly in the area of immunology.
Collapse
Affiliation(s)
- Xinjun Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhe Sun
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Yanfu Zhang
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Zhongli Xu
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA.,School of Medicine, Tsinghua University, Beijing, China
| | - Hongyi Xin
- Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Heng Huang
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, USA
| | - Richard H Duerr
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Kong Chen
- Department of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ying Ding
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Wei Chen
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, USA.,Department of Pediatrics, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
45
|
Germain PL, Sonrel A, Robinson MD. pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools. Genome Biol 2020; 21:227. [PMID: 32873325 PMCID: PMC7465801 DOI: 10.1186/s13059-020-02136-7] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Accepted: 08/06/2020] [Indexed: 11/13/2022] Open
Abstract
We present pipeComp ( https://github.com/plger/pipeComp ), a flexible R framework for pipeline comparison handling interactions between analysis steps and relying on multi-level evaluation metrics. We apply it to the benchmark of single-cell RNA-sequencing analysis pipelines using simulated and real datasets with known cell identities, covering common methods of filtering, doublet detection, normalization, feature selection, denoising, dimensionality reduction, and clustering. pipeComp can easily integrate any other step, tool, or evaluation metric, allowing extensible benchmarks and easy applications to other fields, as we demonstrate through a study of the impact of removal of unwanted variation on differential expression analysis.
Collapse
Affiliation(s)
- Pierre-Luc Germain
- Department of Molecular Life Sciences, University of Zürich, Winterthurerstrasse 190, Zürich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Zürich, Switzerland
- D-HEST Institute for Neurosciences, ETH Zürich, Winterthurerstrasse 190, Zürich, 8057 Switzerland
| | - Anthony Sonrel
- Department of Molecular Life Sciences, University of Zürich, Winterthurerstrasse 190, Zürich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Zürich, Switzerland
| | - Mark D. Robinson
- Department of Molecular Life Sciences, University of Zürich, Winterthurerstrasse 190, Zürich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, Zürich, Switzerland
| |
Collapse
|
46
|
Kim TH, Zhou X, Chen M. Demystifying "drop-outs" in single-cell UMI data. Genome Biol 2020; 21:196. [PMID: 32762710 PMCID: PMC7412673 DOI: 10.1186/s13059-020-02096-y] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/08/2020] [Indexed: 01/10/2023] Open
Abstract
Many existing pipelines for scRNA-seq data apply pre-processing steps such as normalization or imputation to account for excessive zeros or "drop-outs." Here, we extensively analyze diverse UMI data sets to show that clustering should be the foremost step of the workflow. We observe that most drop-outs disappear once cell-type heterogeneity is resolved, while imputing or normalizing heterogeneous data can introduce unwanted noise. We propose a novel framework HIPPO (Heterogeneity-Inspired Pre-Processing tOol) that leverages zero proportions to explain cellular heterogeneity and integrates feature selection with iterative clustering. HIPPO leads to downstream analysis with greater flexibility and interpretability compared to alternatives.
Collapse
Affiliation(s)
- Tae Hyun Kim
- Department of Statistics, University of Chicago, Chicago, USA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, USA.
| | - Mengjie Chen
- Department of Human Genetics and Department of Medicine, University of Chicago, Chicago, USA.
| |
Collapse
|
47
|
Chen L, Zhai Y, He Q, Wang W, Deng M. Integrating Deep Supervised, Self-Supervised and Unsupervised Learning for Single-Cell RNA-seq Clustering and Annotation. Genes (Basel) 2020; 11:E792. [PMID: 32674393 PMCID: PMC7397036 DOI: 10.3390/genes11070792] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/29/2020] [Accepted: 07/08/2020] [Indexed: 12/31/2022] Open
Abstract
As single-cell RNA sequencing technologies mature, massive gene expression profiles can be obtained. Consequently, cell clustering and annotation become two crucial and fundamental procedures affecting other specific downstream analyses. Most existing single-cell RNA-seq (scRNA-seq) data clustering algorithms do not take into account the available cell annotation results on the same tissues or organisms from other laboratories. Nonetheless, such data could assist and guide the clustering process on the target dataset. Identifying marker genes through differential expression analysis to manually annotate large amounts of cells also costs labor and resources. Therefore, in this paper, we propose a novel end-to-end cell supervised clustering and annotation framework called scAnCluster, which fully utilizes the cell type labels available from reference data to facilitate the cell clustering and annotation on the unlabeled target data. Our algorithm integrates deep supervised learning, self-supervised learning and unsupervised learning techniques together, and it outperforms other customized scRNA-seq supervised clustering methods in both simulation and real data. It is particularly worth noting that our method performs well on the challenging task of discovering novel cell types that are absent in the reference data.
Collapse
Affiliation(s)
- Liang Chen
- School of Mathematical Sciences, Peking University, Beijing 100871, China; (L.C.); (Q.H.); (W.W.)
| | - Yuyao Zhai
- Mathematical and Statistical institute, Northeast Normal University, Changchun 130024, China;
| | - Qiuyan He
- School of Mathematical Sciences, Peking University, Beijing 100871, China; (L.C.); (Q.H.); (W.W.)
| | - Weinan Wang
- School of Mathematical Sciences, Peking University, Beijing 100871, China; (L.C.); (Q.H.); (W.W.)
| | - Minghua Deng
- School of Mathematical Sciences, Peking University, Beijing 100871, China; (L.C.); (Q.H.); (W.W.)
- Center for Quantitative Biology, Peking University, Beijing 100871, China
- Center for Statistical Science, Peking University, Beijing 100871, China
| |
Collapse
|
48
|
Wilk AJ, Rustagi A, Zhao NQ, Roque J, Martínez-Colón GJ, McKechnie JL, Ivison GT, Ranganath T, Vergara R, Hollis T, Simpson LJ, Grant P, Subramanian A, Rogers AJ, Blish CA. A single-cell atlas of the peripheral immune response in patients with severe COVID-19. Nat Med 2020; 26:1070-1076. [PMID: 32514174 PMCID: PMC7382903 DOI: 10.1038/s41591-020-0944-y] [Citation(s) in RCA: 1134] [Impact Index Per Article: 226.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 05/19/2020] [Indexed: 02/08/2023]
Abstract
There is an urgent need to better understand the pathophysiology of Coronavirus disease 2019 (COVID-19), the global pandemic caused by SARS-CoV-2, which has infected more than three million people worldwide1. Approximately 20% of patients with COVID-19 develop severe disease and 5% of patients require intensive care2. Severe disease has been associated with changes in peripheral immune activity, including increased levels of pro-inflammatory cytokines3,4 that may be produced by a subset of inflammatory monocytes5,6, lymphopenia7,8 and T cell exhaustion9,10. To elucidate pathways in peripheral immune cells that might lead to immunopathology or protective immunity in severe COVID-19, we applied single-cell RNA sequencing (scRNA-seq) to profile peripheral blood mononuclear cells (PBMCs) from seven patients hospitalized for COVID-19, four of whom had acute respiratory distress syndrome, and six healthy controls. We identify reconfiguration of peripheral immune cell phenotype in COVID-19, including a heterogeneous interferon-stimulated gene signature, HLA class II downregulation and a developing neutrophil population that appears closely related to plasmablasts appearing in patients with acute respiratory failure requiring mechanical ventilation. Importantly, we found that peripheral monocytes and lymphocytes do not express substantial amounts of pro-inflammatory cytokines. Collectively, we provide a cell atlas of the peripheral immune response to severe COVID-19.
Collapse
MESH Headings
- Adult
- Aged
- Aged, 80 and over
- Betacoronavirus/immunology
- COVID-19
- Case-Control Studies
- Coronavirus Infections/genetics
- Coronavirus Infections/immunology
- Coronavirus Infections/pathology
- Cytokines/genetics
- Cytokines/metabolism
- Female
- Gene Expression Profiling/methods
- Humans
- Immunity, Cellular
- Killer Cells, Natural/immunology
- Killer Cells, Natural/metabolism
- Leukocytes, Mononuclear/immunology
- Leukocytes, Mononuclear/metabolism
- Leukocytes, Mononuclear/virology
- Male
- Middle Aged
- Pandemics
- Pneumonia, Viral/genetics
- Pneumonia, Viral/immunology
- Pneumonia, Viral/pathology
- RNA-Seq/methods
- SARS-CoV-2
- Sequence Analysis, RNA/methods
- Severity of Illness Index
- Single-Cell Analysis/methods
- T-Lymphocytes/immunology
- T-Lymphocytes/metabolism
- Young Adult
Collapse
Affiliation(s)
- Aaron J Wilk
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Arjun Rustagi
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Nancy Q Zhao
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Jonasel Roque
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | | | - Julia L McKechnie
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Geoffrey T Ivison
- Stanford Immunology Program, Stanford University School of Medicine, Stanford, CA, USA
| | - Thanmayi Ranganath
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Rosemary Vergara
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Taylor Hollis
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Laura J Simpson
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Philip Grant
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Aruna Subramanian
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Angela J Rogers
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
| | - Catherine A Blish
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA, USA.
- Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
49
|
Hou R, Denisenko E, Forrest ARR. scMatch: a single-cell gene expression profile annotation tool using reference datasets. Bioinformatics 2020; 35:4688-4695. [PMID: 31028376 PMCID: PMC6853649 DOI: 10.1093/bioinformatics/btz292] [Citation(s) in RCA: 72] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2018] [Revised: 03/28/2019] [Accepted: 04/21/2019] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) measures gene expression at the resolution of individual cells. Massively multiplexed single-cell profiling has enabled large-scale transcriptional analyses of thousands of cells in complex tissues. In most cases, the true identity of individual cells is unknown and needs to be inferred from the transcriptomic data. Existing methods typically cluster (group) cells based on similarities of their gene expression profiles and assign the same identity to all cells within each cluster using the averaged expression levels. However, scRNA-seq experiments typically produce low-coverage sequencing data for each cell, which hinders the clustering process. RESULTS We introduce scMatch, which directly annotates single cells by identifying their closest match in large reference datasets. We used this strategy to annotate various single-cell datasets and evaluated the impacts of sequencing depth, similarity metric and reference datasets. We found that scMatch can rapidly and robustly annotate single cells with comparable accuracy to another recent cell annotation tool (SingleR), but that it is quicker and can handle larger reference datasets. We demonstrate how scMatch can handle large customized reference gene expression profiles that combine data from multiple sources, thus empowering researchers to identify cell populations in any complex tissue with the desired precision. AVAILABILITY AND IMPLEMENTATION scMatch (Python code) and the FANTOM5 reference dataset are freely available to the research community here https://github.com/forrest-lab/scMatch. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rui Hou
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Nedlands, Perth, WA 6009, Australia
| | - Elena Denisenko
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Nedlands, Perth, WA 6009, Australia
| | - Alistair R R Forrest
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Nedlands, Perth, WA 6009, Australia
| |
Collapse
|
50
|
Lin Y, Cao Y, Kim HJ, Salim A, Speed TP, Lin DM, Yang P, Yang JYH. scClassify: sample size estimation and multiscale classification of cells using single and multiple reference. Mol Syst Biol 2020; 16:e9389. [PMID: 32567229 PMCID: PMC7306901 DOI: 10.15252/msb.20199389] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 05/22/2020] [Accepted: 05/26/2020] [Indexed: 12/26/2022] Open
Abstract
Automated cell type identification is a key computational challenge in single-cell RNA-sequencing (scRNA-seq) data. To capitalise on the large collection of well-annotated scRNA-seq datasets, we developed scClassify, a multiscale classification framework based on ensemble learning and cell type hierarchies constructed from single or multiple annotated datasets as references. scClassify enables the estimation of sample size required for accurate classification of cell types in a cell type hierarchy and allows joint classification of cells when multiple references are available. We show that scClassify consistently performs better than other supervised cell type classification methods across 114 pairs of reference and testing data, representing a diverse combination of sizes, technologies and levels of complexity, and further demonstrate the unique components of scClassify through simulations and compendia of experimental datasets. Finally, we demonstrate the scalability of scClassify on large single-cell atlases and highlight a novel application of identifying subpopulations of cells from the Tabula Muris data that were unidentified in the original publication. Together, scClassify represents state-of-the-art methodology in automated cell type identification from scRNA-seq data.
Collapse
Affiliation(s)
- Yingxin Lin
- School of Mathematics and StatisticsUniversity of SydneySydneyNSWAustralia
- Charles Perkins CentreUniversity of SydneySydneyNSWAustralia
| | - Yue Cao
- School of Mathematics and StatisticsUniversity of SydneySydneyNSWAustralia
- Charles Perkins CentreUniversity of SydneySydneyNSWAustralia
| | - Hani Jieun Kim
- School of Mathematics and StatisticsUniversity of SydneySydneyNSWAustralia
- Charles Perkins CentreUniversity of SydneySydneyNSWAustralia
- Computational Systems Biology GroupChildren's Medical Research InstituteUniversity of SydneyWestmeadNSWAustralia
| | - Agus Salim
- Department of Mathematics and StatisticsLa Trobe UniversityBundooraVICAustralia
- Baker Heart and Diabetes InstituteMelbourneVICAustralia
- Bioinformatics DivisionWalter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
| | - Terence P Speed
- Bioinformatics DivisionWalter and Eliza Hall Institute of Medical ResearchParkvilleVICAustralia
| | - David M Lin
- Department of Biomedical SciencesCornell UniversityIthacaNYUSA
| | - Pengyi Yang
- School of Mathematics and StatisticsUniversity of SydneySydneyNSWAustralia
- Charles Perkins CentreUniversity of SydneySydneyNSWAustralia
- Computational Systems Biology GroupChildren's Medical Research InstituteUniversity of SydneyWestmeadNSWAustralia
| | - Jean Yee Hwa Yang
- School of Mathematics and StatisticsUniversity of SydneySydneyNSWAustralia
- Charles Perkins CentreUniversity of SydneySydneyNSWAustralia
| |
Collapse
|