Ulcerative colitis (UC) is a chronic nonspecific intestinal inflammatory disease with persistent or recurrent episodes of diarrhea, mucous−purulent bloody stool, abdominal pain, and tenesmus. The course of UC often alternates between active and remission phases and requires constant maintenance therapy, which can place substantial psychological, physical and financial burdens on patients and families. In China, the incidence of UC has been on the rise in recent years, and the recurrent nature of the disease reduces the quality of life of UC patients. The current goals for UC control include a clinical response, clinical remission, normalization of the C-reactive protein (CRP) and fecal calprotectin levels, endoscopic healing, and normalization of the quality of life without disability, and future goals should include histological healing. 5-Aminosalicylic acid, glucocorticoids, and immunosuppressants are the main therapeutic agents for UC. Antitumor necrosis factor-α (TNF-α) drugs, anti-integrin monoclonal antibodies (mAbs), anti-interleukin-12/23 (IL-12/23) mAbs, Janus kinase (JAK) inhibitors, and sphingosine-1-phosphate receptor (S1PRs) modulators have also been approved for UC treatment, but some patients stop responding to these treatments. Delayed diagnosis of UC is one of the risk factors for adverse outcomes, and early diagnosis is critical for obtaining the best treatment response rates. Selection of the drug for initial treatment is vital, and timely use of immunosuppressive or biological agents is associated with better outcomes. Therefore, identification of clinical phenotypes and biomarkers to predict the response of patients to specific therapeutic drugs has important implications for the diagnosis and treatment of UC.
Single-cell RNA sequencing (scRNA-seq) can increase understanding of the development of intestinal diseases at the individual cell level. ScRNA-seq can be used to rapidly obtain the precise gene expression patterns of thousands of cells in the intestine, analyze the characteristics of cells with the same phenotype, and provide new insights into the growth and development of intestinal organs, the clonal evolution of cells, and immune cell changes. These findings can provide new ideas for the diagnosis and treatment of intestinal diseases. Li et al found that UC risk-related genes are enriched in progenitor cells, glial cells and immune cells and showed consistently altered expression in immune cells of both inflammatory and noninflammatory tissues, whereas drug target genes were differentially expressed in antigen-presenting cells. T helper 17 (Th17)-cell activation was observed in both the epithelial cell lineage and immune cell lineage of UC patients, indicating systematic changes in Th17-driven immune activity. Devlin et al found that nonresponsiveness to anti-integrin biologic therapies in patients with pouchitis and UC after ileal pouch anal anastomosis was associated with the IL-1β+LYZ+ myeloid cell signature in a subset of patients, and these results may provide biomarkers for personalized therapy for patients with UC. The detection of peripheral blood-related indicators in patients with UC also provides an objective basis for the diagnosis and treatment of disease. Compared with colonoscopy, these detection methods taking advantage of such indicators are more convenient, and patients are more likely to agree to undergoing these analyses. Gryglewski et al found that patients with progressive UC have reduced γδ T cells in the peripheral blood, increased αβ/γδ T-cell ratios, and significantly increased percentages of γδTCD25, γδTCD54, and γδTCD62L lymphocytes compared with patients with stable UC. These observations might provide markers for predicting exacerbations in UC patients. Furukawa S et al suggested that the peripheral blood monocyte count could be used as a supplemental serum marker of mucosal healing in UC patients with low CRP levels.
Based on the above research, we analyzed peripheral blood cell subtypes of UC patients by scRNA-seq combined with bulk RNA sequencing (RNA-seq) using the Gene Expression Omnibus (GEO) database, revealed markers of UC cell subtypes, and performed weighted gene correlation network analysis (WGCNA) and least absolute shrinkage and selection operator (LASSO) analysis to reveal diagnostic markers of UC with the aim of providing experimental research ideas and a theoretical basis for the discovery of new UC molecular mechanisms and therapeutic drugs.
MATERIALS AND METHODS
Data collection and preprocessing
The GEO database (https://www.ncbi.nlm.nih.gov/gds/) is a database for storing chip sequencing, second-generation sequencing and other high-throughput sequencing data. The GSE125527 data file was downloaded from the NCBI GEO public database and annotated with the GPL24676 platform, and the data from 15 samples of peripheral blood mononuclear cells (PBMCs) with complete expression profiles (including 7 samples from UC patients and 8 normal samples) were downloaded for analysis. The Series Matrix File of GSE3365 was downloaded (the annotation platform was GPL96). The data from 68 UC patients with complete expression profiles were downloaded for this analysis. The Series Matrix File of GSE126124 was downloaded (the annotation platform was GPL96). The data from 57 UC patients with complete expression profiles were downloaded for this analysis.
ScRNA-seq data analysis
First, the expression profiles were processed using the Seurat package, and low-expression genes were filtered out (nfeature_rna > 50 & percent; MT < 5). The data were successively processed by normalization, homogenization, principle component analysis (PCA), and uniform manifold approximation and projection (UMAP) analysis. The optimal number of principle components (PCs) was determined with ElbowPlot, and the positional relationship between each cluster was obtained by TSNE analysis. Each cluster was annotated with the MonacoImmuneData annotation file of the Celldex package, which provides annotation for some cells that are important for disease occurrence. Finally, we extracted the marker genes for each cell subtype from the single-cell expression profile by setting the logfc.threshold parameter of FindAllMarkers to 1. Genes with |avg log2FC| > 1 and p adj < 0.05 were identified as specific marker genes for each cell subtype.
Gene ontology and Kyoto Encyclopedia of Genes and Genomes analyses
Functional annotation of target genes was performed using clusterProfiler to fully explore their functional relevance. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were conducted to assess relevant functional categories. The terms with P and Q values less than 0.05 identified from the GO and KEGG pathway enrichment analyses were considered significant.
Analysis of ligand–receptor interactions (CellChat)
CellChat is a tool that enables quantitative inference and analysis of intercellular communication networks from single-cell data. The main cell input and output signals are assessed, and the coordinate functions of these cells and signals are identified and analyzed using the CellChat network. In this analysis, we used the standardized single-cell expression profile as the input data, obtained the cell subtypes by single-cell analysis, used these subtypes as the cell information, analyzed the interactions related to signaling, and quantified the closeness of the interaction relationship based on the interaction weights and counts between cells to observe the activity and influence of each type of cell in the disease.
Construction of the diagnostic model
The target genes were selected, and LASSO regression was used to further construct the diagnostic model. After inclusion of the expression values for each specific gene, the risk score formula was constructed based on the data from each patient and weighted with the regression coefficients estimated from the LASSO regression analysis. According to the risk score formula, the patients were divided into high- and low-score groups using the median risk score as the cutoff point, and a receiver operating characteristic (ROC) curve was used to study the predictive accuracy of the model. The model formula was as follows: TNFAIP2 * (-0.141)+CXCR4 * (-0.073)+ TCIRG1 * (-0.061)+ CTSS * (-0.040)+ LGALS1 * (-0.036)+ AES * (-0.030)+ CD247 * (-0.029)+ SDCBP * (-0.028)+ PLXDC2 * (-0.024)+ ZFP36L2 * (-0.023)+ EIF4EBP1 * (-0.020)+ RAB11FIP1 * (-0.016)+ LRP1* (-0.016)+ DMXL2 * (-0.012)+ CCDC88A * (-0.007)+ MS4A6A * (-0.004)+ CALM1 * (-0.003)+ STX11 * (-0.01)+ OAZ1 * (-1.793)+ RHOB * 0.0008+ PLAUR* 0.001+ STXBP2 * 0.002+ FPR1 * 0.004+ RAC1 *0.005+ CCL5 * 0.007+ GSTP1 * 0.008+ CSF3R * 0.008+ FTH1 * 0.011+ GABARAP * 0.013+ SYNE2 * 0.026+ S100A11 * 0.028+ IFITM1 * 0.028+ KLF4 * 0.031+ CTSD * 0.033+ RNF130 * 0.041+ LTB * 0.043+ SPI1 * 0.055+ NR4A1 * 0.055+ SARAF * 0.065+ ZYX * 0.075+ SAT1 * 0.081+TNFRSF1B * 0.088+ GPX1 * 0.109.
Construction of the WGCNA coexpression network
We constructed weighted gene coexpression networks to search for coexpressed gene modules and explored the association between gene networks and phenotypes and the core genes in the networks. The WGCNA-R package was used to construct a coexpression network of genes with variance values in the top 5000 in the GSE3365 dataset based on a soft threshold of 3. The weighted adjacency matrix was transformed into a topological overlap matrix (TOM) to estimate the network connectivity, and the agglomerative hierarchical clustering method was applied to construct the clustering tree structure of the TOM matrix. Different branches of the clustering tree represent different gene modules, and different colors indicate different modules. Based on their weighted correlation coefficients, genes were classified according to their expression patterns; genes with similar patterns were grouped into one module, and all the genes were divided into multiple modules based on gene expression patterns.
Estimation of immune cell infiltration
The CIBERSORT method is a widely used method for the evaluation of immune cell types in a microenvironment. Based on the principle of support vector regression, this method deconvolutes expression matrixes of immune cell subtypes and considers 547 biomarkers that distinguish 22 human immune cell phenotypes, including T cell, B cell, plasma cell, and myeloid cell subsets. RNA-seq data from patients with UC were analyzed using the CIBERSORT algorithm to infer the relative proportions of 22 infiltrating immune cells, and Pearson correlation analysis of gene expression and immune cell content was performed. P < 0.05 was considered to indicate a significant difference.
Gene set variation analysis
Gene set variation analysis (GSVA) is a nonparametric unsupervised method for assessing the enrichment of transcriptome genes. GSVA translates gene-level changes into pathway-level changes by combining gene sets of interest and thus determines the biological functions of genes enriched in samples. In this study, we downloaded gene sets from the Molecular Signatures Database (version v7.0) and used GSVA to comprehensively score each gene set and thus evaluate the potential changes in biological functions of different samples.
Statistical analyses were performed using R language (R version 4.1.2). All statistical tests were two-sided, and P < 0.05 was considered to indicate statistical significance.
Single-cell analysis of the scRNA-seq data
GSE125527 contains 103 samples. In this analysis, PBMC sample data from 7 patients with UC were extracted and preliminarily screened using nFeature_RNA and nCount_RNA (nFeature_RNA > 50 and percent.mt < 5) (Figure 1A and B). Ultimately, data from 24340 cells were included, and the 10 genes with the highest standard deviation are displayed in Figure 1C. PCA dimensionality reduction analysis showed that the batch effect between samples was not obvious (Figure 2A), and the optimal number of PCs was calculated as 15 (Figure 2B). A total of 17 subgroups were ultimately obtained through TSNE analysis (Figure 2C). The marker genes with the most obviously different expression levels among subtypes are shown (Figure 2D).
Figure 1 Characterization of single-cell RNA sequencing data from samples of peripheral blood mononuclear cells in ulcerative colitis.
A: The left graph shows the relationship between the cell sequencing depth and the mitochondrial content, and the right graph shows the relationship between the sequencing depth and the gene number, which were positively correlated. B: Quality control analysis of the PBMC samples determined the number of genes and the sequencing depth of each cell. C: Genes with significantly different expression between cells were identified, and a characteristic variance map was drawn.
Figure 2 Clustering of peripheral blood mononuclear cell samples from ulcerative colitis patients.
A: Distribution of peripheral blood mononuclear cells. The dots represent cells, and the colors represent the samples; B: The P value corresponds to each principal component (PC), and the PC was determined based on the P value; C: According to the important components identified via PCA, the cells were divided into 17 clusters using the tSNE algorithm; D: Heatmap of the top 10 characteristic genes of each cluster.
Annotation of cell subpopulations in the single-cell data
Overall, 17 clusters were annotated as five cell categories [T cells, B cells, monocytes, natural killer cells, and granulocyte-monocyte progenitors (GMPs)] using MonacoImmuneData as the annotation data file and the R package SingleR for each subtype (Figure 3). Ultimately, we extracted the marker genes unique to each cell subtype from the single-cell data using the FindAllMarkers function (cellMarkers.txt).
Figure 3 Cell annotation of peripheral blood mononuclear cells in ulcerative colitis.
The 17 annotated clusters were grouped into 5 cell types, namely, T cells, B cells, monocytes, NK cells, and granulocyte-monocyte progenitors.
Analysis of intercellular communication
Using the software package CellChat, we analyzed the ligand−receptor relationships in the single-cell expression profiles and found complex interactions between the cell subtypes (Figure 4A). We found that monocyte −> B cell, GMP −> T cell, GMP −> B-cell and MIF-CD74+CXCR4+ cell −> MIF-CD74+CD44+ cell interactions had high scores (Figure 4B). Furthermore, we found that cells such as monocytes, B cells, and GMPs exhibited close potential interactions with other cells (Figure 4C). Therefore, we selected monocyte marker genes as the candidate gene set for the prediction model.
Figure 4 CellChat identification of communication between cells.
A: Network of cell interactions between the five cell types. Monocytes and B cells and B cells and granulocyte-monocyte progenitors showed the closest interactions. The dot size is proportional to the number of cells in each group, and the edge width indicates the communication probability between cells; B: Bubble diagram of receptor–ligand interactions between cells; C: Comparison of the total number of interactions in the communication network between the five cell types.
Functional enrichment analysis of the candidate gene sets
The 259 candidate gene sets were analyzed. The GO results showed that the main enriched pathway terms were cell chemotaxis, leukocyte migration, immune response-regulated signaling pathway, regulation of peptidase activity, immune receptor activity, MHC class I receptor activity, proteoglycan binding, immunoglobulin binding, inhibitory MHC class I receptor activity, and peptidase regulator activity (Figure 5A). The KEGG results showed that the main enriched pathway terms were Th17-cell differentiation, Th1- and Th2-cell differentiation, chemokine signaling pathway, NF-kappa B signaling pathway, apoptosis, primary immunodeficiency, HIF-1 signaling pathway, and IL-17 signaling pathway (Figure 5B).
Figure 5 Functional enrichment of monocyte markers.
A: Gene Ontology enrichment analysis, including molecular function, cellular component and biological process analyses, of marker genes. The depth of the color indicates the strength of the adjusted P value; B: Kyoto Encyclopedia of Genes and Genomes enrichment analysis of marker genes. The depth of the color indicates the strength of the adjusted P value. MF: Molecular function; CC: Cellular component; BP: Biological process.
Construction of a predictive model for UC
We used GSE3365 as the training set and GSE126124 as the validation set and selected 259 monocyte cell markers for feature screening by LASSO regression. LASSO regression identified 43 genes as characteristic genes of UC, and a prediction model was constructed (Figure 6A-C). The results indicated that the prediction model with 43 genes had the best diagnostic efficacy, with an area under the curve (AUC) of 1 (Figure 6D-E). We further validated the diagnostic model with the GSE126124 dataset. The results showed that the model had strong stability, with an AUC of 0.8305.
Figure 6 Identification of core genes in ulcerative colitis.
A: Tenfold cross validation of tuning parameter selection in the least absolute shrinkage and selection operator (LASSO) model; B: LASSO coefficient distribution of marker genes; C: Gene coefficients were screened by LASSO analysis; D-E: Receiver operating characteristic curves of LASSO-identified genes in both the training and validation sets. The area under the curve values were greater than 0.8, indicating that the model had good predictive efficacy.
Construction of the WGCNA coexpression network and identification of core genes
To identify the core genes affecting the progression of UC, we constructed a WGCNA network based on the genes with top 5000 variance values in the GSE3365 dataset to explore relevant coexpression networks in UC. The soft threshold was set to 3 (Figure 7A, B). The gene modules were detected based on the TOM, and a total of 12 gene modules were detected (Figure 7C): black (120), blue (942), brown (797), cyan (134), green (251), magenta (82), pink (115), red (135), salmon (57), tan (59), turquoise (1832), and yellow (476). The black module had the highest correlation (cor= -0.7, P = (3e-11)) (Figure 7D). We mapped 120 genes from the black module to 43 LASSO models and obtained three common genes: RhoB, cathepsin D (CTSD), and zyxin (ZYX) (Figure 7E).
Figure 7 Construction of a weighted gene correlation network analysis network for ulcerative colitis.
A: Cluster dendrogram of the samples; B: Scale-free index and average connectivity of each soft threshold; C: Dendrogram of gene cluster. The different colors represent different modules; D: Heatmap of the correlations between module feature genes and ulcerative colitis. Blue indicates a negative correlation, and red indicates a positive correlation; E: Venn diagram of the weighted gene correlation network analysis black module with 43 least absolute shrinkage and selection operator genes.
Analysis of immune infiltration in UC
By analyzing the relationship between the core genes and immune infiltration in the GSE3365 dataset, we further explored the potential molecular mechanisms through which the core genes affect disease progression. The content of immune cells and the interactions between immune cells are shown (Figure 8A and B). The results suggested that the levels of monocytes and neutrophils were significantly higher in the UC group than in normal patients, whereas the levels of memory B cells and resting NK cells were lower in the UC group than in normal patients (Figure 8C). The three core genes were strongly correlated with immune cells levels (Figure 8D-F).
Figure 8 Immune infiltration in patients with ulcerative colitis.
A: Relative percentages of 22 immune cell subsets in ulcerative colitis (UC) samples; B: Pearson correlation among the 22 immune cell types. Blue indicates a positive correlation, and red indicates a positive correlation; C: Differences in the immune cell content between controls and patients with UC. Blue indicates controls, and pink indicates patients with UC. aP < 0.05, bP < 0.01, cP < 0.001; D-F: Pearson correlation analysis of the three core genes with immune cells. Orange indicates a positive correlation, and purple indicates a negative correlation.
GSVA of the core genes
We subsequently studied the specific signaling pathways related to the three core genes to explore the potential molecular mechanisms through which the core genes affect the progression of UC. The GSVA results indicated that the three core genes with high expression were enriched in the terms KRAS signaling down, epithelial−mesenchymal transition (EMT), angiogenesis, and KRAS signaling up, whereas the three core genes with low expression were enriched in the terms transforming growth factor (TGF) beta signaling, unfolded protein response (UPR), androgen response, MYC target V2, p53 pathway, mTOR signaling, UV response down, and E2F targets, which suggests that the core genes may affect the progression of UC (Figure 9A-C).
Figure 9 Gene set variation analysis of the core genes.
A-C: Gene set variation analysis of the three core genes. Blue indicates high expression, green indicates low expression, and the background gene set was the hallmark gene set.
Correlation of disease-regulating genes with the core genes
We obtained UC regulatory genes through the GeneCards database and analyzed the differences in the expression of the disease regulatory genes. The results showed that ABCB1, CXCL8, IL10, IL1RN and TLR4 exhibited significant differences between the two groups (Figure 10A). To explore the relationship between the core genes and the disease regulatory genes, we analyzed the correlations between the two gene types. The results showed that CTSD was significantly positively correlated with NOD2 (Pearson r = 0.47) and that RhoB was significantly negatively correlated with ABCB1 (Pearson r =-0.37) (Figure 10B).
Figure 10 Correlation between regulatory genes and core genes in ulcerative colitis.
A: Differential expression of regulatory genes in ulcerative colitis (UC). aP < 0.05, bP < 0.01, cP < 0.001; B: Pearson correlation analysis of core genes and UC regulatory genes. Blue indicates a negative correlation, and red indicates a positive correlation.
Analysis of the expression levels of core genes in single-cell subtypes
The expression of the three core genes, RhoB, CTSD, and ZYX, in the five immune cell types is shown in the figures. All three core genes were highly expressed in monocytes (Figure 11A and B). In addition, the results of the trajectory analyses of the five cell types are shown in the figures. Monocytes and GMPs were mainly concentrated on the left side, B cells were mainly concentrated on the lower half, and T cells and NK cells were mainly concentrated on the upper half (Figure 11C).
Figure 11 Expression patterns of key genes in single cells.
A: Expression of the core genes in five cell subtypes. The three core genes were highly expressed in monocytes; B: Bubble diagram of the expression of the three core genes in the five cell subtypes; C: Cell trajectory analysis revealed the maturation of peripheral blood mononuclear cells in ulcerative colitis according to cell subtype staining.
The course of UC is prolonged, and UC is difficult to cure and is recognized as a refractory disease by the World Health Organization. UC treatment faces challenges such as interindividual heterogeneity and complex complications. It is necessary to constantly improve biological treatment strategies to extend the focus from the short-term treatment response to the ultimate goal of disease clearance[12,13]. Many prognostic markers of UC have been or are being verified. Potential prognostic markers include clinical, genetic, transcriptomic (in the intestine or blood), proteomic, and flora-related prognostic markers, among others. A variety of markers for predicting the efficacy of UC can be used to predict the response of patients to different treatments. Potential markers for predicting the efficacy of UC include single genes/proteins or combinations, specific cell subsets, and in vitro imaging features. Therefore, improved disease classification, a deeper understanding of the natural history of disease, optimization of the design of cohort studies and clinical trials, integration of multiomics data, clinical data, environmental factors and other information, and verification of biological mechanisms can improve the UC precision medicine strategies used in clinical practice.
ScRNA-seq is the most important part of the Human Cell Map project. ScRNA-seq overcomes the defects in studying intercellular specificity within tissues using traditional sequencing technology and thus enables more in-depth studies of the nature of gene expression. By scRNA-seq, Smillie et al identified 51 cell subsets from the intestinal mucosa of UC patients, mainly epithelial, stromal and immune cells. Among them, M-like cells, inflammatory monocytes, inflammation-associated fibroblasts (IAFs), and CD8+IL-17+ T cells were increased in UC, indicating their key role in inflammation. Inflammatory monocytes and IAFs might mediate resistance to anti-TNF therapy by expressing oncostatin-M and its receptor, respectively. The presence of intercellular interactions explains the observed changes in the proportions of cell subsets. M-like cells, IAFs and inflammation-associated monocytes are key factors in the cellular interaction networks that function during disease. Using scRNA-seq data, Chen EN et al determined that adrenomedullin is expressed in activated fibroblasts and epithelial cells of UC patients. Interferon-γ is a key upstream regulator of mast cell gene expression. The regions of UC inflammation were found to feature MRGPRX2-mediated mast cell activation, and reduced activation was observed in samples with gene variants that protect against UC. These results led to the identification of a UC activation cell module and new therapeutic targets. We analyzed scRNA-seq data from PBMC samples of UC patients and extracted marker genes specific for each cell subtype. We found that monocytes, B cells, and GMP cells exhibited closer potential interactions with other cells. Ultimately, we selected monocyte marker genes as the candidate gene set for the prediction model. Monocytosis and a low lymphocyte/monocyte ratio may be effective, noninvasive and low-cost biomarkers for identifying disease activity in patients with UC[19,20]. Previous studies have shown that reduced expression of CD162 is associated with endotoxemia and decreased binding of PLTs to monocytes via membrane CD162-CD62P, which are conducive to the inflammatory response of UC patients. Not surprisingly, selective granulocyte and monocyte apheresis (GMA) combined with conventional therapy appears to be more effective than conventional therapy alone in inducing and maintaining remission in patients with UC[22,23]. By constructing a WGCNA network, we identified the three core genes of the coexpression network: RhoB, CTSD and ZYX. RhoB plays an essential role in regulation of the cell cycle and apoptosis, and highly homologous Rho GTPases differentially regulate the dynamics of intestinal endothelial barrier function. Yang et al found that miR-21 induces degradation of RhoB mRNA in UC patients, resulting in depletion of RhoB and damage to tight junctions in intestinal epithelial cells (IECs). Macrophages produce CTSD. Hausmann et al found that CTSD expression is induced in inflammation-associated intestinal macrophages and that the presence of CTSD might contribute to mucosal damage in UC. Fischbeck et al showed that sphingomyelin-induced apoptosis of IECs is mediated by ceramide and CTSD activation. This activation shortens the physiological life cycle of IECs and impairs key functions of the intestinal mucosal barrier: defense and nutrient absorption. The LIM domain protein ZYX was originally identified as a tiny actin cytoskeleton protein that regulates the assembly and repair of actin filaments. Other functions of ZYX discovered in recent decades suggest that it also plays an active role in the regulation of gene expression and cell differentiation. By interacting with transcription factors such as nuclear matrix protein 4, ZYX can move from focal adhesions to the nucleus, respond to stretching, and regulate gene transcription. Dysfunction of ZYX in the nucleus appears to be associated with pathogenic effects and disease. In intestinal diseases, ZYX promotes colon cancer through mitosis-related phosphorylation and cyclin-dependent kinase 8-mediated Yes-associated protein activation.
We studied the specific signaling pathways related to the three core genes (RhoB, CTSD, and ZYX). By analyzing the enriched pathways found in samples with high and low expression of these core genes, we identified pathways that are known to be involved in many challenges in the diagnosis and treatment of UC.
A variety of gene changes in intestinal mucosal cells in patients with UC are related to inflammatory cancer transformation. Mutations in tumor-related genes in intestinal epithelial cells and organ systems can be attributed to the effects of inflammatory cytokines and reactive oxygen species. In addition, epigenetic changes and changes in miRNA levels increase inflammation and epithelial cell regeneration and exacerbate UC-related colorectal cancer. Therefore, there is a need for drugs or maintenance treatment to control inflammation more effectively, and colonoscopy screening and gene monitoring strategies need to be improved to manage dysplasia. (1) The KRAS signaling pathway and p53 signaling pathway are involved in UC-associated colorectal cancer. Patients with UC have an increased risk of dysplasia and cancer, and this risk is associated with the duration, extent and severity and/or persistence of inflammatory activity in the disease. A meta-analysis showed that the cumulative incidence of UC-associated colorectal cancer is 0.1%, 2.9% and 6.7% after 10, 20 and 30 years of UC, respectively. Multiple genetic changes in intestinal mucosal cells of UC patients have been associated with inflammatory-cancer transformation. Mutations in tumor-associated genes in IECs and organ systems can affect the actions of inflammatory cytokines and reactive oxygen species. Furthermore, epigenetic changes and altered levels of miRNAs increase inflammation, IEC regeneration and the rate of UC-associated colorectal cancer. Therefore, there is a need for better medications or maintenance treatment strategies to control inflammation more effectively and to improve colonoscopy screening and gene monitoring strategies to manage dysplasia; (2) UC is involved in epithelial EMT-related changes, and the TGF-β signaling pathway causes intestinal fibrosis. Intestinal fibrosis is an inevitable process in the development of UC. The pathological changes include excessive synthesis and abnormal deposition of collagen-based extracellular matrix in the intestinal tissue, which leads to intestinal stenosis and even intestinal obstruction, perforation and fistulae and eventually requires surgical treatment. Five percent of UC patients require surgery for fibrosis. The TGF-β/Smad signaling pathway is the classical pathway that influences EMT-related fibrosis and is associated with other pathways that cause EMT. Blocking EMT by inhibiting the activation of the TGF-β/Smad pathway is a key strategy for the prevention and treatment of UC fibrosis; and (3) The mechanism through which endoplasmic reticulum stress (ERS)-induced autophagy interferes with damage to IECs in the pathogenesis of UC involves the UPR and the mTOR signaling pathway. The efficacy of the classical immunosuppressive agent azathioprine in treating UC might be related to its ability to modulate mTORC1 signaling and induce autophagy via the UPR sensor protein kinase R-like ER kinase (PERK). Therefore, focusing on these gene targets, effectively regulating ERS autophagy, alleviating IEC damage, reducing the impairment of intestinal mucosal barrier function, and maintaining intestinal homeostasis will provide potential new targets and more efficient therapeutic options for the treatment of UC.
Immune infiltration in the microenvironment mainly involves immune cells, extracellular matrix, various growth factors, inflammatory factors and specific physicochemical characteristics that significantly affect the diagnosis and clinical treatment sensitivity of the disease. Arriola et al evaluated the infiltration of CD8+ and FoxP3+ immune cells and granzyme B (GzmB) expression in colon biopsies of 20 patients with ipilimumab-related colitis. The researchers found that the counts of CD8+, FoxP3+ and GzmB+ T cells were significantly higher in patients with ipilimumab-related colitis than in healthy controls. Patients requiring infliximab for colitis had significantly higher CD8+/FoxP3+ ratios than those treated with steroids only, and the severity was related to clinical symptoms. Remission of colitis was associated with decreased CD8+ and FoxP3+ cells in patients treated with steroids and infliximab. A previous study showed that the counts of cytotoxic T cells and regulatory T (Treg) cells in the colonic mucosa of patients treated with ipilimumab were associated with clinical characteristics and could thus be used to predict disease severity and guide treatment. Lei et al found that the abundances of cytotoxic T cells, exhausted T cells, type 1 regulatory T cells (Trl cells), induced regulatory T (iTreg) cells, Th1 cells, central memory T cells, DCs, B cells, monocytes and macrophages were generally higher in UC samples than in healthy control samples, whereas the abundances of naïve CD8+ T cells, Th2 cells, effector memory T cells, NKT cells, mucosal-associated invariant T cells (MAIT cells), NK cells, neutrophils and CD8+ T cells were lower in UC samples than in healthy control samples . Immune infiltration has also been used to study the interaction between sensory nerves and the immune response in colitis. These findings highlight the complexity of visceral sensory nerve immune interactions in pain relief and recurrent diseases.
Our study found that UC patients exhibit significantly higher numbers of monocytes and neutrophils and lower numbers of memory B cells and resting NK cells than in normal patients. The results of our study suggest that many infiltrating immune cells, including monocytes, neutrophils and inflammatory cells, are activated in the peripheral blood of UC patients, and these cells are recruited and activated by many upregulated chemokines and cytokines in the mucosa of UC patients to further promote inflammation and injury in active disease. The dysregulation of B-cell differentiation and the insufficient immune response mediated by adaptive T cells in NK cells results in nonselective phagocytosis of leukocytes, infiltration of the mucosal epithelium by a large number of antigens and disruption of immune homeostasis, which are also related to the mechanisms of recurrence of UC.
The expression of RhoB, CTSD and ZYX is also associated with the immune response. Knockdown of RhoB expression markedly decreases Toll-like receptor (TLR) ligand-induced activation of mitogen-activated protein kinases and nuclear factor-κB (NF-κB) and the production of TNF-α, IL-6 and IL-1β in macrophages stimulated with TLR ligands. Yadati et al found that extracellular CTSD inhibition switches the systemic immune status of mice to an anti-inflammatory profile. Podosomes consist of a protrusive actin-rich core and an adhesive integrin-rich ring that contains adaptor proteins such as vinculin and zyxin and could potentially be further exploited to study processes at the ventral plasma membrane of immune cells. RhoB is related to monocytes/macrophages in human traumatic brain injury and myelodysplastic syndrome, and CTSD is related to monocytes/macrophages in Alzheimer’s disease and myocardial infarction[42-45]. These genes are potential diagnostic markers and therapeutic targets for intervention.
Single-cell RNA sequencing (scRNA-seq) can be used to rapidly obtain the precise gene expression patterns of thousands of cells in the intestine, analyze the characteristics of cells with the same phenotype, and provide new insights into the growth and development of intestinal organs, the clonal evolution of cells, and immune cell changes. These findings can provide new ideas for the diagnosis and treatment of intestinal diseases.
To reveal diagnostic markers of ulcerative colitis (UC) with the aim of providing experimental research ideas and a theoretical basis for the discovery of new UC molecular mechanisms and therapeutic drugs.
To identify clinical phenotypes and biomarkers that could predict the response of UC patients to specific therapeutic drugs and thus aid the diagnosis and treatment of UC.
Using the Gene Expression Omnibus database, through scRNA-seq analysis, least absolute shrinkage and selection operator (LASSO) diagnostic model building and weighted gene correlation network analysis (WGCNA), we analyzed peripheral blood cell subtypes of patients with UC by scRNA-seq combined with bulk RNA sequencing (RNA-seq) to reveal the core genes of UC.
After processing the scRNA-seq data, we obtained data from approximately 24340 cells and identified 17-cell types. Through intercellular communication analysis, we selected monocyte marker genes as the candidate gene set for the prediction model. Construction of a WGCNA coexpression network identified RhoB, cathepsin D (CTSD) and zyxin (ZYX) as core genes. Immune infiltration analysis showed that these three core genes were strongly correlated with immune cells. Functional enrichment analysis showed that the differentially expressed genes were closely related to immune and inflammatory responses, which are associated with many challenges with the diagnosis and treatment of UC.
Through scRNA-seq, LASSO diagnostic model building and WGCNA, we identified RhoB, CTSD and ZYX as core genes of UC that are closely related to monocyte infiltration and could be used as diagnostic markers and potential molecular targets for UC therapeutic intervention.
Single-cell RNA-seq combined with bulk RNA-seq analysis of peripheral blood reveals the characteristics and key immune cell genes of UC.