Gao M, Zhong A, Patel N, Alur C, Vyas D. High throughput RNA sequencing utility for diagnosis and prognosis in colon diseases. World J Gastroenterol 2017; 23(16): 2819-2825 [PMID: 28522900 DOI: 10.3748/wjg.v23.i16.2819]
Corresponding Author of This Article
Dinesh Vyas, MD, FACS, Associate Dean of Surgery Research, Department of Surgery, Texas Tech University, 701 West 5th Street, Suite 2263 Odessa, TX 79763, United States. email@example.com
Checklist of Responsibilities for the Scientific Editor of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
High throughput RNA sequencing utility for diagnosis and prognosis in colon diseases
Mamie Gao, Allen Zhong, Neil Patel, Chiraag Alur, Dinesh Vyas
Mamie Gao, Allen Zhong, Neil Patel, Chiraag Alur, Dinesh Vyas, Department of Surgery, Texas Tech University, Odessa, TX 79763, United States
ORCID number: $[AuthorORCIDs]
Author contributions: Gao M and Zhong A have contributed equally as first co-authors; Gao M, Zhong A and Vyas D were involved with the conception, development of the study, data collection and writing the article; Patel N, Alur C and Vyas D were involved with data analysis and interpretation; Vyas D approved the final version.
Conflict-of-interest statement: The authors have no conflicts of interest or financial disclosures.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: Dinesh Vyas, MD, FACS, Associate Dean of Surgery Research, Department of Surgery, Texas Tech University, 701 West 5th Street, Suite 2263 Odessa, TX 79763, United States. firstname.lastname@example.org
Telephone: +1-432-7035290 Fax: +1-432-3351693
Received: September 15, 2016 Peer-review started: September 19, 2016 First decision: October 28, 2016 Revised: November 16, 2016 Accepted: March 15, 2017 Article in press: March 15, 2017 Published online: April 28, 2017
RNA sequencing is the use of high throughput next generation sequencing technology to survey, characterize, and quantify the transcriptome of a genome. RNA sequencing has been used to analyze the pathogenesis of several malignancies such melanoma, lung cancer, and colorectal cancer. RNA sequencing can identify differential expression of genes (DEG’s), mutated genes, fusion genes, and gene isoforms in disease states. RNA sequencing has been used in the investigation of several colorectal diseases such as colorectal cancer, inflammatory bowel disease (ulcerative colitis and Crohn’s disease), and irritable bowel syndrome.
Core tip: RNA sequencing is the use of high throughput next generation sequencing technology to survey, characterize, and quantify the transcriptome of a genome. RNA sequencing has been used in the investigation of several colorectal diseases such as colorectal cancer, inflammatory bowel disease (ulcerative colitis and Crohn’s disease), and irritable bowel syndrome.
Citation: Gao M, Zhong A, Patel N, Alur C, Vyas D. High throughput RNA sequencing utility for diagnosis and prognosis in colon diseases. World J Gastroenterol 2017; 23(16): 2819-2825
RNA Sequencing Basics: RNA sequencing is the use of high throughput next generation sequencing technology to survey, characterize, and quantify the transcriptome of a genome. In contrast to previous methods, RNA sequencing utilizes sequencing by synthesis technology to define the nucleotide sequences and quantify RNA molecules in a sample. Next generation sequencing (NGS) can process this data in hours to days with high fidelity, making it the preferred technique for RNA analysis amongst many researchers. The utilization of this technology in research and literature has been exploding in popularity. There are many promising potential clinical applications of RNA sequencing with recent discoveries using RNA sequencing in many disease states[4,5].
Several commercial RNA sequencing kits are available for any sample. Most follow similar processing steps, but ultimately depend on experimental considerations. Total RNA, mRNA, and small RNA analysis can be done with most kits. For mRNA isolation, poly(T) primers attached to beads or magnets are used to bind mRNA and isolate these strands. For small RNA molecules or non-coding RNA, gel electrophoresis is used to isolate these molecules. Total RNA isolation utilizes a combination of these two techniques. Adaptors are then ligated to the 5’ end, 3’ end, or both. Once RNA is isolated, cDNA is generated, amplified, and then fragmented. Some kits provide direct RNA sequencing without the need to create cDNA. rRNA can be removed since it makes up a significant proportion of the total RNA but is of little research interest. These samples are then sequenced through massive parallel next generation sequencing technologies that utilize sequencing by synthesis of short DNA strands complimentary to the cDNA. Once the reads are produced, software is available to analyze the sequence reads and correspond the reads to portions of the genome. Mapping gene fragments together with sequencing analysis software can also produce de novo transcriptome maps (Figure 1). Using the total of number of reads for each gene product, proportional gene expression can be quantified.
Figure 1 RNA sequencing steps.
A: RNA isolation techniques include the usage of beads with poly(T) primer tails to isolate mRNA or gel electrophoresis to isolate smaller RNA molecules; B: RNA libraries are prepared by attaching adaptors to isolated RNA strands, creating cDNA strands corresponding to the isolated RNA strands, amplifying the cDNA, and then fragmenting the cDNA strands; C: High throughput cDNA sequencing involves sequencing cDNA fragments in parallel by nucleotide addition; D: cDNA sequences are then analyzed by matching fragments onto known genomes or de novo mapping to produce a transcriptome.
Advantages over previous attempts at transcriptome investigation have prompted the recent increased utilization of RNA sequencing. Two prominent techniques were available before NGS RNA sequencing. Hybridization of cDNA probes attached to microarrays allowed for transcriptome analysis but was limited by the requirement for extensive knowledge of the genome, transcription products, alternative splicing, and exons. Resolution was also limited during attempts to quantify gene expression because of background noise produced by cross- hybridization. The other technology was Sanger sequencing, which utilized chain termination methods to determine nucleotide sequences. In contrast to NGS, Sanger methods are more expensive, more time consuming, and only limited portions of transcripts could be analyzed[2,3,8].
Both the discovery of non-coding RNA, such as miRNA (miRNA), and the discovery of post- transcriptional mRNA expression regulation has necessitated the creation of an assay that survey these small non-coding RNAs along with variant mRNAs with high throughput and resolution. RNA sequencing technology allows researchers to perform both those tasks as well as quantifying RNA expression and thus gene expression with a single assay. Because of the high throughput nature of RNA sequencing, transcriptomes can be analyzed and compared between time, different tissue samples, and different environmental factors such as disease states and pharmacologic interventions in an efficient manner. Because of the possibility of de novo transcriptome synthesis, prior genomic and transcriptional knowledge of the sample is not needed, allowing analysis and discovery of novel products. The resolution of RNA sequencing also allows for the identification of single nucleotide variants, novel post-transcriptional modification, novel alternative splicing patterns, and non-coding RNA molecules that have not been previously identified. RNA sequencing provides an accurate quantification of mRNA expression as compared with real-time PCR experiments[10-13].
Using RNA sequencing, we can look at the molecular basis for disease susceptibility, cancer pathogenesis/progression, and response to therapy. RNA Sequencing has been used to analyze the pathogenesis of several malignancies such melanoma, lung cancer, and colorectal cancer. RNA sequencing can identify differential expression of genes (DEG’s), mutated genes, fusion genes, and gene isoforms in disease states. RNA sequencing has the potential for diagnostic and therapeutic applications as well. Current research in colorectal disease using RNA sequencing are unlocking new discoveries that may help clinicians treating patients with colorectal disease in the future.
COLORECTAL DISEASE AND RNA SEQUENCING
RNA sequencing has been used in the investigation of several colorectal diseases such as colorectal cancer, inflammatory bowel disease (ulcerative colitis and Crohn’s disease), and irritable bowel syndrome (IBS). RNA sequencing has been used to identify genomic mutations such as fusion transcripts in colon cancer, as well as the pathogenesis of colorectal cancer[15,16]. Attempts to discover a unique transcript marker for colorectal cancer[17,18] and inflammatory bowel disease have also been attempted for quicker diagnosis than current screening methods[19,20]. RNA sequencing has also been used to investigate treatment response for rectal cancer. Alterations in transcriptional patterns have also been observed in patients with irritable bowel syndrome through RNA sequencing techniques.
Colorectal cancer (CRC) is the third most common cancer among men and women, as well as the third leading cause of death from cancer. It is estimated that more than 50000 people died from colorectal cancer in 2014. While screening methods have dramatically dropped the mortality from CRC, prevention of disease can be improved by diagnosing patients at an earlier progression of disease. The genomic mutation progression of CRC is well documented, but clinicians are still left without a clear molecular disease marker. CRC still poses a significant disease burden on public health.
Inflammatory bowel disease poses significant morbidity, and even possible mortal complications, to patients that are inflicted. Surgical intervention is oftentimes needed to control disease or prevent carcinoma from developing. An estimated 1.5 million people in North America are inflicted with IBD. While the incidence has recently been stabilizing in North America and Europe, incidence has been increasing in the Middle East and Asia. New molecular insights are needed to find more effective diagnosis, prognosis, and treatments.
While irritable bowel syndrome poses less of a risk to public health than colorectal cancer or inflammatory bowel disease, it is one of the most common colorectal diseases. It is estimated that as many as 20% of the adult population may be inflicted. Despite IBS’s high prevalence, diagnosis and treatment of this disease still elude researchers.
RNA SEQUENCING FOR DIAGNOSIS AND PROGNOSIS OF COLORECTAL DISEASES
Research shows that certain RNA sequences are upregulated or downregulated in colorectal diseases, opening the possibility of using RNA sequencing to screen for, diagnose, and assess the prognosis of colorectal cancers. Given the increase in treatment resistance to standard chemotherapy regimens[30,31], RNA sequencing also allows for the detection of those that are treatment resistant. Table 1 provides a summary of the most recently studied markers in colorectal cancer and ulcerative colitis.
Table 1 RNA markers in colorectal cancer and ulcerative colitis.
Active UC and CAC
The expression levels of several microRNAs are altered in colorectal disease, specifically colorectal cancer and inflammatory bowel disease compared to non-disease states. These microRNAs have been shown to be biomarkers for several disease characteristics. CRC: Colorectal cancer; UC: Ulcerative colitis.
Wang et al provides a review of the various diagnostic biomarkers that are altered including stool miRNA, serum miRNA, piwi-interacting RNA, and long non-coding RNA (lncRNA) while Hollis et al provides a more thorough review of the miRNA biomarkers used for early detection, prognosis, and chemosensitivity of CRC. More recently, Yang et al used 16 cancer tissues to find that miR-143 acts as a tumor suppressor and is downregulated in CRC tissues and can be used to diagnose CRC. Using 40 CRC tumor tissues and 595 fecal samples (198 CRC, 199 adenoma, 198 healthy subjects), Yau et al found that miR-20a is upregulated in tumors and fecal samples and can also be used to diagnosis CRC. Sun et al validated that miR-21 is upregulated and miR-143 is downregulated in CRC and are the most important miRNAs in CRC.
Qin et al found through 6 human CRC cell lines that miR-132 acts as a tumor suppressor and that hypermethylation of this causes its downregulation and is a thus marker for poor prognosis. Xu et al used 30 samples of CRC tumors to show that increased miR-20a is also associated with tumor invasion and lymph node metastasis. Using 104 CRC specimens, Liu et al found that lncRNA DANCR is upregulated in CRC tissues. It is correlated with TNM stage, histologic grade, lymph node metastasis, shorter overall survival and disease-free survival.
Hu et al, through a retrospective analysis of 126 patients with colon adenocarcinoma, found that miR-4299 and miR-196b are potential novel biomarkers for XELOX chemoresistance. Downregulation of miR-4299 and upregulation of miR-196b is correlated with better survival.
In regards to ulcerative colitis (UC) and colitis-associated cancer (CAC), Polytarchou et al used 401 colon specimens of patients with UC, Crohn’s, IBS, sporadic CRC, and CAC to show that miR-214 is upregulated in active UC and CAC. Its expression is correlated with UC activity and disease duration and could serve as a biomarker for identifying patients at risk for malignant transformation.
RNA SEQUENCING FOR TREATMENT OF COLORECTAL DISEASES
Through specific targeting, RNA sequencing allows for the development of new therapeutic approaches to colorectal diseases. The various RNA seq-based approaches to therapy of various diseases include gene therapy, natural antisense transcripts (NATs), antisense oligonucleotides (ASOs), and plasmid based therapy (Figure 2).
Figure 2 Treatment options using RNA sequencing.
Many possible therapeutic applications of RNA sequencing in colorectal disease include gene therapy, antisense oligonucleotides, plasmid-based therapy, and natural antisense transcripts.
In terms of gene therapy techniques, Wang et al found that tumor suppressor long intergenic non-coding RNA (lincR-p21), downregulated in CRC, administered exogenously can suppress the stem-like traits of colorectal cancer stem cells. An adenoviral vector with the miRNA responsive element of miR-451 delivers the lincR-p21 into cells that have low miR-451 levels. This inhibits β-catenin signaling and attenuates the viability, self-renewal, and glycolysis of CRC in vitro. It also suppresses the self-renewal potential and tumorigenicty of CRC in nude mice.
Davis et al in 2010 entered phase 1 clinical trial testing to systemically administer small interfering RNA (siRNA) to patients with solid cancers using a targeted, nanoparticle delivery system. The siRNA is designed to reduce the expression of the M2 subunit of ribonucleotide reductase RRM2.
Another treatment being studied is Resveratrol. It has been found that Resveratrol, extracted from Chinese herbal medicine Polygonum cuspidatum, downregulates lncRNA MALAT1 which decreases nuclear localization of β-catenin thus diminishing the wnt/β-catenin signaling, ultimately inhibiting CRC invasion and metastasis.
Phase 1 clinical trials have shown that CEQ508 is a possible medical treatment for familial adenomatosis polyposis. Through transkingdown RNA interference, nonpathogenic E. coli produce and deliver short hairpin RNA (shRNA) against β-catenin to target cells, again inhibiting intestinal cell growth and polyp growth. It has been found to be safe and well-tolerated in nonhumans.
Therapies using ASO, NAT, and plasmid based therapy have not yet been studied with colorectal diseases[46-49]. However, these techniques are available and should be studied with colorectal diseases in the future.
LIMITATIONS OF RNA SEQUENCING IN COLORECTAL DISEASE
Despite the advances that are being made in using RNA sequencing for diagnosis and treatment of colorectal diseases, there are still limitations including targeting and delivery of treatment. Figure 3 summarizes future directions to address these limitations.
Figure 3 Roadmap of future studies.
It is proposed that the future of high throughput RNA sequencing applications in colorectal disease must address current limitations in the literature by creating studies with significant data by increasing sample size, focusing research on therapeutic applications of RNA, and delineating drug targets and delivery mechanisms.
RNA Sequencing itself has many limitations and problems. NGS requires sequences shorter than most mRNA sequences to process in a parallel manner, requiring fragmentation of either the RNA strand or cDNA strand. This requirement can introduce bias to the strands as RNA fragment leads to decreased amplification of 5’ and 3’ ends of the strand and cDNA fragmentation leads to preferential amplification to the 3’ end of a strand. cDNA synthesis of small RNA molecules can also be biased based on the adaptors used, specific G/C-content, and complex tertiary and quaternary structures of these molecules[50,51]. NGS also produces large amounts of data that presents a problem for storage and retrieval of said data. Alternative splicing, trans-splicing, and fragments that correspond to multiple genomic locations also present a problem for the analysis of the transcriptome.
Other challenges include library construction, bioinformatics challenges, and coverage vs cost. In constructing a cDNA library, there are many manipulation stages to go through which can get complicated in profiling all the types of transcripts. Bioinformatic challenges include storing, retrieving, and processing the large amounts of data that is garnered through RNA sequencing. In terms of coverage vs cost, greater coverage requires more sequencing depth and thus greater cost to detect rare transcripts or variants.
As seen in the previous studies described above, small sample sizes were used for most research conducted. While some studies used 400 samples, many had sample sizes that ranged from 6 to 30. Larger samples sizes can help to yield more significant results. Another study with promising results that are not significant due to sample size is Cohen et al’s study on the predictive value of Target Now. Target Now uses immunostaining and RNA expression on tumor samples to identify potentially beneficial or ineffective drugs. The results of this study were not statistically significant due to its small sample size of 19 patients. Despite the promising results of the Target Now study, the small sample size exemplifies the limitations of much of the RNA Sequencing literature and its application in colorectal diseases.
Complications and side effects have been seen when siRNA has been used as a therapeutic agent, further limiting the usage of RNA sequencing in colorectal disease. A phase 1 drug candidate that targeted apoB was withdrawn because of the immune response elicited by its cationic lipid-based formulation that delivers siRNA into endosomes where immune receptors are most dense. This caused one patient to have severe flu-like symptoms typical of an immune response.
In response, dual targeting siRNA is being studied to reduce the potential for off-target gene silencing. Theoretically, fewer strands compete for RISC entry which helps avoid the innate immune response. However, more research needs to be conducted in this area.
The biodistribution of siRNA in vivo has also been a limitation of siRNA application. van de Water et al found that intravenous siRNA accumulates in the kidney of rats rather than being absorbed in the GI tract. There, it acts to suppress the gene function in proximal tubules, limiting the application of siRNA in colorectal disease. Further research is needed to manipulate the localization of siRNA for therapeutic colorectal applications.
To conclude, despite the large amount of research dedicated to using RNA sequencing to diagnose and screen for colorectal diseases, further studies need to be conducted on using these techniques for treatment of these colorectal diseases. With more research, RNA sequencing could be the next novel treatment for colorectal diseases.
The authors are grateful and indebted to Dr. Vyas and the surgical faculty and staff at Texas Tech University Health Science Center at Permian Basin for all of their support and career guidance.
Manuscript source: Invited manuscript
Specialty type: Gastroenterology and hepatology
Country of origin: United States
Peer-review report classification
Grade A (Excellent): 0
Grade B (Very good): B
Grade C (Good): 0
Grade D (Fair): 0
Grade E (Poor): 0
P- Reviewer: Crea F S- Editor: Gong ZM L- Editor: A E- Editor: Wang CH
Yang F, Xie YQ, Tang SQ, Wu XB, Zhu HY. miR-143 regulates proliferation and apoptosis of colorectal cancer cells and exhibits altered expression in colorectal cancer tissue.Int J Clin Exp Med. 2015;8:15308-15312.
[PubMed] [DOI][Cited in This Article: ]
Liu Y, Zhang M, Liang L, Li J, Chen YX. Over-expression of lncRNA DANCR is associated with advanced tumor progression and poor prognosis in patients with colorectal cancer.Int J Clin Exp Pathol. 2015;8:11480-11484.
[PubMed] [DOI][Cited in This Article: ]
Cohen JE, Cohen Y, Peretz T, Hubert A. Retrospective Study of the Predictive Value of Target Now in Systemic Therapy for Metastatic Colorectal and Gastric Carcinomas.Isr Med Assoc J. 2015;17:612-615.
[PubMed] [DOI][Cited in This Article: ]