Esteban-Jurado C, Garre P, Vila M, Lozano JJ, Pristoupilova A, Beltrán S, Abulí A, Muñoz J, Balaguer F, Ocaña T, Castells A, Piqué JM, Carracedo A, Ruiz-Ponte C, Bessa X, Andreu M, Bujanda L, Caldés T, Castellví-Bel S. New genes emerging for colorectal cancer predisposition. World J Gastroenterol 2014; 20(8): 1961-1971
Corresponding Author of This Article
Sergi Castellví-Bel, PhD, Department of Gastroenterology, Hospital Clínic, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Rosselló 153 planta 4, 08036 Barcelona, Catalonia, Spain. email@example.com
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
New genes emerging for colorectal cancer predisposition
Clara Esteban-Jurado, Pilar Garre, Maria Vila, Juan José Lozano, Anna Pristoupilova, Sergi Beltrán, Anna Abulí, Jenifer Muñoz, Francesc Balaguer, Teresa Ocaña, Antoni Castells, Josep M Piqué, Angel Carracedo, Clara Ruiz-Ponte, Xavier Bessa, Montserrat Andreu, Luis Bujanda, Trinidad Caldés, Sergi Castellví-Bel
Clara Esteban-Jurado, Anna Abulí, Jenifer Muñoz, Francesc Balaguer, Teresa Ocaña, Antoni Castells, Josep M Piqué, Sergi Castellví-Bel, Department of Gastroenterology, Hospital Clínic, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, 08036 Barcelona, Catalonia, Spain
Anna Abulí, Xavier Bessa, Montserrat Andreu, Department of Gastroenterology, Hospital del Mar-IMIM (Hospital del Mar Medical Research Centre), Pompeu Fabra University, 08003 Barcelona, Spain
Maria Vila, Juan José Lozano, Bioinformatics platform, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), 08036 Barcelona, Spain
Anna Pristoupilova, Sergi Beltrán, Centre Nacional d’Anàlisi Genòmica (CNAG), Parc Científic de Barcelona, 08028 Barcelona, Spain
Angel Carracedo, Clara Ruiz-Ponte, Galician Public Foundation of Genomic Medicine (FPGMX), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), Genomics Medicine Group, Hospital Clínico, Santiago de Compostela, University of Santiago de Compostela, 15706 Galicia, Spain
Angel Carracedo, Center of Excellence in Genomic Medicine Research, King Abdulaziz University, 21589 Jeddah, Kingdom of Saudi Arabia
Luis Bujanda, Gastroenterology Department, Hospital Donostia, Networked Biomedical Research Centre for Hepatic and Digestive Diseases (CIBEREHD), Basque Country University, 20080 San Sebastián, Spain
Pilar Garre, Trinidad Caldés, Molecular Oncology Laboratory, Hospital Clínico San Carlos, Instituto de Investigación Sanitaria del Hospital Clínico San Carlos (IdISSC), 28040 Madrid, Spain
ORCID number: $[AuthorORCIDs]
Author contributions: Esteban-Jurado C, Garre P, Vila M, Lozano JJ, Pristoupilova A, Beltrán S, Abulí A, Muñoz J, Balaguer F, Ocaña T, Castells A, Piqué JM, Carracedo A, Ruiz-Ponte C, Bessa X, Andreu M, Bujanda L, Caldés T and Castellví-Bel S contributed to this paper.
Supported by SCB is supported by a contract from the Fondo de Investigación Sanitaria, No. CP 03-0070; CEJ and JM are supported by a contract from CIBERehd; CIBERehd and CIBERER are funded by the Instituto de Salud Carlos III; Fondo de Investigación Sanitaria/FEDER, No.11/00219 and No. 11/00681, Instituto de Salud Carlos III (Acción Transversal de Cáncer), Xunta de Galicia, No. 07PXIB9101209PR, Ministerio de Ciencia e Innovación, No. SAF2010-19273, Asociación Española contra el Cáncer (Fundación Científica GCB13131592CAST y Junta de Barcelona), Fundació Olga Torres (SCB and CRP), FP7 CHIBCHA Consortium (SCB and ACar), and COST Action BM1206 (SCB and CRP)
Correspondence to: Sergi Castellví-Bel, PhD, Department of Gastroenterology, Hospital Clínic, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBEREHD), Institut d’Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), University of Barcelona, Rosselló 153 planta 4, 08036 Barcelona, Catalonia, Spain. firstname.lastname@example.org
Telephone: +34-93-2275418 Fax: +34-93-3129405
Received: October 1, 2013 Revised: November 7, 2013 Accepted: January 14, 2014 Published online: February 28, 2014
Colorectal cancer (CRC) is one of the most frequent neoplasms and an important cause of mortality in the developed world. This cancer is caused by both genetic and environmental factors although 35% of the variation in CRC susceptibility involves inherited genetic differences. Mendelian syndromes account for about 5% of the total burden of CRC, with Lynch syndrome and familial adenomatous polyposis the most common forms. Excluding hereditary forms, there is an important fraction of CRC cases that present familial aggregation for the disease with an unknown germline genetic cause. CRC can be also considered as a complex disease taking into account the common disease-commom variant hypothesis with a polygenic model of inheritance where the genetic components of common complex diseases correspond mostly to variants of low/moderate effect. So far, 30 common, low-penetrance susceptibility variants have been identified for CRC. Recently, new sequencing technologies including exome- and whole-genome sequencing have permitted to add a new approach to facilitate the identification of new genes responsible for human disease predisposition. By using whole-genome sequencing, germline mutations in the POLE and POLD1 genes have been found to be responsible for a new form of CRC genetic predisposition called polymerase proofreading-associated polyposis.
Core tip: Colorectal cancer (CRC) is caused by both genetic and environmental factors although 35% of the variation in CRC susceptibility involves inherited genetic differences. Mendelian syndromes account for about 5% of the total burden of CRC. Excluding hereditary forms, there is an important fraction of CRC cases that present familial aggregation for the disease with an unknown germline genetic cause. Recently, new sequencing technologies have permitted to add a new approach to identify new genes responsible for human disease predisposition. By doing so, germline mutations in the POLE and POLD1 genes have been found to be responsible for a new form of CRC genetic predisposition.
Citation: Esteban-Jurado C, Garre P, Vila M, Lozano JJ, Pristoupilova A, Beltrán S, Abulí A, Muñoz J, Balaguer F, Ocaña T, Castells A, Piqué JM, Carracedo A, Ruiz-Ponte C, Bessa X, Andreu M, Bujanda L, Caldés T, Castellví-Bel S. New genes emerging for colorectal cancer predisposition. World J Gastroenterol 2014; 20(8): 1961-1971
Colorectal cancer (CRC) is one of the most frequent neoplasms and an important cause of mortality in the developed world. Approximately 5% of the population develops CRC and this figure is expected to rise as life expectancy increases. For 2015, approximately 473200 new cases are predicted and 233900 individuals will die from this disease in Europe. When taking into account both genders together, it corresponds to the most frequent neoplasm in Spain. Although there has been recent progress in CRC clinical management and treatment that has permitted to reduce the number of cases in the developed countries, it is foreseen that its incidence will increase worldwide with developing nations bearing the brunt of the rise. The incidence of CRC varies widely between countries, depending on their degree of development and also on the quality of their cancer registries. Around 60% of cases are diagnosed in the developed world. The highest incidence rates are found in Australia and New Zealand, North America and Europe, whereas the lowest rates are registered in Africa and South-Central Asia (Figure 1).
Figure 1 Colorectal cancer in the world.
A: Estimated age-standarized incidence rate per 100000 individuals (both genders and all ages); B: Estimated age-standarized incidence and mortality rate per 100000 individuals by genders (data adapted from Ferlay et al).
CRC survival depends on the stage of disease at diagnosis and typically ranges from a 90% 5-year survival rate for cancers detected at the localized stage to 10% for people diagnosed of a distant metastatic cancer. The lifetime risk of CRC in the general population is about 5% in Western countries, but the likelihood of CRC diagnosis increases progressively with age, being more than 90% in individuals over age 50, and 70% of these over 65.
CRC is believed to develop from polyps, which have been traditionally classified as either hyperplastic or adenomatous. Until recently, according to the adenoma-carcinoma sequence proposed by Vogelstein et al the adenoma was considered the exclusive precursor lesion while hypeplastic polyps were deemed to have no malignant potential. However, it is now recognized that lesions, formerly classified as hyperplastic, represent a heterogeneous group of polyps with a characteristic serrated morphology, some of which have a significant risk of malignant transformation through the serrated neoplasia pathway.
GENETIC AND ENVIRONMENTAL RISK FACTORS
As other complex diseases, CRC is caused by both genetic and environmental factors. The role of environmental factors on colorectal carcinogenesis is indicated by the increase in CRC incidence in parallel with economic development and adoption of Western diets and lifestyles, responsible for the high incidence of CRC in industrialized countries. Although the majority of CRC occur mostly in industrialized countries, their incidence rates are rapidly rising in economically transitioning countries in the world. These observations highlight the importance of environmental influences on CRC development and suggest that Western lifestyle risk factors play an important role in the etiology of the disease. However, although environmental causes such as smoking and diet are undoubtedly risk factors for CRC, twin studies have shown that 35% of the variation in CRC susceptibility involves inherited genetic differences[9,10]. In that sense, a minority of CRC cases (about 5%) show strong familial aggregation and belong to the well-known hereditary CRC forms mainly caused by germline mutations in APC, MUTYH and the DNA mismatch repair genes. Approximately 30% of CRC cases show some family history of the disease but do not fit in the previous category and are regarded as familial CRC, whereas a majority of cases do not show any familial aggregation and correspond to sporadic CRC. For instance, familial CRC accounted for about 30% of all CRC cases in an epidemiological study in the Spanish population.
Mendelian cancer syndromes account for about 5% of the total burden of CRC. The genetic components involved in these less frequent hereditary forms were successfully identified using linkage analysis in the past two decades and they correspond to rare highly penetrant alleles that predispose to CRC. Two major subgroups can be clinically divided on the presence or absence of colorectal polyposis. An overview of all CRC syndromes is provided in Table 1. The most frequent forms are hereditary nonpolyposis colorectal cancer and familial polyposis syndrome, which are further described below.
Hereditary Nonpolyposis Colorectal Cancer (HNPCC; MIM No.120435), also known as Lynch syndrome, is the most common form of hereditary CRC accounting for at least 3% of all CRC. HPNCC is an autosomal dominant syndrome defined clinically by the Amsterdam criteria (Table 2), which are used in clinical practice to identify individuals at risk for this disease who require further evaluation and are based on strong familial aggregation and early onset. It is characterized by early-onset CRC (mean age at diagnosis, approximately 45 years), excess synchronous and metachronous colorectal neoplasms and right-sided predominance compared to sporadic neoplasms. In addition, there is an increased incidence of extracolonic neoplasms (endometrial, small bowel, gastric, upper urinary tract, ovarian, brain and pancreatic tumors) being endometrial cancer the most common malignancy associated with Lynch syndrome. Indeed, Lynch syndrome is responsible for approximately 2% of all endometrial cancers. The lifetime risk for developing CRC in individuals affected with Lynch syndrome have been estimated in approximately 66% for men and about 43% for women. The cumulative risk of endometrial cancer is approximately 40% and the lifetime risk of endometrial cancer or CRC in women is approximately 73%. Lynch syndrome tumors develop as a consequence of defective DNA mismatch repair (MMR) associated with germline mutations in the MMR genes, including MSH2 on chromosome 2p16, MLH1 on chromosome 3p21, MSH6 on chromosme 2p16, and PMS2 on chromosome 7q11. In addition, germline epigenetic inactivation of MLH1, by hypermethylation of its promoter, can also lead to Lynch syndrome. Recently, germline deletions of the 3’ region of EPCAM gene were found in a subset of families with Lynch syndrome. This deletion leads to promoter hypermetilation of MSH2, located upstream of the deleted gene. The MMR system is necessary to maintaining genomic fidelity by correcting single-base mismatches and insertion-deletion loops during DNA replication. As a consequence, Lynch syndrome tumors accumulate errors in short repetitive sequences, a phenomenon called microsatellite instability (MSI), which is considered a landmark for this disease. It is noteworthy to mention that in sporadic MSI CRC cancers, loss of expression of MLH1 due to hypermethilation of its promoter is a frequent event, and it is linked with the somatic mutation V600E in the BRAF gene.
At least three relatives with CRC; all of the following must be met:
At least three relatives with colorectal, endometrial, small bowel, ureter, or renal pelvis cancer; all of the following must be met:
One affected individual is a first degree relative of the other two
One affected individual is a first degree relative of the other two
At least two successive generations affected
At least two successive generations affected
At least one CRC diagnosed before the age of 50 years
At least one tumor diagnosed before the age of 50 years
Familial adenomatous polyposis has been excluded
Familial adenomatous polyposis has been excluded
CRC: Colorectal cancer.
Familial Adenomatous Polyposis (FAP; MIM No.175100) is the most common polyposis syndrome, classically characterized by the development of hundreds to thousands of adenomatous polyps in the rectum and colon. FAP is an autosomal dominant disease and accounts for approximately 1% of all CRC cases. In the majority of patients polyps begin to develop during the second decade of life and nearly 100% of untreated patients will have malignancy by ages 40-50 years. Individuals with FAP can also develop a variety of extracolonic manifestations, including cutaneous lesions such as fibromas, lipomas, sebaceous and epidermoid cysts, facial osteomas, congenital hypertrophy of the retinal pigment epithelium, desmoid tumours and extracolonic cancers (tyroid, liver, biliary tract and central nervous system). Duodenal cancer is the second most common malignancy in FAP, with a lifetime risk of approximately 4%-12%. Adenomatous polyps are also found in the stomach and duodenum, especially the periampullary area and can develop into adenocarcinomas. After colectomy, periampullary carcinoma is the most common malignancy, occurring in approximately 5%-6% of the patients. Some lesions such as skull and mandible osteomas, dental abnormalities and fibromas are indicative of the Gardner syndrome, a clinical variant of FAP where the extracolonic features are prominent. FAP is caused by germline mutations in the APC gene on chromosome 5q22, which encodes a tumor suppressor protein that plays an important role in the Wnt signaling pathway. Most patients have a family history of colorectal polyps and cancer, but de novo APC mutations are responsible for approximately 25% of cases.
APPROACHES TO IDENTIFY GENETIC VARIANTS FOR CRC RISK
Among CRC cases of unknown inherited cause, there are large families with a clear positive family history of CRC, which are likely caused by highly penetrant risk loci. In the last few years, it has been described that approximately 40%-50% of CRC families that fulfill the Amsterdam Criteria for Lynch syndrome do not show evidence of MMR deficiency. Studying relatives in such families showed that CRC risk is lower than in those families with Lynch syndrome, that CRC diagnosis is in average 10 years later and that there is no increased incidence of extracolonic malignancies[20,21]. The designation of Familial CRC type X was proposed to describe this type of CRC clustering. Meanwhile, genes responsible for this new entity are unknown, and most patients are included in the heterogeneous group of non-syndromic familial CRC.
Recently, there have been several efforts to identify additional genetic factors that predispose to CRC with uneven success. Linkage analysis in affected families were able to pinpoint chromosomal regions of interest such as 9q22 and 3q22 but no clear CRC predisposition genes were identified after screening for interesting candidates within these areas[22,23].
Since the known high-risk syndromes only account for a small minority of CRC cases, there has been an intensified search for low-penetrance genetic variants that probably underlie part of the hereditary predisposition and together with environmental interactions are responsible for CRC as a complex disease. Therefore, the common disease-common variant hypothesis has been also considered, being a polygenic model of inheritance where the genetic components of common complex diseases correspond mostly to variants of low/moderate effect (typically < 1.5-fold increased risk) that appeared at an elevated frequency in the population (> 5%), each exerting a small influence on disease risk. In this regard, case-control genome-wide association studies (GWAS) have been more successful by discovering up to now 31 common, low-penetrance genetic variants involved in CRC susceptibility[24-32] (Table 3).
Table 3 Genetic variants associated with colorectal cancer susceptibility identified by genome-wide association studies (as for September 2013).
Sample size (cases/controls)
Effect size OR (95%CI)
OVERVIEW OF NEW SEQUENCING TECHNOLOGIES
Until recently, the Sanger method was the dominant approach and gold standard for DNA sequencing. Next generation sequencing (NGS), also called massive parallel sequencing, is based in sequencing millions of DNA fragments at the same time. It consists in a mix of techniques of DNA shearing, PCR amplification and sequencing through modified nucleotides attached to a reversible terminator and a fluorophore, which permits fluorescent detection with an imaging system. Once the fragments are sequenced, they are assembled de novo or aligned with a reference genome by bioinformatics tools and positions that differ are designated as variants. Variants are annotated assigning their position in a gene, retrieving frequency information from genetic variation databases and categorizing them by their functional class (nonsense, missense, synonymous, frameshift, splicing, intronic, untranslated regions, regulatory).
The advantage of NGS comparing with conventional Sanger sequencing is that millions of DNA fragments are sequenced at the same time which permits to have an entire human genome sequenced in few days, and the cost is greatly reduced. However, data analysis that includes filtering of the false positives and prioritization of the candidate variants for the studied phenotypic condition is the main bottleneck of NGS, being time consuming and requiring different strategies that will be discussed later. Another disadvantage of NGS is that PCR amplification and sequencing reaction steps systematically introduce mistakes, producing base-calling errors and shorter sequenced fragments that difficult the mapability to the reference sequence. Due to recent technology and variant calling algorithm improvements, NGS is probably nowadays more accurate than conventional Sanger sequencing. However, although there is a very small error rate associated with NGS, a huge amount of false positives are still detected since millions of variants are sequenced per genome. Thus, after data analysis and selection of the candidate variants it is necessary to validate them using a technology with a different systematic error associated, such as conventional Sanger sequencing, which increases the costs and time of the analysis.
In order to detect genomic sequence variation by NGS, it is possible to sequence the entire genome (whole-genome sequencing, WGS) or capture and sequence only specific regions of interest (targeted enrichment). The most commonly used application for NGS target enrichment in the human genome is whole-exome sequencing (WES) that captures and amplifies the entire protein coding sequence (1% genome), flanking intronic regions and some noncoding RNAs. It is a cost effective approach for detecting rare high penetrance variants based on the fact that for Mendelian disorders over the 85% of causative mutations are in coding regions. One advantage of WES is that is about much cheaper than WGS, which allows sequencing a larger number of samples with better accuracy or coverage. The term coverage corresponds to the read depth or depth and it is the average number of times that a nucleotide has been sequenced in a different sequencing read. Also, the data analysis pipelines are simpler in WES than for WGS. However, WES need for larger amounts of DNA sample and only covering coding variants are among the shortcomings for this technique. It is noteworthy mentioning that NGS target enrichment can also be used to sequence a panel of known genes for clinical diagnosis or regions of linkage disequilibrium for a disease.
The election of individuals to sequence is a critical process to take into account for further analysis and will depend of the disease phenotype and pattern of genetic inheritance. Also, it should be noted that is possible to obtain good results with NGS when using carefully selected patients in contrast to GWAS, where number of cases and controls that are compared needs to be much higher in order to obtain statistically significant findings. For diseases with genetic heterogeneity as human cancers, different strategies can be used including the selection of families with strong disease aggregation or sequencing sporadic cases with early onset for the disease. Both situations are suggestive of the involvement of a germline predisposition. When focusing in families with several affected members, sequencing can be performed in several cases in each family and only those shared variants will be taken into account. On the other hand, if sporadic early-onset cases are chosen, genes with variants in different individuals can be selected. Sequencing non affected individuals of the same family can be useful to discard the variants shared with patients, as long as the disease has complete penetrance or it is quite likely that the non affected individuals will not express the disease in their lifetime.
Data filtering and prioritization in NGS
Based on several recently sequenced individual genomes a pattern has been recognized that, in general, approximately 3-4 million variants are expected to be found in a human genome by WGS and 20000 single nucleotide variants are to be found in a human exome by WES, so it is necessary to do a filtering strategy in order to eliminate as many false positives as possible. The first filter to apply is for those variants that do not pass a coverage threshold (typically 5-10x).
The second filtering process is based on the kind of inheritance, penetrance and frequency of the disease. Regarding the inheritance, for monogenic diseases where unrelated affected individuals have been sequenced, it is necessary to select only the genes that have variants in all of them. If a disease with genetic heterogeneity is studied, variants shared between the affected members of the same family and not shared by the unaffected ones will be chosen. Also, if dominant inheritance is present heterozygous mutations will be expected, whereas homozygous or compound heterozygous mutations will be selected in the case of recessive inheritance. However, variants in the non pseudoautosomal regions of X chromosome for dominant inheritance have to take into account also. In men, they will be annotated as homozygous and it is necessary to select these variants too and not filter them out. Regarding variant effect on protein, it is assumed that high penetrance mutations are causative of Mendelian disorders with a large effect on protein function. Therefore, a positive selection for variants with a strong effect on the protein is advised including those affecting canonical splice sites, as well as frameshift, nonsense and missense mutations.
Proportionally, more deleterious than polymorphic variants are expected to be rare so a causative mutation is not expected to be present at a high frequency in the general population. Thus, variants present at high frequency at reference genetic variation databases can be removed as potential candidates to be causative mutations.
However, many variants can still remain for each individual as putative causative mutations for the disease after filtering. A logical approach to reduce the number of candidate variants is to prioritize the mutations in genes previously implicated with the studied disease. Also, since the protein products of genes responsible for the same disorder tend to physically interact with each other so as to carry out certain biological functions, another approach for the prioritization strategy will be to include genes interacting with those previously implicated with the studied disease. Finally, knowledge of the pathways implicated in a disease can be helpful also to prioritize those genes related with those pathways. After filtering and prioritization, a list of candidate variants will be available.
Sequencing validation by Sanger sequencing or any other PCR technology designed to detect a specific nucleotide change is necessary after NGS to confirm the prioritized variants and exclude sequencing artifacts. Also, segregation analysis in families permits to check if a candidate variant segregates correctly with the disease. Therefore, affected members need to be carriers and non-affected individuals old enough to be expressing the disease should be non-carriers in order to find correct segregation of the candidate variant with the studied disease. Additionally in the case of hereditary cancer, when heterozygous candidate variants with correct segregation are identified, it is necessary to confirm if there is loss of the second allele in the tumor DNA in order to establish the candidate gene as a tumor suppressor gene. Case-control screening studies can also be performed in order to identify additional carriers of the candidate variants in ample disease cohorts and further demonstrate its absence in controls. Finally, functional assessment of the candidate variant and affected gene will be also necessary to further confirm the negative effect of the variant in the protein and prove its involvement in disease development by in vitro studies and animal models.
NEW GENES IDENTIFIED FOR CRC GENETIC PREDISPOSITION
New sequencing technologies made available recently including exome- and whole-genome sequencing have permitted to add a new approach to facilitate the identification of new genes responsible for human disease predisposition. Indeed, some seminal efforts have been already completed very recently for CRC. However, before these high-throughput technologies have yielded results in CRC families, some previous low-throughput sequencing studies reported directed screening of some plausible gene candidates for various reasons. Most studies have not been replicated in additional cohorts and, therefore, there is a strong need to further validate them before considering these genes as hereditary CRC genes perse.
A truncating mutation was found in the CDH1 gene in a family with predisposition to CRC and gastric cancer, suggesting that germline mutations in this gene could contribute to early onset CRC and gastric cancer. Later on, the AXIN2 gene, a component of the Wnt signaling, was found to be mutated in a Finnish family with severe permanent tooth agenesis and CRC. In a subsequent study in patients with unexplained hamartomatous or hyperplastic/mixed polyposis, two early-onset disease patients were found to have germline mutations in ENG, encoding endoglin, previously associated only with hereditary hemorrhagic telangiectasia. This study suggested ENG as a new predisposition gene for juvenile polyposis, however this gene was found to be mutated in an additional study only in patients with ≥ 5 cumulative lifetime gastrointestinal polyps but not in juvenile polyposis. EPHB2 was also evaluated as a candidate tumor suppressor gene for CRC and found mutated in 3 out 116 population-based familial CRC cases, suggesting this gene may contribute to a small fraction of hereditary CRC. In 2009, the GALNT12 gene was also found mutated in the germline of 6 CRC patients. This gene encodes one of the proteins involved in mucin type O-linked glycosylation and it is located in chromosomal region 9q22, previously involved in familial CRC. A more recent study detected additional deleterious variants in this gene reinforcing its role as a new candidate gene for hereditary CRC. Also, an inherited duplication affecting the protein tyrosine phosphatase PTPRJ and causing epigenetic silencing of this gene was detected in a CRC family without polyposis and MMR alteration, being indicative of its contribution to a fraction of hereditary CRC with unknown basis. Afterwards, BMP4, a gene close to 2 of the CRC genetic susceptibility variants identified by GWAS, was also screened in 504 genetically enriched CRC and 3 pathogenic mutations were identified. Then, it could be plausible that some genes identified by CRC GWAS could be also involved in hereditary CRC. In 2011, the BMPR1A gene, previously involved in juveline polyposis and mixed polyposis germline predisposition, was also found mutated in familial CRC type X cases, expanding its phenotype also to this CRC hereditary form. Finally, Cowden syndrome individuals without germline PTEN mutations were found to carry germline mutations in PIK3CA and AKT1, expanding the genetic spectrum of this hereditary CRC condition.
Regarding NGS studies to identify new CRC predisposition genes, Palles et al reported very recently the identification of germline mutations in the POLE (polymerase (DNA directed), epsilon, catalytic subunit) and POLD1 (polymerase (DNA directed), delta 1, catalytic subunit) genes in individuals with multiple colorectal adenomas, carcinoma or both, using whole-genome sequencing. POLE and POLD1 encode the catalytic and proofreading activities of the leading-strand DNA polymerase epsilon and the lagging-strand polymerase δ. The proofreading capacity of the exonuclease domain is essential for the maintenance of replication fidelity and may act not only on newly misincorporated bases but also on mismatches produced by non-proofreading polymerases. They identified a heterozygous p.Leu424Val missense variant in POLE DNA polymerase in a family affected with adenomas and CRC and a p.Ser478Asn missense variant in POLD1 in a second family with CRC. The same POLD1 p.Ser478Asn variant was also identified in the affected members of an independent family. These findings were further validated in a screen of 3,085 individuals with CRC, enriched for a family history of colorectal tumors, in which they detected 12 individuals with the p.Leu424Val variant in POLE and one additional individual with the pSer478Asn in POLD1. Functional assessment supported the importance of these mutations in POLE and POLD1. Mutagenesis studies of Polδ and Pol3 in yeast showed that the mutation of the equivalent residue produces a mutator phenotype and loss of the proofreading activity of the protein[53,55,56]. Also, mice expressing proofreading-impaired Pole and Pold1 in a homozygous state developed spontaneous intestinal adenocarcinomas or a spectrum of cancers. Thus, germline variants in POLE and POLD1 predispose to individuals to either a multiple colorectal adenoma phenotype similar to that observed in MUTYH-associated polyposis or a HNPCC phenotype, in which carriers develop early-onset CRC. Although additional studies will be needed to evaluate these rare germline variants in POLD1 and POLE and their associated phenotypes, the authors suggest that screening for these variants should be considered in patients with an unexplained personal or family history of multiple adenomas, early onset CRC or both. On the other hand, carriers are potential candidates for regular and frequent colonoscopic surveillance starting at an early age.
Two additional reports using exome sequencing have also been published very recently but their results are not as solid as those for the polymerase genes previously mentioned. A cohort of 50 sporadic CRC patients was sequenced including 18 early-onset cases with a relatively low coverage in the first study. Variants were biased selected when found in a list of 1,138 genes likely to play a role in CRC. Further selection to include only those genes undergoing bialleic inactivation yielded FANCM, LAMB4, PTCHD3, LAMC3 and TREX2 as potential tumor suppressor candidates. In the second study, exome sequencing was completed for 40 familial cases from 16 families by selecting distant relatives to decrease the number of shared, non-predisposition variants. Data was analyzed firstly by an agnostic search for CRC predisposition genes not taking into account a biased list of candidates, and secondly by selecting genes previously involved in CRC predisposition or within CRC linkage regions. Two missense variants in the CENPE and KIF23 genes that complied with family segregation and belong to regions on chromosomes 1 and 15 formerly linked to CRC were considered the more plausible candidates for CRC predisposition but additional studies are needed to further elucidate their role.
CRC is one of the most frequent neoplasms and an important cause of mortality in the developed world. CRC is caused by both genetic and environmental factors although 35% of the variation in CRC susceptibility involves inherited genetic differences. Mendelian cancer syndromes account for about 5% of the total burden of CRC, being Lynch syndrome and familial adenomatous polyposis the most common forms. Familial CRC type X is an example of CRC with unknown inherited cause. A clear positive family history of CRC is present (Amsterdam criteria for Lynch syndrome are fulfilled) although MMR is proficient. When considering CRC as a complex disease, low-penetrance genetic variants probably underlie part of the hereditary predisposition together with environmental interactions. So far, 30 susceptibility variants have been identified for CRC. New sequencing technologies made available recently including exome- and whole-genome sequencing have permitted to add a new approach to facilitate the identification of new genes responsible for human disease predisposition. Germline mutations in the POLE and POLD1 genes are responsible for a new form of CRC genetic predisposition called polymerase proofreading-associated polyposis.
We are sincerely grateful to the Centre Nacional d’Anàlisi Genòmica and the Biobank of Hospital Clínic, Barcelona-IDIBAPS for technical help; The work was carried out (in part) at the Esther Koplowitz Centre, Barcelona.
P- Reviewers: Bustamante-Balen M, Hamfjord J, Ventham NT S- Editor: Qi Y L- Editor: A E- Editor: Liu XM
Ferlay J, Shin HR, Bray F, Forman D, Mathers C, Parkin DM. Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008.Int J Cancer. 2010;127:2893-2917.
Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics.CA Cancer J Clin. 2011;61:69-90.
Parker SL, Tong T, Bolden S, Wingo PA. Cancer statistics, 1996.CA Cancer J Clin. 1996;46:5-27.
Vogelstein B, Fearon ER, Hamilton SR, Kern SE, Preisinger AC, Leppert M, Nakamura Y, White R, Smits AM, Bos JL. Genetic alterations during colorectal-tumor development.N Engl J Med. 1988;319:525-532.
Leggett B, Whitehall V. Role of the serrated pathway in colorectal cancer pathogenesis.Gastroenterology. 2010;138:2088-2100.
Gingras D, Béliveau R. Colorectal cancer prevention through dietary and lifestyle modifications.Cancer Microenviron. 2011;4:133-139.
O'Callaghan T. Introduction: The prevention agenda.Nature. 2011;471:S2-S4.
Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland.N Engl J Med. 2000;343:78-85.
Hemminki K, Chen B. Familial risk for colorectal cancers are mainly due to heritable causes.Cancer Epidemiol Biomarkers Prev. 2004;13:1253-1256.
Piñol V, Castells A, Andreu M, Castellví-Bel S, Alenda C, Llor X, Xicola RM, Rodríguez-Moranta F, Payá A, Jover R. Accuracy of revised Bethesda guidelines, microsatellite instability, and immunohistochemistry for the identification of patients with hereditary nonpolyposis colorectal cancer.JAMA. 2005;293:1986-1994.
Hampel H, Frankel W, Panescu J, Lockman J, Sotamaa K, Fix D, Comeras I, La Jeunesse J, Nakagawa H, Westman JA. Screening for Lynch syndrome (hereditary nonpolyposis colorectal cancer) among endometrial cancer patients.Cancer Res. 2006;66:7810-7817.
Stoffel E, Mukherjee B, Raymond VM, Tayob N, Kastrinos F, Sparr J, Wang F, Bandipalliam P, Syngal S, Gruber SB. Calculation of risk of colorectal and endometrial cancer among patients with Lynch syndrome.Gastroenterology. 2009;137:1621-1627.
Suter CM, Martin DI, Ward RL. Germline epimutation of MLH1 in individuals with multiple cancers.Nat Genet. 2004;36:497-501.
Ligtenberg MJ, Kuiper RP, Chan TL, Goossens M, Hebeda KM, Voorendt M, Lee TY, Bodmer D, Hoenselaar E, Hendriks-Cornelissen SJ. Heritable somatic methylation and inactivation of MSH2 in families with Lynch syndrome due to deletion of the 3’ exons of TACSTD1.Nat Genet. 2009;41:112-117.
McGivern A, Wynter CV, Whitehall VL, Kambara T, Spring KJ, Walsh MD, Barker MA, Arnold S, Simms LA, Leggett BA. Promoter hypermethylation frequency and BRAF mutations distinguish hereditary non-polyposis colon cancer from sporadic MSI-H colon cancer.Fam Cancer. 2004;3:101-107.
Half E, Bercovich D, Rozen P. Familial adenomatous polyposis.Orphanet J Rare Dis. 2009;4:22.
Bülow S, Björk J, Christensen IJ, Fausa O, Järvinen H, Moesgaard F, Vasen HF. Duodenal adenomatosis in familial adenomatous polyposis.Gut. 2004;53:381-386.
Lindor NM, Rabe K, Petersen GM, Haile R, Casey G, Baron J, Gallinger S, Bapat B, Aronson M, Hopper J. Lower cancer incidence in Amsterdam-I criteria families without mismatch repair deficiency: familial colorectal cancer type X.JAMA. 2005;293:1979-1985.
Mueller-Koch Y, Vogelsang H, Kopp R, Lohse P, Keller G, Aust D, Muders M, Gross M, Daum J, Schiemann U. Hereditary non-polyposis colorectal cancer: clinical and molecular evidence for a new entity of hereditary colorectal cancer.Gut. 2005;54:1733-1740.
Wiesner GL, Daley D, Lewis S, Ticknor C, Platzer P, Lutterbaugh J, MacMillen M, Baliner B, Willis J, Elston RC. A subset of familial colorectal neoplasia kindreds linked to chromosome 9q22.2-31.2.Proc Natl Acad Sci USA. 2003;100:12961-12965.
Kemp Z, Carvajal-Carmona L, Spain S, Barclay E, Gorman M, Martin L, Jaeger E, Brooks N, Bishop DT, Thomas H. Evidence for a colorectal cancer susceptibility locus on chromosome 3q21-q24 from a high-density SNP genome-wide linkage scan.Hum Mol Genet. 2006;15:2903-2910.
Tenesa A, Dunlop MG. New insights into the aetiology of colorectal cancer from genome-wide association studies.Nat Rev Genet. 2009;10:353-358.
Fernandez-Rozadilla C, Cazier JB, Tomlinson IP, Carvajal-Carmona LG, Palles C, Lamas MJ, Baiget M, López-Fernández LA, Brea-Fernández A, Abulí A. A colorectal cancer genome-wide association study in a Spanish cohort identifies two variants associated with colorectal cancer risk at 1p33 and 8p12.BMC Genomics. 2013;14:55.
Jia WH, Zhang B, Matsuo K, Shin A, Xiang YB, Jee SH, Kim DH, Ren Z, Cai Q, Long J. Genome-wide association analyses in East Asians identify new susceptibility loci for colorectal cancer.Nat Genet. 2013;45:191-196.
Peters U, Jiao S, Schumacher FR, Hutter CM, Aragaki AK, Baron JA, Berndt SI, Bézieau S, Brenner H, Butterbach K. Identification of Genetic Susceptibility Loci for Colorectal Tumors in a Genome-Wide Meta-analysis.Gastroenterology. 2013;144:799-807.e24.
Houlston RS, Webb E, Broderick P, Pittman AM, Di Bernardo MC, Lubbe S, Chandler I, Vijayakrishnan J, Sullivan K, Penegar S. Meta-analysis of genome-wide association data identifies four new susceptibility loci for colorectal cancer.Nat Genet. 2008;40:1426-1435.
Houlston RS, Cheadle J, Dobbins SE, Tenesa A, Jones AM, Howarth K, Spain SL, Broderick P, Domingo E, Farrington S. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33.Nat Genet. 2010;42:973-977.
Tomlinson IP, Carvajal-Carmona LG, Dobbins SE, Tenesa A, Jones AM, Howarth K, Palles C, Broderick P, Jaeger EE, Farrington S. Multiple common susceptibility variants near BMP pathway loci GREM1, BMP4, and BMP2 explain part of the missing heritability of colorectal cancer.PLoS Genet. 2011;7:e1002105.
Dunlop MG, Dobbins SE, Farrington SM, Jones AM, Palles C, Whiffin N, Tenesa A, Spain S, Broderick P, Ooi LY. Common variation near CDKN1A, POLD3 and SHROOM2 influences colorectal cancer risk.Nat Genet. 2012;44:770-776.
Kinnersley B, Migliorini G, Broderick P, Whiffin N, Dobbins SE, Casey G, Hopper J, Sieber O, Lipton L, Kerr DJ. The TERT variant rs2736100 is associated with colorectal cancer risk.Br J Cancer. 2012;107:1001-1008.
Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors.Proc Natl Acad Sci USA. 1977;74:5463-5467.
Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT. The complete genome of an individual by massively parallel DNA sequencing.Nature. 2008;452:872-876.
Ulahannan D, Kovac MB, Mulholland PJ, Cazier JB, Tomlinson I. Technical and implementation issues in using next-generation sequencing of cancers in clinical practice.Br J Cancer. 2013;109:827-835.
Koboldt DC, Ding L, Mardis ER, Wilson RK. Challenges of sequencing human genomes.Brief Bioinform. 2010;11:484-498.
Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, Huff CD, Shannon PT, Jabs EW, Nickerson DA. Exome sequencing identifies the cause of a mendelian disorder.Nat Genet. 2010;42:30-35.
Desai AN, Jere A. Next-generation sequencing: ready for the clinics?Clin Genet. 2012;81:503-510.
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M. A framework for variation discovery and genotyping using next-generation DNA sequencing data.Nat Genet. 2011;43:491-498.
Kimura M. The Neutral Theory of Molecular Evolution. New York: Cambridge Press 1983; .
Lim J, Hao T, Shaw C, Patel AJ, Szabó G, Rual JF, Fisk CJ, Li N, Smolyar A, Hill DE. A protein-protein interaction network for human inherited ataxias and disorders of Purkinje cell degeneration.Cell. 2006;125:801-814.
Richards FM, McKee SA, Rajpar MH, Cole TR, Evans DG, Jankowski JA, McKeown C, Sanders DS, Maher ER. Germline E-cadherin gene (CDH1) mutations predispose to familial gastric cancer and colorectal cancer.Hum Mol Genet. 1999;8:607-610.
Lammi L, Arte S, Somer M, Jarvinen H, Lahermo P, Thesleff I, Pirinen S, Nieminen P. Mutations in AXIN2 cause familial tooth agenesis and predispose to colorectal cancer.Am J Hum Genet. 2004;74:1043-1050.
Sweet K, Willis J, Zhou XP, Gallione C, Sawada T, Alhopuro P, Khoo SK, Patocs A, Martin C, Bridgeman S. Molecular classification of patients with unexplained hamartomatous and hyperplastic polyposis.JAMA. 2005;294:2465-2473.
Ngeow J, Heald B, Rybicki LA, Orloff MS, Chen JL, Liu X, Yerian L, Willis J, Lehtonen HJ, Lehtonen R. Prevalence of germline PTEN, BMPR1A, SMAD4, STK11, and ENG mutations in patients with moderate-load colorectal polyps.Gastroenterology. 2013;144:1402-1409, 1409.e1-5.
Zogopoulos G, Jorgensen C, Bacani J, Montpetit A, Lepage P, Ferretti V, Chad L, Selvarajah S, Zanke B, Hudson TJ. Germline EPHB2 receptor variants in familial colorectal cancer.PLoS One. 2008;3:e2885.
Guda K, Moinova H, He J, Jamison O, Ravi L, Natale L, Lutterbaugh J, Lawrence E, Lewis S, Willson JK. Inactivating germ-line and somatic mutations in polypeptide N-acetylgalactosaminyltransferase 12 in human colon cancers.Proc Natl Acad Sci U S A. 2009;106:12921-12925.
Clarke E, Green RC, Green JS, Mahoney K, Parfrey PS, Younghusband HB, Woods MO. Inherited deleterious variants in GALNT12 are associated with CRC susceptibility.Hum Mutat. 2012;33:1056-1058.
Venkatachalam R, Ligtenberg MJ, Hoogerbrugge N, Schackert HK, Görgens H, Hahn MM, Kamping EJ, Vreede L, Hoenselaar E, van der Looij E. Germline epigenetic silencing of the tumor suppressor gene PTPRJ in early-onset familial colorectal cancer.Gastroenterology. 2010;139:2221-2224.
Lubbe SJ, Pittman AM, Matijssen C, Twiss P, Olver B, Lloyd A, Qureshi M, Brown N, Nye E, Stamp G. Evaluation of germline BMP4 mutation as a cause of colorectal cancer.Hum Mutat. 2011;32:E1928-E1938.
Nieminen TT, Abdel-Rahman WM, Ristimäki A, Lappalainen M, Lahermo P, Mecklin JP, Järvinen HJ, Peltomäki P. BMPR1A mutations in hereditary nonpolyposis colorectal cancer without mismatch repair deficiency.Gastroenterology. 2011;141:e23-e26.
Orloff MS, He X, Peterson C, Chen F, Chen JL, Mester JL, Eng C. Germline PIK3CA and AKT1 mutations in Cowden and Cowden-like syndromes.Am J Hum Genet. 2013;92:76-80.
Palles C, Cazier JB, Howarth KM, Domingo E, Jones AM, Broderick P, Kemp Z, Spain SL, Guarino E, Salguero I. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas.Nat Genet. 2013;45:136-144.
Seshagiri S. The burden of faulty proofreading in colon cancer.Nat Genet. 2013;45:121-122.
Jin YH, Ayyagari R, Resnick MA, Gordenin DA, Burgers PM. Okazaki fragment maturation in yeast. II. Cooperation between the polymerase and 3’-5’-exonuclease activities of Pol delta in the creation of a ligatable nick.J Biol Chem. 2003;278:1626-1633.
Murphy K, Darmawan H, Schultz A, Fidalgo da Silva E, Reha-Krantz LJ. A method to select for mutator DNA polymerase deltas in Saccharomyces cerevisiae.Genome. 2006;49:403-410.
Albertson TM, Ogawa M, Bugni JM, Hays LE, Chen Y, Wang Y, Treuting PM, Heddle JA, Goldsby RE, Preston BD. DNA polymerase epsilon and delta proofreading suppress discrete mutator and cancer phenotypes in mice.Proc Natl Acad Sci USA. 2009;106:17101-17104.
Smith CG, Naven M, Harris R, Colley J, West H, Li N, Liu Y, Adams R, Maughan TS, Nichols L. Exome resequencing identifies potential tumor-suppressor genes that predispose to colorectal cancer.Hum Mutat. 2013;34:1026-1034.
DeRycke MS, Gunawardena SR, Middha S, Asmann YW, Schaid DJ, McDonnell SK, Riska SM, Eckloff BW, Cunningham JM, Fridley BL. Identification of novel variants in colorectal cancer families by high-throughput exome sequencing.Cancer Epidemiol Biomarkers Prev. 2013;22:1239-1251.