Basic Study Open Access
Copyright ©The Author(s) 2015. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Apr 14, 2015; 21(14): 4136-4149
Published online Apr 14, 2015. doi: 10.3748/wjg.v21.i14.4136
Candidate colorectal cancer predisposing gene variants in Chinese early-onset and familial cases
Jun-Xiao Zhang, Richarda M de Voer, Marc-Manuel Hahn, Eugène TP Verwiel, Marjolijn JL Ligtenberg, Nicoline Hoogerbrugge, Roland P Kuiper, Ad Geurts van Kessel, Department of Human Genetics, Radboud University Medical Center, Radboud Institute for Molecular Life Sciences, 6500 HB Nijmegen, The Netherlands
Lei Fu, Peng Jin, Chen-Xi Lv, Jian-Qiu Sheng, Department of Gastroenterology, General Hospital of Beijing Military Region, Beijing 100700, China
Lei Fu, Third Military Medical University, Chongqing 400038, China
Marjolijn JL Ligtenberg, Department of Human Genetics and Department of Pathology, Radboud University Medical Center, 6500 HB Nijmegen, The Netherlands
Author contributions: Zhang JX, Fu L contributed equally to this paper; Zhang JX analyzed the data and performed the experiments; Fu L prepared the samples for exome sequencing, provided clinical information and performed validation experiments; de Voer RM, Hahn MM and Verwiel ET participated in the data analysis; Jin P participated in the sample collection; Lv CX performed the experiment for screening the control cohort; Ligtenberg MJ and Hoogerbrugge N participated in the design of the study; Kuiper RP, Sheng JQ and Geurts van Kessel A conceived and coordinated the study; Sheng JQ, de Voer RM, Kuiper RP and Geurts van Kessel A wrote the manuscript, which was approved by all co-authors.
Supported by research grants from the Dutch Cancer Society (KWF, KUN-4335), the Netherlands Organization for Scientific Research (NWO, 91710358), the Royal Dutch Academy of Sciences (KNAW), National Natural Science Foundation of China (NSFC, 81272194 and 81072041), and a scholarship from the China Scholarship Council (CSC) to Zhang JX.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: Jian-Qiu Sheng, Professor, Department of Gastroenterology, General Hospital of Beijing Military Region, 5 Nanmenchang, Dongcheng, Beijing 100700, China. jianqiu@263.net
Telephone: +86-10-66721299 Fax: +86-10-66721299
Received: June 12, 2014
Peer-review started: June 13, 2014
First decision: July 21, 2014
Revised: October 17, 2014
Accepted: December 1, 2014
Article in press: December 1, 2014
Published online: April 14, 2015

Abstract

AIM: To investigate whether whole-exome sequencing may serve as an efficient method to identify known or novel colorectal cancer (CRC) predisposing genes in early-onset or familial CRC cases.

METHODS: We performed whole-exome sequencing in 23 Chinese patients from 21 families with non-polyposis CRC diagnosed at ≤ 40 years of age, or from multiple affected CRC families with at least 1 first-degree relative diagnosed with CRC at ≤ 55 years of age. Genomic DNA from blood was enriched for exome sequences using the SureSelect Human All Exon Kit, version 2 (Agilent Technologies) and sequencing was performed on an Illumina HiSeq 2000 platform. Data were processed through an analytical pipeline to search for rare germline variants in known or novel CRC predisposing genes.

RESULTS: In total, 32 germline variants in 23 genes were identified and confirmed by Sanger sequencing. In 6 of the 21 families (29%), we identified 7 mutations in 3 known CRC predisposing genes including MLH1 (5 patients), MSH2 (1 patient), and MUTYH (biallelic, 1 patient), five of which were reported as pathogenic. In the remaining 15 families, we identified 20 rare and novel potentially deleterious variants in 19 genes, six of which were truncating mutations. One previously unreported variant identified in a conserved region of EIF2AK4 (p.Glu738_Asp739insArgArg) was found to represent a local Chinese variant, which was significantly enriched in our early-onset CRC patient cohort compared to a control cohort of 100 healthy Chinese individuals scored negative by colonoscopy (33.3% vs 7%, P < 0.001).

CONCLUSION: Whole-exome sequencing of early-onset or familial CRC cases serves as an efficient method to identify known and potential pathogenic variants in established and novel candidate CRC predisposing genes.

Key Words: Colorectal cancer, Cancer predisposition, Early-onset, Germline variants, Exome sequencing

Core tip: Mendelian colorectal cancer (CRC) predisposition syndromes underlie about 5% of all CRCs, and are caused by germline mutations in a limited set of genes. The overall heritability of CRC, however, is estimated to be approximately 30% and as yet many families at risk remain unexplained. This research identifies seven mutations of known CRC predisposing genes (MLH1, MSH2 and MUTYH) in 6 of the 21 families (29%), five of which were previously reported as pathogenic. One unreported variant EIF2AK4 (p.Glu738_Asp739insArgArg) located at conserved region was found to represent a local Chinese variant and significantly enriched in our early-onset CRC patient cohort.



INTRODUCTION

Colorectal cancer (CRC; MIM 114500) is the third most common cancer worldwide and the fourth leading cause of cancer-related death, with over one million new cases diagnosed and approximately 600000 deaths each year[1]. In China, it is the third most common cancer and the fifth leading cause of death from cancer. Moreover, the incidence of CRC in China has been increasing in recent years[2]. Genetic factors are estimated to account for the development of approximately 30% of all CRC cases[3]. However, Mendelian colorectal cancer predisposition syndromes, such as Lynch syndrome (LS), familial adenomatous polyposis (FAP), MUTYH-associated polyposis (MAP), juvenile polyposis syndrome (JPS) and polymerase proofreading-associated polyposis (PPAP), account for only approximately 5%-10% of all CRC cases and are associated with high-penetrance germline mutations in various mismatch repair (MMR) genes or the APC, MUTYH, SMAD4, BMPR1A, POLE and POLD1 genes, respectively[4,5]. The remaining approximately 20%-25% of the cases are thought to be due to moderate- to low-penetrance variants, most of which remain to be identified.

CRC patients with a family history of CRC or an early age at diagnosis are especially suggestive of a hereditary contribution and may be used in genetic association studies to increase the likelihood of identifying susceptibility variants[6-10]. Whereas CRC families with multiple affected individuals may be employed to search for high penetrance genetic susceptibility variants using linkage-based approaches, moderate- to low-penetrance variants cannot be identified through linkage-based studies in large families. In more recent years, multiple low-penetrance genetic loci associated with CRC susceptibility have been identified by genome-wide association studies (GWAS)[11,12]. However, not all results from linkage studies turned out to be consistent, and GWAS are not ideal for the identification of rare variants. Recent advances in next-generation sequencing (NGS) technologies, in particular whole-exome sequencing, have provided efficient means to identify germline variants in individuals with familial or inherited cancer syndromes[5,13-15]. We hypothesized that the majority of the yet unidentified CRC predisposing variants can be identified using whole-exome sequencing when applied to a strictly selected cohort of CRC patients and families. Several cellular signaling pathways appear to be involved in the development of CRC, including the WNT, DNA repair, BMP/TGF-β, apoptosis, MMIF/GIF, and PI3K/AKT pathways[16]. In addition, “sleeping beauty” transposon tagging has recently been employed as an effective forward genetic screening tool for the discovery of novel cancer initiating genes in the mouse intestinal tract, resulting in the identification of hundreds of novel candidate cancer driver genes[17-19].

In this study, we aimed to identify rare and novel germline variants in known and novel candidate CRC predisposing genes by performing whole-exome sequencing of germline DNA of 23 Chinese patients from 21 families diagnosed with non-polyposis CRC at a young age. We initially focused on genes that, based on genetic and functional data, are likely to play a role in CRC development, and on candidate genes that have been identified through GWAS studies.

MATERIALS AND METHODS
Recruitment of patient and control cohorts

Twenty-three patients from 21 families included in this study were recruited through the Department of Gastroenterology of the General Hospital of Beijing Military Region, Beijing, China. All patients were diagnosed with CRC without polyposis at ≤ 40 years of age[20] or from multiple affected CRC families with at least one first-degree relative diagnosed with CRC at ≤ 55 years of age. Additionally, 100 colonoscopy test-negative, unrelated controls with Chinese Han ancestry without inflammatory bowel disease or any family history of CRC were collected from a subject pool who participated in health check-up programs, including colonoscopy, at the department of Gastroenterology of the General Hospital of Beijing Military Region, Beijing, China. This study was approved by the Institutional Review Board of the General Hospital of Beijing Military Region (No. 2014-035), and all patients have provided written informed consent.

Whole-exome sequencing

Genomic DNA was extracted from peripheral blood cells using a QIAamp DNA Kit (QIAGEN, Hilden, Germany) according to the protocol provided by the manufacturer and whole-exome sequencing was performed at the Beijing Genome Institute (BGI, Shenzhen, China) according to manufacturer’s guidelines. Briefly, genomic DNA was fragmented and enriched for exome sequences using the SureSelect Human All Exon Kit, version 2 (Agilent Technologies, Santa Clara, CA, United States) and sequencing was performed at a minimal average coverage of 50 × on an Illumina HiSeq 2000 platform (Illumina, Inc., San Diego, CA).

Bioinformatics analyses

After removing sequence adaptors and low-quality reads, Burrows-Wheeler Aligner (BWA)[21] was used to align the reads to the NCBI human reference genome (hg19). Single nucleotide variants (SNVs) were called using SOAPsnp[22] and small insertion/deletions (indels) were detected using the SAMtools software package[23]. All variants were annotated using an in-house annotation pipeline, as described previously[24]. High-confidence variants (total ≥ 10 reads, ≥ 5 variant reads and ≥ 20% variant reads) were subsequently prioritized for variants that were non-synonymous and not found in our in-house database (1302 in-house analyzed exomes, mostly from European ancestry). In addition, dbSNPv138, the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project database (ESP, 6503 exomes, http://evs.gs.washington.edu/EVS/), and 700 control exome data sets from Chinese subjects with Han ancestry (Juan Tian and Zhimin Feng, BGI, personal communication) were used to exclude recurrent variants with a minor allele frequency (MAF) > 0.001.

Functional impact of variant analyses

Non-synonymous variants that result in alterations in protein function, including protein truncation, splice site defects and missense mutations at highly conserved (phyloP ≥ 3.0) nucleotide positions, were included in our analyses. Alamut v.2.0 software (Interactive Biosoftware) and integrated mutation prediction software (align GVDV, SIFT and PolyPhen-2)[25-27] packages were used for analyses of the identified variants. The prediction of splicing effects was evaluated based on five different algorithms (SpliceSiteFinder, MaxEntScan, NNSPLICE, GeneSplicer, Human Splicing Finder) through the bioinformatics tools of the Alamut v.2.0 software. The online tool “Project HOPE”[28] (http://www.cmbi.ru.nl/hope/) was used for revealing the structural consequences of missense mutations.

Candidate gene selection

We initially selected germline variants in CRC predisposing genes known to be associated with hereditary CRC syndromes and searched for evidence of pathogenicity in relevant databases, i.e., InSiGHT (http://www.insight-group.org/), LOVD (https://atlas.cmm.ki.se/LOVDv.2.0/) and the Mismatch Repair Genes Variant Database (http://www.med.mun.ca/mmrvariants/).

Next to the identification of variants in known CRC predisposing genes, we searched for potential pathogenic variants in novel candidate genes using the remaining exome data of our CRC patient cohort. For the selection of these variants, we focused on genes that meet the following criteria: (1) genes exhibiting recurrent variants; (2) 582 known cancer genes, including somatically mutated cancer genes (Cancer Gene Census, http://www.sanger.ac.uk/genetics/CGP/Census/)[29,30], cancer predisposing genes of which rare germline variants are known to confer a highly or moderately increased risk of cancer and for which at least 5% of individuals with the relevant variants develop cancer[31], and genes that are included in the Radboud university medical center hereditary cancer gene list[32]; (3) 286 genes that have been identified as candidate CRC driver genes by the “sleeping beauty” transposon tagging system in mice[18,19]; (4) 588 genes included in the following KEGG pathways: WNT signaling pathway (hsa04310), TGF-β signaling pathway (hsa04350), base excision repair (BER, hsa03410), nucleotide excision repair (NER, hsa03420), mismatch repair (MMR, hsa03430), non-homologous end-joining (NHEJ, hsa03450), Fanconi anemia pathway (hsa03460) and pathways involved in cancer (hsa05200); and (5) 268 genes likely to play a role in CRC susceptibility identified by GWAS studies[11,12,33,34] and included in the NHGRI GWAS Catalog (http://www.genome.gov/gwastudies/)[35].

Variant validation by Sanger sequencing

Identified germline variants were validated by Sanger sequencing after PCR amplification. The PCR primers were designed in silico using the Primer3 software package[36]. PCR reactions were performed on a Dual 96-Well GeneAmp PCR System 9700 (Applied Biosystems) using standard protocols (primer sequences available upon request). Mutation analyses were performed using the Vector NTI software package (Invitrogen, Paisley, United Kingdom).

RESULTS
Patient cohort characteristics

In order to identify known and potential pathogenic variants in established and novel candidate CRC predisposing genes, we performed whole-exome sequencing on germline DNA of 23 CRC patients from 21 families with non-polyposis CRC diagnosed at ≤ 40 years of age (n = 16), or from multiple affected CRC families with at least one first-degree relative diagnosed with CRC at ≤ 55 years of age (n = 7). The mean age at diagnosis was 38.6 years, and 43% (n = 10) of the patients were female (Table 1).

Table 1 Clinical characteristics and family histories of 23 early-onset and familial colorectal cancer patients.
Patient IDGenderPatient's historyFamily history
43-1AFemaleRC at 37 yrBrother RC at 53 yr
43-2AMaleRC at 53 yrSister RC at 37 yr
49-4AMaleCC at 30 yrBrother CC at 43 yr; sister CC at 23 yr
49-5AFemaleCC at 23 yrBrother CC at 43 yr; brother CC at 30 yr
50-11AMaleCC at 34 yr and relapse at 36Father CRC at 35 yr and death at 52 yr; brother CC at 34 yr and death at 36 yr
54-2AFemaleRC at 44 yrSister CRC; Brother CC at 76 yr and death
66-1-1AFemaleCC at 47 yrSister CC at 51 yr
71AFemaleRC at 57 yrSister RC at 53 yr
77-1AFemaleCRC at 38 yrFather EC at 64 yr and death; uncle CRC at 68 yr and death
102-1AMaleRC at 25 yr
103-1AMaleCC at 53 yrBrother CC at 36 yr and death at 48 yr; mother IO at 63 yr and death
106-2AMaleJC at 34 yr, CC at 39 yr, KC at 44 yr and PC at 45 yrFather EC and death; mother RC at 42 yr and death; Sister CP
108-1AMaleRC at 33 yr
110-1AMaleCC at 36 yr
116-1AFemaleCC at 31 yr and HC at 57 yrBrother intussusception and death at 40 yr; Brother CC at 50 yr, RC and SMT at 58 yr; brother IC at 50 yr, CC at 53 yr and RC at 61 yr; sister GC at 56 yr
120-1AFemaleRC at 36 yr
142-1AMaleRC at 34 yr
149-1AMaleCRC at 31 yrFather EC and death, mother GC at 56 yr
154-1AFemaleCRC at 40 yrFather HC, RC and death at 57 yr
156-1AFemaleCRC at 54 yrSister CP at 54 yr; sister CP; mother CC at 48 yr; grandfather EC and death.
164-1AMaleCC at 30 yrUncle colonitls at 42 yr
165-1AMaleCRC at 43 yrSister RC at 31 yr and death; grandmother RC at 65 yr and death.
180-1MaleCRC at 40 yrSister CP at 46 yr
Exome sequencing performance

Overall, we generated a mean of 68 M raw reads per sample, of which 77.6% to 89.5% were aligned to the human reference genome (hg19; Table 2). The mean coverage of the exome for the 23 samples was 58.5× (range: 53.0-64.7×). On average, 87.03% of the reads was covered at least 10 times and 76.35% of the reads was covered at least 20 times.

Table 2 Alignment and coverage statistics for 23 early-onset and familial colorectal cancer patients.
Sample IDTotal readsTotal mappedReads mapped to genomeCoveredCovered10×Covered20×Average target coverage
43-1A62997602521305934552821293.30%85.80%74.30%55.88×
43-2A57099664503677724392490693.60%86.40%75.00%54.94×
49-4A67025248519783934541870193.30%85.70%74.10%55.30×
49-5A60632336515980174545043192.60%85.00%73.40%55.49×
50-11A68991044585074545103322194.00%87.20%76.80%60.71×
54-2A68459832578603365082062693.40%86.50%76.10%61.71×
66-1-1A69759994588380355147211294.10%87.60%77.50%61.82×
71A68055130581817835101227794.00%87.60%77.70%61.50×
77-1A65956248568942654981736993.80%87.30%77.10%61.06×
102-1A64702600570862844967287394.40%87.90%77.70%59.92×
103-1A66004146551099624821876993.80%86.90%76.40%59.28×
106-2A61956558543673594756703393.80%86.80%76.00%57.97×
108-1A64764180564696654947352094.00%87.20%76.50%57.52×
110-1A68883264569624394997554594.10%87.30%76.90%59.31×
116-1A68975484606813185330570693.70%86.70%75.60%55.86×
120-1A64307066565930514990025994.20%87.30%75.90%53.02×
142-1A72999930653217525782275494.70%87.60%76.20%53.19×
149-1A69636008597893055264174093.90%87.20%76.90%61.19×
154-1A80632788639344485619629794.40%88.30%78.90%64.69×
156-1A94340904821256967319908694.20%87.10%76.10%56.48×
164-1A67813680588374715177982693.60%86.50%75.80%58.83×
165-1A68657326598452925256164694.30%87.80%77.70%60.99×
180-165727112579080575074293894.40%87.90%77.50%59.01×
Average68190354583212505119721193.90%87.03%76.35%58.51×

We identified on average 46437 SNVs (range: 44353-48114) and 1678 indels (range: 1630-1719) per exome. Over 95.3% of these substitutions and 73.1% of indels represented known variants listed in private and public databases (Figure 1). A prioritization scheme was applied to identify candidate variants (Table 3). Initial quality filtering (total ≥ 10 reads, ≥ 5 variant reads and ≥ 20% variant reads) resulted in the identification of 13819 genetic variants in coding regions or canonical splice sites, including 9833 non-synonymous changes. A total of 4432 variants that result in alterations in protein function, including 172 nonsense variants, 188 frame shift variants, 943 canonical splice site variants, 237 in-frame deletions, 191 in-frame insertions and 2701 missense variants with high conservation scores (phyloP ≥ 3.0), were identified. Subsequently, we excluded known variants present in our in-house database and variants with MAF scores > 0.001 in dbSNPv138, reducing the number of variants to 2883. Subsequently, we prioritized variants in known CRC predisposing genes and in genes likely to play a role in CRC development, and excluded variants with MAF scores > 0.1 in the ESP database or in the 700 control exomes from Chinese subjects with Han ancestry, thereby reducing the number of candidate variants to 61. Of these 61, 39 (32 different variants in 23 genes) were validated by Sanger sequencing (Figure 2).

Figure 1
Figure 1 Variant statistics (marked in colors) for 23 early-onset colorectal cancer patient samples. The numbers of detected variants are listed on the left hand side and the patient samples ID on the bottom. The different colors represent different types of variants, i.e., purple represents “private indels”, green represents “private SNVs”, red and blue represent known indels and known SNVs listed in the 1000 genome, dbSNPv138 and in-house databases, respectively. SNV: Single nucleotide variant.
Figure 2
Figure 2 Germline variants identified in known colorectal cancer predisposing genes and genes likely to play a role in colorectal cancer development. The genes are listed on the left hand side and the patient samples on top. Patient samples from the same families are marked (bars). Known colorectal cancer (CRC) predisposing genes are marked by shading (left). The shades at the right hand side of the figure indicate functional (groups of) genes considered to play a role in CRC development. The different variant types are indicated in colors (right). The red-triangle/green-triangle square in sample 180-1 indicates the presence of one MUTYH nonsense and one MUTYH missense mutation.
Table 3 Prioritization scheme for exome data analysis of all 23 patients.
Type of prioritization filterRemaining variants (n)
All variants1106642
Coding region and canonical splice site variants after quality filtering (total ≥ 10 reads, ≥ 5 variant reads and ≥ 20% variant reads)13819
Non-synonymous variants, canonical splice site variants9833
Variants that result in alterations in protein function (protein truncation, splice site defects and missense mutations at highly conserved (phyloP ≥ 3.0) nucleotide positions.Not in in-house database and MAF ≤ 0.001 in dbSNPv13844321
2883
Variants in known CRC predisposing genes and genes likely to play a role in CRC development (MAF ≤ 0.001 in ESP and 700 control Chinese exome data sets)61
Variants/genes validated by Sanger sequencing39 (32 different variants in 23 genes)
Identification of germline variants in known CRC predisposing genes

A total of seven CRC patients from six families (30%) were identified with germline variants in known CRC predisposing genes. Of these, five variants (in four patients) were reported as being pathogenic in public databases, three of which were located in MLH1[37] (Table 4), including a canonical splice site mutation (c.453+1G>T) in patient 106-2A (colon cancer at age of 39), a canonical splice site mutation (c.208-1G>A) in patient 116-1A (colon cancer at age of 31), and a missense mutation (c.677G>A, p.Arg226Gln) in patient 43-1A (rectal cancer at age of 37). This latter mutation has been reported to result in a complete skipping of exon 8 at the mRNA level[38]. The brother of patient 43-1A was also subjected to exome sequencing (patient 43-2A, rectal cancer at age of 53), but the MLH1 mutation c.677G>A was not encountered in this patient, and subsequent Sanger sequencing confirmed this finding. Compound heterozygous MUTYH mutations (p.Gln267* and p.Gly286Glu) were found in patient 180-1 (CRC at age of 40). The sister of patient 180-1 (colonic polyps at age of 46) also carried both MUTYH mutations (p.Gln267* and p.Gly286Glu). Both mutations have been reported to be causative for MUTYH-associated polyposis (MAP)[39,40].

Table 4 Identification of germline mutations in known colorectal cancer predisposing genes.
Sample IDGene nameGene IDGenomic changecDNA changeProtein changePathogenicity
43-1AMLH1NM_000249g.chr3:37053590G>Ac.677G>A1p.Arg226GlnaYes[38,44]
106-2AMLH1NM_000249chr3:g.37048555G>Tc.453+1G>TSSMYes[42]
116-1AMLH1NM_000249g.chr3:37042445G>Ac.208-1G>ASSMYes[43]
180-1MUTYHNM_001128425g.chr1:45797972G>Ac.799C>Tp.Gln267*Yes[39]
180-1MUTYHNM_001128425g.chr1:45797914C>Tc.857G>Ap.Gly286GluYes[40]
49-4AMLH1NM_000249g.chr3:37067252_37067253insTc.1163_1164insTp.Arg389Profs*6NR
49-5AMLH1NM_000249g.chr3:37067252_37067253insTc.1163_1164insTp.Arg389Profs*6NR
49-4AMSH6NM_000179g.chr2:48027422C>Gc.2300C>Gp.Thr767SerNR
50-11AMSH2NM_000251g.chr2:47641406A>Tc.793-2A>TSSMNR

Three mismatch repair gene mutations, observed in three unrelated patients, were not previously reported in public databases. A novel splice site mutation in MSH2 (c.793-2A>T) was identified in patient 50-11A (colon cancer at age of 34). This canonical splice site is inactivated and a splice site seven nucleotides downstream is used according to Alamut prediction. Both a frame shift mutation in MLH1 (p.Arg389Profs*6) and a missense variant in MSH6 (p.Thr767Ser) were found in patient 49-4A (colon cancer at age of 30). The MLH1 mutation p.Arg389Profs*6 was also found in his sister, patient 49-5A (colon cancer at age of 23), whereas this sister was found to be negative for the MSH6 variant p.Thr767Ser. Segregation analysis of four siblings and the mother in this family (Figure 3) showed that the brothers of index patient 49-4A, i.e., family members II:1 (colon cancer at age of 43 years) and II:3 (no cancer), carried both mutations. The MLH1 p.Arg389Profs*6 mutation-positive, MSH6 wild-type mother I:2 and the MLH1 wild-type, MSH6 p.Thr767Ser variant-positive brother II:4 both did not develop cancer. We, therefore, conclude that the MLH1 frame shift mutation (p.Arg389Profs*6) acts as the main contributor to the development of CRC in this family.

Figure 3
Figure 3 Pedigree and segregation analysis in family members of index patients for MLH1 and MSH6 mutations. Index patients are indicated by arrows. Both index patients II:5 (sample 49-4A) and II:6 (sample 49-5A) carried MLH1 frame shift mutation (c.1163_1164insT, p.Arg389Profs*6) and II:5 also carried MSH6 missense mutation (c.2300C>G, p.Thr767Ser). Two brothers II:1 (colon cancer at age of 43) and II:3 (no cancer) carried both mutations. A sister (II:2, no cancer) carried neither the MLH1 nor the MSH6 mutation. A third brother (II:4) carried the MSH6 mutation, but not the MLH1 mutation. And the mother of index patients carried MLH1 mutation, but not the MSH6 mutation. Both did not develop cancer.
Rare germline variants of novel candidate CRC predisposing genes

After extrusion of variants in known CRC predisposing genes, a set of 24 rare candidate germline variants remained (Table 5). Of these, seven represent truncating mutations (five frame-shift indels, one nonsense and one canonical splice site). In addition, one in-frame insertion and 16 highly conserved non-synonymous missense variants are present in this set. For these latter variants, SIFT and Polyphen2 algorithms were used to estimate their functional effects on the respective encoded proteins. In all cases, both SIFT and Polyphen2 predicted the variants to be functionally impaired or possibly/probably functionally impaired (Table 6). Four rare or novel variants were found in cancer predisposing genes that are not directly linked to an increased CRC risk, including ATM p.Lys468Glufs*18 in patient 102-1A (rectal cancer at age of 25 years), MAX p.Leu61Serfs*15 in patient 66-1-1A (colon cancer at age of 47 years), TSC2 p.Asp1734Asn in patient 164-1A (colon cancer at age of 30 years) and ETV4 p.Glu331Lys in patient 71A (rectal cancer at age of 57 years). ATM and MAX are involved in DNA repair pathways, and TSC2 plays a role in the PI3K/AKT pathway. These pathways are also active in CRC. Interestingly, in patient 66-1-1A we also observed a potentially deleterious variant in PARP1 (p.Lys254Glufs*6), another gene involved in DNA repair.

Table 5 Characteristics of 24 variants identified in 19 novel genes likely to play a role in colorectal cancer development.
Sample IDGene nameGene/pathway involvedcDNA changeProtein changers ID in dbSNP138MAF (700 Chinese exomes)MAF (NHLBI ESP)MAF (1000 genome)
102-1AATMCancer gene, DNArepc.1402_1403delp.Lys468Glufs*18NRNRNRNR
66-1-1APARP1DNArepc.758dupp.Lys254Glufs*6NRNR0.000077NR
66-1-1AMAXCancer genec.181delp.Leu61Serfs*15NRNRNRNR
106-2ABUB1Cancer genec.46C>Tp.Gln16*NRNRNRNR
149-1ABUB1Cancer genec.2844delp.Gln949Argfs*3NRNRNRNR
165-1ALIG3DNArepc.218delp.Phe73Serfs*41NRNRNRNR
54-2AMCCTransposon studiesc.1355+1_1355+2ins14SMMNRNRNRNR
49-4AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
71AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
103-1AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
108-1AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
120-1AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
154-1AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
164-1AEIF2AK4GWAS relatedc.2214_2215insCGACGAp.Glu738_Asp739insArgArgNRNRNRNR
77-1ALRP5WNTc.2156A>Gp.Tyr719CysNRNRNRNR
43-1ALRP5WNTc.3536G>Ap.Arg1179HisNRNR0.000077NR
54-2ALRP5WNTc.3919C>Tp.Arg1307TrpNRNR0.000077NR
110-1ARPS6KB2PI3K/AKTc.331A>Gp.Lys111GluNR0.00075NRNR
43-1ARPS6KB2PI3K/AKTc.683C>Ap.Thr228Asnrs183360785NRNR0.001
43-1ARYR2Somatic mutation genec.2701G>Ap.Gly901SerNRNRNRNR
103-1ARYR2Somatic mutation genec.6457A>Gp.Lys2153GluNRNRNRNR
102-1ARYR3Somatic mutation genec.13507G>Ap.Val4503MetNRNRNRNR
71AETV4Cancer genec.991G>Ap.Glu331LysNRNRNRNR
103-1APRDM1Cancer genec.1499A>Gp.Gln500Argrs201512476NRNR0.001
164-1ATSC2Cancer gene, PI3K/AKTc.5200G>Ap.Asp1734AsnNRNRNRNR
71AMTORPI3K/AKTc.5857G>Tp.Val1953LeuNR0.000714NRNR
154-1ADAAM1WNTc.667G>Ap.Val223MetNRNRNRNR
71AFZD10WNTc.1341C>Gp.Phe447LeuNRNRNRNR
164-1ATCF7WNTc.572G>Tp.Arg191MetNRNRNRNR
71AMAST2Transposon studiesc.3482A>Gp.Asn1161SerNRNR0.000077NR
Table 6 In silico functional prediction of 16 missense variants.
Sample IDGene nameGene/pathway involvedcDNA changeProtein changeDomainPhyloP scoreGrantham scoreAlign GVGDSIFT scoreSIFT predictionPolyphen2 scorePolyphen2 prediction
77-1ALRP5WNTc.2156A>Gp.Tyr719CysLDLR class B repeat4.751194C650.000D0.999PrD
43-1ALRP5WNTc.3536G>Ap.Arg1179HisLDLR class B repeat3.71229C250.000D0.953PrD
54-2ALRP5WNTc.3919C>Tp.Arg1307TrpLDLR class A repeat3.172101C350.000D0.948PrD
110-1ARPS6KB2PI3K/AKTc.331A>Gp.Lys111GluProtein kinase, catalytic domain4.63956C550.000D0.535PoD
43-1ARPS6KB2PI3K/AKTc.683C>Ap.Thr228AsnProtein kinase, catalytic domain5.06265C550.001D0.994PrD
43-1ARYR2Somatic mutation genec.2701G>Ap.Gly901SerRyanodine receptor6.08156C550.010D1.000PrD
103-1ARYR2Somatic mutation genec.6457A>Gp.Lys2153GluIntracellular calcium-release channel5.06756C00.020D0.615PoD
102-1ARYR3Somatic mutation genec.13507G>Ap.Val4503MetRyanodine Receptor TM 4-66.01221C150.000D1.000PrD
71AETV4Cancer genec.991G>Ap.Glu331LysPEA3-type ETS-domain transcription factor, N-terminal6.42456C550.001D0.862PoD
103-1APRDM1Cancer genec.1499A>Gp.Gln500ArgZinc finger, C2H24.87543C00.050D0.570PoD
164-1ATSC2Cancer gene, PI3K/AKTc.5200G>Ap.Asp1734AsnRap/ran-GAP5.53823C00.000D0.998PrD
71AMTORPI3K/AKTc.5857G>Tp.Val1953LeuPIK-related kinase5.63432C00.001D0.827PoD
154-1ADAAM1WNTc.667G>Ap.Val223MetDiaphanous GTPase-binding6.34721C00.000D0.998PrD
71AFZD10WNTc.1341C>Gp.Phe447LeuFrizzled protein4.22922C150.000D0.984PrD
164-1ATCF7WNTc.572G>Tp.Arg191MetHigh mobility group, HMG1/HMG24.20291C650.000D0.999PrD
71AMAST2Transposon studiesc.3482A>Gp.Asn1161SerPDZ/DHR/GLGF4.85446C00.000D0.999PrD
Genes recurrently affected by potentially deleterious variants

Despite the limited size of our cohort, the recurrent detection of rare potentially deleterious variants is another way to select candidates from the list of rare variants. Four genes were found to be recurrently affected by different rare variants, and two of them (BUB1 and LRP5) were encountered in patients that also carried pathogenic MLH1 mutations (patients 106-2A and 43-1A, respectively; Figure 2). In total, two truncating BUB1 variants were found (p.Gln16* and p.Gln949Argfs*3). As reported previously, these BUB1 variants may be associated with an increased risk for aneuploidy and, in patient 106-2A, this may have contributed to somatic loss of the wild-type MLH1 allele in the tumor[15]. The other recurrently affected genes were LRP5, RPS6KB2 and RYR2. LRP5 may be of particular interest since it is a component of the WNT-FZD-LRP5-LRP6 complex that triggers β-catenin signaling through the induction of aggregation of receptor-ligand complexes into ribosome-sized signalsomes. We identified three highly conserved LRP5 missense variants in three unrelated patients (Figure 2). Two of these, p.Tyr719Cys and p.Arg1179His, were found to be located in the conserved low-density lipoprotein (LDLR) class B repeat region. To investigate the functional consequences of these three mutations on the LRP5 protein structure, the online tool “Project HOPE” was used. By doing so, we found that variant p.Tyr719Cys gives rise to a mutant residue that is smaller and more hydrophobic than the wild-type residue, which may lead to loss of protein-protein interactions and hydrogen bonds and/or disturb correct protein folding. Through variant p.Arg1179His, a positively charged residue is replaced by a neutral and smaller residue, which again may lead to loss of interactions with other molecules or residues. Through variant p.Arg1307Trp, a positively charged residue is replaced by a neutral, larger and more hydrophobic residue, which may lead to loss of interactions with other molecules or residues, loss of hydrogen bonds and/or disturbance of correct protein folding giving rise to collisions with other molecules or residues.

We also identified a recurrent insertion in EIF2AK4 (p.Glu738_Asp739insArgArg) in seven (33.3%) unrelated patients, which was absent in local in-house and public databases. EIF2AK4 is located in a region previously found to be associated with CRC susceptibility in GWAS studies[11,35]. Since this variant could be common in the Han Chinese population, we screened a cohort of 100 colonoscopy test-negative, unrelated local Han Chinese individuals using Sanger sequencing. We found that 7 (7%) of them carried this variant, revealing a significant enrichment in the early-onset/familial CRC cohort as compared to the ethnicity matched control cohort (χ2 test, P = 0.000604).

DISCUSSION

In order to identify rare and novel germline variants that may predispose to CRC, we applied whole-exome sequencing to 23 Chinese patients from 21 families with non-polyposis CRC diagnosed at ≤ 40 years of age or from multiple affected CRC families with at least one first-degree relative diagnosed with CRC at ≤ 55 years of age. Initially we selected variants in genes that are known to be associated with hereditary CRC syndromes, and we assessed their pathogenicity as reported in public databases such as InSiGHT, LOVD and the Mismatch Repair Genes Variant database. Among the 23 patients included, we identified seven patients (from six families; approximately 30%) with variants in known CRC predisposing genes. This percentage is lower than that previously reported by Tanskanen et al[41], (42%, 16/38) in a cohort of early-onset CRC patients (< 40 years) using exome sequencing. In a study by Tanskanen et al[41], of 38 patients, four were clinically diagnosed with gastrointestinal polyposis (three FAP and one JPS), and 12 were identified with germline MMR mutations and enriched in patients with MSI tumors (86%, 12/14). This discrepancy may be due to the fact that our cohort is a non-polyposis cohort and also includes patients from multiple affected CRC families with at least one first-degree relative diagnosed with CRC at ≤ 55 years of age. In our cohort, six patients were identified with variants in the high-penetrance genes MLH1, MSH2 and MSH6 underlying Lynch syndrome. In addition, we identified biallelic MUTYH mutations, underlying MAP, in one index patient (patient 180-1, CRC at age of 40) and the sister of the patient (colonic polyps at age of 46). Of the eight variants that we identified in known high-penetrance CRC predisposing genes, MLH1 c.453+1G>T, MLH1 c.208-1G>A, MLH1 c.677G>A, MUTYH p.Gln267* and MUTYH p.Gly286Glu were reported as being pathogenic in public databases[39,40,42-44]. In addition, we identified novel rare variants of which two, MLH1 p. Arg389Profs*6 and MSH2 c.793-2A>T, are most likely pathogenic based on both familial segregation and in silico prediction analyses.

In our search for novel germline predisposing variants, we focused on known cancer-associated genes, CRC pathway-associated genes, mouse CRC susceptibility genes identified by transposon (‘sleeping beauty’) tagging, GWAS-associated genes and genes with reported somatic mutations that are considered likely to be involved in CRC predisposition and/or development. Using these criteria, we identified a total of 19 novel candidate CRC susceptibility genes carrying rare, likely deleterious, variants.

One ATM truncating variant (p.Lys468Glufs*18) identified in patient 102-1A (rectal cancer at age of 25) may be particularly relevant. ATM is a gene encoding a protein that belongs to the PI3/PI4-kinase family[45]. The ATM protein represents an important cell cycle checkpoint kinase that is required for a cell’s response to DNA damage and for ensuring genomic integrity[46]. Diseases associated with ATM mutations include ataxia telangiectasia (AT), an autosomal recessive disorder[47]. Because of its role in maintaining genomic integrity, ATM may, when mutated, increase the risk for tumor development[48]. Indeed, germline mutations in ATM have been shown to increase the risk of breast cancer development through the (de)regulation of BRCA1[49]. In addition, loss of heterozygosity at the ATM locus has been found in CRC[50]. Taken together, it appears plausible to assume that germline ATM mutations may increase the risk for CRC development. However, considering the high frequency of truncating mutation in ESP database and in-house database, it is crucial for targeted screening of ATM in a large early-onset and/or familial CRC cohort. Another interesting candidate is the truncating MAX variant (p.Leu61Serfs*15) identified in patient 66-1-1A. The protein encoded by the MAX gene represents the most conserved dimerization component of the MYC-MAX-MXD1 network of basic helix-loop-helix leucine zipper (bHLHZ) transcription factors that regulate cellular proliferation, differentiation and apoptosis[51,52]. It has been shown that the MAX protein interacts with MSH2[53], and that mutant MAX is able to alter the growth and morphology of CRC cells through inactivation of c-MYC[32]. Mutations in the MAX gene have been reported to be associated with the occurrence of hereditary pheochromocytomas and paragangliomas[54]. Interestingly, an additional truncating variant in PARP1 (p.Lys254Glufs*6) was identified in this patient (66-1-1A). PARP1 is activated in response to DNA damage and plays an important role in DNA repair processes, apoptosis and cell cycle control[55]. Since MAX and PARP1 are both involved in DNA repair, and since it has been shown that PARP1 is essential for c-MYC-induced transactivation and retardation of the G2-M transition in cancer cells[56], the combination of these two variants may have a synergistic effect. Therefore, we anticipate that both truncating variants most likely play a role in CRC development in this family.

Other interesting candidate genes recurrently affected by potentially deleterious variants include BUB1, LRP5 and EIF2AK4. Two truncating variants in BUB1 (p.Gln16* and p.Gln949Argfs*3) were found to be present in patient 106-2A and patient 149-1A, respectively. The BUB1 protein is an integral component of the spindle assembly checkpoint (SAC), and we have previously shown that germline variants in the corresponding gene may serve as risk factors for CRC[15]. Patient 106-2A was found to carry both BUB1 p.Gln16* and MLH1 c.453+1G>T variants. We suggest that BUB1 may have contributed to loss of the wild-type MLH1 allele in this patient[15]. Obviously, this latter scenario requires validation in larger CRC cohorts.

Three missense LRP5 variants (p.Tyr719Cys, p.Arg1179His and p.Arg1307Trp), found in three CRC cases, were predicted to be deleterious. LRP5 p.Tyr719Cys and LRP5 p.Arg1307Trp were observed in patient 54-2A and patient 77-1A, respectively. In both cases no other putative pathogenic germline variants were detected. Variant LRP5 p.Arg1179His was found in patient 43-1A, who also carried a pathogenic MLH1 c.677G>A splice site mutation. The LRP5 protein is a component of the WNT-FZD-LRP5-LRP6 complex and, as such, represents an important partner in the WNT signal transduction pathway[57]. Variants LRP5 p.Tyr719Cys and p.Arg1179His are both located in the conserved low-density lipoprotein receptor (LDLR) class B repeat region of LRP5, which is the binding region of Dickkopf-1, a developmental protein antagonist of the canonical WNT-β-catenin pathway[58]. Further assessment of both LRP5 variants using the “Project HOPE” tool indicated that these variants may also result in loss of interactions with other proteins or residues. It has previously been shown that truncated LRP5 proteins are frequently expressed in breast tumors of different developmental stages[59] and that these proteins are strongly implicated in the deregulation of the WNT-β-catenin signaling pathway in hyperparathyroid tumors[60].

One EIF2AK4 variant (p.Glu738_Asp739insArgArg) was recurrently found in seven (33.3%) unrelated patients within our cohort. After comparison of our cohort to an ethnicity matched control cohort, this variant was found to be significantly enriched (P = 0.000604). We, therefore, conclude that also this latter gene may be considered a candidate CRC predisposing gene.

A major challenge of using whole-exome sequencing is the identification of predisposing pathogenic variants within the vast background of non-pathogenic variants. Targeted screening of those genes and variants in replicate large early-onset and/or familial CRC cohorts will be instrumental in gaining more robust evidence for pathogenicity. Our current results, however, already vividly illustrate that whole-exome sequencing in carefully selected cases at risk for hereditary cancer may serve as an attractive approach to identify rare and novel variants in known and novel candidate CRC predisposing genes.

ACKNOWLEDGMENTS

We appreciate Dr. Ying Han, Dr. Hai-Hong Wang, Dr. Xin Wang, Dr. Ai-Qin Li, Dr. Xiao-Wei Wang and Dr. Hui Su from Department of Gastroenterology, General Hospital of Beijing Military Region, Beijing, China for their kindly help in sample collection and we thank the patients and families for participating and their cooperation in this study.

COMMENTS
Background

Mendelian colorectal cancer (CRC) predisposition syndromes underlie about 5% of all CRC cases, and are caused by germline mutations in a limited set of genes. The current selection of causative genes to be screened in high-risk families is based on several phenotypic characteristics, including polyposis (e.g., APC and MUTYH) and microsatellite instability (MLH1, MSH2, MSH6 and PMS2). The overall heritability of CRC, however, is estimated to be approximately 30%. Excluding hereditary forms, there is an important fraction of CRC cases that present familial aggregation for the disease with an unknown germline genetic cause.

Research frontiers

CRC patients with a family history of CRC or an early age at diagnosis are especially suggestive of a hereditary contribution and may be used in genetic association studies to increase the likelihood of identifying susceptibility variants. Whereas CRC families with multiple affected individuals may be employed to search for high penetrance genetic susceptibility variants using linkage-based approaches, moderate- to low-penetrance variants cannot be identified through linkage-based studies in large families. In more recent years, multiple low-penetrance genetic loci associated with CRC susceptibility have been identified by genome-wide association studies (GWAS). However, not all results from linkage studies turned out to be consistent, and GWAS are not ideal for the identification of rare variants. Recently, advances in next-generation sequencing technologies, in particular whole-exome sequencing, have provided efficient means to identify germline variants in individuals with familial or inherited cancer syndromes.

Innovations and breakthroughs

A major challenge of using whole-exome sequencing is the identification of predisposing pathogenic variants within the vast background of non-pathogenic variants. In this study, we performed whole-exome sequencing in a strictly selected cohort of CRC patients and families that are very young CRC patients (diagnosed at ≤ 40 years of age) or familial CRC cases. And data were processed through a tailored analytical pipeline to search for rare germline variants in known or novel CRC predisposing genes.

Applications

The study show that whole-exome sequencing of early-onset or familial CRC cases serves as an efficient method to identify known and potential pathogenic variants in established and novel candidate CRC predisposing genes. The findings also provide insight into the role of these variants in CRC development. Targeted screening of those genes and variants in replicate large early-onset and/or familial CRC cohorts will be instrumental in gaining more robust evidence for pathogenicity.

Terminology

“Early-onset” CRC: CRC is traditionally thought to be a disease of older patients with most being diagnosed after the age of 50 years; however, a significant proportion of young patients present with this disease. Early age of onset is a central characteristic of hereditary predisposition to cancer. Familiar aggregation of tumors and hereditary cases are constantly more frequent under the age of 40 years.

Peer-review

This study investigated the efficiency of whole-exome sequencing in identifying known or novel CRC predisposing genes in early-onset or familial CRC cases. This is a well written paper that has been performed stringently. Although the number of included patients is very low, the authors present very interesting results with a straight forward conclusion.

Footnotes

P- Reviewer: Krieg A S- Editor: Gou SX L- Editor: Wang TQ E- Editor: Liu XM

References
1.  Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics. CA Cancer J Clin. 2011;61:69-90.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 23762]  [Cited by in F6Publishing: 25182]  [Article Influence: 1937.1]  [Reference Citation Analysis (4)]
2.  Chen W, Zheng R, Zhang S, Zhao P, Li G, Wu L, He J. Report of incidence and mortality in China cancer registries, 2009. Chin J Cancer Res. 2013;25:10-21.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in F6Publishing: 156]  [Reference Citation Analysis (0)]
3.  Lichtenstein P, Holm NV, Verkasalo PK, Iliadou A, Kaprio J, Koskenvuo M, Pukkala E, Skytthe A, Hemminki K. Environmental and heritable factors in the causation of cancer--analyses of cohorts of twins from Sweden, Denmark, and Finland. N Engl J Med. 2000;343:78-85.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2809]  [Cited by in F6Publishing: 2659]  [Article Influence: 110.8]  [Reference Citation Analysis (1)]
4.  de la Chapelle A. Genetic predisposition to colorectal cancer. Nat Rev Cancer. 2004;4:769-780.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 439]  [Cited by in F6Publishing: 420]  [Article Influence: 21.0]  [Reference Citation Analysis (0)]
5.  Palles C, Cazier JB, Howarth KM, Domingo E, Jones AM, Broderick P, Kemp Z, Spain SL, Guarino E, Salguero I. Germline mutations affecting the proofreading domains of POLE and POLD1 predispose to colorectal adenomas and carcinomas. Nat Genet. 2013;45:136-144.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 766]  [Cited by in F6Publishing: 714]  [Article Influence: 64.9]  [Reference Citation Analysis (0)]
6.  Lynch HT, Smyrk T. Hereditary nonpolyposis colorectal cancer (Lynch syndrome). An updated review. Cancer. 1996;78:1149-1167.  [PubMed]  [DOI]  [Cited in This Article: ]
7.  Schoen RE. Families at risk for colorectal cancer: risk assessment and genetic testing. J Clin Gastroenterol. 2000;31:114-120.  [PubMed]  [DOI]  [Cited in This Article: ]
8.  Gryfe R, Kim H, Hsieh ET, Aronson MD, Holowaty EJ, Bull SB, Redston M, Gallinger S. Tumor microsatellite instability and clinical outcome in young patients with colorectal cancer. N Engl J Med. 2000;342:69-77.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 978]  [Cited by in F6Publishing: 954]  [Article Influence: 39.8]  [Reference Citation Analysis (0)]
9.  Giráldez MD, Balaguer F, Bujanda L, Cuatrecasas M, Muñoz J, Alonso-Espinaco V, Larzabal M, Petit A, Gonzalo V, Ocaña T. MSH6 and MUTYH deficiency is a frequent event in early-onset colorectal cancer. Clin Cancer Res. 2010;16:5402-5413.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 68]  [Cited by in F6Publishing: 72]  [Article Influence: 5.1]  [Reference Citation Analysis (0)]
10.  Chang DT, Pai RK, Rybicki LA, Dimaio MA, Limaye M, Jayachandran P, Koong AC, Kunz PA, Fisher GA, Ford JM. Clinicopathologic and molecular features of sporadic early-onset colorectal adenocarcinoma: an adenocarcinoma with frequent signet ring cell differentiation, rectal and sigmoid involvement, and adverse morphologic features. Mod Pathol. 2012;25:1128-1139.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 193]  [Cited by in F6Publishing: 218]  [Article Influence: 18.2]  [Reference Citation Analysis (0)]
11.  Tenesa A, Dunlop MG. New insights into the aetiology of colorectal cancer from genome-wide association studies. Nat Rev Genet. 2009;10:353-358.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 296]  [Cited by in F6Publishing: 320]  [Article Influence: 22.9]  [Reference Citation Analysis (0)]
12.  Houlston RS, Cheadle J, Dobbins SE, Tenesa A, Jones AM, Howarth K, Spain SL, Broderick P, Domingo E, Farrington S. Meta-analysis of three genome-wide association studies identifies susceptibility loci for colorectal cancer at 1q41, 3q26.2, 12q13.13 and 20q13.33. Nat Genet. 2010;42:973-977.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 301]  [Cited by in F6Publishing: 300]  [Article Influence: 21.4]  [Reference Citation Analysis (0)]
13.  Jones S, Hruban RH, Kamiyama M, Borges M, Zhang X, Parsons DW, Lin JC, Palmisano E, Brune K, Jaffee EM. Exomic sequencing identifies PALB2 as a pancreatic cancer susceptibility gene. Science. 2009;324:217.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 628]  [Cited by in F6Publishing: 569]  [Article Influence: 37.9]  [Reference Citation Analysis (0)]
14.  Comino-Méndez I, Gracia-Aznárez FJ, Schiavi F, Landa I, Leandro-García LJ, Letón R, Honrado E, Ramos-Medina R, Caronia D, Pita G. Exome sequencing identifies MAX mutations as a cause of hereditary pheochromocytoma. Nat Genet. 2011;43:663-667.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 388]  [Cited by in F6Publishing: 409]  [Article Influence: 31.5]  [Reference Citation Analysis (0)]
15.  de Voer RM, Geurts van Kessel A, Weren RD, Ligtenberg MJ, Smeets D, Fu L, Vreede L, Kamping EJ, Verwiel ET, Hahn MM. Germline mutations in the spindle assembly checkpoint genes BUB1 and BUB3 are risk factors for colorectal cancer. Gastroenterology. 2013;145:544-547.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 71]  [Cited by in F6Publishing: 74]  [Article Influence: 6.7]  [Reference Citation Analysis (0)]
16.  Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J. The genomic landscapes of human breast and colorectal cancers. Science. 2007;318:1108-1113.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2283]  [Cited by in F6Publishing: 2213]  [Article Influence: 130.2]  [Reference Citation Analysis (0)]
17.  Starr TK, Largaespada DA. Cancer gene discovery using the Sleeping Beauty transposon. Cell Cycle. 2005;4:1744-1748.  [PubMed]  [DOI]  [Cited in This Article: ]
18.  Starr TK, Allaei R, Silverstein KA, Staggs RA, Sarver AL, Bergemann TL, Gupta M, O’Sullivan MG, Matise I, Dupuy AJ. A transposon-based genetic screen in mice identifies genes altered in colorectal cancer. Science. 2009;323:1747-1750.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 276]  [Cited by in F6Publishing: 278]  [Article Influence: 18.5]  [Reference Citation Analysis (0)]
19.  March HN, Rust AG, Wright NA, ten Hoeve J, de Ridder J, Eldridge M, van der Weyden L, Berns A, Gadiot J, Uren A. Insertional mutagenesis identifies multiple networks of cooperating genes driving intestinal tumorigenesis. Nat Genet. 2011;43:1202-1209.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 154]  [Cited by in F6Publishing: 158]  [Article Influence: 12.2]  [Reference Citation Analysis (0)]
20.  Domati F, Maffei S, Kaleci S, Di Gregorio C, Pedroni M, Roncucci L, Benatti P, Magnani G, Marcheselli L, Bonetti LR. Incidence, clinical features and possible etiology of early onset (≤40 years) colorectal neoplasms. Intern Emerg Med. 2014;9:623-631.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 4]  [Cited by in F6Publishing: 5]  [Article Influence: 0.5]  [Reference Citation Analysis (0)]
21.  Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754-1760.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 29052]  [Cited by in F6Publishing: 29828]  [Article Influence: 1988.5]  [Reference Citation Analysis (0)]
22.  Li R, Li Y, Kristiansen K, Wang J. SOAP: short oligonucleotide alignment program. Bioinformatics. 2008;24:713-714.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2015]  [Cited by in F6Publishing: 2077]  [Article Influence: 129.8]  [Reference Citation Analysis (0)]
23.  Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078-2079.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 36250]  [Cited by in F6Publishing: 36196]  [Article Influence: 2413.1]  [Reference Citation Analysis (0)]
24.  Vissers LE, de Ligt J, Gilissen C, Janssen I, Steehouwer M, de Vries P, van Lier B, Arts P, Wieskamp N, del Rosario M. A de novo paradigm for mental retardation. Nat Genet. 2010;42:1109-1112.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 599]  [Cited by in F6Publishing: 595]  [Article Influence: 42.5]  [Reference Citation Analysis (0)]
25.  Tavtigian SV, Deffenbaugh AM, Yin L, Judkins T, Scholl T, Samollow PB, de Silva D, Zharkikh A, Thomas A. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J Med Genet. 2006;43:295-305.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 505]  [Cited by in F6Publishing: 529]  [Article Influence: 29.4]  [Reference Citation Analysis (0)]
26.  Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4:1073-1081.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 4922]  [Cited by in F6Publishing: 4989]  [Article Influence: 332.6]  [Reference Citation Analysis (0)]
27.  Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248-249.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 10140]  [Cited by in F6Publishing: 9756]  [Article Influence: 696.9]  [Reference Citation Analysis (0)]
28.  Venselaar H, Te Beek TA, Kuipers RK, Hekkelman ML, Vriend G. Protein structure analysis of mutations causing inheritable diseases. An e-Science approach with life scientist friendly interfaces. BMC Bioinformatics. 2010;11:548.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 570]  [Cited by in F6Publishing: 657]  [Article Influence: 46.9]  [Reference Citation Analysis (0)]
29.  Cancer Genome Atlas Network. Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330-337.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5743]  [Cited by in F6Publishing: 6144]  [Article Influence: 512.0]  [Reference Citation Analysis (0)]
30.  Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. A census of human cancer genes. Nat Rev Cancer. 2004;4:177-183.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 2427]  [Cited by in F6Publishing: 2241]  [Article Influence: 112.1]  [Reference Citation Analysis (0)]
31.  Rahman N. Realizing the promise of cancer predisposition genes. Nature. 2014;505:302-308.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 382]  [Cited by in F6Publishing: 394]  [Article Influence: 39.4]  [Reference Citation Analysis (0)]
32.  Neveling K, Feenstra I, Gilissen C, Hoefsloot LH, Kamsteeg EJ, Mensenkamp AR, Rodenburg RJ, Yntema HG, Spruijt L, Vermeer S. A post-hoc comparison of the utility of sanger sequencing and exome sequencing for the diagnosis of heterogeneous diseases. Hum Mutat. 2013;34:1721-1726.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 234]  [Cited by in F6Publishing: 259]  [Article Influence: 23.5]  [Reference Citation Analysis (0)]
33.  Tomlinson IP, Carvajal-Carmona LG, Dobbins SE, Tenesa A, Jones AM, Howarth K, Palles C, Broderick P, Jaeger EE, Farrington S. Multiple common susceptibility variants near BMP pathway loci GREM1, BMP4, and BMP2 explain part of the missing heritability of colorectal cancer. PLoS Genet. 2011;7:e1002105.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 167]  [Cited by in F6Publishing: 174]  [Article Influence: 13.4]  [Reference Citation Analysis (0)]
34.  Smith CG, Naven M, Harris R, Colley J, West H, Li N, Liu Y, Adams R, Maughan TS, Nichols L. Exome resequencing identifies potential tumor-suppressor genes that predispose to colorectal cancer. Hum Mutat. 2013;34:1026-1034.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 43]  [Cited by in F6Publishing: 46]  [Article Influence: 4.2]  [Reference Citation Analysis (0)]
35.  Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 2009;106:9362-9367.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 3051]  [Cited by in F6Publishing: 2951]  [Article Influence: 196.7]  [Reference Citation Analysis (0)]
36.  Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG. Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012;40:e115.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 5674]  [Cited by in F6Publishing: 5622]  [Article Influence: 468.5]  [Reference Citation Analysis (0)]
37.  Thompson BA, Spurdle AB, Plazzer JP, Greenblatt MS, Akagi K, Al-Mulla F, Bapat B, Bernstein I, Capellá G, den Dunnen JT. Application of a 5-tiered scheme for standardized classification of 2,360 unique mismatch repair gene variants in the InSiGHT locus-specific database. Nat Genet. 2014;46:107-115.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 342]  [Cited by in F6Publishing: 342]  [Article Influence: 31.1]  [Reference Citation Analysis (0)]
38.  Pagenstecher C, Wehner M, Friedl W, Rahner N, Aretz S, Friedrichs N, Sengteller M, Henn W, Buettner R, Propping P. Aberrant splicing in MLH1 and MSH2 due to exonic and intronic variants. Hum Genet. 2006;119:9-22.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 67]  [Cited by in F6Publishing: 69]  [Article Influence: 3.6]  [Reference Citation Analysis (0)]
39.  Kim DW, Kim IJ, Kang HC, Jang SG, Kim K, Yoon HJ, Ahn SA, Han SY, Hong SH, Hwang JA. Germline mutations of the MYH gene in Korean patients with multiple colorectal adenomas. Int J Colorectal Dis. 2007;22:1173-1178.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 21]  [Cited by in F6Publishing: 25]  [Article Influence: 1.5]  [Reference Citation Analysis (0)]
40.  Yanaru-Fujisawa R, Matsumoto T, Ushijima Y, Esaki M, Hirahashi M, Gushima M, Yao T, Nakabeppu Y, Iida M. Genomic and functional analyses of MUTYH in Japanese patients with adenomatous polyposis. Clin Genet. 2008;73:545-553.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 31]  [Cited by in F6Publishing: 36]  [Article Influence: 2.3]  [Reference Citation Analysis (0)]
41.  Tanskanen T, Gylfe AE, Katainen R, Taipale M, Renkonen-Sinisalo L, Mecklin JP, Järvinen H, Tuupanen S, Kilpivaara O, Vahteristo P. Exome sequencing in diagnostic evaluation of colorectal cancer predisposition in young patients. Scand J Gastroenterol. 2013;48:672-678.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 12]  [Cited by in F6Publishing: 12]  [Article Influence: 1.1]  [Reference Citation Analysis (0)]
42.  Sheng JQ, Fu L, Sun ZQ, Huang JS, Han M, Mu H, Zhang H, Zhang YZ, Zhang MZ, Li AQ. Mismatch repair gene mutations in Chinese HNPCC patients. Cytogenet Genome Res. 2008;122:22-27.  [PubMed]  [DOI]  [Cited in This Article: ]
43.  Goldberg Y, Porat RM, Kedar I, Shochat C, Sagi M, Eilat A, Mendelson S, Hamburger T, Nissan A, Hubert A. Mutation spectrum in HNPCC in the Israeli population. Fam Cancer. 2008;7:309-317.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 15]  [Cited by in F6Publishing: 14]  [Article Influence: 0.9]  [Reference Citation Analysis (0)]
44.  Arnold S, Buchanan DD, Barker M, Jaskowski L, Walsh MD, Birney G, Woods MO, Hopper JL, Jenkins MA, Brown MA. Classifying MLH1 and MSH2 variants using bioinformatic prediction, splicing assays, segregation, and tumor characteristics. Hum Mutat. 2009;30:757-770.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 50]  [Cited by in F6Publishing: 55]  [Article Influence: 3.7]  [Reference Citation Analysis (0)]
45.  Savitsky K, Sfez S, Tagle DA, Ziv Y, Sartiel A, Collins FS, Shiloh Y, Rotman G. The complete sequence of the coding region of the ATM gene reveals similarity to cell cycle regulators in different species. Hum Mol Genet. 1995;4:2025-2032.  [PubMed]  [DOI]  [Cited in This Article: ]
46.  Abraham RT. Cell cycle checkpoint signaling through the ATM and ATR kinases. Genes Dev. 2001;15:2177-2196.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1496]  [Cited by in F6Publishing: 1542]  [Article Influence: 67.0]  [Reference Citation Analysis (0)]
47.  McKinnon PJ. ATM and the molecular pathogenesis of ataxia telangiectasia. Annu Rev Pathol. 2012;7:303-321.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 170]  [Cited by in F6Publishing: 164]  [Article Influence: 12.6]  [Reference Citation Analysis (0)]
48.  Pusapati RV, Rounbehler RJ, Hong S, Powers JT, Yan M, Kiguchi K, McArthur MJ, Wong PK, Johnson DG. ATM promotes apoptosis and suppresses tumorigenesis in response to Myc. Proc Natl Acad Sci USA. 2006;103:1446-1451.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 114]  [Cited by in F6Publishing: 117]  [Article Influence: 6.5]  [Reference Citation Analysis (0)]
49.  Broeks A, Urbanus JH, Floore AN, Dahler EC, Klijn JG, Rutgers EJ, Devilee P, Russell NS, van Leeuwen FE, van ‘t Veer LJ. ATM-heterozygous germline mutations contribute to breast cancer-susceptibility. Am J Hum Genet. 2000;66:494-500.  [PubMed]  [DOI]  [Cited in This Article: ]
50.  Uhrhammer N, Bay J, Pernin D, Rio P, Grancho M, Kwiatkowski F, Gosse-Brun S, Daver A, Bignon Y. Loss of heterozygosity at the ATM locus in colorectal carcinoma. Oncol Rep. 1999;6:655-658.  [PubMed]  [DOI]  [Cited in This Article: ]
51.  Blackwood EM, Eisenman RN. Max: a helix-loop-helix zipper protein that forms a sequence-specific DNA-binding complex with Myc. Science. 1991;251:1211-1217.  [PubMed]  [DOI]  [Cited in This Article: ]
52.  Blackwood EM, Lüscher B, Eisenman RN. Myc and Max associate in vivo. Genes Dev. 1992;6:71-80.  [PubMed]  [DOI]  [Cited in This Article: ]
53.  Mac Partlin M, Homer E, Robinson H, McCormick CJ, Crouch DH, Durant ST, Matheson EC, Hall AG, Gillespie DA, Brown R. Interactions of the DNA mismatch repair proteins MLH1 and MSH2 with c-MYC and MAX. Oncogene. 2003;22:819-825.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 26]  [Cited by in F6Publishing: 26]  [Article Influence: 1.2]  [Reference Citation Analysis (0)]
54.  Burnichon N, Cascón A, Schiavi F, Morales NP, Comino-Méndez I, Abermil N, Inglada-Pérez L, de Cubas AA, Amar L, Barontini M. MAX mutations cause hereditary and sporadic pheochromocytoma and paraganglioma. Clin Cancer Res. 2012;18:2828-2837.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 223]  [Cited by in F6Publishing: 207]  [Article Influence: 17.3]  [Reference Citation Analysis (0)]
55.  Schreiber V, Dantzer F, Ame JC, de Murcia G. Poly(ADP-ribose): novel functions for an old molecule. Nat Rev Mol Cell Biol. 2006;7:517-528.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 1444]  [Cited by in F6Publishing: 1458]  [Article Influence: 81.0]  [Reference Citation Analysis (0)]
56.  Pyndiah S, Tanida S, Ahmed KM, Cassimere EK, Choe C, Sakamuro D. c-MYC suppresses BIN1 to release poly(ADP-ribose) polymerase 1: a mechanism by which cancer cells acquire cisplatin resistance. Sci Signal. 2011;4:ra19.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 72]  [Cited by in F6Publishing: 75]  [Article Influence: 5.8]  [Reference Citation Analysis (0)]
57.  MacDonald BT, He X. Frizzled and LRP5/6 receptors for Wnt/β-catenin signaling. Cold Spring Harb Perspect Biol. 2012;4.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 340]  [Cited by in F6Publishing: 399]  [Article Influence: 33.3]  [Reference Citation Analysis (0)]
58.  Zorn AM. Wnt signalling: antagonistic Dickkopfs. Curr Biol. 2001;11:R592-R595.  [PubMed]  [DOI]  [Cited in This Article: ]
59.  Björklund P, Svedlund J, Olsson AK, Akerström G, Westin G. The internally truncated LRP5 receptor presents a therapeutic target in breast cancer. PLoS One. 2009;4:e4243.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 44]  [Cited by in F6Publishing: 52]  [Article Influence: 3.5]  [Reference Citation Analysis (0)]
60.  Björklund P, Akerström G, Westin G. An LRP5 receptor with internal deletion in hyperparathyroid tumors with implications for deregulated WNT/beta-catenin signaling. PLoS Med. 2007;4:e328.  [PubMed]  [DOI]  [Cited in This Article: ]  [Cited by in Crossref: 45]  [Cited by in F6Publishing: 52]  [Article Influence: 3.1]  [Reference Citation Analysis (0)]