The HCV genome is an RNA molecule of approximately 9600 nucleotides structured in a coding region that contains one large open reading frame and flanked by non-translated regions at the 5’ and 3’ ends. The polyprotein is cleaved into structural (core, envelope 1 and 2) and non-structural proteins (NS2, NS3, NS4A, NS4B, NS5A and NS5B) with one additional small protein at the junction between the structural and non-structural elements (p7 protein). Until the recent development of an in vitro replication system that produces infectious viral particles[5-7] many of the protein functions have been studied in sub-genomic replication systems or with purified protein after expression.
The 5’ NTR forms a highly structured RNA element that contains an internal ribosomal entry site (IRES) that allows interaction with the 40S ribosomal subunit and initiation of cap-independent translation of the viral RNA. The polyprotein is translated as one large reading frame and subsequently cleaved by host-cellular proteases and virally encoded proteases into the individual proteins. The structural proteins core, envelope 1 and envelope 2 are cleaved by host-cellular proteases. Processing of these proteins is believed to take place in a membrane associated complex at the endoplasmic reticulum by signal peptidases. The core protein forms the viral capsid, binds to the viral RNA and interacts with envelope proteins to form viral particles. Different receptors have been suggested for the interaction of viral particles with the hepatocyte that mediates HCV entry including CD81, scavenger receptor class-B type-I (SR-BI), low-density lipoprotein receptor (LDL), mannose binding lectins (L-SIGN and DC-SIGN) and glycosaminoglycans. E2 binds with high affinity to the extracellular loop of CD81, a tetraspanin that is expressed on different cell types including hepatocytes. Cell entry of the virus is CD81 and SR-BI dependent, suggesting that these molecules serve as receptors or co-receptors for infection[13-16]. However, this interaction does not explain the hepatotropism as CD81 and SR-BI are not exclusively expressed on hepatocytes. Moreover, it has been shown that CD81 and SR-BI are necessary for cell entry of viral particles, but are not sufficient[16,17]. Some cell lines that express both proteins do not support viral entry. Recently a new candidate has been suggested that may close this gap. Claudin-1, a tight junction component that is highly expressed in the liver, was recently identified as a key factor in the late entry process.
These structural components of HCV are flanked by the non-structural proteins NS2 to NS5B. The function of one additional protein (p7) between these elements remains to be elucidated. It has been suggested that p7 forms an ion channel in planar lipid bilayers[19,20]. However, it is still unclear whether it is a virion component. NS2 contains an autoprotease, which cleaves the junction between NS2 and NS3. NS3 is a multifunctional protein with a N-terminal protease domain and a C-terminal RNA helicase/NTPase domain. The NS3 protease cleaves the remaining non-structural proteins with NS4A as a cofactor for this activity. The NS3 RNA helicase/NTPase unwinds RNA and DNA; however, its role during viral replication is unclear. The integral membrane protein NS4B is sufficient to induce membranous web formation and has been proposed to serve as a scaffold for replication complex assembly. The role of NS5A is again unclear. Numerous protein-protein interactions have been suggested including a role in silencing the host’s innate immune response and determining responsiveness to interferon alpha. Some of these interactions are discussed later. NS5B encodes the viral RNA-dependent RNA polymerase. In the viral replication cycle the positive-strand RNA genome serves as a template to make a negative-strand intermediate, which then again serves as a template to produce multiple nascent genomes. Recently, one additional protein resulting from framshifted translation of the core protein has been identified (alternative reading frame protein ARFP). However, the function of this protein is unknown.
Phylogenetic analysis of HCV genomes revealed that sequences fall into different clusters. This observation led to a classification of HCV into different genotypes and a standardized nomenclature was proposed in a consensus paper in 1994. The global distribution of HCV genotypes is regionally specific. The predominant genotype in most areas is genotype 1. However, some areas are almost exclusively infected with other genotypes. For example, the predominant genotype in Egypt is genotype 4 and the HCV epidemic in this country could be linked to parenteral treatment of schistosomiasis in the 1950s[25,26]. In some regions in Africa genotype 2 is more frequent. In some areas of Asia, however, genotypes 3 and 6 are predominant[28-30]. Despite substantial sequence variation all genotypes share the same structure of linear genes of nearly identical size. The genotype specific variation of the different genes is remarkably consistent and has enabled many of the currently recognized variants of HCV to be provisionally classified based on partial sequences from subgenomic regions such as core/E1 and NS5B. The original nomenclature was recently updated further standardizing the nomenclature of existing variants. Based on phylogenetic analysis a classification into 6 major genotypes was proposed and criteria for the designation of new HCV variants were formulated. These proposals provide an HCV nomenclature scheme for the three major public HCV sequence databases (Europe, USA and Japan: http://s2as02.genes.nig.ac.jp/) and eliminate inconsistencies of the current classification procedures. HCV genotypes differ from each other by 31%-33% on the nucleotide level. The genotypes are further divided into multiple epidemiologically distinct subtypes differing by 20%-25% from one other. A phylogenetic tree depicting all published complete HCV genomes is presented in Figure 1. For many of the HCV subtypes, particularly for the less frequent ones, complete genome sequences are not available. The lack of sequence data for rare genotypes is profound prompting a sequencing initiative supported by the NIH to improve the HCV sequence databases.
Figure 1 Published HCV Full Genomes.
169 full length HCV sequences available from the Los Alamos National Laboratory (LANL) HCV Sequence Database are illustrated in a phylogenetic tree. HCV sequences fall into six different clusters (genotype 1-6) and are further classified into subtypes. The sequence from one recombinant virus (genotype 1b/2k) is included.
As a member of the flaviviridae, the virally encoded RNA polymerase of HCV lacks a proof reading function. Replication of this positive-stranded RNA genome is, therefore, characterized by ongoing error rates between 1 in 10000 and 1 in 100000 bp copied, which are typically found for RNA polymerases. Together with a high turn-over rate of estimated 1012 virions per day, theoretically every possible mutation in every single position of the genome will be generated in one infected host every day. This high error rate is reflected in the generation of a heterogeneous, but closely related swarm of viruses within the same host referred to as quasispecies. The quasispecies nature of HCV can be best illustrated by sequence analysis of a short, but highly polymorphic region in envelope 2 designated as the hypervariable region 1 (HVR 1). Analysis of clonal sequences reveals that sequences of the viral population from the same subject are highly variable, but still phylogenetically closely related. In public databases normally the consensus sequence, as the most predominant residue at any given position within the quasispecies population, is presented. The quasispecies nature of HCV may have important consequences during a transmission event. Depending on the transmission route the number of transmitted viral RNA copies can be limited and may not represent the true complexity of the sequence diversity of the donor. This bottleneck phenomenon has been described for sexual transmission of HCV and in the chimpanzee model. However, the bottleneck could also be interpreted as selection of the optimal strain in the new host during the earliest infection events. Different aspects about the nature of the observed HCV sequence evolution have been published. The next section gives an overview of the different mechanisms.
DRIVING FORCES OF EVOLUTION
Longitudinal analysis of isolates from subjects with chronic HCV infection calculated a mutation rate on the order of 1.5-2.0 × 10-3 nucleotide substitutions per site per genome per year[38,39]. In the model of neutral evolution mutations are selectively neutral. The spread of these neutral mutations is mainly influenced by stochastic factors and is called genetic drift. As a consequence of this stochastic process even disadvantageous mutations can reach fixation when the virus circulates through a sufficiently small population. In turn, advantageous mutations are also affected by genetic drift when they are rare and are occasionally lost from the population. Several studies have described the rapid sequence drift of HCV over time. The model of neutral evolution and the presumption that such diversification should occur at a constant rate over time provide a framework to estimate times of spread of HCV in specific transmission networks[41,42]. For example, a recent analysis of viral sequences obtained from an HCV and HIV outbreak in children at the Al-Fateh hospital in Libya, utilizing this molecular clock, demonstrated that the origin of the outbreak predated the year of 1998. Noteworthy, this analysis excluded the possibility that the source virus was transmitted by foreign medical staff as suggested by local authorities.
Sequence drift has been suggested in a few studies as the major driving force of HCV evolution. An analysis by Allain et al suggested a dominant role in the evolution of envelope 2. They analyzed clonal sequences of six different transmission pairs years after the transmission event. In this study the ratio of non-synonymous to synonymous mutations in the analyzed region did not support the hypothesis of positive or negative selection. The author’s, therefore, concluded that neutral evolution is a major component of the observed sequence diversity. Moreover, the authors did not find a correlation between the strength of the antibody response and the rate of evolution in these patients. However, in this study the strength of the humoral immune response was determined in serum collected years after the transmission event utilizing different HVR 1 peptides corresponding to autologous and heterologous sequences. Therefore, the true antibody response against the virus present during the acute phase of infection might have been underestimated. Another shortcoming of this study is that the inoculum sequence at the time of transmission was unknown and only a single time point years later was available making conclusions about the true evolutionary rate and kinetics difficult.
Positive and negative selection
In contrast to being neutral mutations may also be selected. Many mutations are probably disadvantageous or even deleterious for the virus and these variants are eliminated in a negative selection process. However, some mutations may not have an impact on replication capacity and a few may even be beneficial and confer a replication advantage. Variants harbouring these beneficial mutations will out compete for others in a dynamic process of continuous positive selection. Similar to other highly variable pathogens a complex process of continuous selection has been proposed for HCV. Theoretically, infections with persistent viruses such as HIV and HCV have time to evolve within the same host before transmission to the next host and may adapt to the specific environment in an individual. The evolution of HCV may, therefore, be substantially influenced by host factors mediating selection pressure on the virus. Even though the consensus sequence may be close to the maximum of viral replication capacity at any one time, the existence of a large and diverse viral population allows rapid, adaptive changes in response to changes in the replication environment. Many variants that are beneficial in a new environment may already be present in a low frequency in the quasispecies population and subsequently out competes the existing dominant sequence. The impact of the quasispecies complexity on the clinical outcome can be profound. Farci et al analyzed sequences covering HVR1 obtained during the acute phase of infection from subjects who spontaneously resolved viremia and subjects who continued to chronic infection. Spontaneous resolution of viremia was predicted by a decrease in quasispecies complexity during the first weeks of infection. In turn patients with viral persistence had increasing viral diversity suggesting a fast adaptation process to the new environment. A similar effect of quasispecies diversity on the outcome of treatment is discussed[47,48]. In recent years many studies have been published that aim to characterize the driving forces of this selection process.
The most variable region in the HCV genome is a short fragment spanning 27 amino acids of envelope 2 and is, therefore, designated the hypervariable region 1 (HVR 1). There is strong evidence that the profound sequence diversity in this region is the result of immune pressure by virus specific antibodies. Importantly, there is a close association between the observed sequence diversity in this region and the appearance of HCV specific antibodies in the sera of subjects with acute infection[49,50]. Patients suffering from common variable immunodeficiency (CVID) who present with hypogammaglobulinemia are not able to produce high titres of HCV specific antibodies and, therefore, are not able to mount humoral immune selection pressure. Analysis of sequences of HVR1 revealed that patients with hypogammaglobulinemia had significantly less amino acid substitutions in this region over time as compared to controls. In a similar analysis, the rate of non-synonymous and synonymous mutations was compared between core and envelope in patients with and without CVID. The rate of synonymous or silent mutations was similar in the core and envelope protein. In patients without CVID, as expected, the rate of non-synonymous mutations was much higher in envelope as compared to core, a protein that is known to be highly conserved. However, this high rate of non-synonymous mutations in envelope was not observed in patients with CVID suggesting that evolution is triggered by the presence of anti-HCV antibodies. In the chimpanzee model, it was demonstrated that a high turn-over rate is not sufficient to explain HVR1 sequence diversity. Only minor sequence variation was observed in this region upon serial infection with passage of an infectious HCV clone between 8 different animals. Noteworthy, samples for the subsequent infection of the next animal were taken during the acute phase before antibodies became detectable. Again, this study indicates that this region in the envelope remains stable in the absence of antibodies despite high level viremia that was present in all animals during the acute phase of infection. Taken together, all these studies suggest that without immune selection pressure only minor sequence changes occur in HVR1.
The lack of an in vitro culture system has hampered direct evaluation of these putative escape mechanisms in the envelope protein. Recently, more elegant tools for this type of analysis became available by pseudotyping retroviral particles with HCV glycoproteins (HCVpp)[54-56]. Utilizing this technique, the impact of neutralizing antibodies on the evolution of HVR1 was demonstrated in a study by von Hahn and co-workers. Here longitudinal samples were obtained over a time period of 26 years from patient H who was infected in 1977 with genotype 1a. Sera were analyzed for the presence of neutralizing antibodies against the autologous isolate present at the time of sampling. A neutralizing antibody response could be detected as early as 8 wk after infection against the inoculum strain. Interestingly, the antibodies present in a given sample continuously failed to neutralize HCV pseudoparticles bearing the autologous sequence from the same time point. Longitudinal analyses demonstrate continuous escape from emerging antibodies over the time of infection demonstrating humoral immune pressure as the major driving force for the observed sequence diversity in HVR1.
CD8 T cells
Mutational escape from CD8 T cells targeting viral proteins has been well documented for highly variable pathogens such as HIV and SIV. Similarly, in the chimpanzee model of HCV infection selection of mutations in CD8 epitopes that inhibit recognition by specific T cells has been described by Weiner and co-workers. In a follow-up analysis the majority of targeted CD8 epitopes in chimpanzees infected with HCV evolved over time and an important role for mutational escape as a contributor for viral persistence has been suggested. However, acute infection is rarely detected in humans due to lack of specific symptoms making the design of similar longitudinal studies difficult. First evidence for selection pressure by CD8 T cells was obtained from sequence analyses of patients with chronic infection. Here the T cell response against previously defined CD8 epitopes was determined. In some cases the autologous viral sequence present in the patient differed from the described prototype sequence of the epitope and was not targeted by specific T cell lines derived from that patient. The study included a case where sequence evolution was observed in a follow-up sample suggesting that CD8 escape also plays a role during chronic HCV infection. More recently, several longitudinal studies on patients with acute HCV infection have been published providing compelling evidence for CD8 escape in humans[61-64]. Probably the most comprehensive analysis was done by Cox et al. They prospectively followed subjects with ongoing intravenous drug use and high risk behaviour for evidence of acute HCV infection. Using this approach they were able to identify eight patients with acute HCV infection. Samples from these patients were obtained during acute infection (at the time of diagnosis) and after 6 mo. Utilizing comprehensive techniques with overlapping peptides spanning the entire HCV polyprotein the breadth of the immune response was determined. At the same time, sequence evolution between the first and second sample obtained 6 mo later was analyzed. Seventeen of 25 targeted epitopes evolved over time consistent with selection of escape mutations. Of note, the single subject without selection of escape mutations cleared viremia spontaneously. In turn, 50% of the observed sequence changes outside the envelope were associated with a detectable CD8 response. In line with these findings Ray et al analyzed sequences from a single source outbreak infected with HCV genotype 1b and observed reproducible selection of mutations in previously described CD8 epitopes in subjects expressing the restricting HLA-allele. The observation of widespread escape in individuals during acute HCV infection prompted efforts to determine whether adaptation to HLA class I-restricted selection pressure also occurs at the population level. Moore et al analyzed sequences spanning the reverse transcriptase protein of HIV-1 in a large HLA-diverse cohort. This study revealed accumulation of viral sequence polymorphisms at different sites of the protein in patients sharing the same HLA-allele. Many of these sequence polymorphisms were located inside previously described CD8 epitopes that are restricted by the associated HLA-class I allele. This study demonstrated that MHC class I-associated selection pressure has a major impact on the evolution of HIV-1. A similar analysis was done by the same group in a cohort chronically infected with HCV. Here, sequences spanning parts of the NS3 protein were analyzed and again a number of associations between sequence polymorphisms and particular HLA-alleles were identified. This analysis was extended to all non-structural proteins in a cohort of 70 subjects with chronic HCV genotype 1a infection with similar results. An example of an HLA class I-associated sequence polymorphism is illustrated in Figure 2. Viral sequences from all 70 subjects are aligned to a majority consensus sequence and sorted into sequences derived from HLA B8-positive and HLA B8-negative subjects. Boxed is the region of a described HLA B8-restricted CD8 epitope. Differences from the consensus sequence are significantly more frequent in HLA-B8 positive subjects compared to HLA B8-negative subjects indicating that there is reproducible selection pressure on this region in HLA B8-positive subjects. This study included a phylogenetic analysis approach for the detection of HLA-associated sequence polymorphisms highlighting the potential for false positive detection of such associations by pure statistical approaches. However, these studies suggest that the same evolutionary forces act on HCV and HIV-1 and that selection pressure by virus specific CD8 T cells is an important driver of viral evolution.
Figure 2 HLA class I-associated sequence polymorphisms in a CD8 epitope.
Viral sequences are aligned to a consensus sequence and sorted into sequences derived from HLA B8-positive and HLA B8-negative subjects. Boxed is the region of a described HLA B8-restricted CD8 epitope. Differences from the consensus sequence are significantly more frequent in HLA-B8 positive subjects (P < 0.001). This association was reported in.
CD4 T cells
Spontaneous resolution of viremia after acute infection with HCV has been associated with the emergence of a broad and functionally intact T cell response. There is convincing evidence that HCV specific CD4 T cells are important for viral control in the early phase. However, little is known about mutational escape in targeted CD4 T cell epitopes during HCV infection. In theory, a similar selection process as observed for CD8 T cells could be present. A quasispecies that is mutated inside an immuno-dominant CD4 epitope may have a selection advantage in an individual with acute HCV infection and could out compete others. Eckels et al found evidence for selection of mutations in CD4 epitopes in NS3[69,70]. Analysis of a large number of clonal sequences revealed a high degree of polymorphisms in regions targeted by CD4 T cells and the ratio of synonymous versus non-synonymous mutations was consistent with positive selection. However, none of the observed variants became the dominant sequence at a second time point 16 mo later. More recently, two studies have described mutational escape from CD4 responses. The first report is part of a vaccination study in the chimpanzee model. One animal was vaccinated with DNA followed by recombinant vaccinia virus in a prime/boost strategy with HCV NS3 and NS5A/B and subsequently infected with a genotype 1a isolate. After primary infection and transient control of viremia the animal developed chronic infection. Longitudinal sequence analysis of the NS3 and NS5A/B region revealed two non-synonymous mutations. Both of them were located inside regions targeted by CD4 T cells. Additional experiments with synthetic peptides showed that the mutated sequence was not recognized by specific T cells from this animal. A second study focusing on the evolution of the envelope protein in one subject similarly identified amino acid substitutions inside targeted CD4 epitopes consistent with escape. These findings suggest that CD4 T cells may select mutations during HCV infection; however, the overall extent of CD4 escape as a contributor to the evolution of HCV remains to be clarified.
Infection with HCV initiates a cascade of events within the infected cell with the goal to generate an antiviral state. The first line of defence builds the innate immune system that is triggered by engagement of pathogen-associated molecular patterns (PAMPs) to specific PAMP receptors. In case of HCV toll-like receptor 3 (TLR3) and retinoic-acid-inducible gene I (RIG-I)-receptor recognize dsRNA resulting in activation of multiple cellular factors that mediate transcription and secretion of interferon alpha and beta. Engagement of these type I interferons with their cellular receptor activates a series of interferon stimulated genes (ISG) with the goal to initiate an antiviral state within the infected cell. The hepatitis C virus has developed strategies to evade this first line of host immune defence. Recently it was demonstrated that the NS3/4A protease is able to specifically cleave Cardif (CARD adaptor inducing interferon beta) and TRIF (Toll-interleukin-1 receptor domain-containing adaptor-inducing beta interferon)[73-75]. Both proteins are involved in the activation of interferon regulatory factors (IRFs) and their inactivation, therefore, interferes with the production of interferon alpha and beta. Gale et al demonstrated in vitro that NS5A is able to inhibit protein kinase R (PKR). PKR is activated by interferon alpha and involved in the inhibition of viral RNA translation. Its inhibition, therefore, represents a functional antagonism to the interferon alpha response. Interactions between virus and cellular host factors are sequence specific and negative selection of mutations undermining these immune evasion strategies seems reasonable. However, the underlying mechanisms of the innate immune response are highly conserved in all humans and less dependent on the genetic background of the individual. Therefore, the direction of selection pressure does not change upon transmission to the next host, which makes the observation of positive selection unlikely. Moreover, the interactions between virus and the host’s innate immune response take place during the earliest phase of infection making it even more difficult to directly show that viral escape variants are selected.
The current standard treatment regimen for patients with chronic hepatitis C is a combination of pegylated interferon alpha with ribavirin[2,3]. Interestingly, the response rate to this treatment regimen is dependent on the infecting genotype suggesting that sequence differences between genotypes influence the susceptibility to these drugs. Patients infected with genotype 2 and 3 usually show a much faster decline in viral load after initiation of therapy associated with higher sustained response rates. The determinants of this differential responsiveness of different genotypes are poorly understood. Interferon alpha predominantly modulates the immune system. Engagement of its specific receptor turns on a cascade of IFN-stimulated genes (ISGs) resulting in a non-pathogen specific antiviral state. Different HCV sequences seem to have different capabilities to interfere with this anti-viral strategy. The response rates to treatment dramatically differ not only between different genotypes, but also between isolates of the same subtype. Comparison of sequence isolates that have been successfully treated with isolates that did not respond has put a 40 amino acid stretch of the HCV NS5A protein into the spotlight. The degree of sequence variation in this region has been associated with treatment outcome and has therefore been designated as the interferon-sensitivity determining region (ISDR). Subsequently, conflicting results have been published in similar studies; however, a meta-analysis supported the impact of this region on treatment outcome. A correlate of this observation may be the reported inhibitory action of HCV NS5A on PKR. For this interaction the ISDR and an additional 26 C-terminal amino acid stretch of NS5A are crucial. Therefore, selection of viral variants during treatment that successfully enhance this interaction seems reasonable. However, neither was selection of mutations observed in the presence of this antiviral drug in vitro nor has selection of variants during treatment in longitudinal studies formally been shown. It remains, therefore, unclear how interferon alpha contributes to evolution.
The exact antiviral mechanisms of ribavirin are even less well established. Several mechanisms have been suggested including inhibition of the HCV polymerase and early chain termination during the replication process. Higher mutation rates in the presence of ribavirin have been reported potentially resulting in an ‘error catastrophe’. Two recent studies analyzed the mutation rate in the presence and absence of ribavirin in patients receiving treatment. Hofmann et al analyzed the NS3 and NS5B gene in 14 subjects receiving either ribavirin monotherapy or in combination with interferon alpha. Based on a comparison of clonal sequences in the quasispecies population before and after initiation of therapy they concluded that the mutation rate of HCV is higher in the presence of ribavirin. These results were reproducible in cell culture with HCV replicon bearing hepatoma cell lines. Even though the overall effect was weak, a dose dependency could be demonstrated and the inactive L-enantiomer did not show the same effect. In a similar analysis by Lutchman et al the mutation rate for NS5B was calculated based on analysis of bulk and clonal sequences of the NS5B gene in 18 subjects receiving ribavirin and 13 subjects receiving placebo. A significant increase of the mutation rate in the presence of ribavirin was observed after 4 wk of treatment; however, there was no significant difference between the mutation rates after 24 wk compared to the placebo group. The authors of this latter study conclude that ribavirin unlikely acts through an increase of the mutational error rate resulting in an error catastrophe. One study demonstrated selection of a Phe to Tyr mutation in position 415 of the HCV NS5B protein in the presence of ribavirin. This mutation was associated with a less susceptible phenotype for this drug when tested in the replicon model in vitro. This mutation was also observed in 5 out of 16 subjects infected with genotype 1a in the study by Lutchman et al. However, it was not reproducible in HCV genotype 1b. It is, therefore, still unclear if specific mutations are selected in the presence of ribavirin.
Future treatment strategies will include small molecules as inhibitors of virus specific protein functions. Drugs like protease and polymerase inhibitors are very successful for the treatment of HIV. However, selection of variants that are resistant to these antiviral compounds is a major challenge in the management of patients infected with HIV. In recent years many compounds have been tested as inhibitors of the HCV protease and polymerase. Some are now available in early clinical trials. The first compound that was tested in humans was the protease inhibitor BILN 2061. In patients infected with genotype 1 the viral load was dramatically decreased after only 2 d of treatment. However, the drug was designed to inhibit the HCV protease from a genotype 1 isolate with high affinity. As expected, the efficacy was, therefore, much lower in subjects infected with HCV genotype 2 or 3. Due to observed toxic effects of this drug in high dosage in animal models further clinical trials were stopped. Other small compounds have meanwhile reached early phases of clinical testing (reviewed in) including protease inhibitors (VX-950 and SCH 503034) and polymerase inhibitors (NM283 and HCV-796). Along with their proof of excellent efficacy in vitro and in vivo several reports have been published describing resistance mutations[87,88]. A recent study by Sarrazin et al analyzed clonal sequences from subjects treated with the protease inhibitor VX-950 in a clinical trial. In this analysis mutations associated with phenotypic resistance were rapidly selected during treatment and the number of resistant clones in each patient correlated well with the virologic response to the drug. The mutations were reproducibly located in only a few positions in the HCV protease gene. Interestingly, the number of resistant clones decreased after cessation of therapy indicating that some mutations are associated with fitness costs and revert back to wild type in the absence of the drug. Future studies will show if combinations of different drugs such as polymerase and protease inhibitors are beneficial to decrease the risk of resistance mutations similar to HIV.
Constraints on sequence diversity-purifying selection
Analysis of available HCV full genome sequences from public databases shows that the degree of sequence variation varies both between different proteins, but also between regions of the same protein (Figure 3). Some regions are highly conserved even across different HCV genotypes. Many of these highly conserved regions represent functionally important motifs in the viral protein in which substantial sequence variation is not tolerated. Viral evolution is clearly limited by structural constraints forcing the virus into a state in which it is able to functionally exist. Many mutations that occur during the replication process are deleterious or disadvantageous to the fitness of the virus and are, therefore, negatively selected. In contrast, as highlighted in this review multiple forces of the immune system or in some cases drugs exact positive selection pressure away from the consensus sequence in the individual. Selection of variants is, therefore, a trade-off between host pressure and functional needs. Purifying selection describes the driving force towards a sequence with optimal replication capacity in the absence of outside pressure on the virus. Reversion of resistance mutations that have been selected in the presence of antiviral drugs back to the consensus sequence have been first described in the influenza model. For HIV, reversion of drug resistance mutations is well documented and the concept of a salvage therapy was based on this observation. In this concept treatment is re-initiated after interruption and genotypic reversion of drug resistance. However, the clinical benefit is controversial and with a wide range of antiviral drugs now available for the treatment of HIV this strategy received less attention.
Figure 3 Entropy across the HCV polyprotein.
100 HCV genotype 1b sequences were retrieved from the Los Alamos National Laboratory (LANL) HCV Sequence Database. The entropy score was calculated for 300 windows of 20 residues overlapping by 10 residues utilizing the algorithm implemented in the database.
In HCV reversion was first described for an escape mutation that has been selected by virus-specific CTLs. In this study a virus harbouring an escape mutation in an HLA-B8 restricted epitope in NS3 was transmitted to a host who is HLA-B8 negative and who, therefore, was not able to mount the same T cell response. In the new host the virus continued to evolve back to the prototype sequence and the variant disappeared. Similarly, Ray et al analyzed viral sequences from a single source outbreak years after the transmission event. Interestingly, the source of the virus again had an escape mutation in the same HLA-B8 restricted epitope in NS3. Years later the mutation has reverted back to the prototype sequence in most recipients. Of note, the mutation was stable in subjects who are HLA-B8 positive and who are theoretically able to target this region. Similar to the reversion of CTL escape mutations Sarrazin et aldescribe in their study on drug resistance to the protease inhibitor VX-960 reversion back to the wild type sequence after treatment was discontinued. These studies illustrate the main selecting forces of HCV evolution. On one side, there is positive selection pressure mainly by the immune system but also in the presence of antiviral drugs. These selection forces are not constant and vary in different hosts and different environments largely depending on the host’s genetic background. On the other side, there is negative selection pressure, which presses the virus into a state of optimal replication capacity. This force is more or less constant, but largely depends on the pre-existing sequence configuration such as the genotype or presence of compensatory mutations.