©The Author(s) 2015. Published by Baishideng Publishing Group Inc. All rights reserved.
Auditory hallucinations: A review of the ERC “VOICE” project
Kenneth Hugdahl, Department of Biological and Medical Psychology, University of Bergen, 5009 Bergen, Norway
Kenneth Hugdahl, Division of Psychiatry, and Department of Radiology, Haukeland University Hospital, 5021 Bergen, Norway
Kenneth Hugdahl, NORMENT Center of Excellence, University of Bergen, 5009 Bergen, Norway
Kenneth Hugdahl, K G Jebsen Center for Neuropsychiatric Disorders, University of Bergen, 5009 Bergen, Norway
Author contributions: Hugdahl K solely contributed to this paper.
Supported by European Research Council Advanced Grant, No. #249516; Research Council of Norway FRIBIOMED Grant, No. 807696; and SFF Grant, No. 222373.
Conflict-of-interest: The author declares no conflict-of-interest.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: Kenneth Hugdahl, Professor, Department of Biological and Medical Psychology, University of Bergen, Jonas Lies vei 91, 5009 Bergen, Norway. firstname.lastname@example.org
Received: September 19, 2014
Peer-review started: September 20, 2014
First decision: October 14, 2014
Revised: March 25, 2015
Accepted: April 10, 2015
Article in press: April 14, 2015
Published online: June 22, 2015
The research presented in this article was funded by an ERC Advanced Grant “VOICE” to Kenneth Hugdahl, and is thus a selective review. In this sense, this is not a formal review of the literature, nor a formal comparison of findings in the literature at large. The ERC “VOICE” project has also been reviewed in a conference proceedings volume from the 16th Conference for Philosophy, Psychiatry and Psychology, Golden Sands, Bulgaria, Edited by Stoyvanov D and to be published by Cambridge Scholars Publishing, Ltd. Schizophrenia is one of the most severe mental disorders, causing lifelong distress and social handicaps. The disorder is recognized as a leading cause of morbidity both globally and in particular in the Western world, and ranks as one of the most costly disorders that affect humans.
A key symptom in schizophrenia is auditory verbal hallucinations (AVHs), i.e., the experience of “hearing voices” in the absence of an external auditory source. AVHs occur in 70%-80% of patients with schizophrenia and often produce distress, functional disability, and behavioral dys-control[2,3]. In some cases, hallucinations may also pose a threat to the patient or his/her family members, and even to society, if they the form of “hearing voices” commanding the patient to commit unwanted behaviors or acts.
SCHIZOPHRENIA AND AUDITORY HALLUCINATIONS
Because of the heterogeneity of the diagnostic phenotype of schizophrenia, which so far has escaped specification of underlying neuronal and molecular explanations[5,6], the ERC “VOICE” project has suggested an alternative approach by focusing on a single symptom, rather than on the diagnostic category itself. There are several advantages with such an approach. First of all, by definition, it will reduce heterogeneity by increasing signal to noise ratio, since a symptoms is less heterogeneous than the diagnostic category as such. Second, it will better allow for the pursuit of explanations “upwards” to the clinical level of explanation and “downwards” to the neurobiological level of explanation since the symptom is closer to both the upward diagnosis and the downward neurobiological markers. Third, it will better allow for characterizations at the individual level, since symptoms can be (and typically are) quantified through the use of various symptom scales, like the Positive and Negative Syndrome Scale (PANSS) and Scale for the Assessment of Negative Symptoms/Scale for the Assessment of Positive Symptoms (SANS/SAPS), while a diagnosis cannot be and typically is not quantified. This means that parametric relationships can be obtained between, e.g., degrees of cognitive impairment and scores on a specific symptom scale item, which is not possible for a diagnostic category. A fourth advantage is that a single symptom could also serve as an endophenotype, which can add to the genetic understanding of the broader phenotype of schizophrenia. At the brain level of explanation, a corresponding parametric relationship could be predicted for, e.g., intensity of neuronal activation in a defined cluster of voxels and a symptom score, which could also be conducted in a parametric way for single symptoms, but not for a single diagnosis. I here disregard the identification of sub-diagnoses within a diagnostic category as seen in, e.g., the DSM classification system, since such a procedure is still categorical and would not allow for the identification of parametric relationships to other domains, like cognition or pathophysiology. Fifth, unraveling the mechanisms involved in a single (or a few) key symptom(s) may provide a novel way of developing new symptom-specific treatment procedures, both pharmacological and non-pharmacological. Sixth, since some of the symptoms seen in schizophrenia are also shared with non-psychotic individuals in the general population[11-13], studying auditory hallucinations and “hearing voices” respectively, will open for the possibility of specific symptoms as continuous dimensions rather than discrete categories[6,14].
AVHs are the most characteristic symptom of schizophrenia and psychosis, and they “define” the disorder from a clinical and phenomenological point of view. Clinically, what drains the patient both cognitively, emotionally, and physically is the ongoing “dialogue” and typically negative comments and commands from the “voice”, which recruits almost all available cognitive resources[7,15,16], and with the resulting typical signs of reality disorientation, and inward attentional focus. From a phenomenological point of view, AVHs are characterized by a limited set of core features; the experience is auditory in nature, with a distinct perceptual quality of “hearing a voice”. It is true that some schizophrenia patients may experience visual, olfactory or kinesthetic hallucinations, but such instances are rare and may actually be part of other symptoms, not sharing explanations with the experience of hearing voices. The “voice” is typically localized outside of the head[18,19], although recent research has shown that hearing the “voice” as coming from the inside is more common than previously believed. The “voices” also typically have negative emotional valence[20,21], which is in addition experienced as controlling the patient[7,22], revealing a lack of executive cognitive power by the patient.
AVHs and a dimensional approach to mental disorders
From a cognitive and neurobiological point of view, there is an emerging literature showing a likewise emerging picture of symptom specificity when it comes to AVHs[19,23,24]. Experiences of AVHs should therefore be an ideal target for research from a dimensional point of view since recent studies have identified that about 4%-5% of the general population experience “hearing voices”[11-13], also when excluding previous mental health problems, medication, drug use, and other potentially confounding factors in this group of individuals. Thus, by studying the phenomenology, cognition and neurobiology of healthy “voice hearers” in the same experimental context as patients with auditory hallucinations, and comparing with non-hearing, non-psychotic control subjects, it should be possible to map commonalities and differences across several dimensions[6,14]. In this sense, non-clinical “voice-hearers” may constitute a theoretically important third group, the “missing link” between clinical and non-clinical individuals who experience AVHs. Observing similarities and differences at cognitive and neuroimaging levels of explanation between such a third group and clinical AVHs may provide invaluable data for unraveling the underlying cognitive and neuronal mechanisms for the understanding of AVHs. Cuthbert and Insel have elegantly reviewed the National Institute of Health (NIH), United States, initiative to find new ways of conceptualizing and classifying mental disorders as dimensional, rather than categorical illnesses. With this was meant that the aim is to find new ways of understanding mental disorders, and providing new avenues for treatment, that are based on dimensions of empirically observable behavior and recordings of neuronal activity. Cuthbert and Insel provide a set of criteria that could aid researchers to identify dimensions of symptom-like behaviors including cognitive, emotional, and social dimensions that can be analyzed from the molecular to the clinical level of explanation (see also). Badcock and Hugdahl have applied these criteria to the understanding of AVHs from a dimensional point of view, focusing on the cognitive construct of inhibition, and the lack of inhibitory power that characterizes patients suffering from frequent and severe AVHs. The theoretical approach used by Badcock and Hugdahl is a first attempt to apply the new NIH criteria to the understanding of a specific symptom.
THE PHENOMENOLOGY OF AVHs
AVHs are phenomenologically speaking a conviction that other individuals are talking to the patient despite the absence of an external auditory signal[8,15,16,18]. AVHs therefore have a different sensory quality than illusions, which are perceptual misinterpretations of an actual perceptual experience, although in some instances AVHs can take on an illusory quality, as when patients report that the “voices” they experience started as “hearing sounds” that over time develop into “voice” hallucinations. AVHs are often described as “misperception of inner experiences and thoughts” or as “misattribution of speech”, or as “internally generated events that are interpreted as being externally generated”, further attesting to a speech perceptual nature of how AVHs are experienced phenomenologically by the patient (see also). Other characteristics of AVHs are that the quality of the voice is often negative and condemnatory, and that it is out of the patients’ control; in fact, many patients subjectively report that they “feel controlled by the voice”. To this list should also be added an attentional dimension, and that AVHs have a profound influence on the patients’ attentional capacity, with attention being drawn towards the “voices”, with a corresponding loss of attentional awareness of the outer world. This inward attentional focus has been the target for different cognitive therapy approaches (see[28,29]), trying to train the patient to re-allocate attentional focus from the inner imaginary “voices” to the outer real voices. Thus, the subjective and phenomenological experiences of AVHs are that they: (1) are perceptual phenomena experienced as not belonging to one’s self, irrespective of whether they are perceived as coming from the outside or the inside of the head; (2) typically have a negative emotional valence consisting of sinister comments and commands; and (3) that they are out of volitional control[30,31]. In addition, some patients also struggle with not complying with the commands given by the “voice”, and this is what they sometimes experience as the most frightening and anxiety provoking aspect of having AVH experiences - that they would have to commit acts that they do not want to commit. It may also be of interest to note the Bleuler in his 1911 book recognized these three dimensions in his writings where he writes about “voices” speaking to the patient, take the patient’s thought away, and that threats or curses are common contents.
This also points to an emerging new research area; the cognitive and neurobiological aspects of patients failing to withstand the commands of the “voice” and who commit tragic acts with traumatic consequences. This is an area unfortunately much neglected in hallucination research, and may shed important light on other aspects of AVHs and schizophrenia in general. McCarthy-Jones et al used a structured interview scale with questions about a variety of content dimensions of AVHs in order to describe in detail the phenomenology of AVHs in a relatively large sample of almost 200 patients. Items entered into the analysis were questions on duration, location and frequency, as well as questions related to past memories, gender, family relationship with the “voice”, first, second or third person, and emotional content. The results showed a more complex relationship between the various domains or dimensions than previously acknowledged, which raises the question of whether AVHs should phenomenologically speaking be divided into sub-types with their own core characteristics. For example, the results showed that the localization of the “voices” as coming either from inside or outside of the head was about 50%, while command “voices” like “Did the voices ever tell you what to do” accounted for 67% of positive answers from patients. This finding attests to the saliency of commands and threats as part of the phenomenology of AVHs.
THEORETICAL MODEL OF AVHs
Probably the most influential model for the understanding of AVHs in schizophrenia is the suggestion by Frith et al[34-36] that AVHs are caused by a failure to adequately monitor and label verbal thoughts as coming from the inside rather than from the outside of the patient’s head, often called an “inner speech” model. In addition to an inner speech model, it has been suggested[14,21] on the basis of the negative emotional content of AVHs that they may represent misinterpreted recall of strong emotional and traumatic memories that would act like intrusions, and that are mislabeled as coming from the outer world (see for detailed overview over and discussion of theoretical models of AVH). However, as pointed out by Jones (see also), neither model fulfills the criteria of encompassing the full phenomenology of AVHs, common to all patients. Since both patients with auditory hallucinations and healthy individuals “hearing voices” subjectively report experiencing someone “speaking to them”, not that they “speak to the voice” (although they may be engaged in a later dialogue with the “voice”), it seems that a perceptual model would better fit the actual phenomenology of “hearing voices” than a speech production model.
This is supported in a review by David et al on AVHs which concluded that more than two thirds of patients with AVHs subjectively report that the “voice” is speaking in a different accent than their own, which is difficult to conceive of in an inner speech model. Electrophysiological and brain imaging data also show a pattern of responding more in congruence with a perceptual view, with recorded activity in temporal lobe speech perception areas[38-41]. In a recent review and meta-analysis, Jardri et al moreover concluded that most studies support a view of aberrant speech perception as the core phenomenology of an AVH, and with a neuronal focus in the peri-Sylvian region in the auditory cortex (see also[7,42-45]. A speech perception view does however not exclude a speech production model. An inner speech model states that AVHs are related to a deficit in monitoring of one’s own inner dialogue and thoughts, which then also includes a perceptual component in the monitoring aspect. What is different between the models is that a perceptual model gives rise to specific hypotheses about a neuronal origin in the speech perception areas in the upper posterior part of the temporal lobe, and primarily on the left side, while an inner speech model is less specific about a neuronal focus, (see however who found activation in the right inferior frontal gyrus).
Similarly, since only about 10%-20% of the “voices” that patients experience “hearing” are about actual memories and previous experiences, it is difficult to reconcile this with a memory model for auditory hallucinations. It is however true that in some cases when the voices are very intrusive and threatening to the patient, there seems to be a correlation with previous trauma and sexual abuse and positive symptoms, like AVHs[47,48]. However, although there is no unique link between traumatic experiences and psychotic disorders, several studies have found a relationship between psychotic symptoms, including AVHs, and childhood trauma from sexual abuse as well as from having experienced increasing violence in war zones[48,49]. Such experiences are however not unique to causing AVH, since they more often lead to anxiety disorders, like PTSD. Furthermore, traumatic memories as causes of auditory hallucinations cannot explain the experience of benevolent voices, e.g., when patients experience “listening to angels”. If anything, the content of such hallucinations should be related to the trauma or abuse event if based on memories, which does not the fit their non-traumatic content. Also, as for inner speech models, memories need to be interpreted and “translated” into a perceptual experience in order to be congruent with what patients actually report, and this again points towards a perceptual basis for auditory hallucinations. To this should be added the characteristic perceptual quality of the “voice”, which is typically experienced as a real person speaking with distinct perceptual qualities like accent, emotional valence, and timbre, as if in direct interpersonal communication.
EMPIRICAL EVIDENCE FOR A PERCEPTUAL DIMENSION
Hugdahl et al used a dichotic listening task with simultaneous presentations of brief speech sounds consisting of a single vowel and a consonant, so called consonant-vowel syllables. One syllable is presented to the right ear and another simultaneously to the left ear. This is a common procedure used in research on laterality and hemispheric asymmetry to investigate left hemisphere preference for speech sound perception. The result is typically a preference for reporting the right ear over the left ear syllable, reflecting a left hemisphere perceptual preference and processing of speech stimuli[51,52]. Hugdahl et al predicted that AVHs would interfere with the processing of an external speech sound if a perceptual model is correct, and that the interference would increase with more frequent and severe AVHs. The authors therefore correlated the score on the hallucination item in the PANSS symptom scale with the right ear advantage in the dichotic listening task, predicting a negative correlation. The results confirmed the prediction with a significant negative correlation with the hallucination item, not found when a corresponding correlation was calculated for a negative PANSS symptom (see also[53,54]).
The notion that the magnitude of the right ear advantage in hallucinations would reflect the degree to which AVHs interfere with the perception and subsequent processing of an external auditory stimulus was further pursued by Ocklenburg et al who conducted a meta-analysis of dichotic listening studies involving schizophrenia patients and in particular hallucinating patients. The total sample in the meta-analysis consisted of 700 patients and 700 healthy controls. The results showed that the patients had significantly reduced right ear advantage compared to controls (effect-size approximately 0.25) and that this was exaggerated when comparing patients with AVHs and healthy controls (effect-size 0.45). This is shown in Figure 1.
Figure 1 Graphic illustration of the meta-analysis done by Ocklenburg et al on the relationship between the right-ear advantage in dichotic listening and schizophrenia (A), and between right-ear advantage and hallucinations (B).
Reprinted with permission from the authors and the publisher.
Ever since the pioneering work by Johnstone et al which showed that schizophrenia patients had reduced grey matter volume compared to healthy controls, the use of magnetic resonance (MR) for the study of brain correlates of schizophrenia as a mental disorder (see meta-analysis by), and AVHs in particular (see meta-analysis by[41,44]), has been a major research topic, using both structural and functional MR measures. The fact that temporal and frontal lobe brain areas are implicated in auditory hallucinations has been repeatedly shown in functional and structural imaging studies. Using a voxel-based-morphometry (VBM) analysis method, Gaser et al reported already in 2004 a reduction of grey matter in the left superior temporal gyrus and auditory cortex in hallucinating patients. This finding was followed by Neckelmann et al who found a similar pattern, and in addition reported reductions in the frontal lobe, thalamus, and basal ganglia (Figure 2).
Figure 2 Structural magnetic resonance imaging, using Voxel-Based-Morphometry analysis of grey matter density in the brains of patients with frequent (PANSS P3 > 3), and infrequent (PANSS P3 < 4) hallucinations, shown in sagittal, coronal and axial slices.
Data from Neckelmann et al redrawn with permission of the authors and the publisher.
These results have later been strengthened by the results of a meta-analysis by Modinos et al who concluded that: “Severity of AVHs was significantly associated with GMV reductions in the left and marginally with the right STG, including Heschl’s gyrus” (p. 1046).
Frontal and temporal lobe grey matter reductions in schizophrenia patients have often been related to positive symptoms in general, and not only with AVHs[60-62] (see also for review). It should be noted however, that in a very recent study by van Tol et al, grey matter reduction in the superior temporal gyrus was significantly reduced in both frequent and infrequent AVH patients compared to healthy controls, while reduction in the parahippocampal region was unique to AVH patients only. This could lend support to an intrusive memory account of AVHs[21,64], while at the same time weakening a speech perceptual model since temporal lobe area reductions did not differ between AVH and non-AVH patients. However, such a conclusion may be premature, since the results do not disprove that AVHs are related to temporal lobe pathology, but only say that this may be a more general characteristic of schizophrenia.
With regard to white matter changes and abnormalities in the brains of AVH patients, this is less known compared to grey matter abnormalities. Allen and Modinos reviewed the white matter literature, based on diffusion tensor imaging (DTI), and fractional anisotropy (FA) which is an index of the difference in flow of water molecules along vs across axonal fibers. In general, these studies have shown both increased and decreased FA values in patients with schizophrenia, in particular for connections between anterior and posterior brain regions. Interestingly, schizophrenia patients experiencing AVHs show increased connectivity between frontal and temporal/parietal areas which could be a white matter structural correlate to the verbal nature of AVHs.
Functional imaging has likewise revealed temporal and frontal lobe abnormalities in patients with frequent AVHs[12,38-40,66,67]. See also[19,23] for meta-analyses, and for an overview of both functional and structural imaging studies on AVHs. Thus, an understanding of the neuronal circuitry underlying AVHs is emerging, with new evidence coming from cognitive, behavioral, and brain imaging studies (see[26,68,69] for reviews). When reviewing the literature on functional imaging and AVHs, a distinction has to be made between “state” and “trait” studies. With “state” is meant studies that have compared patients during hallucination episodes with non-hallucination episodes while in the scanner, which may or may not be compared with healthy controls, and may or may not include the presentation of external stimuli. Patients typically indicate with a button-press or similar when they experience AVHs in the course of the scanning. With “trait” is meant that patients have been screened for frequency and severity of AVHs with typical symptom interview scales, like the PANSS, before the scanning, such that it is known in advance which patients could be labelled AVH-patients, and which could be labelled non-AVH-patients. Patients are then compared with healthy controls, typically in a paradigm with external stimuli, although this is not a requirement.
Resting-state brain activation
Northoff et al suggested, after reviewing the literature, that AVHs may represent resting-state neuronal hyperactivity in the default-mode cortical network, a kind of spontaneous neuronal firing especially in the auditory areas, during AVHs in the absence of an external auditory stimulus. Northoff et al summarized their review by stating that: “the findings suggest that the resting state activity in especially the anterior medial cortical regions is abnormally increased as indicated by the observations of ... increased connectivity, and increased low-frequency fluctuations in schizophrenic patients.” (p. 206).
Increased spontaneous activation in regions associated with the default-mode network may then interfere with processing of external speech, or an auditory stimulus, causing a kind of neuronal interference, i.e., the internally generated neuronal activity during episodes of AVH interferes with the processing of an external stimulus. This is illustrated in Figure 3, which also shows in the lower panel a comparison between patients with frequent and severe AVHs and healthy controls in the presence of an auditory stimulus.
Figure 3 Functional imaging results for hallucinating patients based on the meta-analysis done by Kompus et al which show activation in the absence of an external auditory stimulus (upper left panel) compared with healthy controls in the presence of an external auditory stimulus (upper right panel, data from van den Noort et al).
The lower left panel shows the same activation as in the upper left panel, spontaneous activation in hallucinating patients in the absence of an external auditory stimulus, but now compared with absence of activation in hallucinating patients in the presence of an auditory stimulus. Color-coded areas indicate significantly activated brain regions during active hallucinations and task-processing. Reprinted and redrawn with permission from the authors and the publishers.
Neuronal interference was also observed by Hubl et al and Ford et al where the amplitude of the N1-component of the event-related electrophysiology potential to an external auditory stimulus in AVH patients was reduced. Based on these and other studies, (e.g.,[70,73-75]) it is suggested that the evidence show reduced rest-external stimulus interaction in AVHs, caused by failure of modulation of both resting state and stimulus-related activity. This is similar to the suggestion by Hugdahl et al that AVHs may involve: “failure of down-regulation of a resting-state network and corresponding up-regulation of an effort network, thus upsetting the normal functioning of cognitive control mechanisms”. (p. 41).
A paradoxical finding
Empirical evidence from functional neuroimaging studies, both electrophysiological and hemodynamic studies, point in the direction of AVHs being related to abnormal neuronal network architecture and interactions, with increased activation in auditory and speech perception areas in the absence of external auditory stimuli, and reduced activation in the same areas in the presence of external auditory stimuli. Kompus et al labelled this the “paradoxical effect” since it is paradoxical that increased spontaneous activation in AVH patients in the absence of an auditory stimulus is not further increased when an auditory stimulus is actually presented, as if the perceptual system is “shut down”. A major question is of course why this paradoxical effect occurs in the first place.
Northoff et al suggested that the reduced resting-stimulus network interaction is the result of a mislabeling process where the abnormal processes that have occurred in the auditory cortex in AVH patients cause the patient to register an internally generated event as if it was an external stimulus, similar to faulty monitoring and labelling of an inner speech event as coming from the outside[35,36]. Similarly, Kompus et al suggested that the paradoxical effect of reduced activation to an external auditory stimulus in the auditory cortex in AVH patients would fit a model of abnormal interaction between cortical networks, with too high activation in the default-mode network during episodes of external stimulus processing, causing neuronal interference, or resource competition.
Kompus et al further discussed if this could be an attentional effect, in addition to a sensory processing effect, such that reduced activation to an external stimulus is caused by a failure to allocate attentional resources to an external source in the course of a hallucinatory episode. That schizophrenia patients are impaired on neuropsychological tests for attention is well-known[77-79], and it has recently also been shown that AVH schizophrenia patients fail to allocate attentional resources to the location of an auditory stimulus. Thus, it is possible that the paradoxical effect seen in the meta-analysis by Kompus et al is an attentional effect, with the cognitive system being “shut down”. A third explanation is that it is a signal-gating effect, and that an auditory signal is not properly gated from the ear to the temporal lobe, possibly because of an abnormality at the thalamic or hippocampal levels, in which case it would be the thalamic system being “shut down”. Several studies have shown that schizophrenia patients show aberrant sensory gating[38,80,81] such that the filtering function of sensory gating, i.e., to filter out irrelevant stimulus noise at the sub-cortical level, with the aim of facilitating attention shifts at the cortical level to relevant aspects of an external stimulus, is impaired. If this mechanism is aberrant, it is not unreasonable to assume that the processing of an external stimulus is hindered by the noise created by the internally generated AVH, which is not inhibited by the incoming stimulus.
FAILURE OF TOP-DOWN INHIBITION
Irrespective of the theoretical model and neuroimaging data supporting either a perceptual model or other models, what may trouble patients the most is the failure to ignore the “voices” when they occur, and to focus on other things happening in their surrounding environment. It is as if the “voices” drain the patient of their cognitive capacity, leaving them in the hands of whatever malevolent comments or commands the “voice” may find it proper to try on. From a clinical point of view, this is probably what troubles, and not the least, scares the patient the most, being unable to control the “voice” and to withstand and ignore what they “say”. This not only confuses the patient but also creates anxiety and fear when struggling not to comply with what the “voice” command. From a theoretical point of view this can be seen as a failure of top-down inhibitory control, or a failure of executive functions as it is typically called in classic neuropsychology[82,83]. Using a variant of the dichotic listening experimental paradigm, the forced-attention paradigm, as described in the perception section above, Hugdahl et al found a negative correlation between scores on the PANSS hallucination item and the ability to shift attention to either side in auditory space (Figure 4).
Figure 4 Correlations between the right and left ear scores on the forced-attention dichotic listening task and scores on the the PANSS P3 Hallucination item.
Note the negative correlations for the FR and FL instruction conditions (upper left and lower right), respectively. See text for further explanations. From Hugdahl et al, reprinted with permission from the authors and the publisher. ns: Not significant.
The forced-attention dichotic listening task[84,85] is an experimental task that taps non-executive and executive attention in the same paradigm. As discussed above, due to the lateralization of speech sound perception, the right ear stimulus of the dichotic pair will be preferred for processing over the left ear stimulus since it has direct access to the left temporal lobe[51-52,86,87]. As also discussed above, this is called a bottom-up right-ear advantage (REA). The novel aspect of the forced-attention extension of the paradigm is that when the subject is instructed to explicitly attend to and report only from the right ear, this will create a non-executive attentional focus situation, since the bottom-up tendency to process the right ear stimulus acts synergistically with the top-down instruction to attend to the same ear, which results in a larger REA. When the subject is instructed to attend to and report only from the left ear, the brain faces a cognitive conflict where the bottom-up and top-down processes are opposed to each other, and act non-synergistically. The conflict requires executive control resources, and the cognitively strong right ear stimulus has to be inhibited and the cognitively weak left ear stimulus has to be facilitated. Most healthy adults can overcome the bottom-up tendency and report the left ear stimulus of the dichotic pair, showing a left-ear advantage (LEA). The finding of a negative correlation between PANSS score for the hallucination item and the magnitude of the REA in the situation with attention focused on the left ear that was reported by Hugdahl et al would then be evidence for a parametric inverse relationship between frequency and severity of AVHs and the ability to execute cognitive control.
Although the inability to suppress and inhibit the “voice” is a prominent feature of AVHs clinically, there are surprisingly few studies that have explicitly looked at the relationship between executive functions in general, and inhibition in particular, and AVHs. Executive functions in AVHs were reviewed by Waters et al, in particular in relation to what these authors called failure of “intentional cognitive inhibition”, which is the inability to voluntarily monitor and inhibit intrusive thoughts. Such thoughts may be intrusive fragments from memory that are not inhibited and are then mislabeled as coming from the outer world in the form of someone speaking to the patient. Cognitive inhibition has been decomposed into different sub-components and sub-processes, voluntary vs involuntary inhibition, where the latter is the process of ignoring something without being aware of the process of doing it. Miyake et al have proposed that inhibition in executive functions is related to the suppression of pre-potent responses and response-tendencies, which in addition requires the ability to shift attention set after inhibition. Applied to AVHs, it can now be suggested that the “voices” are intrusive and pre-potent thoughts that are not voluntarily inhibited and that take on an autonomous role once they are initiated and enter into the patients’ awareness.
The failure of cognitive inhibition in AVH patients may have a neuronal localization in the frontal lobes, and a corresponding prefrontal inhibitory abnormality. Executive function and inhibition have been linked to the anterior cingulate cortex (ACC) in several studies, ranging from the early positron emission tomography (PET) studies by Pardo et al with the Stroop task to later studies with fMRI using various auditory and visual tasks (see[24,91]). Thus, the prefrontal cortex and the ACC are areas critically involved in cognitive control and executive functions, and these regions have repeatedly been shown to be affected in patients with schizophrenia (see[26,92] for reviews).
LEARNING TO IGNORE THE “VOICES”
Failure of attention and executive control in AVH patients may also be the starting point for novel cognitive training attempts. Cognitive behavior therapy for schizophrenia and auditory hallucinations has long been directed towards giving the patient the skills necessary to voluntarily inhibit and shift attention away from the “voice” (see[28,93]). Recent approaches to cognitive therapy have been more focused on specific training procedures, rather than inducing a therapeutic change of strategy. Thus, a distinction can be made between treatments which in the case of schizophrenia will mean abolishing symptoms, therapies inducing new strategies on how to cope with stressful situations, and training which is specifically aimed at handling a single event, or symptom. We have developed an iPod/iPhone app based on the dichotic listening paradigm, described above, as a tool to learn how to inhibit and ignore the “voice”, also to be used in social situations, like riding on a bus, or being in other social situations, which can be used “there-and-then” whenever the patient feels the urge to have help in withstanding the “voice”. Preliminary results from 15 patients show some promising effects that warrant further research on the use of app-technology for training and learning new mental skills in patients with schizophrenia.
THE ERC “VOICE” MODEL
The results from the ERC project have provided the empirical input to a model that is graphically shown in Figure 5 (the model is also presented in).
Figure 5 The ERC “VOICE” model showing impaired processing in a suggested pre-frontal and temporal lobe neuronal network, due to hyper-activation of temporal lobe regions, including the auditory cortex, which is not inhibited due to impairment of pre-frontal executive inhibitory functions.
See text for further explanations. Redrawn from Hugdahl et al, with permission from the authors and the Publisher.
The purpose of proposing the model was to advance our understanding of the neuronal underpinnings of AVHs from the perceptual, attentional and cognitive control domains that have been reviewed and discussed above. The model derives from an assumption of the existence of two cognitive networks, or systems; one that acts as a bottom-up system that primarily responds to stimulus features, and consists of perceptual and sensory processes, and another top-down system that responds primarily to the cognitive demands of the situation. The model further assumes that the bottom-up system is responsible for the actual initiation of AVHs, driven by neuronal hyper-excitation in the temporal lobes, while the top-down system is responsible for the maintenance of AVHs, and in particular failure of inhibition and attentional focus, localized to frontal and parietal lobe areas. As discussed above, AVHs are seen as the product of an abnormality in both the bottom-up and top-down systems characterized by hyper-activation of the bottom-up system, and hypo-activation of the top-down system, as illustrated in Figure 5. AVHs are therefore not inhibited once they occur because of impaired functioning of the prefrontal cortex. Finally, the model assumes that parietal lobe areas are activated for the direction of attentional focus towards the voices, and are also not inhibited through impaired frontal lobe executive and cognitive control functions. As also discussed above, the “VOICE” model will generate new hypotheses not only for the understanding of the underlying neuronal mechanisms for AVHs but can also contribute to novel hypotheses regarding cognitive training and learning of how to inhibit and ignore the “voice”, by teaching the patient how to re-allocate attention away from the inner “voice” and towards real outer voices. From the networks outlined in the “VOICE” model, it is possible to derive the underlying cortical network representations by applying advanced connectivity analysis approaches to functional neuroimaging data, thus advancing our current understanding of the underlying mechanisms in a hypothesis-driven manner. The model moreover suggests new avenues that go beyond existing paradigms and methods, and move to the lower levels of explanation from the cognitive and imaging levels by asking questions about which neurotransmitters and receptors may be involved in the neuropathology. This is a new avenue in AVH research, and knowledge about transmitters and receptors may provide the inspiration for new pharmacological treatments. The “VOICE” model will also cover non-clinical individuals with AVHs, i.e., individuals in the general population who share the experience of “hearing voices” but who are not clinically handicapped by their experiences. A major difference between clinical and non-clinical AVH individuals is intact attentional top-down cognitive control of the “voices” in non-clinical individuals, coupled with intact frontal lobe functioning and increased activation in these regions[95,96]. Another difference between clinical and non-clinical individuals who experience AVHs is that the former lack a meta-cognitive understanding of the subjectivity of their experience and to a higher degree ascribes the experience to external factors.
THE NEUROCHEMISTRY OF AVHs
MR spectroscopy and glutamate/gamma-amino-butyric-acid interactions
A question that is seldom asked in the AVH literature is what transmitters and receptors may be causing or contributing to the neuronal activation abnormalities seen in AVH patients and reviewed above (see however). We do not know what triggers an AVH at the cellular level, causing the subjective experience of perceiving a “voice” in the absence of an external stimulus. To answer this it is necessary to move down to the receptor and transmitter levels of explanation. The meta-analysis by Kompus et al, (see also), revealed that areas in the posterior left temporal and right frontal lobes were hyper-excited during spontaneous AVHs, and at the same time hypo-excited to an external speech sound, when compared with healthy control subjects. Thus, AVHs seems to produce a neuronal paradox in the sense that the same brain areas that fire uncontrollably at the initiation of AVHs at the same time are refractory to the presentation of an external speech sound. Kompus et al called this a “paradox” since one would expect activation in the auditory and speech perception regions in the absence of an auditory stimulus would actually be increased when an external stimulus is presented in addition. The spontaneous increase in neuronal activity in the auditory cortex in AVH patients was also recently shown by Homan et al who found that cerebral blood flow was higher in this area in AVH patients, and remained so also after treatment with transcranial magnetic stimulation. Homan et al suggested that increased blood flow may be a trait-marker of AVHs, which was also suggested by Kühn et al. It is now suggested that the paradox could be explained by the differential actions of gamma-amino-butyric-acid (GABA) interneurons that produce hyper-excitation in the first case and hypo-excitation in the second case.
This can be empirically addressed through MR spectroscopy (MRS) to measure and quantify regional concentrations of brain metabolites, such as glutamate, which is excitatory and GABA, which is inhibitory. MRS thus allows for in vivo measurements of transmitter concentrations in patients which solves the problem of exclusive reliance on animal models. MRS allows for the near-simultaneous recording and quantification of both glutamate and GABA levels, and also other relevant metabolites, like N-acetyl-aspartate (NAA), choline, and creatine[99,100]. By adding an MRS sequence of 5-10 min to the fMRI sequence, it will therefore be possible not only to measure and quantify concentrations of glutamate and GABA in the hallucinating brain but also to correlate MRS metabolite concentrations in selected voxels with other clinical, cognitive and imaging data. A typical MR spectrum is shown in Figure 6.
Figure 6 Example of a printout of the typical peak spectra from an MR spectroscopy sequence applied to a single voxel.
Selected metabolites are indicated with respective acronym. Glu: Glutamate; Gln: Glutamine; Glx: Glutamate + Glutamine; GABA: Gamma-amino-butyric-acid; Cre: Creatine; Chln: Choline; NAA: N-acetyl-aspartate. From Hugdahl et al, reprinted with permission from the authors and the publisher.
Glutamate and schizophrenia symptoms
Glutamate is suggested to have an effect on positive symptoms associated with schizophrenia through balancing sub-cortical dopamine release (see[10,101] for overviews). The classic pathway for the involvement of glutamate in schizophrenia and in the regulation of positive symptoms is that reduced cortical glutamate levels, and/or dysfunctional N-methyl-D-aspartate (NMDA) receptors, hypo-activate GABA interneurons, which leaves striatal dopamine release uninhibited, resulting in dopamine excess in the schizophrenia brain. The finding of reduced glutamate levels in schizophrenia patients would fit with a number of other studies which show that when healthy individuals are given ketamin and phencyclidin (PCP), which are drugs that act as NMDA receptor antagonists[103,104], they show signs and symptoms of a psychosis. The possible relationship between glutamate reduction and dopamine was described by Carlsson et al such that prefrontal glutamate release will result in activation of GABA interneurons to balance too high levels of glutamate, which will also have an inhibitory effect on striatal dopamine release (see also). When glutamate levels fall below critical levels or when GABA receptors are dysfunctional, GABA interneurons will consequently be hypo-activated and dopamine release will be correspondingly uninhibited. This will result in excess dopamine, and in particular dopamine D2-receptor activity, producing positive psychotic symptoms. It should be pointed out, however, that a more recent article concluded after reviewing the schizophrenia literature, with both decreases and increases of dopamine, particularly in frontal regions.
Interestingly there were no studies comparing glutamate levels in patients and controls in temporal regions in the Poels et al review, which leaves the question of abnormal glutamate regulation in the auditory and speech perception areas currently unanswered. Falkenberg et al moreover found a positive correlation between fMRI activation and glutamate in the anterior cingulate region in schizophrenia patients, while there was a corresponding negative correlation in control subjects. Falkenberg et al also found that decreased glutamate levels were associated with impaired executive control functioning when the patients were tested on a cognitive control task. It should be noted that the healthy control subjects did not show an association between task performance and glutamate levels. Thus, glutamate seems to be specifically involved in mediating frontal lobe executive functioning in schizophrenia patients, who also perform below the controls on this task. The study by Falkenberg et al again raises the question as to whether glutamate levels are increased or decreased in temporal lobe areas, and whether patients with frequent AVHs differ from patients with less frequent AVHs.
Temporal lobe glutamate levels and AVHs - preliminary results
The only data, to my knowledge, on the role of transmitters in AVHs are from the study by Homan et al who found that levels of NAA differed between patients and healthy controls, and a positive correlation between NAA levels and the frequency of positive symptoms (see however). Again the MRS voxel placements were in the frontal lobes, and therefore it is not known what role glutamate may play for the frequency and severity of AVHs when measured in the temporal lobes. As argued above, an increase in temporal lobe glutamate levels may be predicted to go along with increased AVH frequency, considering the neuroimaging data which have shown increased activity in both hemodynamic and electrophysiology studies[18,24,49,110], although as for other brain metabolites that have been studied in schizophrenia patients, both increases and decreases could be expected. Hugdahl et al recorded glutamate levels from four voxels, in the left and right auditory cortex, overlapping with Heschl’s gyrus and in the left and right prefrontal cortex in the superior frontal gyrus. Symptom frequency and severity were quantified from the PANSS interview scale. The results showed first of all that schizophrenia patients had reduced glutamate levels compared to the healthy controls, which would be predicted from previous studies (e.g.,[63,112,113]). Second, patients with a high symptom load for the AVH item in the PANSS had significantly higher glutamate level in both temporal and frontal lobe areas, thus for the first time demonstrating increased glutamate levels in these regions. The results were moreover specific for the AVH symptom when compared with the emotional withdrawal negative symptom. Figure 7 shows the relationship between AVH and glutamate levels in the temporal lobe. The glutamate data in Figure 7 are pooled for the right and left temporal lobes, with a voxel placement in the upper posterior portion, overlapping with Heschl’s gyrus. The data are presented as correlations between mean glutamate levels in the voxel areas and PANSS AVH item scores, as well as for the sum total of all PANSS positive item scores, also including the AVH item (data re-plotted from Hugdahl et al).
Figure 7 Scatter-plots of the correlations between scores on the PANSS P3 Hallucination item and Glutamate levels from a temporal lobe voxel, and between the sum total of positive symptoms and Glutamate levels from the same voxel.
P3_Hallu: Score for the PANSS hallucination item; POSITTOT: Sum total of positive symptoms; Glu: Glutamate.
The results reported above thus extend the findings reported in Homan et al who found altered levels of NAA in AVH patients, by showing that altered levels of brain metabolites in AVH patients are also related to glutamate. This may be a significant novel finding since NAA is involved in the metabolism of NAA-glutamate. Thus, the two metabolites are functionally overlapping, and both could therefore be expected to be elevated in a state of neuronal hyper-excitation, as in auditory hallucinations.
The relevance of the findings by Hugdahl et al is substantiated by the fact that while most studies have found reduced levels in schizophrenia patients in general, the same holds for NAA, which is also typically reduced in schizophrenia in general, but increased in hallucinating patients. The demonstration that auditory hallucinations may be mediated by excessive glutamate levels in the same brain regions that have been previously implicated from fMRI studies is a finding that may have implications for new treatment targets and the development of new drugs that are tailored to counteract the experience of hearing voices rather than acting on the schizophrenia disorder as such. The MRS findings reported in Hugdahl et al would also support a dimensional view advocated by specifically targeting alterations in brain metabolites related to specific symptoms rather than to the schizophrenia disorder as such (see also).
NEW WAYS OF SAMPLING SYMPTOM DATA IN REAL-TIME
The introduction of smartphone app technology
Data on frequency and intensity of auditory hallucinations are typically obtained from scores on a single item in structured interview scales, where the PANSS is a commonly used scale. As previously mentioned, data on hallucinations are entered as a number between 1 and 7, with 1 = low frequency or non-existent, while 7 = high frequency. There is however a number of non-optimal and confounding factors with such interview scale data that may obscure the true frequency and variability over time on an individual basis. Data are obtained in a single, or few, sessions, typically in an artificial office environment, data are aggregated across several sub-symptom dimensions, processes, and verbal responses, and interpreted by the clinician before entering a score on the questionnaire. Questionnaire data are also by necessity retrospective in the sense that the patient is asked to recall past events, e.g., whether they have had unusual or uncommon experiences lately, or whether they sometimes hear voices inside their head that others cannot hear, what these voices tell the patient, how often they hear them etc. Thus, there is a fairly long “distance” from the phenomenological experience in the patient’s mind, and the data that the researchers are provided with and use for statistical analyses have therefore been filtered through several intermediate steps from the patient to the researchers. There is thus a pertinent problem of how to bridge, or shorten, the distance from the phenomenology of a personal experience to a score on the analysis computer. Our research group at the University of Bergen, Norway, has recently suggested the use of smartphone technology for acquisition of real-time data on cognitive and clinical parameters as well as for cognitive training to inhibit the “voice”, as previously discussed (e.g.,).
The idea is that the patient is equipped with an iPhone or iPod and after opening the app is confronted with a series of question shown on the display that will tap the three major dimensions in auditory hallucinations, a perceptual dimension (do the voices come from outside or inside the head), a cognitive dimension (does the patient have full or no control over the voices), and an emotional dimension (are the voices negative or positive). In addition there are questions about frequency and intensity of the “voice”. The patient responds to each question by moving the finger-slider from left (inside the head, no control, emotionally negative) to the right (outside the head, full control, positive) of the display, and the distance is then quantified in centiles.
There are several advantages with an app-approach to symptom data sampling: (1) data will be patient-driven, rather than therapist-driven; (2) data will be acquired in real-time, and data acquisition can be repeated over and over again during day, either at fixed or pre-determined intervals, or whenever the patient feels a necessity to enter data; (3) data will better reflect the “ebb and flow” of symptoms; and (4) this will allow examination of temporal relationships between variables with a short “life-cycle”, not possible with standard interview questionnaires. A final advantage is also that an iPhone app will have high ecological validity in the sense that a patient going through these questions when, e.g., sitting on a bus, will not stand out in any way. He or she will instead look like most other young people who listen to music or play games on their iPhones while on a bus, or elsewhere in today's digitized society. Figure 8 shows an example of a prototype display with the three dimensions shown in the display.
Figure 8 iPhone app for the sampling of real-time data on the three dimensions of auditory verbal hallucinations, cognitive control (upper slider), emotional content (middle slider), and perceptual locus (lower slider), developed by the ERC “VOICE” Group at the University of Bergen, Norway.
See text for further explanations.
UNSOLVED ISSUES AND FUTURE DIRECTIONS
I have reviewed the current literature on the phenomenology, neuroimaging, and neurochemistry of AVHs. In particular the neuroimaging data, both structural and functional neuroimaging, support a view of AVHs as perceptual misrepresentations, caused by neuronal hyper-activation that results in a phenomenological experience as “someone speaking to the patient” despite the absence of an external auditory source for such an experience. In that respect AVHs are like perceiving sounds that do not exist. It is however quite conceivable that the phenomenological experience of a sensory event is caused by misinterpretations of inner speech and covert monologues, triggered by intrusive traumatic memories, which are two alternative models for the explanation of AVHs. Localization of the neuronal correlates of AVHs in the temporal lobes cannot however explain other cognitive aspects of AVHs, like the failure to cognitively inhibit and ignore the “voices”, and focus attention to the outer world rather than inwards and engaging in a running commentary and dialogue with the “voice”. The cognitive aspects of AVHs have been shown to implicate frontal and parietal brain regions, and a model is presented which sees AVHs as initiated by temporal lobe hyper-activation, not controlled due to frontal lobe hypo-activation in a cortical network. Despite the huge literature on AVHs, there are still unresolved issues and questions. One pertinent question is what cognitive and neuronal processes give the hallucination a negative emotional tone, which should be the focus for future research. One hypothesis is that the amygdala may play a critical role in the emotional flavoring of AVHs. Another unsolved issue is the possibility of a genetic predisposition for AVHs, orthogonal to genetic predisposition for schizophrenia in general (see for an update on genetic factors in schizophrenia). The fact that AVHs also occur in the general population in individuals not in need of clinical care[11-12,96] provides a background against which it could be predicted that AVHs could be related to a genetic pathway orthogonal to genetic susceptibility for schizophrenia. These issues should be explored in future research, hopefully also in larger samples than has been the case so far.
The author wishes to express his thanks to all colleagues and students that have contributed to the research, too many to mention all.
P- Reviewer: Chakrabarti S, Gazdag G, Maniglio R, Serafini G, Schweiger U S- Editor: Ji FF L- Editor: A E- Editor: Yan JL