Schizophrenia is a severe neuropsychiatric disorder associated with core neurocognitive dysfunction. Traditional models of schizophrenia focus on high-level disturbances such as impairments in attention or executive processing. However, more recent findings suggest widespread disturbances across multiple neurocognitive domains (
1,
2). For example, patients show significant deficits in ability to match tones across delay, irrespective of disturbances in attention or working memory (
3,
4). Similar deficits are observed in monkeys and humans following auditory cortex lesions (
5–
7). To date, however, the contribution of lower-level deficits to higher cortical dysfunction in schizophrenia remains relatively unexplored. This study focuses on two well-established event-related brain potentials—mismatch negativity (MMN) and P300—to evaluate the relationship between higher- and lower-level stages of auditory information processing impairments in schizophrenia.
P300 is a late cognitive brain response with a latency of approximately 300 msec following stimulation (
8). P300 is elicited most commonly by using an auditory "oddball" paradigm in which a sequence of repetitive standard tones is interrupted infrequently and unexpectedly by physically deviant tones. Generators of P300 are localized to multiple higher-order brain regions, including the temporoparietal and medial frontal cortex, consistent with a role of P300 in working memory and context updating (
9). Deficits in P300 generation in schizophrenia were first reported in the 1970s (
10) and have been extensively replicated since then, with an observed effect size deficit of approximately 0.89 across studies (
11,
12). Despite being generated within an oddball task, P300 does not appear to reflect either specific detection of the oddball tone or the response decision itself, as these can be dissociated behaviorally. Instead, P300 appears to represent activation of higher-level circuits with the goal of determining the higher-level significance of the fact that an unexpected event has just occurred.
MMN, like P300, is generated following deviant auditory stimuli in an oddball task. However, unlike P300, MMN is generated equivalently whether or not individuals are paying attention to the stimuli while they are being presented and whether or not the stimuli are relevant to a particular behavioral task (
13). MMN generation, moreover, is driven primarily by feed-forward low-level auditory projections with limited recurrent modulation from higher-order brain regions such as prefrontal or parietal cortex. By contrast, P300 is heavily dependent upon recurrent processing within high-order cortical feedback loops (
14).
Behaviorally, MMN indexes function of the auditory "echoic" memory system, which stores preattentive representations of simple auditory features (
13). These representations, in turn, govern attention-dependent deviant detection (
15). MMN is typically recorded to relatively large differences in pitch between standard and deviant stimuli. However, MMN can be obtained reliably to pitch differences as small as 2%–3%, which are at or near the threshold of behavioral detection (
15–
18). Nevertheless, MMN increases progressively in amplitude and decreases progressively in latency with increasing levels of pitch deviance, suggesting that only partial MMN activation is required in order for individuals to consciously detect between-tone pitch differences (
15–
18).
Deficits in MMN generation to pitch-deviant stimuli were first documented over 15 years ago (
19,
20) and have been shown since that time to correlate significantly with deficits in tone matching (
21) and global level of function (
22). The mean effect size of the MMN deficit across studies (1.0–1.2 standard deviation units depending upon deviant type [
23]) is as large as or larger than the deficit in P300, although few studies have measured both potentials within the same groups (
24–
26).
In both schizophrenia and comparison subjects, MMN increases in amplitude until tones are separated by approximately an octave in pitch (100% deviance), whereafter plateau or even decline is observed (
18). When analyzed as a function of pitch deviance, patients show reduced plateau of MMN amplitude with no evidence of rightward shift in half-maximal response, suggesting that the deficit reflects primarily a decrease in maximal current flow that can be sustained through MMN generators (
18,
27).
MMN generators have been localized consistently to the primary auditory cortex based upon dipole mapping (
28,
29), fMRI (
30,
31), and direct intracranial recording (
27). Deficits in MMN generation similar to those observed in schizophrenia are induced by antagonists of N-methyl-d-aspartate (NMDA)-type glutamate receptors in both human (
29,
32,
33) and animal (
27,
34) models. In contrast, MMN is not significantly affected by psychotomimetics of psychostimulant (
35) or hallucinogenic (
33,
36) type. MMN and P300 are hierarchically organized and interrelate with behavioral tone-matching threshold (
37). Nevertheless, the degree to which MMN and tone-matching deficits account for P300 impairments in schizophrenia has been evaluated to only a limited degree, with most studies not finding interrelationships among potentials (
25,
26), although correlations have been found with the preceding N2 potential (
25).
In the present study, MMN and P300 were obtained using not only a traditional paradigm, employing fixed differences in pitch and duration across both groups, but also a novel paradigm in which pitch differences between standard and deviant tones were adjusted dynamically to correspond to each individual's tone discrimination threshold. In keeping with distributed cognitive models, we hypothesized independent but interrelated deficits in MMN and P300 generation in schizophrenia, with impairments in both lower-level processing, as indexed by MMN, and intrinsic dysfunction within P3 generator regions contributing significantly to the overall impairment in P300 generation.
Method
Subjects
Informed consent was obtained from 50 patients with DSM-IV schizophrenia or schizoaffective disorder (39 men and 11 women; mean age=37.8 years [SD=9.3]) and 21 normal comparison subjects (10 men and 11 women; mean age=36.2 years [SD=9.6]) following full explanation of procedures. All patients were receiving medication, with five patients receiving only a typical antipsychotic (haloperidol decanoate) and the remainder receiving atypical antipsychotics (olanzapine, risperidone, quetiapine, ziprasidone, clozapine) either alone or in combination. Positive symptoms were assessed using the Brief Psychiatric Rating Scale (BPRS) and negative symptoms were assessed using the Schedule for Assessment of Negative Symptoms (SANS) in a subgroup of 40 patients, along with the digit symbol test (
38) to assess cognition. Mean antipsychotic doses were 1104 mg/day (SD=428) in chlorpromazine equivalents based upon published conversion factors (
39). Socioeconomic status was determined using the Hollingshead scale (
40).
Stimulation Procedure
All subjects participated in the following procedures: 1) a "standard" MMN paradigm using a fixed level of pitch and duration deviance; 2) a "standard" P300 paradigm using a fixed level of pitch and duration deviance; 3) assessment of deviance-detection threshold; 4) an individualized (adaptive) P300 paradigm; and 5) an individualized MMN paradigm.
Standard paradigms
In the standard MMN condition, stimuli (5 msec rise/fall) consisted of 1000-Hz, 50-msec tones, with deviations in pitch (1500 Hz, 50 msec) and duration (1000 Hz, 75 msec) presented at 12.5% sequential probability. Stimuli were presented every 300 msec via headphones at nominal intensity of 75 dB. A total of 2600 stimuli (200 deviants) were presented. During the MMN runs, subjects were instructed to ignore the presented stimuli while watching a silent video. In the standard P300 paradigm, subjects were instructed to ignore the 1000-Hz tones and press a button in response to the 1500-Hz deviants. Stimuli were presented every 1500 msec. Deviant probability was 20%. A total of 750 stimuli (150 deviants) were presented.
Tone-matching threshold assessment
Following completion of the standard paradigm, tone-matching thresholds were obtained using an up-down transform procedure in which standard and deviant tones were presented in intermixed fashion. Standard tones were always 1000 Hz in pitch, 50 msec in duration. Deviant tones were dynamically adjusted from 1020 (2% change) to 1400 (40% change) in logarithmic steps based upon a 3-down/1-up transform rule, in which difficulty was increased following three successive correct target detections, and decreased following one failed detection or false alarm. This provided a mean correct performance of 79.4%.
Adaptive paradigms
In the adaptive P300 and MMN conditions, procedures were as for the standard paradigm except that pitch deviance levels were adjusted to correspond to each subject's tone-matching threshold. Adaptive P300 was obtained while subjects performed a task in which tone deviance was dynamically adjusted to maintain performance at 79.4% correct. Individualized MMN was obtained using a pitch deviant corresponding to the tone-matching threshold determined in the behavioral paradigm. The same duration deviant was included in this condition as well to allow comparison across paradigms.
Event-Related Potential Recordings
Electrical recordings were obtained from 64 scalp locations, consisting of expanded 10/20 placements, along with monopolar vertical and horizontal EOG electrodes using an Active II recording system (Biosemi, Amsterdam), relative to nose reference. Activity was amplified with a bandpass of 0.001 to 100 Hz and digitized continuously at a minimal sampling rate of 500 Hz.
Epochs (–100 to 450 msec) were constructed offline. Artifact-rejection threshold was ±100 μV. Accepted trials were averaged for each subject relative to prestimulus baseline. Peak values were determined within prespecified intervals at Fz (MMN) and Pz (P300). For MMN, a 100–200 msec latency range was used for all deviant types. For P300, waveforms were rereferenced to average mastoid derivation prior to peak detection, and a 290–450 msec range was used. P300 was log transformed prior to analysis to improve normalization.
Dipole Mapping
Source-localization of MMN was performed using Brain Electric Source Analysis (BESA 5.0, MEGIS Software GmbH, Munich). Average files were filtered using a 0.5–45 Hz zero-phase-shift band-pass digital filter with roll-off of 24 dB/octave. Paired source dipoles were initially seeded into the primary auditory cortex (Heschl's gyrus), consistent with prior studies (
31) but then allowed to float. The presence of potential additional sources was evaluated by seeding additional dipoles into brain regions (e.g., frontal cortex) potentially implicated in MMN generation. Optimal fits were selected based upon level of residual variance. A complementary approach, Local Auto Regressive Average (LAURA) (
41), was also used to verify source localization.
Statistical Analyses
Between-group analyses were performed using separate repeated measures multivariate analysis of variance (MANOVA) for P3 and MMN using all available data per analysis. Interrelationship among measures was determined by multiple regression using SPSS 15.0 (SPSS Corporation, Chicago) with all simultaneous entry of all specified variables. Structural equation modeling was implemented using AMOS 7.0 (
42). Selection among alternative models was determined by minimizing χ
2variance, with paths entered according to the following criterion: "χ
2to include" = (χ
2without – χ
2with), (df
without – df
with). Residual error and goodness of fit measures were used to assess model integrity.
For correlational analyses, MMN values were inverted in polarity to provide positive coefficients to increasing values. Bivariate and partial correlation coefficients (r) were determined by linear regression. All significance levels are two-tailed with preset alpha level for significance of p<0.05. The comparison group included a significantly higher proportion of female subjects than the patient group (p=0.01, Fisher's exact test). Gender was therefore included as a factor in between-group ANOVAs. However, no significant main effects of gender or gender-by-group analyses were observed.
Results
Standard Paradigm
In the standard condition, large fixed differences in pitch (50%) were used for both groups. Highly significant differences with large effect sizes (
Table 1) were observed for MMN (
Figure 1). A repeated measures MANOVA across pitch and duration deviants in the standard MMN run revealed a highly significant main effect of group (F=29.7, df=1, 69, p<0.00001) with no main effect of deviant type (F=0.3, df=1, 69, p=0.61) or group-by-deviance type interaction (F=0.7, df=1, 69, p=0.4). The group effect remained strongly significant even when gender was included as a factor (F=24.8, df=1, 67, p<0.0001). As expected, MMN latency was longer by about 50 msec to tones of duration deviance, relative to those of pitch deviance, in both groups (comparison subjects: 57.3 msec [SD=24.4]; patients: 47.5 msec [SD=30.7]). Latencies did not differ between groups for either pitch- or duration-deviant stimuli.
Patients also showed a highly significant difference (F=16.0, df=1, 66, p=0.0002) with large effect size (Table 1) in P300 amplitude to standard deviant stimuli (
Figure 2) that also remained strongly significant even when gender was included as a factor (F=8.96, df=1, 64, p=0.004). However, the P300 deficit was no longer significant following covariation for MMN amplitude (F=2.77, df=1, 63, p=0.10). In contrast, robust deficits in MMN amplitude for deviants in pitch (F=10.2, df=1, 63, p=0.001) and duration (F=15.3, df=1, 63, p=0.0002) remained following covariation for P300.
Deviance Detection Thresholds
Deviance detection thresholds were determined using a behavioral up-down transform procedure. As expected, patients showed significantly elevated tone-matching thresholds relative to comparison subjects (Table 1), consistent with prior literature (
3,
4,
43).
Adaptive Paradigm
Adaptive P300 waveforms were obtained while subjects were performing a task in which tone differences were adjusted dynamically to maintain correct performance for all subjects at an identical level. Patients showed significant deficits in P300 generation even in the adaptive P300 paradigm (F=6.48, df=1, 66, p=0.013) that were reduced to a marginal level of significance when gender was included as a factor (F=3.68, df=1, 64, p=0.059). The degree of difference between P300 amplitude in the standard and adaptive P3 conditions was also significant between groups when gender was not included (F=6.70, df=1, 66, p=0.012) but became marginally significant when gender was included as a factor (F=3.29, df=1, 64, p=0.07). For both groups, latency was significantly longer in the adaptive than in the standard condition (F=14.9, df=1, 66, p<0.0001). However, neither the absolute latencies (F=0.9, df=1, 66, p=0.4) or the prolongation in latency between paradigms (F=1.3, df=1, 66, p=0.3) differed significantly across groups.
When MMN was obtained at individually adjusted deviance detection threshold, MMN values were significantly different from 0 for both patients (t=9.24, p<10–11) and comparison subjects (t=6.99, p<10–6). Nevertheless, no significant difference in MMN amplitude was observed across groups (F=0.1, df=1, 67, p=0.73). Furthermore, patients showed greater difference between MMN recorded in the standard versus individualized condition than did comparison subjects (F=25.3, df=1, 69, p<0.0001), which remained significant when gender was included as a factor (F=23.9, df=1, 67, p<0.0001). MMN amplitude in patients was approximately 40% lower than that of comparison subjects in the standard MMN condition, but the difference was only 7% in the individualized MMN condition. In the individualized MMN condition, patients showed significantly shorter MMN latency relative to comparison subjects (t=2.47, p=0.016) in keeping with the larger levels of pitch differences needed to equate performance.
Duration-deviant stimuli included during the individualized run were physically identical to those in the standard run. As expected, MMN following duration deviants was highly similar in both conditions, and significantly lower in patients relative to comparison subjects (F=27.1, df=1, 69, p<0.0001) even when gender was included as a factor (F=21.8, df=1, 67, p<0.0001).
Relationship Among Event-Related Potential Measures
The relationship among measures was addressed using hierarchical multiple regression and confirmatory structural equation modeling (
Figure 3). In the multiple regression, two levels were tested: an initial model incorporating only the standard measures, and a second model incorporating individualized measures.
In the initial model, group significantly predicted MMN amplitude in the standard paradigm (b=0.50, pχ0.0001). Both group (b=0.32, p=0.027) and MMN (b=0.35, p=0.042) significantly predicted P300 amplitude. Strong intercorrelation was observed between the MMN to pitch versus duration deviants.
A second regression investigated interrelationships among group and the individualized MMN and P300 measures. As expected, in this condition group did not significantly predict amplitude of MMN (β=0.05, p=0.7) reflecting the lower levels of pitch differences used in comparison subjects versus patients. Furthermore, MMN and P300 were not significantly interrelated (β=1.4, p=0.25), suggesting that threshold adjustment removed the shared variance between components. A group effect was nevertheless still observed for P300 (β=0.30, p=0.014), suggesting dysfunction independent of bottom-up influence.
In confirmatory structural equation modeling (model C in Figure 3), a highly robust, nonsaturated model was obtained showing relationships equivalent to those from the linear regression modeling. Independent group effects were observed on both MMN measures, as well as P300, suggesting direct contribution of the pathophysiological process to these measures.
Interrelationship between MMN and P300 was confirmed by a highly significant (β=0.30, p<0.001) path between MMN to pitch deviants and P300. Deletion of the path led to a highly significant worsening of the model fit (χ2=11.2, df=1, p=0.0008), as did its reversal to project from P300 to MMN (χ2=6.3, df=1, p=0.012).
Source Analysis
For both groups, symmetric dipoles fixed in anatomically defined auditory cortex accounted for over 90% of variance in both the standard and individualized MMN conditions (
Figure 4). Similar reductions were observed in left and right auditory cortices. When a minimum norm approach (LAURA) was used for source localization rather than a dipole model approach, localization to auditory cortex was again observed, consistent with the a priori hypothesis.
Relationship With Clinical Measures
Amplitudes to deviants for both MMN (pitch: r=0.44, p=0.001; duration: r=0.33, p=0.014) and P300 (r=0.38, p=0.004) significantly predicted current psychosocial function. Furthermore, amplitudes to deviants for both MMN (pitch: r=0.40, p=0.002; duration: r=0.26, p=0.05) and P300 (r=0.42, p=0.001) were significantly related to participant education level, although no correlations were observed with parental education (all p>0.2). No correlations were observed with SANS or chlorpromazine equivalents for any event-related brain potential measure. P300 in the individualized (r=0.32, p=0.05), but not standard (r=0.26, p=0.11), condition correlated significantly with digit symbol scaled score.
Discussion
The present study investigated the interrelationship between cognitive and sensory level function in schizophrenia using two separate brain response measures: P300, which reflects processing within fronto-parietal association regions, and MMN, which reflects early processing at the level of auditory sensory cortex. Findings are twofold. First, deficits in early sensory processing in schizophrenia, as reflected by MMN, are as large as or are larger than deficits in higher-order processing as reflected by P300. Second, the disturbances in sensory-level processing, as indexed by MMN, contribute significantly to impaired P300 generation. Deficits are thus found at multiple information processing stages in schizophrenia, consistent with distributed neurochemical models of the disorder.
In the present study, three separate parameters were studied: 1) tone-matching ability, as reflected by the degree of pitch difference needed for subjects to differentiate a deviant from a standard stimulus; 2) MMN, which reflects preattentive deviance detection at the level of auditory sensory cortex (
13,
27); and 3) P300 generation, which reflects activation of neural networks encompassing temporoparietal and frontal association regions (
8,
9). Furthermore, MMN and P300 were obtained in both a standard paradigm, as in prior studies, and in an adaptive paradigm in which subjects were tested at their individualized tone-matching thresholds. This manipulation permitted assessment of both MMN and P300 under conditions where behavioral response was equivalent between-groups, permitting isolation of lower- and higher-level processing.
We tested three separate hypotheses relating to the underlying concept that deficits in schizophrenia are both 1) hierarchical, with deficits in low-level processing contributing to impairments in subsequent information processing stages, as well as 2) distributed, with dysfunction at each level of processing contributing to further functional impairment. First, we hypothesized that patients would show large deficits at both the preattentive and attentive processing levels when tested in the standard paradigm. Second, that at tone-matching threshold, both groups would show small but equivalent MMN amplitudes, suggesting that in patients, as in comparison subjects, behavioral deviance detection thresholds would be driven primarily by preattentive processing. Third, that even under conditions where preattentive processing was controlled, patients would still show deficits in higher-order processing, but that the degree of deficit would be smaller than under standardized condition, suggesting both bottom-up and intrinsic contributions to cortical dysfunction.
Results of these tests are as follows. First, in the standard paradigm, patients showed large effect size differences in generation of both MMN (d=1.26–1.35) and P300 (d=1.08), with deficits in MMN generation being statistically as large as or larger than those in P300. Second, in the individualized condition, patients showed MMN amplitudes that were statistically indistinguishable from those of comparison subjects (p=0.7), with at most a small effect size difference between groups (d=0.11). Third, in the adaptive condition, patients nevertheless continued to show reduced P300 amplitudes relative to comparison subjects (d=0.69). The degree of reduction, however, was significantly less than in the standard condition, as shown by the fact that the difference in P3 amplitude between the standard and adaptive paradigms was significantly smaller in patients versus comparison subjects (p=0.012, d=0.70). Thus, at least part of the deficit in P300 generation in schizophrenia can be viewed as reflecting a bottom-up contribution of impaired sensory processing.
Finally, we conducted both linear regression and path analyses to test explicitly the combined hypothesis that 1) deficits in MMN contribute significantly to deficits in P300 generation, but that 2) higher-level cortical regions are also independently impaired. These relationships are shown most clearly in the path analysis (Figure 3), where a significant effect of the disease process ("group") is observed on MMN generation to pitch deviants (β=0.49, p<0.001), which in turn leads to a significant reduction in standard P300 generation (β=0.30, p<0.001). In addition, the disease process directly affects the relative amplitude of P300 in the standard versus individualized condition (β=0.31, p<0.008). The absence of this enhancement, in turn, contributes to P300 amplitude reduction in the standard condition (β=0.53, p<0.001). Thus, dual paths to reduced P300 generation in schizophrenia are apparent.
On a functional level, the present findings can be most easily conceptualized from the viewpoint of distributed cortical dysfunction, with deficits in both sensory and higher-order processing contributing to overall brain dysfunction. MMN normally increases in amplitude until tones are approximately an octave apart (100% deviance), whereafter it plateaus or may even decrease (
18). Thus patients, who required a pitch deviance of 14% to reliably detect deviant tones, were much closer to their MMN ceiling level response at behavioral threshold than comparison subjects, who required only 3% difference. Therefore, even when performing the task apparently as well as comparison subjects in the standard condition, patients nevertheless generate significantly less additional MMN activation (d=1.31) contributing to their significant impairment in P300 generation.
On a molecular level, the finding of interrelated lower- and higher-level deficits supports recent neurochemical theories of schizophrenia, which focus on dysfunction of widespread neurotransmitter systems such as glutamate and GABA. Deficits in MMN generation are associated with impaired NMDA receptor function at the level of auditory sensory cortex (
27,
29,
32,
33), consistent with a primary role for dysfunction of NMDA receptor-related mechanisms in schizophrenia.
In addition to assessing the relationship between MMN and P300 generation, the present study also confirms the relationship between MMN and psychosocial function on the one hand (
44) and premorbid educational achievement (
24,
25) on the other. MMN in general, and pitch MMN in particular, may thus serve as an index of a form of schizophrenia associated with poor premorbid function and impaired psychosocial outcome.
A limitation of this study is that all patients were receiving antipsychotic medication at the time of testing. However, antipsychotic medications—including risperidone (
45), olanzapine (
46), and clozapine (
47)—have been found to be without significant effect on MMN generation. To the extent that it has an effect, clozapine may enhance P300 generation (
47) and so could not account for between-group differences in P300 amplitude observed in this study. No correlations with effective medication dose or relationship with medication type was observed for any of the dependent measures in this study.
In summary, although sensory processing was once considered normal in schizophrenia, severe deficits are observed in information processing even at the level of sensory cortex across sensory systems (
48). Such deficits are large in magnitude and contribute to subsequent impairments in brain activation. Furthermore, even under conditions where patients are apparently performing equivalently to comparison subjects, such as their performance in the standardized P300 condition in the present study, they nevertheless are functioning much closer to their maximal functional capacity. As a result, brain activation levels at both sensory and cognitive levels are lower even in the face of apparently normal performance. Although models of cognitive dysfunction in schizophrenia are frequently discussed as "top-down" versus "bottom-up", the present findings suggest that both types of impairments are simultaneously present in schizophrenia, consistent with distributed hierarchical models and distributed neurochemical theories of schizophrenia.