Schizophrenia has been characterized as a disorder with cognitive dysfunction
(1). Among the many aspects of cognitive functioning, language-related functions, such as verbal memory
(2), thought
(3), and context utilization
(4), have been shown to be distinctively impaired in schizophrenia. While these are integrated aspects of language function, at the most basic level language comprehension requires processing of phonemes, that is, categorization of the simplest units of speech sounds according to their acoustic properties. Cienfuegos et al.
(5) indicated dysfunction of categorical perception of speech sounds in schizophrenic patients. In a behavioral task that required subjects to identify intermediate phonemes along the continuum from /ba/ to /da/, schizophrenic patients showed disturbance in their perception of intermediate forms near the center of the continuum and in boundary definition. However, little is known about the physiological substrates for impairment of basic perception of speech sounds and the affected cortical regions subserving the process in schizophrenia, which were the issues addressed by this study.
One useful index for such an investigation is mismatch negativity, a component of event-related potentials elicited approximately 150–200 msec after the onset of physically deviant auditory stimuli in identical and repeated sequences
(6). Näätänen
(7) stated that mismatch negativity reflects the detection of mismatches between deviant stimuli and the neural trace encoding the physical features of the standard stimuli and that mismatch negativity can be elicited even under a passive condition, when subjects entirely ignore the stimuli. Thus, mismatch negativity can be an index of preattentive detection of acoustic changes in humans. Encouraging findings related to the issues that we attempt to address here have been demonstrated by studies of magnetoencephalographic (MEG) recording of mismatch negativity
(8,
9), which demonstrated physiological correlates of preattentive perception of speech sounds predominantly in the left auditory cortex in healthy adults.
Many investigators have reported a lower than normal amplitude of mismatch negativity, or lower power of MEG recording of mismatch negativity, in response to pure-tone stimuli in schizophrenic patients
(10–
16), but other investigators did not observe this
(17). However, to our knowledge, phonetic mismatch negativity in schizophrenic patients has not been assessed to date.
Accordingly, the aim of the present study was to investigate abnormalities in preattentive perception of speech sounds, as indexed by mismatch negativity in schizophrenic patients, by using a system with high spatiotemporal resolution for recording event-related potentials. We hypothesized that schizophrenic patients would show lower mismatch negativity amplitude in response to speech sounds than in response to pure tones, in accordance with known language-related dysfunction in schizophrenia. Moreover, we predicted that schizophrenic patients would show lower amplitude of mismatch negativity in detecting change between phonemes of different boundaries than in detecting the physical features of a phoneme (e.g., duration). Finally, scalp current-density mappings with high spatial resolution would allow more detailed assessment of cortical networks subserving the processes, and the dysfunction of these networks in schizophrenic patients, than occurred in most of the previous investigations, which used a montage with fewer electrodes.
Method
Subjects
We recruited 23 outpatients with schizophrenia from the Day Treatment Unit, Department of Neuropsychiatry, University of Tokyo; all were receiving medication. There were 16 men and seven women, their mean age was 28.1 years (SD=4.0), and all were right-handed according to the Edinburgh Inventory
(18). Diagnosis of schizophrenia was made through the Structured Clinical Interview for DSM-IV Axis I Disorders, Clinical Version, by one trained psychiatrist (K.K.). The mean duration of illness was 7.8 years (SD=4.0), and the mean IQ (WAIS-R) was 87.0 (SD=10.3). All of the patients received typical neuroleptics alone, and the mean dose was 438.3 mg/day of chlorpromazine equivalents (SD=299.5). All patients were clinically stable, and their psychiatric symptoms were evaluated by a psychiatrist (K.K.) using the Positive and Negative Syndrome Scale
(19) within the 3 days before EEG recording. The mean scores for positive symptoms, negative symptoms, and general psychopathology were 12.0 (SD=3.7), 16.8 (SD=4.8), and 31.6 (SD=6.8), respectively.
The comparison group comprised 28 right-handed healthy subjects similar to the schizophrenia group in terms of sex (18 men, 10 women) and age (mean=26.2 years, SD=5.1). The numbers of years of education for the patients and the comparison subjects were 13.3 (SD=2.0) and 15.9 (SD=1.7), respectively, with the patients showing a significantly lower level (t=4.78, df=49, p<0.001). As assessed with the Hollingshead scale, the socioeconomic status of the schizophrenic patients (mean=4.5, SD=0.5) was significantly lower than that of the comparison subjects (mean=1.9, SD=0.6) (t=16.14, df=49, p<0.001), while parental socioeconomic status did not differ significantly between groups (schizophrenia: mean=1.8, SD=0.9; comparison: mean=1.9, SD=0.8).
Japanese was the first language for all subjects in both groups. The exclusion criteria for both groups were a history of electroconvulsive therapy, neurological illness, traumatic brain injury with any known cognitive consequences or loss of consciousness for more than 5 minutes, substance abuse or dependence, and hearing or vision impairment. An additional exclusion criterion for the comparison group was a history of psychiatric disease in themselves or their first-degree relatives. After complete description of the study to the subjects, written informed consent was obtained.
Task
The subjects were presented with auditory stimulus sequences consisting of 1,080 standard stimuli and 120 deviant stimuli delivered in random order; the probabilities of hearing standard and deviant stimuli were 90% and 10%, respectively. The interstimulus interval was 490–530 msec. The stimuli were delivered binaurally through headphones. The subjects were instructed to watch a silent film to ignore the stimuli.
The experiment consisted of three conditions. The first condition was designed to elicit mismatch negativity in response to change in duration of pure tones (standard: 100 msec, deviant: 50 msec). The second and third conditions consisted of vowel stimuli, the former to elicit mismatch negativity in response to change in duration of a vowel (standard: Japanese vowel /a/ with a 150-msec duration, deviant: /a/ with a 100-msec duration) and the latter to elicit mismatch negativity in response to across-phoneme change (standard: Japanese vowel /a/ with a 150-msec duration, deviant: /o/ with a 150-msec duration). These vowel stimuli were spoken by an actor who was a native Japanese speaker, were digitized, and were edited to have a duration of 100 or 150 msec, loudness of 70 dB sound pressure level, and rise/fall time of 10 msec. The frequencies for the formants (F) of the vowels were as follows: /a/: F0=140 Hz, F1=760, F2=1250, F3=2750, F4=3600; /o/: F0=140 Hz, F1=480, F2=770, F3=2820, and F4=3600. The frequency of the pure-tone stimuli was 1000 Hz, which was nearly equal to the center frequency of the formants of the vowel stimuli. The order of the three conditions was counterbalanced across the subjects.
Measurement of Event-Related Potentials
The EEG recording procedure has been described in detail elsewhere
(20). Briefly, EEGs were recorded by means of a 128-electrode cap. The tip of the nose was used as the reference for all the electrodes. Averaging and artifact rejection (EEG amplitude greater than 50 μV, electro-oculogram amplitude greater than 150 μV) were conducted off-line. The mean number of accepted responses for deviant stimuli across channels was more than 80 for all of the subjects and did not differ significantly between the groups.
The mismatch negativities were measured as difference waveforms obtained by subtracting the event-related potentials for standard stimuli from those for deviant stimuli. In order to determine the latency of mismatch negativity, the computation of potential field strength (global field power)
(21) was applied to the subtracted waveforms for each subject and condition. Moreover, after the computation of global field power for the grand-averaged waveforms for each group and condition, individual mismatch negativity amplitude was determined as the averaged potential within a 100-msec time window around the peak latency of the grand-averaged global field power.
We also evaluated the N1 amplitude elicited by standard stimuli. After computing the global field power for the grand-averaged waveforms for each group and condition, each individual N1 amplitude was determined to be the averaged potential within a 40-msec time window around the peak latency of the global field power.
Topographic EEG and Mapping of Scalp Current Density
The scalp current density strengthens near electric fields and filters out far electric fields; thus, it is suitable for differentiating multiple generators of the activities near the cortical surface. The methods used for topographic EEG and mapping of scalp current density have been described in detail elsewhere
(20). Briefly, peripheral electrodes beyond the 10–20 system areas were excluded, and for the remaining 104 channels the scalp potential was reconstructed by spherical spline interpolation. The distributions of scalp current densities were then obtained by computing the spatial derivatives of the spherical spline functions used in the potential map interpolation: J=–σ[(d
2V/dx
2)+(d
2V/dy
2)].
Statistical Analyses
Since a preliminary analysis of variance (ANOVA) of mismatch negativity amplitudes that included sex as the between-subject factor revealed no main effect of or interaction with sex, the two sexes were combined in further analyses. We planned a hierarchical testing strategy. First, we performed a repeated measures ANOVA for the between-group comparison of mismatch negativity amplitude, adopting group as the between-subject factor and considering condition, hemisphere, and electrode site (22 frontotemporal electrodes in the left hemisphere and 22 homologous sites in the right hemisphere) as within-subject factors. After finding a significant group-by-condition-by-hemisphere or group-by-condition interaction, we conducted separate follow-up ANOVAs for each condition. In addition, a repeated measures ANOVA of the peak latency of mismatch negativity was performed with group as the between-subject factor and condition as the within-subject factor.
We conducted quantitative group comparisons of scalp current density by using a method similar to that proposed by Giard et al.
(22). We calculated individual values for the right temporal component during the condition involving tone duration and for the left temporal and right frontal/temporal component complex for the phoneme-duration and across-phoneme conditions, in which current sink (perpendicularly sinking into the scalp) or combinations of current sink and current source (arising out of the scalp) were evident in the maps of scalp current density for the comparison group (see Results section). The scalp current-density sink values for each component under each condition were calculated by averaging values over 10–12 channels around each region.
After confirming that sex had no significant effect, we performed ANOVAs to compare N1 amplitudes, adopting group as the between-subjects factor and condition, hemisphere, and electrode site (the same clusters of electrodes as those for mismatch negativity analyses) as within-subject factors.
Pearson’s product-moment correlation was used in the exploratory analyses of the relationships between clinical measures and mismatch negativity amplitudes (averaged amplitudes over 22 channels for each hemisphere under each condition). In addition, we also conducted exploratory analyses of the relationship of amplitude for across-phoneme mismatch negativity and of scalp current density to language-related clinical measures. These measures were scores for conceptual disorganization (P2) and difficulty in abstract thinking (N5) from the Positive and Negative Syndrome Scale and WAIS-R verbal IQ and scores on the information, vocabulary, and comprehension subscales. Because of the multiple comparisons, we considered p<0.005 to be statistically significant for the correlational analyses.
Results
Group Comparison of Mismatch Negativity Amplitude
The repeated measures ANOVA of the amplitude data showed no main effect of group (F=0.94, df=1, 49, p=0.34) but revealed a significant group-by-condition-by-hemisphere interaction (F=3.45, df=2, 98, p=0.04) (
Figure 1,
Figure 2,
Figure 3). Thus, we compared the amplitudes of mismatch negativity separately for each condition. Under the tone-duration condition, there was neither a significant main effect of group (F=0.47, df=1, 49, p=0.50) nor a significant group-by-hemisphere interaction (F=1.33, df=1, 49, p=0.25). Under the phoneme-duration condition, there was no significant main effect of group (F=0.08, df=1, 49, p=0.78), but there was a significant group-by-hemisphere interaction (F=5.36, df=1, 49, p=0.03) that led us to perform a separate analysis for each hemisphere. The follow-up analysis did not show a significant group difference for either the left (F=0.67, df=1, 49, p=0.42) or right (F=0.06, df=1, 49, p=0.81) hemisphere.
Finally, there was a significant main effect of group under the across-phoneme condition, while a significant group-by-hemisphere interaction was not evident (
Figure 3). These results suggest that the group difference in mismatch negativity was confined to the across-phoneme condition, with the schizophrenic patients showing lower amplitudes in both hemispheres. In addition, the schizophrenic patients had abnormal lateralization under the phoneme-duration condition.
Group Comparison of Mismatch Negativity Latency
The repeated measures ANOVA of mismatch negativity latencies did not show either a significant main effect of group (F=1.61, df=1, 49, p=0.21) or a significant group-by-condition interaction (F=0.61, df=2, 98, p=0.55) but showed a significant main effect of condition (F=27.13, df=2, 98, p<0.001), indicating that mismatch negativity latencies significantly differed between conditions for both groups. The post hoc analysis indicated that the across-phoneme mismatch negativities were generated significantly earlier than the tone-duration mismatch negativities (Tukey’s honestly significant difference test, p<0.001), and the tone-duration mismatch negativities were generated significantly earlier than the phoneme-duration mismatch negativities (Tukey’s honestly significant difference test, p=0.007).
N1 Amplitude Elicited by Standard Stimuli
The latencies of the global field power peaks for the comparison and schizophrenia groups were 98 and 98 msec, respectively, for the N1 under the pure-tone condition, 104 and 104 msec for the N1 under the phoneme-duration condition, and 110 and 104 msec for the N1 during the across-phoneme condition. The repeated measures ANOVA of the amplitude data showed no significant main effect of or interaction with group.
Group Comparison of Scalp Current Density
For the comparison group, there was a clear combination of current sink and current source in the right temporal region under the tone-duration condition, while the sink/source combination was more evident in the left hemisphere under the phoneme-duration and across-phoneme conditions (
Figure 4). However, these sink/source combinations were less evident in the schizophrenia group. Moreover, there was a clear sink in the right frontal region in the phoneme-duration and across-phoneme conditions in the comparison group; however, once again, this was less evident in the schizophrenia group. It has been demonstrated that auditory mismatch negativity has two generator systems: one is in the bilateral temporal cortex (electric current tangential to the scalp), and the other is in the right frontal cortex (electric current radial to the scalp)
(22). Since a sink/source combination represents current tangential to the scalp and a current sink represents current radial to the scalp, the former corresponds to the temporal component and the latter corresponds to a mixture of frontal and temporal components (frontal component is predominant) of mismatch negativity.
There was no group difference in the right temporal component under the tone-duration condition (t=–1.20, df=49, p=0.24). On the other hand, the schizophrenic patients showed significantly weaker left temporal values under the phoneme-duration condition (main effect of group: F=1.36, df=1, 49, p=0.25; group-by-component interaction: F=5.34, df=1, 49, p=0.03; post hoc t test for left temporal component: t=–2.11, df=49, p=0.04; t test for right frontal/temporal component: t=0.59, df=49, p=0.56) and significantly weaker left temporal and right frontal/temporal values under the across-phoneme condition (main effect of group: F=4.50, df=1, 49, p=0.04; group-by-component interaction: F=0.00, df=1, 49, p=0.95).
Correlations With Clinical Data
There were no significant correlations between age and amplitude of mismatch negativity for any of the conditions or hemispheres in either group. In the schizophrenia group, there were also no significant correlations between mismatch negativity and duration of illness or dose of medication. In the schizophrenia group, lower mismatch negativity in the right hemisphere under the phoneme-duration condition was significantly associated with more severe negative symptoms (r=0.61, N=23, p=0.002). We did not find significant correlations between across-phoneme mismatch negativity or scalp current density and any of the language-related clinical measures (Spearman’s r=–0.24 to 0.28, N=23, p=0.19 to 0.99) in the schizophrenia group.
Discussion
To our knowledge, this is the first study that provides electrophysiological evidence for the dysfunction of preattentive perception of change in speech sounds among schizophrenic patients, shown by mismatch negativity measured by a high-resolution EEG montage. The present results suggest that language-related dysfunction in schizophrenic patients may be present at the early stage of auditory processing of relatively simple stimuli, such as phonemes, and not just at stages involving higher-order semantic processes. In addition, we demonstrated that schizophrenic patients showed more deficits in the across-phoneme condition than during detection of change in physical features (duration) of a pure tone or a vowel. Generation of mismatch negativity is thought to comprise two neural processes: sensory memory formation and maintenance of standard stimuli and the process of comparing deviant stimuli with maintained representations of the standard stimuli
(11). Since the standard stimuli for the phoneme-duration and across-phoneme conditions were the same (vowel /a/ with 150-msec duration), and the group difference was found only for the latter condition, this study suggests that schizophrenia is associated more with dysfunction of the comparative process for perception of across-phoneme change, while sensory memory formation and maintenance, used for repetitive phoneme stimuli, may be intact.
The mappings of scalp current density for the comparison subjects showed that the tone-duration condition is characterized by generation of the right temporal component of mismatch negativity, corroborating a previous report of right-hemisphere predominance of pure-tone mismatch negativity
(23). On the other hand, under the phoneme-duration and across-phoneme conditions, the left temporal component and the right frontal/temporal component complex (frontal component is predominant) were more evident. In the latter two conditions, the temporal components may correspond to processing of acoustic features of speech sounds in the auditory cortex. The right frontal component may reflect the automatic attention-orienting process
(22) and/or the top-down, modulatory control
(24,
25) to facilitate human processing of complex signals with higher information content, such as phonemes. In our study, the schizophrenic patients showed bilaterally lower mismatch negativity only in the across-phoneme condition, supporting the finding that they had statistically weaker left temporal and right frontal/temporal cortical activity in this condition as shown by analysis of scalp current density. Thus, our results indicate that the frontotemporal cortical network for mismatch processing of language-related stimuli is less synchronized during across-phoneme perception of speech sounds in schizophrenic patients. In addition, an abnormal lateralization of the phoneme-duration mismatch negativity without lower amplitudes, which is consistent with the left-sided lower scalp current density in this condition, may be related to a general alteration of speech sound processing in schizophrenic patients.
In the schizophrenia group, the lower mismatch negativity in the right hemisphere in the phoneme-duration condition was associated with negative symptoms. This result agrees somewhat with findings in previous studies
(12,
13), although not with those in another study
(10); the investigators in the latter study claimed that a significant association between mismatch negativity attenuation and negative symptoms might characterize mismatch negativity as a chronicity marker for schizophrenia. However, we hesitate to apply this interpretation to our study, since there was only one significant correlation among the six indices (three conditions, two hemispheres) of mismatch negativity and the present study was a cross-sectional investigation of patients with relatively mild symptoms. We failed to find associations between across-phoneme mismatch negativity and language-related clinical measures. One reason may be that our schizophrenic patients were less symptomatic and their values on the clinical measures extended across only a limited range. Another reason may be that mismatch negativity is a physiological measure of sensory memory function at the early stage of information processing, while language-related clinical manifestations are much more integrated, thus involving deficits in multiple stages of information processing. We assume that specific neuropsychological assessments, such as tests of short-term verbal memory, may show associations with the phonetic mismatch negativity, which is an interesting topic for future studies.
A discrepancy between previous studies and ours that should be explained is the absence in our study of significantly lower mismatch negativity in the conditions in which subjects detected change in the duration of a sound (tone and vowel); previous investigations
(10,
12,
14,
15) showed abnormalities in mismatch negativity under duration-change conditions. One possible reason may be that in an easier task, such as watching a film, as adopted in this study, rather than a more attention-demanding one, such as a visually distracting task, patients could allocate a certain amount of attention to the auditory stimuli instead of performing the primary task, leading to the lack of between-group difference; this situation was discussed by Alain et al.
(26). However, this is unlikely to be the main reason, since the condition of across-category change for which the between-group difference was significant cannot be regarded as the task that keeps attention
away from the stimuli more easily for the schizophrenic patients. Another possibility may be that the patients’ symptoms were relatively mild and the duration of illness was relatively short in this study; thus, certain types of mismatch negativity were not as disturbed as in chronic illness. Partially supporting findings have been provided by two studies
(12,
13) showing an association between mismatch negativity and severity of negative symptoms and by a published abstract
(27) suggesting that mismatch negativity may be normal in first-episode schizophrenia. Nevertheless, whatever the reason for the lack of significantly lower mismatch negativity in response to duration change of pure-tone or vowel stimuli, the markedly lower mismatch negativity under the across-phoneme condition for schizophrenic patients should still be regarded as a significant finding. In addition, the between-group comparison of event-related brain activity based only on scalp voltages may obscure some aspects of electrophysiological abnormalities in schizophrenia. Namely, since the scalp current density emphasizes regionalized activities in the vicinity of the cortical surface, the weak detection of scalp current-density sinks and/or sink/source combinations in the former two conditions shown by qualitative inspection may indicate less localized cortical activities subserving generation of mismatch negativity in schizophrenic patients. This speculation is tentative, however, since our analyses of scalp current density detected statistically significant group difference only for the left temporal component in the phoneme-duration condition and not for that in the tone-duration condition. Further studies, using MEG and hemodynamic assessment of mismatch negativity in response to pure-tone and speech sounds in schizophrenic patients, will be necessary.
One possibility that should be ruled out is that the degree of deviance of physical features of the stimuli in the across-phoneme condition incidentally resulted in maximum differentiation between the schizophrenia group and the comparison group, rather than reflecting language-specific dysfunction. Javitt et al.
(11) suggested that the degree of deficit of mismatch negativity in schizophrenic patients is largest under the conditions in which mismatch negativity is normally greatest. In this study, the overall amplitudes of mismatch negativity in the phoneme-duration and across-phoneme conditions were comparable in the comparison group; thus, we could not ascribe differential abnormality in across-phoneme mismatch negativity to the possibility just mentioned. Moreover, Michie et al.
(15) demonstrated that their schizophrenic patients showed more deficits in mismatch negativity in response to duration change than in response to frequency change, thereby ruling out the possibility that group difference under the across-phoneme condition might reflect a difference in the ability to detect frequency changes of acoustic stimuli.
In conclusion, this study, using an EEG recording system with high spatiotemporal resolution, demonstrated that schizophrenic patients have more deficits in preattentive detection of across-phoneme change in vowel sounds than in detection of change in the duration of a vowel or a pure tone. Thus, we believe that this study provides the first electrophysiological evidence of impaired frontotemporal cortical networks for preattentive processing of speech sounds in schizophrenic patients. Our next goal will be to elucidate the relationship of this physiological abnormality in perception of speech sounds to higher-order, language-related cognitive dysfunctions in schizophrenic patients.