Schizophrenia has been consistently associated with a range of early neurodevelopmental abnormalities
(1 –
3) . One measure that may reflect early neurodevelopmental abnormality and has received considerable attention is general intellectual functioning—or IQ. Estimates of premorbid IQ are attainable through several study designs, including follow-back studies of school-, conscript-, or clinic-based testing, longitudinal birth or conscript cohort studies, and studies of samples at genetic risk for schizophrenia. The last quantitative review of the literature, which was published in 1984, suggested that premorbid IQ in individuals with schizophrenia was, on average, 0.43 standard deviations below that of comparison samples
(4) .
However, the diagnostic criteria for schizophrenia and related disorders became more restrictive with the publication of the third edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-III)
(5) and its subsequent editions. This diagnostic narrowing is also reflected in related international systems
(6) and diagnostic assessment tools such as the Research Diagnostic Criteria (RDC)
(7) . Studies published during the past 20 years have provided data from samples diagnosed using these more narrow criteria. Several have also used more comprehensive psychometric measures of IQ. In addition, a substantially larger group of studies with actual premorbid data is currently available.
The purpose of the present quantitative review was to re-evaluate the mean premorbid effect size of IQ in schizophrenia samples relative to comparison samples using more recent studies. The following five specific questions were considered in our assessment: 1) What is the mean effect size for premorbid IQ impairment in schizophrenia? 2) Is the mean effect size reliable? 3) Are discrepancies in verbal and nonverbal IQ present during the premorbid period? 4) Is there evidence of gender differences in premorbid IQ? 5) Is there evidence of increasing impairment over time or with age?
Method
Literature Search and Study Selection
Our literature search included both an online PubMed database search and a careful evaluation of the references from reviews and original studies pertaining to premorbid IQ in schizophrenia that were published before March 1, 2007. The keywords used in the computer search included all combinations of the following words: “IQ,” “intelligence,” “cognition,” “neuropsychological,” “neurocognitive,” “schizophrenia or psychosis,” and “premorbid or predictors.” All potentially relevant studies were examined manually to assess inclusion and exclusion criteria. Inclusion criteria were the following: 1) published in English, 2) use of standardized psychometric IQ (or equivalent) tests, 3) testing conducted prior to the onset/diagnosis of schizophrenia, and 4) test results provided separately for a group consisting solely of subjects diagnosed with schizophrenia, schizoaffective disorder, or schizophreniform disorder and for a group of healthy comparison subjects. Exclusion criteria were as follows: 1) IQs estimated from achievement tests or single tests, such as word reading, or solely from verbal IQ or performance IQ assessment; 2) absence of premorbid IQ data from a relatively healthy comparison group or comparison data reported solely from a group at high risk for psychosis, with known cognitive delays in childhood or matched for childhood IQ; 3) insufficient data to closely estimate effect sizes (with each study required to have one of the following combinations: means, standard deviations, and number of subjects for each group; either a t or F score and the degrees of freedom or number of subjects for this statistic; or group differences expressed in standard deviation or z score units); and 4) data reported on the same or overlapping sample as a more complete or relevant study.
Methodological Categorization
Although there are several important differences in the methodologies of studies on premorbid IQ in schizophrenia, we primarily focused on two issues we thought were likely to have significant impact on estimates of premorbid IQ deficits in schizophrenia patients. The first relates to diagnostic assessment criteria and methods. A significant change was made in the diagnostic criteria for schizophrenia from DSM-II to DSM-III (i.e., narrowing criteria for schizophrenia and altering the boundary with affective psychoses) and with the advent of the RDC. The only prior meta-analysis of premorbid IQ drew primarily from studies in which samples were diagnosed according to pre-DSM-III criteria or before RDC. Thus, we categorized the studies in our review based on whether or not diagnoses were made by criteria according to ICD-10, RDC, DSM-III, DSM-III-R, or DSM-IV.
The second important methodological issue considered was the type and number of tests used to estimate IQ. Individually-administered test batteries of both verbal and nonverbal subtests are now considered the gold standard for assessing IQ, with the Wechsler scales (Wechsler Adult Intelligence Scale [WAIS]
[8], Wechsler Intelligence Scale for Children [WISC]
[9] ) being the most well established. However, given the extensive time needed to administer these test batteries, studies with large samples or many neuropsychological measures typically estimate IQ using two to four subtests. Although these estimates are known to be highly comparable with IQ scores based on full test batteries in standardization samples, it has been argued that short forms of IQ tests may attenuate reliability and validity, particularly in clinical or racially diverse samples
(10 –
12) . Similarly, group-administered IQ tests, typical of school settings, reportedly yield more variable estimates relative to a full, individually-administered IQ test battery
(12) . For this reason, we characterized each study based on the number of tests used to estimate IQ and whether tests were individually or group administered. Studies using at least one-half of an individually-administered IQ test battery were considered to have “long” IQ estimates, and studies using two to four subtests of an IQ test battery or group-administered IQ tests were considered to have “short” IQ estimates.
Using these two methodological considerations, we organized studies a priori into the following three levels: 1) level 1, studies that used recent diagnostic criteria (RDC, ICD-10, DSM-III, or DSM-IV) and “long” IQ estimates; 2) level 2, studies that used recent diagnostic criteria and “short” IQ estimates; and 3) level 3, studies that used older diagnostic systems (pre-DSM-III or pre-RDC) and “short” IQ estimates. We found no studies using older diagnostic systems and “long” IQ estimates.
Other (secondary) methodological issues that were considered in our analyses included age at premorbid IQ testing and sample ascertainment. In order to analyze effect sizes by age at testing, we categorized studies according to the following general age ranges during which testing was conducted: 1) exclusively during childhood (age <13); 2) exclusively during adolescence or early adulthood (age ≥13); and 3) across a range of ages, including both childhood and adolescence, or without specifying age. If IQ estimates declined with the onset of acute psychosis or during the “prodrome” (the period of active increase of subthreshold symptoms leading into psychosis), we expected effect sizes to be larger in samples of older individuals who were presumably closer to psychosis onset.
Similarly, the method of sample ascertainment might lead to discrepant estimates of premorbid impairment in schizophrenia. We categorized studies according to whether patient samples were identified through 1) diagnostic screening or hospital record linkage of large cohort or population samples; 2) follow-up of genetic risk samples; 3) follow-up or hospital record linkage of conscript samples; 4) selective hospital samples with follow-back assessment of school, conscript, or clinic records; and 5) follow-up of prodromal samples. Comparison samples were either 1) remaining members of cohort or population samples or 2) healthy individuals matched with a patient sample on various demographic variables.
Statistical Analysis
Effect sizes for each study were calculated using Hedges’ adjusted, standardized mean differences
(13) . We calculated both weighted and unweighted means. Weighted means were calculated using the inverse variance weight
(14) . A single mean effect size was calculated for studies with multiple effect sizes. For studies with independent samples (e.g., effect size reported by gender), the study effect size was the weighted mean. An unweighted mean effect size was calculated for studies that used repeated measures or other dependent-sample measures (e.g., effect size for verbal and nonverbal subtests). One effect size (
[15] ; for the comparison of WISC scores only) was an extreme outlier (z >3) and therefore excluded from all analyses.
We used the analogue to the analysis of variance method
(13) to model between-study variance due to key study design factors, such as a priori methodological level, patient and comparison sample ascertainment methods, and age at testing. Heterogeneity was determined by a significant Q statistic based on chi square tables
(14) . Finally, we calculated a mean IQ for all studies that reported IQ scores by group for tests with known means and standard deviations.
Discussion
To our knowledge, this is the first quantitative review of the literature on premorbid IQ in schizophrenia since the well-cited meta-analysis by Aylward et al.
(4) was published in 1984. As such, it is the first to incorporate studies that used the most recent diagnostic criteria for schizophrenia and samples that reached the age of schizophrenia onset since the Aylward et al.
(4) review. With the increase in both the number of available studies and diagnostic specificity, we were able to apply more rigorous inclusion criteria. Analyzing only studies that used standardized measures of psychometric IQ administered during the premorbid period in samples of individuals who later developed schizophrenia, schizoaffective disorder, or schizophreniform disorder, we found a moderate and reliable effect size (Cohen’s d=–0.54) for premorbid IQ impairment (relative to IQ estimates in comparison samples) that was comparable with the mean effect size reported by Aylward et al. ([Cohen’s d=–0.43] As noted in Table 1, we used different methods than Aylward et al. Applying these methods to the data summarized in Table 2 of the review by Aylward et al. [
4, p. 437], we obtained a mean Cohen"s d of –0.49.). The effect size we found is almost exactly one-half of the mean effect size found in chronically ill samples (Cohen’s d=–1.10
[37] ). Furthermore, it is consistent across studies using more recent as well as older diagnostic criteria and across studies using longer and individually-administered and shorter or group-administered IQ test batteries (with one exception
[29] ).
Interestingly, in examining factors that might account for the unusually small premorbid IQ deficit found in the schizophrenia group in the Lubin et al.
(29) study, we found that the mean IQ score of the comparison group (standard score equivalent=95.3) was slightly below the standard score mean for IQ (100). While this was not the lowest IQ score of a comparison group among the studies we examined (mean IQ=90.6 for comparison subjects at age 7
[17] ), it may represent a more poorly matched comparison sample relative to other studies. More precisely, the study sample examined by Lubin et al. is the only one in which comparison subjects were recruited from specific subsamples of a larger cohort. An earlier published report on this same study sample indicated that nearly one-third of the comparison subjects were recruited from the same hospital patient population as schizophrenia subjects. The other two-thirds were recruited from specific work sites, a field hospital, and a troop command
(36) . It is possible that this recruitment method yielded comparison subjects with a lower mean IQ score relative to the larger conscript cohort from which the schizophrenia subjects were identified.
Appropriateness of comparison group sampling is important to this type of analysis. Based on standardized psychometric IQ test scores (mean=100 [SD=15]), one would expect an effect size of Cohen’s d=–0.54 to equate to a mean premorbid IQ score of 91.8 in schizophrenia samples. The mean premorbid IQ score of 94.7 in the studies we reviewed reflects a mean IQ in comparison groups that was slightly above standardization means of 100. In other words, the mean deficit of Cohen’s d=–0.54 may be slightly inflated because of a potential confound in comparison sampling. However, effect sizes in large samples with comparison subjects, either well matched on several demographic variables or highly representative of the larger population
(17,
23,
26,
30 –
32), were homogeneous in demonstrating a mean premorbid IQ deficit of approximately one-half the standard deviation. This suggests that comparison sampling was not a significant overall confound, even if comparison subjects who were poorly matched or had lower mean IQ scores contributed to atypical results at an individual study level
(16,
25,
29) .
The modest impairment of premorbid IQ in schizophrenia samples only underscores the persistent question of whether IQ declines with illness progression. While the discrepancy between pre- and postmorbid effect sizes is consistent with theories of a decline in IQ over the course of schizophrenia onset, it might also be explained by differences in sampling, medication, or clinical state. However, there is no obvious indication that the samples from which premorbid IQ measures were obtained in our analyses were less chronic overall relative to those in the Heinrichs and Zakzanis meta-analysis of post-onset neurocognition
(37) . In addition, a significant correlation between neuroleptic dose and IQ deficit was not found in the Heinrichs and Zakzanis review. Moreover, while Heinrichs and Zakzanis did find a smaller mean effect size in studies that used non-WAIS-R estimates of full-scale IQ compared with WAIS-R estimates, most of the non-WAIS-R estimates were based on single tests or word reading tests, which were not included in our analyses. This evidence suggests that the discrepancy between pre- and postmorbid IQ in schizophrenia might be related to clinical state or illness progression instead of sampling, medication effects, or measurement artifact.
Although nine of the 18 studies provided multiple measures of IQ over time in a given sample (including two studies with reports on similar, although not exactly the same, samples), the data were insufficient to calculate a mean change in effect size over time. Only three longitudinal studies reported significantly greater impairment in IQ over time, and these changes in impairment were from pre- to postmorbid testing. Since these changes could have occurred entirely during the postmorbid period, there remain no reliable longitudinal data supporting a premorbid decline in IQ in schizophrenia. Prospective longitudinal data from pre-adolescence through the premorbid, prodromal, and illness stages are needed to determine if and when IQ declines over time for individuals who develop schizophrenia (relative to comparison subjects). However, as demonstrated in the study conducted by Lane and Albee
(35), careful matching of comparison groups is critical to analyses of change over time. It is only through the careful matching of comparison groups that we can properly consider the significance of age at assessment, age at symptom and psychosis onset, and age and duration of illness at retesting.
Analyses of premorbid functioning in specific and separable neurocognitive domains over time may also be important in identifying patterns of change not evident in global measures of cognition such as IQ. Deficits in some domains, such as attention and verbal memory, may be apparent during the premorbid or early prodromal phases and represent markers of vulnerability or predictors of illness
(38) . Cognitive functioning in other domains, such as executive or olfactory functioning, may become increasingly impaired over time or with illness progression
(34,
39) . However, a focus on single neurocognitive functions may be less promising in indexing vulnerability and predicting outcome than combinations of relatively distinct neurocognitive and behavioral variables (e.g.,
40 ). In any case, interpretations regarding the deterioration of IQ—or lack thereof—should not be generalized to other neurocognitive functions that are not sufficiently measured by IQ tests (e.g., executive functions, memory).
In summary, with the exception of one study that had potentially problematic comparison group sampling, studies that incorporated several different methodologies and IQ measures provided a notably consistent and reliable report of IQ deficit prior to the onset of schizophrenia. This medium-sized premorbid deficit 1) was approximately one-half that found after diagnosis; 2) could be reliably measured in childhood; and 3) did not appear to progress with age or over time (within the limited data available), even during the onset of early prodromal symptoms. In contrast to findings from several individual studies, evidence is lacking at the meta-analytic level to support the theory of a larger premorbid deficit in performance versus verbal IQ or in male versus female subjects. While the overall finding is highly consistent with theories of schizophrenia as a neurodevelopmental disorder, the size of premorbid relative to postmorbid IQ estimates supports the presence of additional progressive deterioration over the transition to acute psychosis. It is not at all clear whether and when this deterioration occurs or whether it is best accounted for by specific or widespread changes. These questions can only be answered with repeated measures across a number of cognitive domains during the premorbid, prodromal, and very early stages of psychosis onset as well as over time.