A considerable proportion of patients with schizophrenia perform an average of 1.5 to 2.5 standard deviations below population norms on standardized psychometric tests, a performance consistent with mild to moderate cognitive impairment
(1 –
3) . Other patients perform within age-adjusted norms, yet their performance is worse than predicted based on parental education
(4) . The cognitive impairment often appears before the onset of psychosis
(5 –
7), persists throughout the patient’s life regardless of remission of psychosis, and is correlated with functional outcomes
(8,
9) . For all these reasons, cognitive impairment in schizophrenia is an important target for therapeutic intervention
(10,
11) . However, despite the importance of cognitive dysfunction to outcome in schizophrenia, the benefit exerted by antipsychotic medications on cognitive performance remains questionable
(12,
13) .
Investigation of the cognitive effects of antipsychotics intensified after the introduction of second-generation (“atypical”) antipsychotic drugs. It has been hypothesized that second-generation antipsychotic drugs differ in their mechanism of action relative to first-generation antipsychotic drugs. The equivocal results of the plethora of individual studies and meta-analyses comparing the effects of second-generation antipsychotic drugs with those of first-generation antipsychotic drugs on cognitive performance reflect the unresolved debate regarding this issue. Some studies reported that the cognitive benefits of second-generation antipsychotics were superior to those of first-generation antipsychotics
(12,
22 –
21), but the effect size of these differences was not large and the results were inconsistent. Furthermore, the sample size of some of these trials was small, the statistical analyses employed were not always preplanned, and these trials did not always properly account for relevant confounders such as symptom change, anticholinergic treatment, change in extrapyramidal symptoms, and practice
(14,
16,
21 –
23) . Additionally, in many of these trials the investigator(s) or sponsor(s) selected a relatively high dose of the first-generation antipsychotic drug as the comparator, which might have biased the results in favor of the second-generation antipsychotic drug, since high doses of first-generation antipsychotics both impair motor performance and require anticholinergic drugs, which in turn may impair cognitive performance
(13) . Finally, many of the studies of the cognitive effects of second-generation antipsychotic drugs were funded by companies that manufacture these medications, a fact cited by some investigators as a possible source of bias
(24) . The doubts about the superior procognitive effects of second-generation antipsychotic drugs, taken together with doubts regarding the advantages of these drugs in other noncognitive areas of schizophrenia, became the impetus for several publicly funded trials intended to provide more definitive answers. The Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study
(25), funded by the U.S. government, addressed many of these biases. In a preplanned analysis, the CATIE study reported that although cognitive performance improved in all groups after 2 months of treatment with olanzapine, perphenazine, quetiapine, or risperidone, there were no statistically significant differences among the groups
(26) . Despite the many strengths of the CATIE trial, the following arguments were raised: 1) chronically ill patients are less likely to benefit from the procognitive advantages of second-generation antipsychotics than younger first-episode schizophrenia patients, since chronically ill patients who participate in trials often do so because they failed to respond to previous treatment(s); 2) haloperidol, which is the most prescribed first-generation antipsychotic, should have been used as the comparator instead of perphenazine, which has many characteristics of second-generation antipsychotics
(27) and is much less prescribed; 3) results obtained from blinded trials are not generalizable, since patients who participate in such trials are different from the patients seen in clinical practice; and 4) in blinded trials, the psychiatrist is deprived of the possibility to adjust the optimal dose to the individual patient.
The present study addresses some of the aforementioned limitations of the CATIE study by comparing the cognitive performance of first-episode patients randomly assigned in an open-label design to haloperidol or one of four second-generation antipsychotics (amisulpride, olanzapine, quetiapine, and ziprasidone). Since it has been argued that trials showing second-generation antipsychotics to be superior to haloperidol used doses of haloperidol that were too high, the drug was administered in the low dose of 1 to 4 mg/day. To improve the study’s generalizability and acceptability to patients and treating physicians, the design was pragmatic and included patient groups that are often excluded in efficacy trials (e.g., patients with substance misuse and patients who could not participate in blinded trials for safety or other reasons). The overall objective of this randomized, open-label trial was to compare the effectiveness of haloperidol with that of second-generation antipsychotics as reflected by the time to discontinuation of the randomized drug. The study (European First Episode Schizophrenia Trial [EUFEST]) has been described in detail elsewhere
(28,
29) .
Method
Participants (N=498) were 18- to 40-year-old patients who met DSM-IV criteria for schizophrenia or schizophreniform disorder, confirmed using the Mini-International Neuropsychiatric Interview–Plus
(30), and were recruited at 50 sites in 13 European countries and Israel.
Inclusion criteria were 1) recent onset of psychosis with <2 years between the onset of positive symptoms and recruitment into the trial; 2) <2 weeks exposure to antipsychotic drugs during the preceding year, and 3) <6 weeks lifetime exposure to antipsychotic drugs. Data were obtained on demographics, diagnoses, severity of psychopathology (Positive and Negative Syndrome Scale [PANSS]
[31], Clinical Global Impression [CGI] Scale
[32] ), and extrapyramidal symptoms (St. Hans Rating Scale
[33] ). The trial was conducted in accordance with the Declaration of Helsinki, Good Clinical Practice, and national regulatory requirements.
Patients were randomly assigned via a centralized, computerized online randomization system to open-label treatment with haloperidol, 1 to 4 mg/day (N=103); amisulpride, 200 to 800 mg/day (N=104); olanzapine, 5 to 20 mg/day (N=105); quetiapine, 200 to 750 mg/day (N=104); or ziprasidone, 40 to 160 mg/day (N=82). (Ziprasidone was not available in all participating countries at all times throughout the trial, which explains the smaller number of patients in this arm.) At all times during the trial, the dose administered to patients was at the discretion of the treating physician, providing that it remained within the range permitted by the protocol. Mood stabilizers, benzodiazepines, antidepressants, and anticholinergics were allowed before and during the trial. Additional antipsychotics were not permitted.
The following five cognitive tests were administered:
Rey Auditory Verbal Learning Test . This assessment is a list learning task in which the participants are read a list of words and are then tested for what they have learned by recall. Five trials are administered, and the outcome measure is the total number of words recalled
(34) .
Trail Making Test, Part A and Part B (two tests). These assessments are tests in which participants must first draw lines to connect consecutively numbered circles on one worksheet (Part A) and then connect the same number of consecutively numbered and lettered circles on another worksheet by alternating between the two sequences (Part B). The outcome measures are the time to complete the tests. These are tests of complex visual scanning, motor speed, and ability to shift strategy
(35) .
Purdue Pegboard Test . This assessment is a task in which participants must place pegs in the holes of a board. It tests motor speed and motor coordination and is sensitive to subtle motor dysfunction, including extrapyramidal symptoms. The outcome measure is the number of pegs placed correctly within 30 seconds by the dominant hand drawn
(36) .
Digit Symbol Coding . This assessment is a test in which the participant is allowed 120 seconds to move through a grid of 133 numbers (1 to 9) and pair them with the correct symbol by using a key consisting of nine symbol-digit pairs. The test is part of the WAIS-III
(37) and measures symbolic representational abilities, visual motor skill/dexterity, processing speed, attention/concentration, visual perception of abstract stimuli, and short-term visual memory. The outcome measure is the number of correct symbols drawn within 120 seconds
(37) .
The composite score, rather than the individual test scores, was assigned as the main outcome measure to increase statistical power. The five cognitive tests were selected based on practicality of administration for a large, multisite, international, multilingual, schizophrenia antipsychotic clinical trial. Motor abilities were emphasized in order to assess the effect of haloperidol on cognitive performance. All of the countries that participated had validated versions of the tests, except Bulgaria for the Rey Auditory Verbal Learning Test. Since lengthy testing is not always acceptable to patients, the tests selected represent a compromise between the comprehensiveness of the test battery and the desire to reduce the amount of missing data. The tests were administered at baseline and after 6 months of treatment (SD=3 weeks). For patients in whom the severity of psychosis at baseline was such that cognitive testing was not feasible, testing could be postponed until the end of the first month of the trial. The tests were administered by psychologists and psychiatrists trained at investigators meetings to administer the specified tests.
Of the 498 patients randomly assigned to one of the five trial drugs, 40 were missing baseline data and 172 completed ≤2 of the five neurocognitive tests at both time points and were therefore excluded from analyses using the composite score. The remaining 286 patients included in the analysis had valid data at baseline and again at 6 months for ≥3 of the five neurocognitive assessments. Analyses regarding each subtest included all patients who had valid data for the specific test at both baseline and the 6-month follow-up. Comparison of demographics and clinical characteristics between the group of 286 patients included in the composite score analyses and the group of 212 patients excluded from these analyses as a result of incomplete data revealed that excluded patients were less educated and had worse cognitive performance on two of the individual tests, but these differences were very small (
Table 1 ). Chi square tests revealed no differences in the distribution of gender and race between the group of included patients and the group of excluded patients. Before receiving trial medication, of the 286 patients included in the analyses, 29% were antipsychotic naive, 24.5% had received treatment with a first-generation antipsychotic drug, and 46.5% had received treatment with a second-generation antipsychotic drug. Comparisons among the patients randomly assigned to the five medication arms revealed no baseline differences on demographic characteristics (
Table 2 ). There were no differences in rates of extrapyramidal symptoms between the five study arms, except patients who were randomly assigned to haloperidol exhibited more akathisia (χ
2 =9.77, df=4, p=0.04), but these differences were very small.
Before randomization to the trial medication, 11.2% of patients were already receiving anticholinergic drugs as treatment for extrapyramidal symptoms resulting from medications taken before the present study. During the 6-month period between randomization and follow-up, anticholinergic drugs were added to the treatment regimen of an additional 18.5% of the participating patients. At the time of the 6-month follow-up, the mean daily antipsychotic doses were as follows: 2.5 mg for haloperidol, 455 mg for amisulpride, 12 mg for olanzapine, 458 mg for quetiapine, and 98 mg for ziprasidone.
Statistical Analyses
To analyze the results, a composite score of the five neurocognitive tests was defined as the mean of the patients’ results on all tests. Scores were standardized to create z scores (mean=0 [SD=1]) based on the means and standard deviations on the baseline assessments of the 286 patients included in the analyses. Effect size was calculated using Cohen’s formula
(38) . The Pearson correlation between baseline and the end-of-trial score was 0.79, indicating high within-subject consistency of the composite score. Within-group improvement in cognitive performance over time was evaluated using paired-sample t tests. Secondary analyses compared change in the five standardized test scores between baseline and month 6 and included all patients who had valid data for the specific test at both baseline and the 6-month follow-up. Treatment groups were compared using analysis of covariance and independent-sample t tests with Bonferroni corrections for multiple comparisons. Linear regression analysis was used in an attempt to predict improvement in the cognitive composite score from baseline to 6 months. Cox proportional hazards regression analyses were performed to examine the association between change in the cognitive composite score from baseline to month 6 and time to all-cause discontinuation or time until discontinuation as a result of lack of efficacy. The statistical analysis, similar to the method described by Keefe et al.
(26), was determined before access to the cognitive data was available.
Results
Change in the cognitive composite score from baseline to 6 months showed improvement in each of the five treatment groups (haloperidol: t=–3.43, df=51, p=0.001 [effect size=0.43]; olanzapine: t=–7.11, df=73, p<0.001 [effect size=0.56]; quetiapine: t=–6.25, df=59, p<0.001 [effect size=0.51]; ziprasidone: t=–4.75, df=44, p<0.001 [effect size=0.49]; and amisulpride: t=–4.15, df=54, p<0.001 [effect size=0.33]). There was no overall difference between the five treatment groups in the composite scores (F=0.56, df=4, 281, p=0.69) or individual tests (all p values >0.11) (
Table 3 ). Random effects analyses, including subjects with partial missing data, yielded similar results. A significant improvement in the cognitive composite score was seen over time (p<0.001), but there were no differences between the five study arms (p=0.42). Controlling for multiple comparisons using Bonferroni corrections, t tests comparing the change in the subtest scores from baseline to month 6 for haloperidol versus the four second-generation antipsychotic drugs (both separately and grouped together) revealed no significant differences.
Since there were no differences among the five treatment groups on the cognitive composite and subtest scores, presentation of the individual measures at baseline and the 6-month follow-up included all patients who had valid data for both time points collapsed across treatment groups (
Table 4 ). Change in the individual measures showed consistent improvement.
Changes in the effect size of the composite scores of patients who were drug naive before baseline testing did not differ significantly from that of patients who received drug treatment before baseline testing (effect size=0.37 [SD=0.58] versus effect size=0.45 [SD=0.65], p=0.31, respectively).
The analyses were repeated with trial site, years of education, and baseline alcohol and substance use included in the model. These covariates did not produce results that differed significantly from the unadjusted analyses.
A model of prediction of improvement in the cognitive composite score from baseline to 6 months suggested that a lower (worse) baseline composite score was a predictor of greater cognitive improvement (r 2 =0.25, p<0.001). Drug and alcohol abuse, trial site, and PANSS subscale scores were not significant predictors of cognitive improvement. Similar results were found when examining each of the five study arms separately.
The Pearson correlations between change in the cognitive composite score from baseline to the 6-month follow-up and change in PANSS scores from baseline to the 6-month follow-up revealed negative correlations (
Table 5 ). These negative correlations indicated that cognitive improvement was associated with symptom amelioration, but the magnitude of these correlations was very small. Changes in the composite scores of the 53 patients for whom anticholinergic medications were added between the baseline and 6-month assessments did not differ from that of the rest of the patients (effect size=0.43 [SD=0.6] versus effect size=0.41 [SD=0.75], respectively).
Cox proportional hazards regression analyses suggested that change from baseline to 6 months in the cognitive composite score was not a significant predictor of time to discontinuation as a result of efficacy (p=0.99) or time to all-cause discontinuation (p=0.96). The results were unchanged when baseline covariates that were found previously to predict time to discontinuation were included
(26) .
Discussion
The present study compared cognitive performance in first-episode schizophrenia patients after 6 months of treatment with haloperidol, olanzapine, amisulpride, ziprasidone, or quetiapine. Compared with the results of baseline testing, cognitive scores improved moderately, with no differences among the five treatment groups. Lower (worse) baseline composite scores predicted greater cognitive improvement. Change in cognitive performance was weakly related to change in symptom scores.
Results of this randomized, open-label trial, reporting small advantages of second-generation antipsychotic drugs relative to first-generation antipsychotic drugs in terms of cognitive performance, differ from findings of some previous studies and meta-analyses
(22 –
21) . A likely explanation for this difference might be the relatively high doses of haloperidol used in some, but not all, related studies
(13) . High doses increase the risk of extrapyramidal symptoms and the use of anticholinergic medications, which in turn impair motor and cognitive speed. It is therefore possible that results of previous trials comparing second-generation antipsychotics with first-generation antipsychotics on cognitive functioning in schizophrenia in fact reflect the diminished propensity of second-generation antipsychotics to induce extrapyramidal symptoms rather than an intrinsic cognitive benefit of these new compounds. In the present trial, the mean daily dose of haloperidol was only 2.5 mg, which is a lower dose than that used in most other studies.
Results of the present trial are in agreement with the results of the CATIE schizophrenia trial, which reported cognitive improvements after 2, 6, and 18 months of treatment, relative to baseline, but no differences among treatment groups, with the exception of a slight advantage for perphenazine
(26) . The larger effect sizes reported in the present study compared with that of the CATIE study could be attributable to the younger and less chronic population examined in our study. The magnitude of the improvement observed in the present study is similar to that reported in other studies of first-episode schizophrenia patients
(15,
39,
40) . One single study
(39) of first-episode patients reported a very small advantage of risperidone over haloperidol, but the advantage was extremely small and the dose of haloperidol was higher than the dose used in the present trial.
Part of the improvement in cognitive test scores between baseline and 6-month testing observed in the current study might have been a result of practice or placebo effects
(41), such as prior exposure to the tests, familiarity, and procedural learning. This was demonstrated in a recent study, conducted by Goldberg et al.
(23), in a sample of first-episode schizophrenia patients treated with second-generation antipsychotics as well as in a comparison sample of healthy subjects, reporting the same order of magnitude of improvement found in the present study (i.e., effect size=0.33). In the present trial, the changes between the baseline and 6-month assessments ranged from an effect size of 0.56 to 0.33, depending on the individual test. Although patients were tested more often in the study conducted by Goldberg et al.
(23) than in the present study, it is still possible that at least part of the improvement observed in the current trial might have been the result of practice. Thus, the 6 months that elapsed between testing was sufficiently long and the effect size was sufficiently large to suggest that at least part of the effect was drug related and not practice related. It has also been suggested that first-generation antipsychotics, particularly haloperidol, might attenuate the practice effect
(23) . However, this is not reported in the present study, since the cognitive improvement from a low dose of haloperidol was not significantly different from that of the second-generation antipsychotic drugs. Nevertheless, it is still conceivable that higher doses of haloperidol might exert a deleterious effect on cognitive performance
(22) . Furthermore, these data do not support the contention that treatment with haloperidol plus anticholinergic drugs is equivalent to treatment with second-generation antipsychotics.
There are several limitations to the present study. First, a more comprehensive cognitive battery might have revealed different results. However, there is little biological rationale to assume that a particular cognitive task not assessed in our study might have preferentially been affected by one but not another of the five drugs tested. Second, because of the open-label design and the possibility that some investigators might have been convinced by results of previous trials reporting advantages of second-generation antipsychotics over first-generation antipsychotics in terms of cognitive performance, this might have led to an “expectation bias,” which could have conferred an advantage to the second-generation antipsychotic drugs over haloperidol
(29) . However, this potential bias is not supported by the results. Third, since antipsychotics can be detected in both blood and the CNS for many days and weeks after discontinuation of treatment, it is conceivable that the effect of the antipsychotic drugs administered before baseline testing could have affected baseline cognitive performance. In the present trial, like in many other similar trials
(42), some patients had very brief washout periods, others had no washout period, and a few patients who could not be tested off medication because of the severity of their psychosis were allowed to have baseline testing shortly after random assignment. However, this is unlikely to have affected the results, since there were no baseline differences in cognitive performances among the five treatment groups, nor were there differences between the group of patients who were drug naive before baseline testing and the group of patients who received drugs before baseline testing. Furthermore, the NIMH Measurement and Treatment Research to Improve Cognition in Schizophrenia (MATRICS) Task Force on the design of clinical trials to assess the cognitive effects of drug treatment
(43) and other similar trials evaluating cognition in first-episode patients
(42) suggests that the interference of severe psychosis with testing might not justify antipsychotic washout. Finally, it cannot be ruled out that a first-generation antipsychotic drug or a second-generation antipsychotic drug not included in the present trial, such as risperidone or clozapine, would have proved to be superior to the drugs tested.
Despite the limitations of the present study, as well as other similar studies, it is unlikely that different trial designs or different patient populations would help differentiate any of the currently marketed antipsychotic drugs in terms of cognitive benefits. First, the discovery platform common to all antipsychotic drugs for the last 50 years has been the same, and its target has been the reduction of psychotic symptoms, not cognitive improvement. Antipsychotic drugs are screened based on their preclinical in vitro and in vivo similarities to other antipsychotics already proven to ameliorate psychosis in humans. The proof of concept and the go-/no-go decisions in phases II and III clinical trials have been based on the antipsychotic properties of these drugs and not on their potential procognitive benefit. If second-generation antipsychotic drugs differ from first-generation antipsychotic drugs in areas other than adverse effect profile, they do so in terms of global effectiveness as reflected by the results of the trials conducted by Fleischhacker et al.
(28) and Kahn et al.
(29) .
The present study, similar to many other previous studies, found only very weak correlations between psychotic symptoms at baseline or improvement in psychotic symptoms and cognitive impairment at baseline or improvement in cognitive performance, further underlying the separation between the desired procognitive and antipsychotic effects. It is therefore not surprising that not a single antipsychotic drug distinguishes itself in terms of cognitive benefits.
Interestingly, individual trials, even the most pragmatic and independently funded trials, very rarely provide definitive answers to the clinical question they intend to address. This is particularly true when the debate is about the advantages and distinguishing characteristics of drugs aimed at the same indications but perceived as older versus newer generations of drugs. Neither the CATIE study
(44) nor the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial
(45 –
50), a trial showing very little differences between diuretics compared with novel antihypertensive drugs, settled the debate or immediately and radically affected prescribing practices. It appears that only a continuous stream of quality publications in high-impact journals and educational efforts can alter established perceptions and affect prescribing practices
(51) .
In summary, we found that antipsychotics, whether a low dose of haloperidol or several second-generation antipsychotics, were associated with improvement on cognitive test performance in patients in the early stages of schizophrenia but that the magnitude of improvement among these drugs did not differ.