Drug choice in the treatment of schizophrenia has been controversial. Second-generation antipsychotic drugs, which have a low propensity to cause extrapyramidal side effects, were introduced in the 1990s. As their cost represents a large proportion of mental health budgets, totaling $11.7 billion in the United States in 2005
(1), there is a debate as to their superior effectiveness compared with lower-cost first-generation antipsychotics, such as haloperidol. Meta-analyses have shown that some second-generation antipsychotics (amisulpride, clozapine, olanzapine, and risperidone) are more efficacious than first-generation antipsychotics
(2,
3) . Some evidence suggests that even these superiorities may be due to an inappropriate choice of the comparator first-generation antipsychotic, the dosage of the comparator antipsychotic, or lack of prophylactic antiparkinson medication
(4,
5) . Despite these controversies, second-generation antipsychotics have become the most frequently prescribed drugs in some countries, including the United States. The question of whether there are efficacy differences between these drugs thus becomes very important.
Results
The search yielded 3,620 citations. Of 612 studies inspected, 319 were excluded: 44 studies with no or inadequate randomization; 23 studies with no appropriate drug group; 230 open-label studies (181 from China); one study with inappropriate participants; six studies with no usable data; and 15 studies that used groups of second-generation antipsychotics. We included 293 publications on 78 studies with 167 relevant arms and 13,558 participants (only the principal publications are referenced). Nine studies included amisulpride, four aripiprazole, 28 clozapine, 48 olanzapine, 21 quetiapine, 44 risperidone, two sertindole, nine ziprasidone, and two zotepine.
Forty-nine studies were mainly sponsored by pharmaceutical companies, and 22 were publicly funded; funding was uncertain for seven studies despite written queries. The participants had relatively chronic courses of illness, with mean ages in the mid-30s, but five trials included only first-episode patients. The diagnostic criteria used were mainly those of DSM-III-R, DSM-IV, and ICD-10 (for details, see the online data supplement).
Primary Outcome Measure: PANSS Total Score
The pooled effect sizes of each second-generation antipsychotic versus every other one are shown in
Figure 1 (forest plots with the single studies can be found in the online data supplement). It should be noted that all results are shown twice. For example, the comparison between amisulpride and olanzapine is described under “amisulpride versus other second-generation antipsychotics” as well as under “olanzapine versus other second-generation antipsychotics.” Despite the redundancy, the results are easier to understand in this format; otherwise the reader interested in a given drug would have to look up the findings in different sections, making it difficult to see the gestalt. To save space, we present here, for the significant results, only the number of participants combined for the two second-generation antipsychotics compared, the difference in PANSS scores (weighted mean difference), and the p value, and, for nonsignificant results, the number of participants combined for the two second-generation antipsychotics compared. Negative values mean superiority of the first second-generation antipsychotic throughout. All statistical details are presented in figures and tables. The data were rather homogeneous, and the few cases of significant heterogeneity are reported in the text.
Amisulpride
There were no significant differences between amisulpride and olanzapine (N=701), risperidone (N=291), and ziprasidone (N=122).
Aripiprazole
Aripiprazole was less efficacious than olanzapine in two studies sponsored by aripiprazole’s manufacturer (N=794, weighted mean difference=5.0, p=0.002). Two further studies found no significant difference compared with risperidone (N=372).
Clozapine
Clozapine was not significantly different from olanzapine (N=619), quetiapine (N=232), risperidone (N=466), and ziprasidone (N=146). Clozapine was significantly more efficacious than zotepine (N=59, weighted mean difference=–6.0, p=0.002). The comparison with risperidone was significantly heterogeneous due to one study sponsored by clozapine’s manufacturer
(17) ; excluding the study did not change the overall results.
Olanzapine
Olanzapine was significantly more efficacious than aripiprazole (N=794, weighted mean difference=–5.0, p=0.002), quetiapine (N=1,449, weighted mean difference=–3.7, p<0.001), risperidone (N=2,404, weighted mean difference=–1.9, p=0.006), and ziprasidone (N=1,291, weighted mean difference=–8.3, p<0.001). No significant difference between olanzapine and amisulpride (N=701) or clozapine (N=619) emerged.
Quetiapine
Quetiapine was significantly less efficacious than olanzapine (N=1,449, weighted mean difference=3.7, p<0.001) and risperidone (N=1,953, weighted mean difference=3.2, p=0.003). There was no significant difference compared with clozapine (N=232) and ziprasidone (N=710).
Risperidone
Risperidone was significantly more efficacious than quetiapine (N=1,953, weighted mean difference=–3.2, p=0.003) and ziprasidone (N=1,016, weighted mean difference=–4.6, p=0.002). It was less efficacious than olanzapine (N=2,404, weighted mean difference=1.9, p=0.006). No difference compared with amisulpride (N=291), aripiprazole (N=372), clozapine (N=466), and sertindole (N=493) emerged.
Sertindole
There was no significant difference between sertindole and risperidone in two studies sponsored by sertindole’s manufacturer, one in treatment-resistant patients, which found results with risperidone to be 7 points better, the other without this criterion finding sertindole 3.5 points better (N=493), leading to significant heterogeneity.
Ziprasidone
Ziprasidone was less efficacious than olanzapine (N=1,291, weighted mean difference=8.3, p<0.001) and risperidone (N=1,016, weighted mean difference=4.6, p=0.002). No significant differences compared with amisulpride (N=122), clozapine (N=146), and quetiapine (N=710) were found.
Zotepine
Zotepine was less efficacious than clozapine (N=59, weighted mean difference=6.0, p=0.002).
Secondary Outcomes: Positive and Negative Symptoms
The findings suggest that a substantial portion of the efficacy differences was due to superior improvement in positive symptoms. Results for positive symptoms paralleled those found for overall symptoms except that olanzapine was not significantly more efficacious than risperidone (
Figure 2 ; see also the online data supplement). There were no significant differences for negative symptoms, with the exception of a superiority of quetiapine compared with clozapine in two small Chinese studies of first-episode schizophrenia (
Figure 3 ). The comparisons of quetiapine with risperidone and olanzapine with ziprasidone were heterogeneous, and the results did not change when outliers were excluded (see also the online data supplement).
Dropout Due to Inefficacy of Treatment
The rates of dropout due to poor efficacy were consistent with the primary outcome measure, except that clozapine was significantly more effective than risperidone, and amisulpride was superior to ziprasidone. Furthermore, there was no significant difference in a single study comparing aripiprazole and olanzapine, and no significant difference between risperidone and ziprasidone (
Figure 4 ; see also the online data supplement).
Metaregressions
Metaregression did not detect significant effects of study duration, antipsychotic dosages or dose ratios, or study quality. Nor were there significant effects for sponsorship, with the exception of clozapine versus risperidone (coefficient=6.3 in the expected direction, p=0.015).
Sensitivity Analyses
The results of the extensive sensitivity analyses (pharmaceutical sponsorship, single-blind studies, lower-quality studies, effectiveness studies, CATIE phase 2, first-episode studies, Chinese studies, etc.) did not alter the primary findings (see the online data supplement). The important results are summarized below.
Pharmaceutical sponsorship and study quality
Excluding studies sponsored by pharmaceutical companies (see the online data supplement) or excluding studies with a Jadad quality score <3 did not change the results.
First-episode and treatment-resistant populations
The five first-episode studies showed no difference between second-generation antipsychotics. Most studies of treatment-resistant patients involved clozapine, which was not more efficacious than olanzapine (N=7, N=570, weighted mean difference=–0.2, 95% CI=–3.6 to 3.2), risperidone (N=5, N=471, weighted mean difference=–1.3, 95% CI=–5.8 to 3.2) or ziprasidone (N=1, N=146, weighted mean difference=0.5, 95% CI=–6.7 to 7.7).
Clozapine dose
In studies with mean clozapine dosages above 400 mg/day, clozapine was superior to risperidone (N=2, N=335, weighted mean difference=–6.6, 95% CI=–11.5 to –1.7), but not olanzapine (N=2, N=154, weighted mean difference=2.4, 95% CI=–2.4 to 7.3). There were no data for other drugs.
Sensitivity analysis using Hedges’ g
The only change when adding studies using scales other than the PANSS (using Hedges’ g) was that amisulpride was significantly more efficacious than risperidone for negative symptoms (see the online data supplement).
Publication bias
Funnel plots failed to detect publication bias. The fail-safe number of unpublished studies that would make the results no longer statistically significant was a median of 19 (see the online data supplement).
Discussion
This the first meta-analysis of the efficacy of all second-generation antipsychotics in randomized trials that compare two or more of these agents head-to-head. The main results are summarized in
Table 1, and selected side effects are summarized in
Table 2 . Olanzapine was more efficacious than aripiprazole, quetiapine, risperidone, and ziprasidone, and its efficacy was similar to that of amisulpride and clozapine. Risperidone was less efficacious than olanzapine, but more efficacious than quetiapine and ziprasidone. The pattern of superior outcomes in terms of positive symptoms subscore and dropouts due to inefficacy paralleled that of the PANSS total score, whereas there was little difference on negative symptom score. The findings are similar to those in phase 1 of CATIE (olanzapine > [risperidone=quetiapine=ziprasidone]) for its efficacy measures—time to dropout due to poor efficacy, time with good response, and PANSS total score
(4) . They are also consistent with meta-analyses comparing second-generation antipsychotics with first-generation antipsychotics
(2,
3,
21) . In the most recent of these
(3), the following pattern emerged: clozapine (Hedges’ g=–0.52), amisulpride (Hedges’ g=–0.31), olanzapine (Hedges’ g=–0.28), and risperidone (Hedges’ g=–0.13) were significantly more efficacious than first-generation antipsychotics, while aripiprazole, quetiapine, ziprasidone, and zotepine were only as efficacious as the older drugs. Thus, the only clear deviation was that we expected better results from clozapine, which was only more efficacious than zotepine, and more efficacious than risperidone in terms of dropout due to inefficacy.
The results were remarkably consistent, although the studies were sponsored by different companies. Our recent analysis
(22) using blind raters showed that pharmaceutical sponsors highlight the benefits of their own second-generation antipsychotics in the abstract. In meta-analysis, the outcomes that will be presented are predefined, which avoids the selective presentation of favorable results. Indeed, Davis et al.
(23) recently extracted data in a predefined manner and found no industry bias in comparing the numerical results of industry-sponsored and non-industry-sponsored studies. We performed sensitivity analyses and metaregressions with sponsor, dose, study quality, treatment resistance, study origin, and trial duration as moderators and found only a few differences. We also included only blinded randomized trials because we recently found that lack of blinding can clearly bias the results in this competitive area
(3) .
The analysis of positive and negative symptom subscores on the PANSS suggested that most of the differences between drugs were due to positive symptoms rather than negative symptoms. Many studies selected patients with predominantly positive symptoms, and this design may be less sensitive for detecting negative symptom changes. The results based on dropouts due to insufficient efficacy were by and large consistent with results on the primary outcome measure. The few first-episode studies found no differences between agents. Such patients differ substantially from multiepisode patients, and more research on treatment of this population is needed.
Most studies available for analysis involved olanzapine (N=48), followed by risperidone (N=44), clozapine (N=28), and quetiapine (N=21), while few or no studies compared the other second-generation antipsychotics against one another (see the online data supplement). For example, since amisulpride was not distinguishable in efficacy from olanzapine and risperidone—which is consistent with amisulpride’s better efficacy compared with first-generation antipsychotics
(2,
3,
21) —comparisons with other second-generation antipsychotics would be essential to verify that it really is a more effective second-generation antipsychotic, as suggested by the open randomized European First-Episode Schizophrenia Trial
(24) . In the only comparison with ziprasidone at full dosage (sponsored by its manufacturer), amisulpride was given in low doses (100–200 mg/day), and positive symptoms were omitted from the report
(25) . Nevertheless, amisulpride was superior in terms of dropouts due to inefficacy.
In our opinion, clozapine is a more efficacious drug because it has consistently been shown to be more effective than first-generation antipsychotics
(2,
21,
26), as well as other second-generation antipsychotics in CATIE phase 2 and in the Cost Utility of the Latest Antipsychotic Drugs in Schizophrenia Study (CUtLASS), which could not be included here
(12,
13,
27) . The clozapine group of CATIE phase 2 was a nonblinded study arm
(12), and CUtLASS compared clozapine with a number of second-generation antipsychotics as a group. Weekly blood tests in the clozapine group may have improved monitoring
(27) . Clinical experience, including relapses when patients are switched from clozapine and antisuicidal properties of clozapine, also suggests its superiority
(28) .
The most likely explanation for not finding clozapine superior here is that most studies used low or very low clozapine dosages; five used dosages under about 210 mg/day, and several had an upper limit of 400 mg/day. In the few studies in which mean dosages exceeded 400 mg/day, clozapine was more efficacious than risperidone. In two pivotal studies
(29,
30) demonstrating clozapine’s superiority to first-generation antipsychotics, the mean dosages were 600 mg/day and 523 mg/day. Randomized, blinded dose-finding studies (in which a dosage of 600 mg/day was found to be best
[31] ) and our review of five controlled plasma-level studies
(32) showed that low clozapine dosages do not produce results as good as high dosages. Clozapine requires slow titration and a long enough trial
(33), but most studies in our meta-analysis were relatively long, and metaregression/sensitivity analyses of studies lasting at least 3 or 6 months did not show a superiority of clozapine. The sensitivity analysis of treatment-resistant patients failed to show a superiority of clozapine, but the criteria varied and the patients may not have been as refractory as those in the pivotal studies
(29,
30) . Hardly any studies had a run-in phase to confirm refractoriness.
Meta-analysis can only reflect (and be as good as) the underlying studies. Since 20 of 22 studies did not find clozapine superior (the exceptions were references
17 and
34 ), a narrative review would have reached similar conclusions. Our finding taken together with the dose-response findings tentatively suggests that relatively high doses of clozapine might be needed for some patients. A sufficiently powered double-blind independent study with high dosages in treatment-refractory patients is needed.
Many limitations of meta-analyses must be considered in assessing our findings. In essence, meta-analysis is an observational technique based on originally controlled studies. The studies are a sample from a hypothetical population of studies, and all depends on how well they represent the population. Meta-analysis summarizes similar but not identical studies, and despite extensive sensitivity analyses, the “averaging apples with oranges” (heterogeneity) problem can never be fully ruled out. For example, we combined studies using different statistics (e.g., last observation carried forward, mixed models). Missing standard deviations had to be estimated from the other studies. Adjustments for multiple testing are usually not made in meta-analyses, but because of the many tests applied, a number of findings may well be due to chance alone, although most were consistent across measures (PANSS total and positive symptom scores, dropout due to inefficacy). In contrast, many comparisons of the less studied second-generation antipsychotics were underpowered. The statistical power of metaregression is especially weak, one reason being that it analyzes the influence of moderators at the level of studies rather than of patients.
What are the implications of our findings for the choice of drug? First, we must consider the magnitude of the efficacy differences. We found the effect sizes to range between 1.9 (olanzapine versus risperidone) and 8.3 (olanzapine versus ziprasidone) PANSS points, or 0.11 to 0.29 effect size units. These effect sizes are small to medium differences according to Cohen’s classification
(35) . The clinical relevance of the difference between olanzapine and risperidone (effect size 0.11) based on a large sample size (N=2,404) is especially doubtful. For perspective, the average difference between second-generation antipsychotics and placebo in a recent meta-analysis was only 10 PANSS points and the effect size was 0.51
(20) . To a certain extent, the advantage of one over another second-generation antipsychotic can be viewed in the context of the antipsychotic-placebo difference of 10 PANSS points, but we would hasten to add that this is an indirect and imprecise comparison, particularly considering the differences between active-control trials and placebo-control trials (e.g., reference
36 ). On the other hand, schizophrenia afflicts patients for life, and even a small benefit may be important.
Second, exploding health care costs are making the price of medication important, and some second-generation antipsychotics (amisulpride, risperidone, and zotepine), as well as the first-generation antipsychotics, are now off patent.
Third, side effects are extremely important in choice of drug, because the efficacy of a medication can be interpreted only in the context of its adverse effect profile. There are many side effects that we would have to analyze in 36 possible drug comparisons. Some of these side effects are measured in several ways, and we would have to do a number of metaregressions. It would not be possible to present the resulting information in one paper. Rare side effects would require epidemiological databases. The large differences in side effects among second-generation antipsychotics are well known. As summarized in the APA Practice Guideline for the Treatment of Patients With Schizophrenia, Second Edition
(18), clozapine and olanzapine are the most likely to lead to weight gain and glucose and lipid abnormalities (see
Table 2 ), followed by quetiapine and then risperidone (see also references
7,
37,
38) . Aripiprazole and ziprasidone are relatively benign. There can be dramatic consequences, such as type 2 diabetes, diabetic ketoacidosis, heart attack, and stroke. Extrapyramidal symptoms and tardive dyskinesia are important (see reference
39 for a review). Amisulpride and risperidone carry a risk of some dose-related extrapyramidal symptoms and substantial increase in prolactin levels (more so than haloperidol
[40] ), which can be associated with osteoporosis, galactorrhea, amenorrhea, and sexual dysfunction. The use of clozapine is restricted to refractory patients because of the risk of agranulocytosis, but it also carries a risk of seizures, anticholinergic effects, constipation, sedation, postural hypotension, hypersalivation, myocarditis, and pancreatitis. Sertindole and ziprasidone carry the strongest risk of QTc prolongation among second-generation antipsychotics. Large differences in side effects are often more important than small efficacy differences for individual patients. The clinician should also consider the patient’s past treatment responses, including efficacy and adverse events, the patient’s preferences, and medical vulnerabilities, including family history. For example, olanzapine would not be a first choice for a patient with a vulnerability to diabetes, and the high risk of weight gain does not make it a drug for every patient. Finally, there are substantial differences between individual patients in how they respond to these drugs. The balancing of efficacy and side effects must be tailored to the individual patient, the setting, and the health system.