Results
212 studies reported between October, 1955, and September, 2012, with 43049 participants, were included in the analysis (details of included studies are shown in
appendix pp 41–65; PRISMA
41 flowcharts are shown in
appendix pp 70–76). The mean duration of illness was 12·4 years (SD 6·6) and the mean age of trial participants was 38·4 years (SD 6·9). Nine studies exclusively examined first-episode patients. In terms of study quality, the reports often did not provide details about randomisation procedures and allocation concealment (
appendix pp 77–84); however, 144 studies (68%) were done by pharmaceutical companies, which (in those cases in which they responded to our requests for information) had used appropriate methods throughout. 13 studies were single-blinded (with allocation concealed from assessors), and the rest (199, 94%) were double-blinded, but few details were reported about the methods of concealment or how successful they were. Our analysis accorded with the known high numbers of withdrawals in clinical studies of schizophrenia (35% overall for the studies included in our analysis); the effect of withdrawals was examined by meta-regression. The main reason for selective reporting was that the use of antiparkinson drugs was often not reported. The networks of eligible comparisons are shown in
figure 1 and in the
appendix (pp 85–91). The results of the direct comparisons for all outcomes are shown in the
appendix (pp 92–96).
We created hierarchies of effect size on the basis of SUCRA rankings for all outcomes.
Figures 2 and
3 show these results for overall efficacy (
appendix pp 97–104). Most of the differences between drugs are gradual rather than discrete. As a rule of thumb, Cohen
42 has suggested that a standardised mean difference of −0·2 is small, −0·5 medium, and −0·8 large. All drugs were superior to placebo (range of mean effect sizes −0·33 to −0·88;
figure 3), and clozapine was significantly more effective than all the other drugs (
figure 2). After clozapine, amisulpride, olanzapine, and risperidone were significantly more effective than the other drugs apart from paliperidone and zotepine. These effect sizes were small (range −0·11 to −0·33;
figure 2).
All-cause discontinuation was used as a measure of acceptability. All drugs were significantly better than placebo apart from zotepine (
figure 2,
figure 4A). ORs and NNTs ranged from 0·43 and 6 for amisulpride to 0·80 and 20 for haloperidol. Amisulpride (range of significant mean ORs 0·53–0·71; NNTs 8–14), olanzapine (0·58–0·76; 9–17), clozapine (0·57–0·67; 9–12), paliperidone (0·60–0·71; 9–14), and risperidone (0·66–0·78; 11–18) had significantly lower all-cause discontinuation than several other drugs. Haloperidol was worse than quetiapine (OR 1·32; NNT 15) and aripiprazole (OR 1·33; NNT 15;
figure 2; for NNTs and NNHs see
appendix pp 133–39).
Apart from haloperidol, ziprasidone, and lurasidone, all drugs produced more weight gain than placebo (
figures 4B,
5). Olanzapine produced significantly more weight gain than most other drugs, followed by zotepine (
figure 5). Clozapine, iloperidone, chlorpromazine, sertindole, quetiapine, risperidone, and paliperidone produced significantly more weight gain than haloperidol, ziprasidone, lurasidone, aripiprazole, amisulpride, and asenapine (with the exception that asenapine did not differ significantly from paliperidone). Standardised mean differences for these comparisons ranged from −0·18 to −0·57 (
figure 5). Other differences were not statistically significant apart from iloperidone causing more weight gain than paliperidone, risperidone, and quetiapine (
figure 5).
Clozapine, sertindole, olanzapine, quetiapine, aripiprazole, iloperidone, amisulpride, and asenapine did not cause significantly more extrapyramidal side-effects than placebo. The range of mean ORs and NNHs for the other drugs were 1·61–4·76 and 3–11, respectively (
figure 4C). Clozapine produced fewer extrapyramidal side-effects than all other drugs and placebo (mean ORs 0·06–0·40; NNTs 5–9), and was followed in ranking by sertindole, olanzapine, and quetiapine (
figure 5, for NNTs see
appendix pp 133–39). Haloperidol caused significantly more extrapyramidal side-effects than the other drugs apart from zotepine and chlorpromazine, for which the differences were not significant (mean ORs 0·06–0·52; NNHs 5–11; in favour of other drugs). Zotepine, chlorpromazine, lurasidone, risperidone, and paliperidone were among the least well tolerated drugs, because they produced significantly more extrapyramidal side-effects than several others in the analysis (
figure 5).
Aripiprazole, quetiapine, asenapine, chlorpromazine, and iloperidone did not cause significantly increased prolactin concentrations compared with placebo (
figure 4D). Paliperidone and risperidone were associated with significantly more prolactin increase than all other drugs including haloperidol, and haloperidol was associated with significantly more than the rest apart from chlorpromazine and sertindole (
figure 6). Clozapine and zotepine could not be included in the analysis, because the one direct comparison between them (ie, with each other) was not linked with any other drug in the network (standardised mean difference −1·23, 95% CrI −1·8 to −0·64, in favour of clozapine; n=52).
44 No usable data were available for amisulpride.
Lurasidone, aripiprazole, paliperidone, and asenapine were not associated with significant QTc prolongation compared with placebo (
figure 4E). The standardised mean differences of the other drugs compared with placebo ranged from marginal (0·11, haloperidol) to large (0·90, sertindole). Results for the comparisons between drugs with respect to QTc prolongation are shown in
figure 6.
Amisulpride, paliperidone, sertindole, and iloperidone were not significantly more sedating than placebo (
figure 4F). For the other drugs compared with placebo, mean ORs and NNHs ranged from 1·84 and 10 (aripiprazole) to 8·82 and 2 (clozapine). ORs for the comparisons between drugs with respect to sedation are shown in
figure 7 and their NNTs are shown in the
appendix (pp 133–39).
The assumption of consistency was generally supported by a better trade-off between model fit and complexity when consistency was assumed than when it was not (
appendix pp 105–14). Significant disagreement between direct and indirect estimates (inconsistency) was identified in only very few cases: for efficacy seven of 80 loops; for all-cause discontinuation three of 80 loops; for weight gain one of 62 loops; for extrapyramidal side-effects one of 56 loops; for prolactin increase three of 44 loops; for QTc prolongation two of 35 loops; and for sedation none of 49 loops were inconsistent (
appendix pp 105–14). Data were double-checked and we could not identify any important variable that differed across comparisons in these loops. The number of included studies in the inconsistent loops was typically small, so the extent of inconsistency was not substantial enough to change the results.
Results for efficacy and extrapyramidal side-effects were robust against the sensitivity and meta-regression analyses (
appendix pp 115–32). The most notable exceptions were that the relative efficacy of asenapine increased from the 13th to the seventh rank when placebo comparisons were removed. A large, failed study had driven its primary result, so asenapine was also more effective (ninth rank) when such trials were excluded. Haloperidol doses lower than 12 mg per day (or 7·5 mg per day) caused significantly fewer extrapyramidal side-effects than did higher doses, but still more than any other antipsychotic drug; for the efficacy outcome, lower doses of haloperidol did not significantly differ from higher doses. Chlorpromazine doses higher than 600 mg per day (or 500 mg per day) were associated with higher efficacy (sixth rank) than lower doses (14th rank), with little difference in extrapyramidal side-effects. Small studies tended to show higher efficacy of the active interventions compared with placebo (regression coefficient=1·31, 95% CrI 0·58–2·03). However this had only a small effect on the ranking of the treatments (
appendix pp 115–32). None of the other meta-regression or sensitivity analyses led to any important changes in the efficacy and extrapyramidal side-effect hierarchies (
appendix pp 115–32).
Discussion
Our multiple-treatments meta-analysis provides evidence-based hierarchies for the efficacy and tolerability of antipsychotic drugs, overcoming the major limitation of conventional pairwise meta-analyses.
3,4,21 Results for our primary outcome challenge the dogma that the efficacy of all antipsychotic drugs is the same. This notion originated from an influential narrative review published in 1969,
45 but it has not been scientifically addressed since.
The efficacy hierarchy generated by our analysis was robust against many sources of bias, including various analyses related to dose. In particular, findings from pairwise meta-analyses
3–5 suggested that some, but not all, second-generation antipsychotics were more effective than haloperidol, but these findings have been criticised for differences in haloperidol doses used by the included studies, which might have affected the efficacy outcomes.
35 However, the fact that exclusion of all haloperidol comparisons in our analysis did not affect the efficacy hierarchy refutes this criticism. The FDA still requires placebo-controlled trials for all new antipsychotic drugs. Increasing placebo response in such trials is a concern,
36 but exclusion of all placebo comparisons did not change the results much in our analysis, apart from asenapine turning out more effective than in the primary analysis. That the four most effective second-generation antipsychotic drugs were the first to be developed could also suggest a cohort effect in terms of changes in study populations. However, two meta-regression analyses— one with publication year as a continuous moderator and the other comparing the results of trials published in the past 15 years with those published earlier—did not change the efficacy hierarchy to an important extent. The example of paliperidone (approved by the FDA in 2007), which is the active metabolite of risperidone (approved by the FDA in 1993), and has essentially the same receptor-binding profile,
46 also contradicts this suggestion, because both drugs ranked next to each other in most domains (apart from sedation and QTc prolongation) and because paliperidone was more effective than several antipsychotic drugs that had been developed previously (
figures 2,
3).
We emphasise that the differences in efficacy between drugs were small (standardised mean differences 0·11–0·55, median 0·24), and smaller overall than those for side-effects. However, for perspective, the efficacy differences compared with placebo were of only medium size (0·33–0·88, median 0·44), so the differences in efficacy between drugs are possibly substantial enough to be clinically important. Finally, because most clozapine studies were done in refractory patients, clozapine is thought to be superior only in this subtype, but in our analysis of non-refractory patients it was also more effective than all the other drugs. However, this result has the limitation that it was mainly based on older comparisons of clozapine with first-generation drugs. As in our previous conventional meta-analysis,
47 clozapine was not more effective than any other second-generation antipsychotic in direct pairwise comparisons (
appendix pp 92–96). A European Union-funded study to examine the early use of clozapine in first-episode patients is underway.
All-cause discontinuation has previously been used as a measure for the acceptability of treatments, because it encompasses efficacy and tolerability.
7,8 In our analysis, the results paralleled the efficacy findings in that the most effective drugs also had the lowest discontinuation rates (although haloperidol, the worst drug with respect to all-cause discontinuation, had a middle rank for efficacy). In randomised controlled trials in patients with schizophrenia, more participants withdraw because of inefficacy (40% overall for the studies included in our analysis) than because of side-effects (17%; other reasons for withdrawal were not assessed),
36 and some evidence suggests that patients prioritise efficacy over tolerability.
37 We have used the neutral term all-cause discontinuation, because clinicians might intuitively associate the word acceptability more with tolerability than with efficacy.
Haloperidol caused the most extrapyramidal side-effects, followed by zotepine and chlorpromazine. Chlorpromazine did not produce significantly more extrapyramidal side-effects than did most second-generation antipsychotics. Haloperidol doses lower than 7·5 mg per day (the lowest dose in multiple-episode patients was 4 mg per day) produced similar outcomes for efficacy and extrapyramidal side-effects as did higher doses. However, five second-generation drugs were associated with significantly more extrapyramidal side-effects than was placebo. These findings show that extrapyramidal side-effects cannot be used for a dichotomous classification into first-generation and second-generation antipsychotics. Curiously, clozapine was associated with less use of antiparkinson drugs than was placebo. Abrupt withdrawal of prestudy treatment and too short washout phases (sometimes only 48 h) can lead to rebound and carry-over extrapyramidal side-effects.
48 Furthermore, involuntary movements are present in 9–17% of antipsychotic drug-naive people with schizophrenia.
49 Clozapine has a low intrinsic risk of extrapyramidal side-effects and might suppress both of these effects.
Weight gain and associated metabolic problems are regarded as the major issues associated with new antipsychotic drugs. Indeed, olanzapine, zotepine, and clozapine were the worst in this respect, and some guidelines recommend against the first-line use of olanzapine for first-episode patients.
12 However, ziprasidone and lurasidone (along with haloperidol) were the only antipsychotic drugs without significantly more weight gain than placebo in adults. By contrast, chlorpromazine was among the worst drugs in this respect. This finding shows that sedating, low-potency, first-generation antipsychotics also cause weight gain, and that a dichotomy between first-generation and second-generation antipsychotics based on weight gain is another oversimplification.
3Sedation is unpleasant for patients. Overall, our results with respect to sedation were reasonable, and direct and indirect comparisons were consistent. For example, clozapine and chlorpromazine are certainly sedating drugs; the good results for amisulpride can be accounted for by the absence of blockade of histaminergic receptors associated with sedation; and the small sedative effects of paliperidone can possibly be accounted for by its slow-release mechanism limiting plasma peaks after ingestion. Although the highest ORs were almost two-times higher for sedation than the highest for extrapyramidal side-effects, sedation is sometimes transient, is measured only by spontaneous reports, and the potential confounder of concomitant use of benzodiazepines in the studies should not be ignored.
QTc prolongation can lead to life-threatening torsades de pointes.
50 The antipsychotic drugs assessed differed enormously with respect to this outcome, with some not differing from placebo, and one (sertindole) being almost one standard deviation worse. Indeed, sertindole was associated with increased cardiac mortality compared with risperidone in a large, pragmatic, randomised controlled trial
51 (n=9858, all-cause mortality not different). In another study,
52 no difference in frequency of sudden death was seen between ziprasidone (the third worst drug in our analysis) and olanzapine (n=18154).
51 We emphasise that amisulpride was regarded as benign in some guidelines,
13 but our findings show that it might not be—a result that is consistent with an analysis of amisulpride overdoses.
50 This result has the limitation that the evidence is indirectly derived from two comparisons with olanzapine, since direct comparisons with placebo were not available (
appendix pp 92–96). QTc data were not available for the older drugs (clozapine, chlorpromazine, and zotepine).
Prolactin increase can be associated with several side-effects such as amenorrhoea, galactorrhoea, sexual dysfunction, and osteoporosis; a possible association with breast cancer has also been discussed, but the link is not proven.
53 The causes of some of these side-effects are multifactorial—eg, decreased libido can also be the expression of schizophrenic negative symptoms, and osteoporosis can be caused by immobility in schizophrenia. However, the differences between drugs with respect to this outcome were large. For example, paliperidone and risperidone increased prolactin by more than one standard deviation compared with placebo; aripiprazole reduced prolactin (although not significantly) because of its partial-dopamine-agonist properties. Despite the collaboration of its manufacturer, no useable data on amisulpride were available, but its high prolactin risk is well known.
54Our study has several limitations. The network could be expanded to old drugs such as perphenazine and sulpiride, which have had good results in effectiveness studies,
55,56 but only a few relevant perphenazine trials have been done.
57 As more and more second-generation antipsychotics are losing their patent protection, the debate about the costs of the original second-generation antipsychotics becomes less important. The present debate is about whether the newest drugs are cost-effective. These new drugs do have favourable properties, such as acceptable weight gain (especially lurasidone and asenapine, and to a lesser extent iloperidone and paliperidone). De Hert and colleagues
22,58 additionally noted that these drugs might be fairly benign with respect to increases of lipids and glucose, which partly correspond to weight gain. Reporting of side-effects is unsatisfactory in randomised controlled trials in patients with psychiatric disorders,
59 and some side-effects were not recorded at all for some drugs (
figure 4). The meta-regression with percentage of withdrawals as a moderator could not rule out all potential bias associated with high attrition in schizophrenia trials.
Our findings cannot be generalised to young people with schizophrenia, patients with predominant negative symptoms, refractory patients, or stable patients, all of whom were excluded to enhance homogeneity as required by multiple-treatments meta-analysis. A funnel-plot asymmetry was seen, which is not necessarily the expression of publication bias, but rather of higher efficacy in small trials than in larger ones, for various reasons.
24 For example, sample size estimates for drugs with low efficacy might have needed higher numbers of participants to attain statistical significance than in trials with more effective drugs. However, accounting for trial size did not substantially change the rankings. Finally, because multiple-treatments meta-analysis requires reasonably homogeneous studies, we had to restrict ourselves to short-term trials. Because schizophrenia is often a chronic disorder, future multiple-treatments meta-analyses could focus on long-term trials,
60 but these remain scarce.
3 In any case, for clinicians to know to which drugs patients are most likely to respond within a reasonable duration such as 6 weeks is important.
Antipsychotic drugs differ in many properties and can therefore not be categorised in first-generation and second-generation groupings. The suggested hierarchies in seven major domains should help clinicians to adapt choice of antipsychotic drug to the needs of individual patients, and should lead to modification of clinical practice guidelines.