Since their introduction in the 1990s, second-generation antipsychotics have become the drugs of choice in the treatment of schizophrenia at a cost of over $10 billion annually in the United States, 75% of which is paid through Medicaid
(1) . Studies involving patients with chronic schizophrenia reported that these medications are more effective and have fewer side effects and a lower risk of hospitalization than older drugs, generating sufficient savings to offset greater drug costs
(2 –
5) . However, many of these studies were based on nonexperimental designs, and a small number of randomized trials have shown either smaller net savings
(6,
7) or increased total costs
(8) . Two recent 12-month trials failed to find advantages for the newer drugs in either clinical effectiveness, reduced parkinsonian side effects, or cost
(8,
9), and an economic analysis showed increased costs to the California Medicaid program in association with the introduction of these medications
(1) .
To further evaluate these agents from a public health perspective, the National Institute of Mental Health (NIMH) initiated the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) investigation. The CATIE study used an experimental study design to compare the effectiveness of one first-generation antipsychotic (perphenazine) and all four second-generation antipsychotics (olanzapine, risperidone, quetiapine, and ziprasidone), other than clozapine, that were available in the United States in January 2002. A report on the primary clinical outcomes from the CATIE study, considering only treatment with the initial randomly assigned drug (phase 1), found that patients receiving olanzapine 1) stayed on their medicine longer than others, 2) were less likely to switch drugs for lack of efficacy, 3) had lower symptom levels during the early months of the trial; and 4) had fewer hospitalizations for schizophrenia
(10) . None of the other three second-generation drugs showed greater efficacy or tolerability than the first-generation drug perphenazine. Weight gain with olanzapine was substantial, averaging 2 pounds per month, with concomitant increases in hemoglobin A1C, cholesterol, and triglycerides.
The present analysis of CATIE data presents pairwise comparisons of the five treatments on measures of health costs, symptoms, and on several measures of effectiveness that address health-related quality of life. The primary outcomes are total health costs and quality-adjusted life year (QALY) ratings. The primary objective of the cost-effectiveness component of the CATIE study was to compare alternative intention-to-treat strategies, i.e., to determine whether choice of the first drug in the CATIE algorithm resulted in differences in cost-effectiveness over the entire study. Analysis was also planned to compare the cost-effectiveness of treatments exclusively while patients were receiving the initially assigned treatment.
Method
Study Setting and Design
The CATIE investigation was conducted between January 2001 and December 2004 at 56 U.S. sites and included an algorithmically determined series of treatment phases. Patients were initially assigned to olanzapine, perphenazine, quetiapine, risperidone, or ziprasidone under double-blind conditions. However, patients with tardive dyskinesia (15% of the sample) were prohibited from assignment to perphenazine and thus were not available for comparisons involving that drug, limiting the generalizability of perphenazine comparisons to patients without pre-existing tardive dyskinesia. Patients who discontinued their first treatment were invited to receive other second-generation antipsychotics, including clozapine if they so desired. Open treatment was also offered to patients who refused a second blind assignment or whose treatment failed after a second assignment (phase 3), when a small number chose first-generation antipsychotics.
Participants
The study was approved by an institutional review board at each site. Patients 18–65 years of age with a diagnosis of schizophrenia
(11) who were able to take oral antipsychotic medication were eligible. Patients or their guardians provided written informed consent. Patients were excluded if they had a diagnosis of schizoaffective disorder, mental retardation, or other cognitive disorders; an unstable serious medical condition; past adverse reactions to a proposed treatment; or treatment-resistant schizophrenia or if they were in their first episode of schizophrenia, pregnant, or breast-feeding.
Interventions
Identical capsules contained olanzapine (7.5 mg), quetiapine (200 mg), risperidone (1.5 mg), perphenazine (8 mg), or ziprasidone (40 mg). Ziprasidone was approved for use by the Food and Drug Administration (FDA) during the trial and was added in January 2002, after 40% of the subjects had been recruited. Medications were flexibly dosed with one to four capsules daily, as judged by the study doctor. Concomitant medications were permitted, except for additional antipsychotic agents. Further details about blinding, later phases of treatment, and modal dosing have been presented elsewhere
(10,
12) .
Outcome Measures
This study followed measurement and analytic methods recommended by the Public Health Services Task Force on Cost-Effectiveness in Health and Medicine
(13) . The primary outcomes were total health costs and quality-adjusted life year (QALY) ratings.
Costs
The economic perspective addressed total health costs (health service use plus medications), which were estimated by multiplying the number of units of each type of service received by the estimated local unit cost of that service, and then summing the products across different services.
Service use was documented every month through a self-report questionnaire that recorded four kinds of hospital days (medical, surgical, psychiatric, and substance abuse) across six different facilities (state mental hospitals, private psychiatric hospitals, VA hospitals, non-federal general hospitals, community mental health centers, and detoxification facilities). Nights spent in nursing homes, halfway houses, board and care homes, and supervised apartments were also recorded. Use of 16 types of outpatient mental health care, including psychiatric and psychosocial rehabilitation services, was documented along with eight different types of medical or surgical outpatient visits and use of both psychiatric and medical emergency room services.
Unit costs of these services were estimated from published reports and administrative data sets and are presented in a supplement that accompanies the online version of this article. Antipsychotic medication costs were based on published wholesale prices for the specific capsule strengths used in the CATIE study
(14), adjusted downward for discounts and rebates affecting patients whose medication costs would have been paid by Medicaid (with costs about 25% less than wholesale prices)
(15) or by the Department of Veterans Affairs (VA) (40% less than wholesale prices)
(16) . Costs of over 200 different ancillary medications were estimated on the basis on average daily medication costs for specific agents in the 2002 MarketScan® data set, representing typical medication costs for insured patients
(17) . The unit of analysis for cost evaluation is the total average health cost per month, including costs of all health service use, ancillary drugs, and study medications at the prescribed doses.
Effectiveness
Cost-effectiveness analysis requires a single measure of health-related quality of life that reflects both health gains and health losses due to side effects. The U.S. Public Health Service Task Force on Cost-Effectiveness in Health and Medicine
(13) specifically recommended that health states be expressed as QALY ratings, in which a year of life is rated on a cardinal scale from 0 (worst possible health) to 1 (perfect health), as evaluated by members of the general public.
A recent series of studies
(18 –
20) have demonstrated a method for evaluating QALY ratings in schizophrenia. First, a factor analysis of Positive and Negative Syndrome Scale (PANSS) data gathered from a sample of almost 400 patients
(21) was used to identify positive, negative, and cognitive factors on the PANSS. Then cluster analysis was used to identify eight disease-specific health states on the basis of these three factors. With input from expert clinicians, PANSS subscale scores for each health state were used to develop script and video materials to convey to lay individuals, representing the general public, the health impairments experienced with each schizophrenia state, as well as for five commonly co-occurring side effects (orthostatic hypotension, weight gain, tardive dyskinesia, pseudoparkinsonism, and akathisia)
(18) . Using these eight video presentations, the states were rated by 620 members of the general public using the standard gamble, the recommended method for determining QALY ratings
(13) . QALY ratings for the eight schizophrenia health states ranged from 0.44 to 0.88, while side effect weights ranged from a low of 0.857 for tardive dyskinesia through 0.959 for weight gain and 1.0 when a side effect was not present. The final QALY rating estimate is the product of the QALY rating for the schizophrenia state and the QALY ratings for each side effect. Following the recommendations of the Public Health Task Force
(13), this measure represents the health state of each subject on symptoms and side effects weighted for
societal preferences (outcome measures based on
individual preferences and their analysis are described in the supplement that accompanies the online version of this article). The Patient Perspective contains a clinical description of a typical patient in the study and describes the major features on which the QALY ratings are based.
Statistical Analysis
For consistency and comparability, the statistical methods used in the analysis of continuous measures in this study were the same as those used in the original publication from the CATIE investigation
(10) . Two hundred thirty-one patients with tardive dyskinesia were prohibited from assignment to perphenazine, and ziprasidone was added to the trial after 40% of the patients had been enrolled. Thus, randomization took place under four separate regimens: including and excluding patients with tardive dyskinesia and including and excluding ziprasidone. Analyses were thus conducted on four different datasets with overlapping membership. Each dataset only included patients with an equal chance of being randomly assigned to the treatments under comparison. Perphenazine patients, in particular, were only compared to equivalent patients who did not have tardive dyskinesia at baseline.
The primary comparison between the four treatments available at the beginning of the trial was an overall 3 degree of freedom test. This test was performed on analytic dataset I, which excluded patients with tardive dyskinesia and those randomly assigned to ziprasidone. If the overall test was significant at p<0.05, the three second-generation drugs were compared with perphenazine with a Hochberg adjustment for multiple comparisons
(22) in which the smallest p value was compared to 0.05/3=0.017 and the largest to p=0.05. Dataset II, which excluded perphenazine and includes tardive dyskinesia patients, was used to compare the three second-generation drugs to each other via step-down testing. If the overall 2 degree of freedom test was significant at p<0.05, an alpha of p<0.05 was applied for all comparisons.
Datasets III and IV were used to compare ziprasidone to the other drugs among patients randomly assigned to a treatment group after ziprasidone became available. Hochberg adjustment for four pairwise comparisons was used to compare ziprasidone and perphenazine in dataset III, which excluded tardive dyskinesia patients, and ziprasidone to the other three second-generation drugs using dataset IV. The smallest p value was considered significant if p=0.05/4=0.013. Tables and figures in this article present findings using dataset I, with the ziprasidone data taken from dataset III. Data from the other datasets are presented individually in the supplemental tables and figures that accompany the online version of this article.
The central cost analysis was a paired comparison between treatment groups of average monthly costs from all 18 months using a mixed model including terms representing treatment group, the baseline value of the dependent cost variable, time (treated as a classification variable for months 1–18), site, a history of recent clinical exacerbation, and baseline-by-time interactions. The baseline-by-time term adjusts for baseline differences in characteristics of patients who dropped out early and thus are less well represented at later time points. Group-by-time interactions, to evaluate differences in time trends between groups, were also tested. A random subject effect and a first-order autoregressive covariance structure were used to adjust standard errors for the correlation of observations from the same individual.
Use of any hospital days in each month was examined using a dichotomous measure (0=no, 1=yes) analyzed with generalized estimation equations using the GENMOD procedure of SAS (c).
Because of the skewed distribution of service use (i.e., nondrug) cost data, log-transformed data were used in the analysis of both 1) nondrug health service costs and 2) total costs, including medications, and both mean and median values are presented
(23) . Adjusted average log-transformed costs were then re-transformed into average costs using the “smearing estimation” method of Duan
(24), after testing the data for heteroscedasticity
(25) . Untransformed monthly data were also averaged for each individual and compared using the Kruskal-Wallis nonparametric test.
The same mixed model analysis was used for effectiveness outcomes based on scores from months 1 and then quarterly from 3 through 18, again using a random subject effect and a first-order autoregressive covariance structure.
Results
Although 1,493 patients were enrolled in the study, all data from one site (33 patients) were excluded prior to analysis due to concerns about data integrity, and 19 never took their assigned study drug (
Figure 1 ). Baseline utilization data were not available for an additional 19 patients, leaving 1,424 patients for analysis. Comparison of all patients on baseline assessments showed significant differences, as expected, on measures of tardive dyskinesia and akathisia, reflecting the exclusion of patients with tardive dyskinesia from randomization to perphenazine, as per the study design. There were no significant differences on these measures among patients who participated in the randomization that included perphenazine. Details of the baseline assessment data and comparisons of treatment groups both with and without perphenazine are presented in supplemental tables A and B that accompany the online version of this article.
In the intention-to-treat analysis using all available follow-up data, 68.2% of patients were still participating in follow-up interviews at 6 months, with significant differences in the proportion of participants across randomized treatments (χ 2 =16.4, df=4, p=0.003). At 18 months, 45.7% were still participating, and differences in participation across these agents were no longer significant (χ 2 =8.6, df=4, p=0.66).
In contrast to data on participation in follow-up interviews, data on treatment continuation show that only 25.9% of all patients completed 18 months with their original assigned treatment, with significant differences in the proportion of completers between groups (χ 2 =31.5, df=4, p<0.0001).
Data on drug treatment following the first
change in treatment after randomization show that virtually all treatments administered were second-generation drugs (range 96.0% to 99.6% across groups for all prescriptions following the first drug change) with a balanced distribution of agents across initial treatment groups (
Table 1 ).
Service Use and Costs
Examination of all outcome data based on intention-to-treat analyses, which attributed all costs to the initially assigned drug, showed that total medication costs for patients initially assigned to perphenazine were $200–$300/month (about 40%–50%) lower than drug costs for patients assigned to each of the four second-generation antipsychotics (
Table 2,
Figure 2 ) (p<0.0001). Significant group-by-time statistical interactions (p<0.0001) reflect the narrowing of differences in drug costs during the first 8 months, after which perphenazine remained consistently less costly (p<0.0001 at each time point).
There were no significant differences in the proportion of patients who received inpatient care each month, the single greatest source of cost among people with schizophrenia (online supplemental figure A). The average total inpatient and residential treatment costs per month were also not significantly different between groups (
Table 2, online supplemental figure B), nor were there any significant differences in the sum of inpatient, residential and outpatient health service costs (i.e., all nondrug costs) (
Table 2, online supplemental figure C). Group-by-time interactions for these costs were not statistically significant, indicating continuous equivalence of these nondrug health services costs across groups over time.
When health service and drug costs were summed to generate total health costs (i.e., the primary cost outcome), average total monthly health care costs were $300–$600 (20%–30%) lower for perphenazine than for second-generation antipsychotics (
Table 2,
Figure 3 ) (p<0.0001). The modest rise in total costs for the perphenazine group during the later months of the trial reflects increased drug costs as perphenazine patients switched to second-generation antipsychotics, but there was no late increase in nondrug costs among perphenazine patients. Significant group-by-time interactions reflected convergence in log-cost between groups, but perphenazine log-costs remained significantly lower than those of other groups (p<0.001 at all time points). Although raw average cost data presented in
Figure 3 overlap in the latter months of the trial, time-specific analytic comparisons of log-transformed data, which reduces the effect of outlier values, showed consistently lower costs for perphenazine. In a sensitivity analysis in which Medicaid or VA drug prices (approximately 25% and 40% lower, respectively) were applied to all patients, the advantage for perphenazine on cost comparisons remained statistically significant.
Drug costs for the olanzapine group were significantly higher than for the quetiapine, risperidone, and ziprasidone group (
Table 2 ). Due to lower inpatient and outpatient costs for the olanzapine group (statistically significant only for inpatient costs in comparison to quetiapine), there were no significant differences in total health care costs between second-generation drugs.
Examination of median values revealed similar cost patterns with significantly lower costs for the perphenazine group (data presented in supplemental table C that accompanies the online version of this article).
Effectiveness
As seen in
Figure 4, for the intention-to-treat analysis, significant improvement in QALY ratings was observed from baseline (mean for all groups: 0.683) to 18 months (mean for all groups: 0.747). Perphenazine was associated with higher (i.e., better) QALY ratings than other agents, but there was only one significant difference (perphenazine was greater than risperidone) (
Table 3 ). Differences between perphenazine and second-generation antipsychotics on the PANSS and other quality of life measures were not statistically significant (online supplemental tables F and G; online supplemental figures D–G).
Analysis of Period During the Initial Drug Assignment
When the analyses were limited to observations of patients while they received the first assigned medication (phase 1), the patterns of statistically significant cost results did not change, although, as expected, without the additional costs of switching to second-generation antipsychotics, the differences between perphenazine and second-generation antipsychotics in both medication costs and total health cost increased by an additional $100–$200/month (
Table 2, online supplemental table D). Group-by-time interactions in total costs were not statistically significant, since there was less convergence of cost in phase 1 during the later months of the trial (i.e., excluding observations after the switch from perphenazine to second-generation antipsychotics) (
Figure 5 ).
The pattern of statistically significant results on QALY ratings and other effectiveness measures were also unchanged in the analysis that only included observations on the initially randomized drug (
Table 3 ).
Re-transformed log-cost data showed similar patterns of results to the raw mean cost data for both the intention-to-treat and initial assignment analyses (online supplemental table E).
Discussion
This study found that during the 18 months of the CATIE trial, initial assignment to a first-generation antipsychotic, perphenazine, was less costly and no less effective than assignment to each of four second-generation antipsychotics as measured by QALY ratings that combined measures of symptoms and side effects. Several different analytic strategies all yielded the same pattern of significant results, including: 1) analyses of all available outcome data, 2) analyses limited to the period of treatment with the initially assigned drug (phase 1), 3) comparison of both means and medians using parametric and nonparametric statistics, respectively, 4) examination of re-transformed log-cost data, and 5) sensitivity analyses in which less expensive Medicaid- and VA-discounted drug prices were applied to all patients. Since in this study perphenazine was consistently and significantly less costly and not less effective than the next most effective treatment, as measured by QALY ratings, calculation of the cost-effectiveness ratio was not performed
(13) .
These results extend the efficacy and safety outcomes analysis from the first report of the CATIE study, which used time to discontinuation of the initial treatment for any cause as the primary measure of effectiveness
(10) . The median time to discontinuation or completion of the trial in that study was 9.2 months for olanzapine compared with 5.6 months for perphenazine, a 64% increase for olanzapine in the length of time when both patient and doctor felt that no increase in benefit could be obtained by switching to another drug. The cost of treatment during these initial treatment periods, including the drugs, was $1,404 per month for olanzapine versus $960 per month for perphenazine, a 46% increase in cost per month for olanzapine. Among those who did not complete the study with their initial assigned medication and who were switched to other drugs, the difference in average monthly costs for the remainder of the 18-month study period, after the treatment period on their initially assigned drug, was only 3%.
Strengths of the study were its large sample size, long follow-up duration, and recruitment of patients from diverse representative sites with minimal exclusion criteria—all of which increase the generalizability of the results. The study was also enhanced by the use of a rigorously developed algorithm for evaluating health states specific to schizophrenia in terms of QALY ratings that take both symptoms of schizophrenia and side effects into account.
At the time the study was initiated, it was widely believed that perphenazine increased the risk of tardive dyskinesia, and differential randomization was used to minimize that risk. While a recent review suggested that second-generation antipsychotics are associated with less risk of tardive dyskinesia than first-generation antipsychotics
(26), it noted that only three of 11 year-long studies were based on randomized trials and many others used relatively high doses of haloperidol for comparison. Results from the CATIE investigation
(10) are consistent with the results of other recent studies
(8,
9,
27 –
31) that have questioned the extent to which the risk of either tardive dyskinesia or pseudoparkinsonism is greater with older drugs, especially when lower potency drugs are used in moderate doses
(29,
32) . The maximum dose of perphenazine allowed in the CATIE study was 32 mg, about half the maximum clinically recommended dose (64 mg for inpatients). It is notable, however, that only 40% of perphenazine patients reached the maximally allowed 32-mg dose, the same percentage as reached maximal doses in the olanzapine and risperidone arms
(10) .
Data loss from attrition was considerable. However, differences in loss-to-follow-up rates across treatments were generally small and not significantly different between olanzapine and perphenazine, the most effective two treatments, at 16 of 18 time points. In the comparisons between the two most effective drugs, perphenazine and olanzapine, data were obtained at 65% of scheduled assessments for perphenazine and at 70% for olanzapine. To the extent that this difference introduces selection bias, it would tend to favor olanzapine, since patients tend to do better at later assessments, a greater fraction of which were completed by patients in the olanzapine group. We also conducted a series of analyses to determine whether there were any differences between treatments in baseline characteristics (e.g., symptoms or substance abuse) that predicted duration of participation in the trial. If patients with severe symptoms at baseline were especially likely to drop out with one of the treatments, long-term findings could be systematically biased in favor of that treatment. However, there were no significant interactions between treatment assignment and any baseline characteristics in association with duration of participation in the trial. Differences between treatments in baseline predictors of drop out do not seem to have biased our results. Re-analysis of primary outcomes using Multiple Imputation to address missing data (presented in the data supplement that accompanies the online version of this article) was consistent with our results.
Furthermore, the failure to find significant differences between perphenazine and olanzapine on any effectiveness measure, other than the duration of treatment with the initially assigned antipsychotic, is not likely to be attributable to the lack of statistical power, since differences of small magnitude (4 points on the PANSS and 0.016 in QALY rating) were found to be statistically significant in some comparisons involving both olanzapine and perphenazine and other treatments and 95% confidence interval analysis showed small differences between olanzapine and perphenazine on effectiveness measures.
The CATIE study was not long enough to detect differences in time-dependent, longer-term side effects such as tardive dyskinesia or medical comorbidity (e.g., diabetes or cardiovascular disease), since findings cannot be generalized beyond 18 months. Lack of long-term cost data from either the CATIE study or other randomized trials related to these serious outcomes limits the conclusions regarding cost-effectiveness that can be drawn at this time. It should also be reiterated that the comparisons with perphenazine only pertain to treatment of patients without tardive dyskinesia at the time of treatment initiation. Similarly, the results cannot be generalized to other clinical populations such as first-episode patients, refractory patients, the elderly, those with unstable medical problems, long-term institutionalized patients, patients who refuse to take medication, or patients with diagnoses other than schizophrenia. The relevance of these findings to first-generation antipsychotics other than perphenazine at modest doses, or to other second-generation antipsychotics, and especially to clozapine, is also unknown. Furthermore, patients who believed that their current medication (whether first or second generation) was uniquely effective for them were unlikely to have participated, and these results are thus not applicable to patients satisfied with their current medication.
Treatment decisions must be based on the clinical situation of each individual patient. This study would clearly
not justify policies that would unconditionally restrict access to any particular medication or that would thoughtlessly force patients or doctors who are satisfied with a current treatment to change to a treatment just because it might be less expensive. The development and implementation of formulary policies that might allow cost savings while protecting individualized clinical decision making should be based on consideration of all available research and on a consensus among relevant stakeholders, including patients and families as well as providers
(33) .
These results should encourage consideration of older intermediate potency drugs like perphenazine when a medication change is indicated. Risperidone is likely to be available in generic form in the next few years, followed by olanzapine 4 or 5 years later. Generic versions of these drugs are likely to have even lower prices than those used in our discounted cost estimates, and the coming availability of generic second-generation antipsychotics will undoubtedly alter the cost profiles described here.