Antidepressants are used commonly as a first-line treatment for major depressive disorder. In particular, second-generation antidepressants such as selective serotonin reuptake inhibitors (SSRIs), serotonin-norepinephrine reuptake inhibitors, and other drugs that selectively affect the activity of neurotransmitters play a prominent role in treatment. Although these drugs are believed to have similar efficacy to first-generation agents (for example, tricyclic antidepressants and monoamine oxidase inhibitors), they are recommended over the first-generation agents because of their relatively favorable side effect profile and reduced risk of harm in overdose or in combination with certain medications or food (
1 ).
Current treatment guidelines for major depression (
2,
3 ) suggest an acute-phase treatment duration of six to 12 weeks. For patients who demonstrate an adequate response (usually defined as remission) to acute-phase treatment, continuation-phase treatment of four to nine months is recommended. The goal of continuation-phase treatment is to prolong the absence of depressive symptoms such that the patient's episode can be considered completely resolved. Effective continuation-phase treatment prevents relapse, defined as the return of depressive symptoms during the current depressive episode. After successful continuation-phase treatment, a maintenance phase of treatment to prevent recurrence of a new, distinct episode is considered. For patients with a history of recurrent depression, maintenance-phase treatment can frequently last for years.
For acute treatment of depression, approximately 60% of patients respond to second-generation antidepressants (
4 ). Evidence from the one-year naturalistic follow-up phase of the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial indicates that at least 40% of acute-phase responders relapse during continuation treatment (
5 ). Randomized, double-blinded, controlled trials have assessed how treatments compare with placebo and with each other for preventing relapse and recurrence, but results for second-generation antidepressants have not been systematically reviewed. Two reviews have systematically assessed relapse prevention during continuation-phase treatment or recurrence prevention during maintenance-phase treatment (
6,
7 ), but these reviews included all antidepressants rather than just second-generation antidepressants. They are limited by the dates of their literature searches—searches censored at 1987 (
6 ) and 2000 (
7 )—and thus exclude more recent studies. More recently, Zimmerman and colleagues (
8 ) focused on second-generation antidepressants, although the intent of their review was to illustrate how conclusions differ between extension trials and placebo substitution trials.
Because second-generation drugs are now the most frequently prescribed antidepressants, our goal was to systematically evaluate data on the efficacy of second-generation antidepressants for preventing relapse and recurrence. We conducted a systematic review and meta-analysis of comparative and placebo-controlled evidence for 12 second-generation antidepressants (bupropion, citalopram, duloxetine, escitalopram, fluoxetine, fluvoxamine, mirtazapine, nefazodone, paroxetine, sertraline, trazodone, and venlafaxine). We refer to these agents collectively as antidepressants. We had two key questions for this review. First, for adults with a depressive syndrome, do antidepressants differ in their efficacy or effectiveness for maintaining remission (specifically, preventing relapse during the continuation phase and preventing recurrence during the maintenance phase)? Second, for adults with depressive syndrome, what is the overall effect size for active treatment compared with placebo, and is this effect size persistent over time?
Methods
Key questions
Key questions designed to address efficacy, effectiveness, and tolerability of antidepressants for maintaining remission guided our work. The key questions were formulated through a process involving the public, the Scientific Resource Center for the Effective Health Care program of the Agency for Healthcare Research and Quality (AHRQ), and various stakeholder groups. AHRQ provided funding for the initial review, although this update and analysis were unfunded.
Literature search
To identify articles relevant to each key question, we searched MEDLINE, EMBASE, the Cochrane Library, PsycINFO, and International Pharmaceutical Abstracts. Searches covered studies published during the period of January 1980 through April 2007. In addition, we manually searched reference lists of relevant review articles and letters to the editor. We also manually searched the Center for Drug Evaluation and Research (CDER) database to identify unpublished research submitted to the U.S. Food and Drug Administration (FDA).
Study selection
Two persons independently reviewed article titles and abstracts. We included head-to-head trials comparing one antidepressant with another and placebo-controlled trials. Studies included adult inpatient and outpatient populations with depressive illness in which individuals demonstrated response to treatment or remission. Head-to-head trials were included if they reported relapse or recurrence rates, regardless of whether participants were randomly assigned to treatment groups after successful acute-phase or continuation-phase treatment (that is, extension versus randomized substitution trials). For our meta-analysis, inclusion criteria were more stringent for placebo-controlled evidence; only studies that randomly assigned participants after demonstrating either an acute-phase response or lack of relapse during the continuation phase were included (randomized placebo-substitution trials).
Data abstraction
Trained reviewers abstracted data from each study; a senior reviewer read each abstracted article and evaluated completeness of data extraction. We recorded intention-to-treat results if available. When intention-to-treat results were not explicitly presented, we derived intention-to-treat results by using as the numerator the number of patients who experienced the outcome (relapse for continuation phase or recurrence for maintenance phase) and using as the denominator the number randomly assigned to that arm of the trial.
Quality and strength assessment
In terms of quality, we assessed the internal validity of trials on the basis of predefined criteria—ratings of good, fair, or poor—from the U.S. Preventive Services Task Force (
9 ) and the National Health Service Centre for Reviews and Dissemination (
10 ). Elements of internal validity assessment included randomization, allocation concealment, similarity of compared groups at baseline, use of intention-to-treat analysis, and overall and differential loss to follow-up. Discrepancies in quality assessment were resolved by discussion and, when necessary, consultation with a third party.
Data synthesis
We qualitatively summarized all studies. For placebo-controlled trials—a majority of the included studies—we also conducted quantitative analyses. We calculated the relative risk of loss of response for active treatment compared with placebo. The primary outcome measure was defined as loss of response or remission (in other words, continuation-phase relapse or maintenance-phase recurrence). In most trials this was defined as an increase in the Hamilton Rating Scale for Depression (HAM-D) or Montgomery-Asberg Depression Rating Scale (MADRS) score above a predefined cutoff point. We conducted relative risk meta-analysis of relapse rates for trials stratified by duration of follow-up: less than one year and one year or more. Stratification of trials lasting one year or more was intended to represent a conservative delineation of maintenance-phase treatment (thus referring to recurrence), whereas trials lasting less than one year were assumed to represent relapse prevention during the continuation phase. Risk-difference meta-analyses were used to calculate numbers of patients needed to treat to prevent one relapse or recurrence overall and for each time period.
Our analyses included trials with multiple placebo comparisons. For example, for trials that compared multiple dosing arms with a single-placebo group, we combined the dosing arms for a single comparison as long as doses were within the range of FDA-approved doses. In trials that compared more than one drug with placebo, we included each drug-placebo comparison as an observation but reduced the sample size of the placebo group proportionately so as not to overrepresent the placebo group (
11 ). For example, if 300 patients were randomly assigned to receive drug A (N=100), drug B (N=100), or placebo (N=100), our analysis compared drug A (N=100) with placebo (N=50) and drug B (N=100) with placebo (N=50). Although the proportion of placebo-treated participants having the outcome does not change, this approach inflates the variance of the log relative risk and ultimately results in a more conservative confidence interval. For each meta-analysis, we tested for heterogeneity of treatment effects, using I
2 statistics. We report the results of the more conservative random effects models (
12 ). To estimate possible publication bias, we used funnel plots, the Beggs adjusted rank correlation test, and the Egger regression approach (
13,
14 ). However, because these tests have low statistical power when the number of trials is small (
15 ), undetected bias may still be present.
The most common trial design was an open-label acute treatment phase of six to 15 weeks, followed by a randomized, double-blind, placebo-controlled continuation phase, maintenance phase, or both for acute-phase responders or remitters. Because trials differed in the length of open-label treatment before randomization and in the duration of treatment after randomization, we conducted a meta-regression to explore how heterogeneity in design influenced estimates of relative risk of relapse or recurrence. Similarly, we used meta-regression to explore whether pooling antidepressants as a class was a reasonable approach. For simplicity, this analysis explored heterogeneity by comparing SSRI trials (citalopram, escitalopram, fluoxetine, fluvoxamine, paroxetine, and sertraline) with other second-generation antidepressant trials (bupropion, duloxetine, mirtazapine, nefazodone, trazodone, and venlafaxine).
To balance our assessment of benefits, we examined reported rates of adverse events and rates of loss to follow-up that were attributed to adverse events. We qualitatively compared the rates from studies on relapse prevention and recurrence prevention with rates reported in acute-phase trials (
4,
16 ). We also conducted a relative risk meta-analysis for active treatment compared with placebo for both overall loss to follow-up and loss to follow-up attributed to adverse events. However, because of variability in study populations and in adverse event assessment and reporting among trials, caution should be taken in interpreting this evidence.
All statistical analyses were conducted with Stata 9.1 software.
Results
Our search identified 2,318 article titles and abstracts. [A flow diagram of the study selection for the meta-analysis is available as an online supplement to this article at ps.psychiatryonline.org.] Of these, we reviewed 902 full-text articles and retained 29 articles describing 27 unique trials that addressed relapse or recurrence prevention (
Tables 1 and
2 ). The most common reason for exclusion was "wrong study design"; many excluded studies assessed acute-phase treatment. Included studies differed in their design (such as timing of randomization and eligibility criteria), although most studies randomly assigned acute-phase responders or remitters to ongoing treatment with active drug or placebo. Most trials used a predefined cutoff point on a standardized scale (such as the HAM-D or MADRS) to determine eligibility for randomization, although the cutoff point varied among trials. Likewise, operational definitions of relapse and recurrence varied among trials.
We gave most of the included trials a quality rating of fair. They represent a broad range of methodological quality.
The mean age of trial participants was generally between 40 and 50 years. We excluded trials of children and adolescents (under 18 years of age). Two trials were conducted in older populations (age range 65–87); the age of participants was 75 in one trial comparing citalopram with placebo (
17 ) and 77 in one trial comparing sertraline with placebo (
18 ).
Comparative trials
Four head-to-head trials (five publications) directly compared the efficacy of one second-generation antidepressant with another for maintaining remission (
Table 1 ) (
19,
20,
21,
22,
23 ). Comparisons included duloxetine with paroxetine (
23 ), fluoxetine with sertraline (
19 ), fluvoxamine with sertraline (
20,
21 ), and trazodone with venlafaxine (
22 ). Relapse and recurrence rates did not differ significantly. Although none of these studies were designed to test an equivalence hypothesis, absolute differences in relapse and recurrence rates were consistently modest and likely not to be of clinical significance. Further, these comparative trials did not use a substitution design to randomly assign participants after successful acute-phase or continuation-phase treatment.
Placebo-controlled trials
Twenty-three randomized controlled trials provided placebo-controlled evidence to support the general efficacy of second-generation drugs for preventing relapse or recurrence among patients with depressive disorders (
Table 2 ) (
17,
18,
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
35,
36,
37,
38,
39,
40,
41,
42,
43,
44 ). For 12 trials (13 placebo comparisons) the randomized follow-up was shorter than one year, and the trials were deemed to represent relapse prevention during continuation-phase treatment (
24,
25,
26,
27,
28,
29,
30,
31,
32,
33,
34,
44 ). These trials provide consistent evidence in favor of active drug over placebo. The unadjusted frequency of relapse was 22% for active treatment, compared with 42% for placebo. An additional 11 randomized controlled trials had follow-ups of one year or longer and were deemed to represent recurrence prevention during maintenance-phase treatment (
17,
18,
35,
36,
37,
38,
39,
40,
41,
42,
43 ). These trials also provide consistent evidence in favor of active treatment over placebo. The unadjusted frequency of recurrence was 26% for active treatment, compared with 48% for placebo.
Meta-analysis
Trials shorter than one year: relapse prevention. Our relative risk meta-analysis comprised 12 trials lasting less than one year (
Figure 1 ): one on bupropion (
24 ), three on citalopram (
25,
33,
44 ), one on escitalopram (
26 ), three on fluoxetine (
27,
28,
34 ), and one each on mirtazapine (
29 ), nefazodone (
30 ), sertraline (
31 ), and venlafaxine (
32 ). The pooled relative risk of relapse was .54 (95% confidence interval [CI]=.46–.62), and the number of patients needed to treat to prevent one additional relapse over a mean time of eight months was five (CI=4–6). Heterogeneity among these trials was moderate (I
2 =47%). Tests for publication bias were not statistically significant.
Trials one year or longer: recurrence prevention. Eleven trials provided data points for follow-up of one year or more (
Figure 2 ): one on citalopram (
17 ), one on escitalopram (
42 ), one on fluvoxamine (
35 ), one on nefazodone (
36 ), two on paroxetine (
37,
41 ), four on sertraline (
18,
38,
39,
43 ), and one on venlafaxine (
40 ). Trials consistently favored active treatment over placebo for preventing recurrence, although differences were not always statistically significant. For example, during a 100-week comparison of sertraline 50–100 mg per day with placebo, 45% of sertraline-treated participants and 54% of placebo-treated participants had a recurrence, but differences were not statistically significant (
18 ). The pooled relative risk of recurrence was .56 (CI=.48–.66) and the number of patients needed to treat to prevent one additional recurrence over a mean time of 16 months was five (CI=4–6). Heterogeneity among these trials was moderate (I
2 =30%). Tests for publication bias were not statistically significant.
Meta-regression
Our meta-regression explored heterogeneity among included trials with regard to the duration of open-label treatment before random assignment of responders, the length of the postrandomization phase, and drug type (SSRI or other type of second-generation antidepressant). None of these variables influenced our estimates of effect size at a statistically significant level.
Adverse events
The most common adverse event documented in continuation- and maintenance-phase studies was headache, followed by nausea (weighted mean incidence=15.5% and 7.4%, respectively). Compared with the incidence of adverse events in acute-phase studies (
4,
45 ), the relative incidence of these events during long-term treatment was slightly lower. On the basis of 22 trials that provided sufficient data, loss to follow-up in general and loss to follow-up because of adverse events represented an average of 50% and 7%, respectively, of patients randomly assigned to receive active treatment and 68% and 4%, respectively, of patients randomly assigned to receive placebo. Based on data pooled from 17 placebo-controlled trials, the relative risk of dropping out for any reason was statistically significantly lower for active treatment than for placebo (relative risk=.75, CI=.69–.83). Data pooled from 18 placebo-controlled trials demonstrated that loss to follow-up because of adverse events was not statistically significantly different between active treatment and placebo (relative risk=1.42, CI=.92–2.20).
Discussion
We systematically assessed the efficacy and tolerability of second-generation antidepressants for the prevention of relapse and recurrence during treatment in the continuation and maintenance phases of major depression, respectively. Only a small number of trials directly compared one antidepressant with another. Results of these trials did not demonstrate statistically significant differences between duloxetine and paroxetine (
23 ), fluoxetine and sertraline (
19 ), fluvoxamine and sertraline (
20,
21 ), or trazodone and venlafaxine (
22 ) for preventing relapse or recurrence.
Although results are relatively consistent, we consider the strength of comparative evidence to be moderate because additional well-conducted studies could change our conclusions. Pooled data for second-generation antidepressants as a class compared with placebo suggest a relatively large effect size that persists over time, reflecting high-strength evidence for continued treatment beyond the acute phase. The number needed to treat to prevent one additional relapse during continuation-phase treatment or recurrence during maintenance-phase treatment is in the range of four to six patients.
The tolerability profile of continuation- and maintenance-phase treatment is fair to good. In clinical trials, 7% of patients randomly assigned to receive active treatment and 5% of patients assigned to receive placebo discontinued their continuation or maintenance phase of treatment because of adverse events. Although loss to follow-up was high (that is, 50% for active drug and 68% for placebo), the relative risk of discontinuing treatment because of adverse events did not differ significantly between active treatment and placebo. Overall loss to follow-up in acute-phase studies has been estimated at approximately 24% (
16 ), which is considerably lower than our estimates from continuation- and maintenance-phase studies. Our estimates likely are high because of longer trial duration, the preventive aim of this treatment, and misclassification of clinical endpoints (relapse or recurrence) as loss to follow-up.
Current practice guidelines for major depression recommend continuation-phase treatment for four to nine months for patients who demonstrate an adequate response to acute-phase treatment (
2,
3 ). For patients with recurrent depression, maintenance treatment is recommended. Our systematic review and meta-analysis provide relatively strong support for these guidelines.
On the basis of consistency of effect sizes over time, our review illustrates stable benefits of active treatment over placebo for up to two years of treatment. Although we demonstrated continued benefits of drug treatment over time, we were unable to draw inferences as to the most appropriate duration of antidepressant treatment. We identified only one randomized controlled trial that compared relapse rates for differing lengths of antidepressant treatment (
27 ), but the sample size of the longest treatment arm in this trial may have been insufficiently powered. Still, fluoxetine was shown to be more efficacious than placebo for up to 38 weeks (or approximately nine months) in this trial. More research is needed to determine the most appropriate length of therapy.
In a well-conducted systematic review of relapse prevention with first- and second-generation antidepressants in depressive disorders, Geddes and colleagues (
7 ) reported a 70% reduction in the odds of relapse for patients continuing antidepressant treatment compared with patients discontinuing treatment. The effect sizes reported in their analysis were "similar for all classes of antidepressants," but such unadjusted indirect comparisons may not be valid. To explore this further, we used data reported in the Geddes and colleagues review and converted their odds ratio to a relative risk ratio, specifically .45 for active drug compared with placebo (CI=.41–.49) and .41 for second-generation antidepressants compared with placebo (CI=.35–.48). The confidence intervals for the relative risk of relapse that we calculated for only second-generation antidepressants (less than one year, CI=.46–.62; one year or longer, CI=.48–.66) overlapped this relative risk estimate from the Geddes and colleagues study, and our analysis included nearly twice as many trials of newer antidepressants.
Even though a small number of comparative studies found no statistically significant differences between second-generation antidepressants, we were unable to draw firm conclusions as to whether one drug may be better than another for long-term treatment. These comparative trials were extension trials and did not reassign patients randomly to continuation or maintenance treatment but rather gave them the option to continue on with their blinded acute-phase treatment. This trial design has been shown to produce overall lower relapse rates, but it also yields larger differences between active treatment and placebo than placebo-substitution trials that randomly assign participants at the time of successful completion of acute-phase or continuation-phase treatment (
8 ). For this reason, we limited our meta-analyses to placebo-substitution trials. Although this refinement did not answer the question of whether the extension or placebo-substitution design provides a more accurate assessment of the benefits of drug treatment, we believe our results provide strong evidence for the benefits of continuing versus discontinuing antidepressant treatment after successful acute- or continuation-phase treatment.
Although it is tempting to draw inferences about one drug compared with another by indirectly comparing effect sizes among placebo-controlled trials (
Figures 1 and
2 ), we caution against such inferences because included trials differed in design and because unadjusted comparisons may be inaccurate. Adjusted indirect comparisons usually agree with results of head-to-head comparisons, but only when the trials being indirectly compared are similar (
46,
47 ). Because of differences in trial design and in operational definitions (such as definition of relapse or recurrence) used by investigators, we chose not to conduct adjusted indirect comparisons. More research is needed to verify whether second-generation antidepressants differ in relapse rates.
Our analysis is limited chiefly by the sparse quantity and quality of available evidence addressing our research questions. Only a handful of comparative studies have been published, making it difficult to generalize about one drug compared with any other. Selective publication of relapse and recurrence prevention trials could influence our conclusions, as has been shown with acute-phase treatment trials of antidepressants (
48 ). Although our statistical tests did not detect significant bias toward the publication of positive results, such bias may nevertheless be present, which would lead to an overestimation of the treatment effect. This possibility is especially important given that most of the included studies were sponsored by pharmaceutical companies. Evidence for some drugs was limited to a single study. Although we conducted a meta-regression to explore heterogeneity, data were insufficient to assess all important differences among trials. One important distinction that we could not address is whether presenting with a history of a single depressive episode versus recurrent episodes made a difference in relapse or recurrence rates. Because this is a primary decision point for psychiatrists in deciding whether to continue with maintenance-phase treatment, more work is needed in this area.
Finally, we could not determine whether demographic factors such as patients' age influenced relapse rates, although results from trials with older participants were generally consistent with evidence from younger adult populations. Evidence from children was not considered.