Relapse during antidepressant treatment is common and troublesome
(1,
2), although the efficacy of continuation and maintenance pharmacotherapy is well supported
(3 –
6) . The etiology of such relapses, which occur in up to 40% of patients, is unknown
(3,
7 –14) . Possible mechanisms include loss of placebo effect, diminished treatment adherence, increasing medication catabolism over time, and development of pharmacologic resistance to drug action.
Pattern analysis in antidepressant studies is based on the well-described delay in response to antidepressant medication
(15) and the observation that medication responses tend to persist once attained
(16) . The predictive validity of retrospectively applied pattern analysis has been supported by three analyses of data from acute antidepressant clinical trials
(17 –
19) and one analysis of continuation and maintenance treatment
(14) . A primary goal of this study was to confirm the findings of the latter study, but this time with a prospective design. A secondary goal was to replicate two earlier findings: that reverse neurovegetative symptoms—hypersomnia and weight gain—predict that fluoxetine will confer no more benefit than placebo in maintenance treatment and that chronicity of illness predicts relapse with both fluoxetine and placebo
(20) . Finally, we wished to examine whether the atypical and melancholic subtypes of major depression were predictive of differential relapse rates between fluoxetine and placebo.
Method
Participants and Procedure
A total of 627 patients 18 to 65 years of age who met DSM-IV criteria for a current episode of major depressive disorder were recruited by research programs at the New York State Psychiatric Institute in New York City and the Depression Clinical and Research Program of Massachusetts General Hospital in Boston. The study was approved by institutional review boards at both sites, and all participants provided written informed consent. Diagnoses were established using the Structured Clinical Interview for DSM-IV Axis I Disorders—Patient Edition
(21) . No minimum score for severity of depressive symptoms was required for inclusion in the study. Baseline medical screening included medical history, physical examination, ECG, CBC, blood chemistry profile, thyroid function tests, urinalysis, and urine drug screen. Patients were excluded from the study if they were at significant risk of suicide; were pregnant or breast-feeding; were women not using effective contraception; had an unstable physical disorder; had a lifetime history of any organic mental disorder, psychotic disorder, or mania; had a history of seizures; had a neurological disorder that significantly affects CNS function; had been active substance abusers or had substance dependence in the previous 6 months, other than nicotine dependence; were taking medications that may cause or exacerbate depression; had clinical or laboratory evidence of hypothyroidism without adequate and stable replacement therapy; or had a history of nonresponse to an adequate trial of a selective serotonin reuptake inhibitor (defined as a 4-week trial of at least 40 mg of fluoxetine or the equivalent daily).
After a 1-week medication-free washout period, patients who continued to meet inclusion criteria and whose symptoms had not improved significantly began a 12-week course of open-label treatment with fluoxetine. They were seen weekly by a research psychiatrist for the first 6 weeks, biweekly for the next 4 weeks, and weekly for the remaining 2 weeks. Target fluoxetine doses were 10 mg/day for the first week, 20 mg/day for weeks 2–4, 40 mg/day for weeks 4–8, and 60 mg/day for weeks 5–12. The dose was increased to meet the target only if the patient tolerated the medication well, and it was increased to 40 mg daily for all patients who could tolerate it. Treatment response was rated on the 17-item Hamilton Depression Rating Scale (HAM-D)
(22) and the Clinical Global Impression Scale (CGI)
(23) .
Chronicity was rated on a 6-point scale (1=single episode; 2=mainly well, with recurrent episodes; 3=chronic, with multiple remissions; 4=chronic, with no more than two remissions; 5=chronic, intermittent; 6=chronic, persistent). Patients’ neurovegetative symptom patterns were determined through a procedure we used previously
(20) . A modified HAM-D containing 20 items was administered at baseline. While the standard instrument contains items in which only typical positive symptoms such as insomnia and weight loss are ranked, the modified version contained in addition the reverse symptoms for those items, such as hypersomnia and weight gain. If a patient’s summary score for the standard 17 items was at least equal to his or her score with the positive-symptom items replaced by the reverse-symptom items, the patient was considered “positive neurovegetative”; if not, the patient was considered “reverse neurovegetative.”
Patients who responded to the medication by week 12 entered a discontinuation phase during which they underwent random assignment under double-blind conditions with computer-generated randomization, either to continue taking fluoxetine at the dose to which they had responded or to take placebo, for 52 weeks or until relapse. By convention, the first 6 months of this period were considered the continuation phase, and the remainder, the maintenance phase. Identical fluoxetine or placebo capsules were dispensed by a clinical research pharmacist who was blind to pattern of response. Compliance was monitored by counting returned capsules; participants whose adherence to the protocol was judged inadequate by the treating research psychiatrist were removed from the study.
A “true drug” response pattern was defined as a delayed and persistent response to treatment, with the first CGI improvement score of 2 or less occurring after the second week of treatment. A “placebo” response pattern was defined as an early and/or nonpersistent response, with the first CGI improvement score of 2 or less occurring before week 3 of treatment or a CGI improvement score above 2 occurring at any point after the start of response. Relapse during the double-blind discontinuation phase was defined as having at least 2 consecutive weeks of ratings of less than “much improved” on the CGI improvement scale compared with ratings at entry into the study.
Data Analysis
The analysis was based on modeling the time to relapse using Cox’s proportional hazards regression
(24,
25), which expresses the hazard of relapse as a function of covariates and estimates regression coefficients to measure associations between survival—that is, continuing without relapse—and the covariates. The proportionality of the relevant hazards was confirmed before fitting the model, and the residuals from Cox’s regression model were examined for evidence of influential observations, outliers, and any violation of the proportional hazards assumption
(26) .
The inferences are based on a model for the hazard as a function of treatment (fluoxetine or placebo), response pattern group (“true drug” response or “placebo” response), the interaction between treatment and response pattern group, and eight covariates: age, gender, neurovegetative symptom pattern (positive or negative), chronicity (as measured on the 6-point scale), subtype of major depression (atypical, melancholic, or neither), HAM-D score before the washout period, HAM-D score at randomization, and study site (New York City or Boston). First, a model containing two-way interactions between each of the covariates, treatment, and response pattern group was fitted. The model included all lower-order terms contained in the two-way interactions; no interactions between the covariates were considered. Inferences are based on the final model, which was arrived at by using one-term-at-a-time backward elimination that preserved the hierarchical principle (a lower-order term was not deleted if the model included a higher-order term containing it). All reported statistical tests are two-tailed.
Results
Open-Label Treatment
Of the 627 patients who consented to screening for the study, 34 (5.4%) were excluded for medical reasons, 18 (2.9%) did not return to begin treatment, and five (0.8%) improved significantly during the washout period and did not begin treatment. (We presume the low rate of improvement during the washout period was due to the absence of placebo treatment and the high chronicity among the participants, which may have limited placebo or nonspecific responses.) Of the 570 participants who began open-label treatment, 54.4% were women; 76.8% were Caucasian, 10.2% were African American, 8.2% were Hispanic, and the remainder were of other ethnicities. The participants’ mean age was 37.5 years (SD=11.3), and they had a mean of 14.7 years of education (SD=2.6); 72.8% were employed, students, or homemakers, and 17.9% were married. The mean HAM-D score at baseline was 17.7 (SD=4.5). A majority (59.1%) of participants had a chronic mood disorder and had experienced no more than two periods of well-being since the onset of illness.
The mean dose of fluoxetine taken by participants during the open-label phase was 45.8 mg/day (SD=15.1), and they took the medication for a mean of 9.7 weeks (SD=3.8). Of 366 participants who completed the 12-week open-label phase (64.2% of those who entered it), 292 (79.8%) were considered responders by CGI criteria at week 12; the intention-to-treat response rate was 51.2% (292 of 570 subjects). The mean HAM-D score at week 12 for responders was 5.1 (SD=3.3). Among the 204 patients who did not complete the open-label phase, the most common reason for noncompletion was removal from the study because of inadequate adherence to the protocol (34.4%), followed by dropping out because of inadequate response (18%) or side effects (14.4%).
Group Assignment
Of the 292 participants who were eligible, 30 elected not to continue in the study; of the remaining 262 participants, 131 each were randomly assigned to the fluoxetine and placebo groups. The participants who underwent random assignment were a mean age of 38.2 years (SD=10.9), 18.3% of them were married, and they had a mean of 15.1 years of education (SD=2.5), and 55.3% were female. Their mean HAM-D score was 17.1 (SD=4.1) at baseline and 4.9 (SD=3.1) at randomization; 22.7% of them had a history of dysthymia and thus currently had “double depression.” About two-thirds (35%) of the participants had one or more comorbid axis I disorder, most commonly panic disorder (13.3%), social phobia (12.4%), and alcohol dependence (10.6%) (percentages are nonexclusive).
During this phase, 85 participants left the study, on average 16.4 weeks (SD=2.0) after randomization; 34 of them were from the placebo group (26.0% of the placebo group), and 51 were from the fluoxetine group (38.9%) (χ 2 =4.5, df=1, p=0.035). The most common reasons for leaving during this phase were removal for inadequate adherence (30.6% of those who left the study), loss to follow-up (14.1%), and side effects (7.1%). Vigorous efforts were made to ensure that all dropouts were contacted and interviewed. If worsening of symptoms was one of the reasons a participant left the study, that patient was not counted as a dropout.
Models for Survival Without Relapse
To examine the hypothesis that “true drug” and “placebo” response patterns during initial treatment are predictive of relapse, a survival model was constructed, and response pattern, treatment, and their interaction were entered. While fluoxetine treatment was significantly predictive of a lower relapse rate (hazard ratio=0.383, 95% confidence interval [CI]=0.198–0.742; χ
2 =8.10, df=1, p=0.004), there was neither a significant effect of response pattern (hazard ratio=1.097, 95% CI=0.676–1.779; χ
2 =0.14, df=1, p=0.71) nor an interaction between response pattern and treatment (hazard ratio=1.609, 95% CI=0.730–3.550; χ
2 =1.39, df=1, p=0.24). This finding fails to confirm our previous finding
(20) that patients who had a placebo response pattern did no better with active medication than with placebo.
In our previous study
(20), we found that a neurovegetative symptom pattern was predictive of relapse rate during continuation treatment. Among patients who were positive neurovegetative (with the typical predominance of insomnia and decreased appetite), active medication showed a significant advantage over placebo, whereas among patients who were reverse neurovegetative (with a predominance of hypersomnia and increased appetite), it did not. We postulated that neurovegetative symptom pattern might be a surrogate for diagnosis, with patients categorized as reverse neurovegetative having an atypical subtype of depression.
To examine these predictions in the current study, we constructed survival models using neurovegetative symptom pattern, subtype diagnosis, and their interaction. In a model using neurovegetative symptom pattern and treatment, a positive neurovegetative symptom pattern was associated with a lower rate of relapse (hazard ratio=0.586, 95% CI=0.373–0.920; χ
2 =5.38, df=1, p=0.02); contrary to our previous findings
(20), this association did not differ between fluoxetine and placebo (interaction term: ratio of hazard ratio=1.384, 95% CI=0.658–2.909; χ
2 =0.73, df=1, p=0.39). In the model examining diagnosis, neither melancholic subtype (hazard ratio=0.988, 95% CI=0.363–2.685; χ
2 =0.0006, df=1, p=0.98) nor atypical subtype (hazard ratio=1.079, 95% CI=0.477–2.444; χ
2 =0.03, df=1, p=0.85) interacted with treatment. Atypical subtype was significantly predictive of a greater risk of relapse, independent of treatment (hazard ratio=1.603, 95% CI=1.076–2.389; χ
2 =5.38, df=1, p=0.02). However, because atypical depression is more chronic than other types and chronicity was a strong predictor of relapse in our previous study, examination of this effect requires a model that includes chronicity.
To examine chronicity, its interaction with subtype diagnosis, and other potentially predictive variables, a full model was developed that incorporated treatment, response pattern, age, gender, neurovegetative symptom pattern (positive or reverse), chronicity (6-point scale), presence of a comorbid nonaffective axis I disorder, subtype of major depression (atypical, melancholic, or neither), pretreatment severity of depressive symptoms (HAM-D), degree of response at randomization (HAM-D), and study site (New York City or Boston). A full model was fitted, substituting the presence or absence of lifetime dysthymia for chronicity to assess the effect of double depression, a more widely used clinical construct describing chronicity.
Table 1 summarizes the best-fitting model for the hazard for relapse.
Treatment Effect
These results indicate that fluoxetine treatment during continuation and maintenance treatment was associated with continued remission (ratio of relapse hazard during placebo substitution to relapse hazard during fluoxetine continuation=1.73; 95% CI=1.20–2.51).
Figure 1 shows survival curves for the fluoxetine and placebo groups, with other variables set at their mean values. These curves depict a sharp but diverging decrement in survival for both groups during the first six months of double-blind treatment, and no further divergence after that point. The relapse rates at the end of the continuation phase (6 months after randomization) were 35.2% for the fluoxetine group and 61.8% for the placebo group; after 1 year, they were 45.9% for the fluoxetine group and 72.0% for the placebo group. This finding replicates previous findings that fluoxetine prevents relapse during both continuation and maintenance treatment
(3,
14) and that the risk of relapse with discontinuation of medication is substantially greater during the continuation phase of treatment.
Gender
Independent of other variables, women were more likely to relapse during the continuation and maintenance phases, regardless of which treatment they received (
Figure 2 ). This is an unexpected finding that was not seen in our previous continuation and maintenance study
(14,
20) and, to our knowledge, has not been reported previously. Although gender differentials in antidepressant response rates have been reported, findings have been inconsistent, and differences have been in the direction of superior responses for women
(27) . In previous work examining this issue in 1,746 patients, our group did not find a gender effect for acute response to antidepressants, including fluoxetine
(28) .
Chronicity, Comorbidity, and Site
In this study, greater chronicity of illness was strongly associated with relapse during continuation and maintenance treatment (
Figure 3 ) (hazard ratio=1.709; 95% CI=1.140–2.563), and no difference was observed between treatment groups. This finding replicates our previous finding of an association between chronicity and relapse. The fairly high degree of chronicity in this sample may explain the relatively high relapse rates in both the fluoxetine and placebo groups. Once chronicity is included in the model, the subtype of depression was no longer significant, suggesting that the higher relapse rate among patients with atypical depression was accounted for by the greater chronicity in this group. The presence of double depression in 22.7% of the subjects who entered the randomization phase did not account for the significant variance in outcome when presence or absence of double depression was substituted for the ordinal chronicity measure.
After accounting for chronicity in the model, a reverse neurovegetative symptom pattern was still associated with a significantly higher risk of relapse. The presence or absence of any comorbid nonaffective axis I diagnosis was not predictive of relapse. The most common comorbid diagnoses—panic disorder, social phobia, and alcohol dependence—were not predictive of relapse when included as individual covariates in the model. The New York City site was significantly more likely than the Boston site to rate patients as relapsed when the other indicated variables were controlled for, although this propensity did not differ by treatment.
Severity
Severity of depression after completion of initial treatment, as measured by the HAM-D, had a marked effect on hazard for relapse. This effect was independent of treatment and increased with level of pretreatment severity of illness. For example, a 3-point increase in posttreatment HAM-D score increased the hazard for relapse by 33% for those with a pretreatment HAM-D of 20, and by 82% for those with a pretreatment score of 25.
Discussion
In this prospective study, we were unable to replicate the findings of two previous studies
(14,
19) that retrospectively analyzed continuation and maintenance studies to use pattern analysis to predict relapse. Unlike in our previous study
(14,
20) in this study we did not use a fixed dose of 20 mg of fluoxetine but rather raised the dose for patients who did not respond to treatment with the initial dose. This process may have limited the number of placebo responses observed, reducing the study’s statistical power to show a difference in relapse by treatment response pattern, although we did not even observe a numerical difference in the expected direction. Also, the dose increases that were allowed during the open-label phase of the study may have led to delayed placebo or nonspecific responses, since patients were informed that higher doses of fluoxetine were used to help with initial nonresponse. Hence, some patients who responded to the nonspecific factors, such as expectations related to dose increases, would have been misclassified as true drug responders. Finally, both previous discontinuation studies
(14,
19) required a minimum HAM-D score for entry and a maximum to enter the randomization phase, which the current study did not. It could be that pattern analysis is not as powerful with samples of subjects who have had a less robust response to medication treatment; if that is the case, the clinical utility of pattern analysis would be significantly limited.
Given these differences, we cannot resolve the discrepancy between these and previous findings. One possibility is that these findings simply resulted from a type II error in which, by chance, a true difference went undetected because of random error or insufficient statistical power. Another possibility is that the earlier studies represent false positives or type I errors. Since two of the three reported studies are positive, it seems unlikely that chance alone could account for this. Only independent replication by other investigators will resolve this issue.
This study has replicated our previous observation that chronicity is a strong predictor of relapse in both fluoxetine and placebo groups
(20) . This may be due to chronicity’s being associated with only a transient placebo response or to chronicity’s being a risk factor for true tachyphylaxis. Considering all the evidence, it seems likely that at least some apparent loss of medication efficacy during continuation and maintenance treatment is due to loss of placebo effect, but pattern analysis of treatment response is probably an imperfect way of determining how much improvement in patients receiving medication is due to placebo effect. The determination of whether true tachyphylaxis occurs with antidepressants is a more difficult problem that must be investigated with methods beyond those used in clinical trials. One possible approach would be to use brain imaging to determine whether changes in brain metabolism associated with symptom improvement during medication treatment
(29) revert to baseline in some cases despite continued medication.
The relapse rates with both fluoxetine and placebo are relatively high compared with those reported by Montgomery et al.
(3), who used a 40 mg dose of fluoxetine, and comparable to those observed in our previous study using a fixed dose of 20 mg
(14,
20) . The Montgomery et al. study also differed somewhat from ours in design, as patients took fluoxetine for a longer period before randomization; also, the patients’ illness was probably less chronic. In population and entry criteria, our previous study
(14,
20) was comparable to this study. The apparent lack of difference in relapse rates between patients taking 20 mg of fluoxetine in our previous study and those taking the higher doses in this study suggests but does not prove the lack of a dose effect.
A reverse neurovegetative symptom pattern was found to be associated with an increased risk of relapse, which confirms our previous findings and suggests that patients with these features may have a poorer natural course even with active treatment. Finally, higher HAM-D scores at the end of the acute treatment phase were also found to be associated with an increased risk of relapse, confirming previous reports of higher relapse or recurrence rates among patients with residual symptoms than among relatively asymptomatic patients who are in remission
(30) .
Limitations that should be kept in mind when interpreting the results of this study include the fact that participants sought treatment at tertiary care centers, whose patient populations are probably not representative of those in other settings. Also, our assessments of treatment adherence were based on counts of returned capsules and clinical judgment. Although adherence was deemed to be good overall, poor adherence could have accounted for some of the relapses in our sample. Finally, fluoxetine blood levels were not obtained; these might have provided additional information, although in one large study
(31), they were not found to be predictive of relapse.
This study has several implications for clinical treatment. First, maintenance treatment with fluoxetine in patients whose major depression has responded is effective across a range of severity and chronicity. Second, even among patients with a true drug response pattern, there is a significant risk of relapse despite maintaining active medication at the full dose to which the patient originally responded. This raises the concern that true tachyphylaxis rather than loss of placebo response accounts for some of these relapses. Finally, patients with highly chronic depression do respond to fluoxetine treatment and do benefit from maintenance treatment, although they relapse at a higher rate than patients whose illness is less chronic, both on active medication and placebo. Future studies of continuation and maintenance antidepressant treatment should focus on ways to distinguish loss of pharmacologic benefit with antidepressant medication from loss of placebo effect and explore ways to prevent and treat such relapses.
The replication in this study of the finding in previous work
(20) that chronicity of depression is a strong predictor of relapse, with both medication and placebo treatment, has implications for future research. It strongly suggests that any maintenance study should be stratified by chronicity to ensure that this factor does not have a confounding effect on relapse rates. The finding of a higher relapse rate for women may be a statistical artifact, but this should be examined in future maintenance studies.