Practice guidelines for treating bipolar disorder discourage the use of long-term antidepressant therapy (
1–
5). Several expert panels recommend discontinuing antidepressants within 12 weeks of recovery from bipolar depression (
1–
3). One clinician survey recommends discontinuing antidepressants within 8–20 weeks after recovery from depression (
2). This time frame is considerably shorter than that recommended for relapse-prevention of unipolar major depressive disorder (
6). However, there is no evidence to suggest that patients with bipolar disorder require less prophylactic antidepressant therapy than do patients with unipolar depression. We previously compared survival rates in recovered bipolar II patients and unipolar depression patients receiving fluoxetine monotherapy for 26 weeks (
7) and found that somewhat fewer bipolar (22%) than unipolar (33%) patients relapsed during prophylactic therapy. While practice guidelines recommend antidepressant maintenance for recovered patients with unipolar depression (
6,
8), the opposite recommendation is made for recovered bipolar patients (
1–
5,
9,
10).
Most practice guidelines for bipolar disorder derive from studies of both patients with bipolar I and bipolar II disorder taking different antidepressant medications. These studies have generally reported an increase in mood conversion episodes and depressive relapses during antidepressant use (
10–
14). While one randomized study of bipolar I and bipolar II depressed patients reported good antidepressant efficacy with more manic conversions with venlafaxine compared with sertraline or bupropion (
10), a study comparing adjunctive bupropion with desipramine therapy in addition to established mood stabilizer therapy found no difference in antidepressant efficacy and a higher manic conversion rate with desipramine (
15). A retrospective study of bipolar I and II patients (
16) found that 44% of patients reported at least one mood conversion during antidepressant therapy if they had a history of prior mood conversions. However, the authors acknowledged the limitation of using retrospective self-report data in patients taking various antidepressants and mood stabilizers. Although a recent literature review (
11) found rates of 20%–40% for antidepressant-induced mood conversions, it also found that adjunctive mood stabilizer therapy provided limited protection against antidepressant-induced mood conversions. Finally, several case-control and randomized trials of bipolar II patients found that long-term antidepressant therapy may not produce an increase in mood conversion episodes and may prevent depressive relapse (
17–
20).
In this article, we present results from a randomized double-blind placebo-substitution study of the long-term safety and efficacy of fluoxetine monotherapy of bipolar II disorder. We hypothesized that fluoxetine monotherapy would be superior to lithium monotherapy in preventing relapse or recurrence of depression. We further hypothesized that there would be a similar mood conversion rate among treatment groups.
Method
Patients
Outpatients at least 18 years old who had a DSM-IV-TR diagnosis of bipolar II disorder with a current major depressive episode and a score ≥16 on the 17-item Hamilton Depression Rating Scale (HAM-D;
21) were enrolled. Exclusion criteria were prior mania or psychosis, substance abuse or dependence within the preceding 3 months, nonresponse to fluoxetine within the current episode, sensitivity to fluoxetine or lithium, presence of an unstable medical condition, pregnancy, nursing, renal insufficiency, dementia, malignancy, and concurrent use of antidepressant or mood stabilizer therapy.
Procedures
Patients provided informed consent in accordance with the ethical standards of the University of Pennsylvania institutional review board. The study was conducted using the Good Clinical Practice guidelines with oversight by the university's Office of Human Research and an independent data and safety monitoring board.
A psychiatric history was verified using the Structured Clinical Interview for DSM-IV format (
22). Medical history, physical examination, and laboratory tests (including blood urea nitrogen and creatinine levels, thyroid panel, pregnancy test in women, drug screen, and ECG) were performed. Estimates of the number of prior depressive and hypomanic episodes were obtained as defined by DSM-IV criteria.
Structured HAM-D and Young Mania Rating Scale (YMRS) (
23) scores were obtained by a study clinician. Symptom ratings were obtained with attribution as to the origin of the symptom. For example, insomnia could be recorded on the HAM-D as a depressive symptom or recorded on the YMRS as a hypomanic symptom. It could also be simultaneously recorded on the HAM-D and YMRS as a mixed hypomanic and depressive episode symptom if the evaluator attributed the insomnia to both conditions. This rating method sometimes resulted in baseline YMRS scores that were above zero. This procedure has been employed in bipolar trials as a real-world means of distinguishing mood conversion episodes from depressive symptoms (
20,
21,
24,
25).
Treatment
Initial fluoxetine monotherapy, at 20–80 mg/day, was administered on the basis of response and tolerability. Patients who had a final HAM-D score ≤8 by week 12 of treatment were randomly assigned to continuation monotherapy for an additional 50 weeks with fluoxetine at 10–40 mg/day, lithium at 300–1200 mg/day (with a serum level of 0.5–1.5 mmol/liter), or placebo. Patients assigned to the fluoxetine group who previously took >40 mg/day of fluoxetine had their double-blind fluoxetine dosage reduced to 40 mg/day, and those who previously took ≤40 mg of fluoxetine daily had their established double-blind fluoxetine dosages maintained. Patients assigned to the lithium group had their previous fluoxetine therapy discontinued. Lithium therapy was initiated at 600 mg daily for 1 week, and a serum lithium level was obtained. Based on tolerability and serum lithium level, the lithium dose was increased to 900 mg daily during week 2 of therapy. This process was repeated until a serum lithium level of 0.5–1.5 mmol/liter was achieved by week 4 of therapy. Patients assigned to the placebo group had their previous fluoxetine therapy discontinued and placebo substituted. This procedure was based on prior safety data from double-blind fluoxetine discontinuation trials (
26). Sham serum lithium levels were reported to the study clinician for patients assigned to receive fluoxetine or placebo. Short-term zolpidem (≤10 mg), lorazepam (≤2.0 mg), or trazodone (≤75 mg) was permitted for severe insomnia.
Outcome measures were obtained at baseline (i.e., the last week of open-label fluoxetine therapy) and during double-blind treatment after weeks 1, 2, 4, 6, 8, 12, 20, 28, 36, 44, and 50. Because outcome assessments were not equally spaced in time, the number of days since baseline was used for statistical analyses.
Outcome measures included HAM-D and YMRS scores. A treatment-emergent side effect profile of reported and elicited adverse experiences was obtained at each study visit, including the onset and cessation date, the severity of the event, the relationship of the event to treatment, and the outcome.
Full syndromal depressive relapse was defined as an increase in baseline HAM-D score to ≥14 plus meeting criteria for a major depressive episode (
6,
7,
19). A subsyndromal depressive episode was defined as any increase in baseline HAM-D score without meeting criteria for a major depressive episode. Patients experiencing subsyndromal depressive episodes underwent double-blind rescue therapy via upward or downward adjustment of medication within the allowable dosage (and serum lithium level) parameters.
Syndromal and subsyndromal hypomania were assessed via patient telephone reports of mood change and by clinician-elicited assessment of mood conversion symptoms in the preceding treatment period performed at each study visit. Hypomania was defined by DSM-IV criteria as an episode lasting ≥4 days with ≥4 symptoms. Subsyndromal hypomania was defined as an episode lasting ≤3 days with ≥4 symptoms or an episode lasting ≥4 days with ≤3 symptoms. Hypomania was also examined using YMRS score cut-points of ≥8 and ≥12 at any study visit (
19,
20,
24). Patients experiencing syndromal or subsyndromal hypomania underwent double-blind rescue therapy via upward or downward adjustment of medication within the allowable dosage (and serum lithium level) parameters.
Sample Size
The study was powered to generate a sufficient sample size to test the original hypothesis that fluoxetine monotherapy is superior to lithium, fluoxetine plus lithium, and placebo in preventing depressive relapse at 6 and 12 months. Before the start of the trial, the study design was modified to drop the fluoxetine-plus-lithium treatment condition to specifically compare fluoxetine monotherapy and lithium monotherapy, with placebo as the control condition. The assumptions for our calculations were based on the results from a previous study of long-term fluoxetine monotherapy (
7). Using Kaplan-Meier analyses, after 6 months the cumulative proportion of patients not relapsing was 0.7 on fluoxetine and 0.1 on placebo, and after 12 months the proportions were 0.6 and 0.1, respectively. To be conservative, we based our power analysis on the expected survival frequencies of 0.6 for fluoxetine and 0.18 for placebo. The group size needed to detect a significant difference in 0.6 versus 0.18 with the log-rank test at the p<0.05 level with 80% power was 23. Assuming a 50% remission rate during initial fluoxetine therapy, the minimum number of patients necessary was 4×23×2=184. However, because the number of double-blind treatment conditions was reduced to three at the outset of the trial, our sample size was conservatively adjusted to 168 (yielding 28 patients per treatment condition).
Statistical Procedures
Analyses were conducted in Stata, version 10.0 (Stata Corp., College Station, Tex.), according to the intent-to-treat principle, with two-sided tests of hypotheses and an alpha of 0.05. Initial analyses were descriptive, summarizing demographic and clinical variables by treatment condition. Kaplan-Meier survival curves were plotted, as well as overlaid individual-level profile plots of YMRS scores. The log-rank test was used to compare the survival distributions to relapse between treatment groups. The mean time to relapse or recurrence in days was estimated for each treatment group.
We fit a Cox proportional hazards model with two indicator variables: one for fluoxetine (which took a value of 1 for fluoxetine and 0 otherwise) and one for lithium (which took a value of 1 for lithium and 0 otherwise). We tested the null hypothesis that the hazard ratios were equal in both groups by estimating the hazard ratio for fluoxetine compared with lithium (primary test) and fluoxetine compared with placebo (secondary test). The Cox model also allowed for comparison of fluoxetine with placebo as a secondary test of equality of regression coefficients (
27). Fisher's exact test was used to compare the proportion of patients in each treatment group who relapsed.
Mood conversion rates were compared between treatment groups using quasi-least squares (
28). Quasi-least squares allowed for the use of a linear regression model with adjustment for the correlation between repeated measurements on each patient (
29). Quasi-least squares regressed YMRS scores on the following covariates: baseline YMRS score; time (days from baseline divided by 7); indicator variables for fluoxetine and lithium (which took a value of 1 for fluoxetine or lithium and 0 otherwise); and time-by-treatment group interaction terms. We tested whether the time-by-treatment group regression coefficients differed significantly from zero (indicating that change over time for that group differed significantly from placebo). Quasi-least squares was used to estimate the expected YMRS scores at baseline and 350 days after baseline, and to obtain the estimated change over 350 days. We estimated the proportion of patients in each treatment group that had YMRS scores ≥8 and ≥12 at any time during treatment and compared these proportions using Fisher's exact test.
Analysis of variance (ANOVA) and Kruskal-Wallis tests were used to compare mean and median changes in YMRS scores. These analyses allowed us to compare the greatest elevations in YMRS scores (compared with baseline) for each patient.
Finally, Fisher's exact test was used to compare the proportion of patients who had adverse events in each treatment group. The frequency of each event type by treatment was examined to identify potential differences in event "signal" among treatment groups. This was done because the study was not powered to detect small to moderate differences in adverse events between treatments. We note that failure to identify a significant difference between treatment groups does not prove that no clinically meaningful difference exists.
Results
Enrollment
A total of 167 patients were enrolled (
Figure 1): 89 were women, with a mean age of 36.9 years (SD=12.7, range=19–76), and 78 were men, with a mean age of 37.9 years (SD=12.9, range=20–84). Nineteen patients (11.4%) did not pass screening, and 148 patients received initial fluoxetine monotherapy. Sixty-five patients (43.9%) discontinued initial fluoxetine monotherapy: 26 (17.6%) because of lack of efficacy, five (3.4%) because of adverse events, two (1.4%) because of noncompliance, and 32 (21.6%) for withdrawn consent or loss to follow-up. None discontinued initial fluoxetine monotherapy for hypomania.
Eighty-three patients (49.7%) recovered during initial fluoxetine monotherapy, and 81 patients were randomly assigned to receive double-blind therapy with fluoxetine (N=28), lithium (N=26), or placebo (N=27). There were no statistically significant differences in baseline clinical or demographic characteristics among treatment groups (
Table 1). The mean average lithium dose was 1027 mg/day (SD=210.8), and the mean average serum lithium level was 0.69 mmol/liter (SD=0.27). The mean average maximum fluoxetine dose was 34.3 mg/day (SD=7.9).
Relapse Prevention Efficacy
The mean time to full syndromal depressive relapse or recurrence was 249.9 days (95% confidence interval [CI]=186.8–312.9) for the fluoxetine group, 156.4 days (95% CI=92.3–220.6) for the lithium group, and 186.9 days (95% CI=113.0–260.7) for the placebo group. The significance level for a comparison of time to relapse on fluoxetine and lithium or placebo was p=0.03 (
Figure 2).
The Cox proportional hazards ratio for relapse was significantly lower for fluoxetine compared with lithium (ratio=0.4; 95% CI=0.2–0.9; p=0.04). In contrast, the hazards ratio was not significantly different for lithium compared with placebo (ratio=1.2; 95% CI=0.6–2.4; p=0.7) or for fluoxetine compared with placebo (ratio=0.5; 95% CI=0.2–1.1; p=0.1). Thus, the estimated relapse hazard with lithium was 1÷0.4=2.5 (95% CI=1.1–5.0) times higher compared with fluoxetine monotherapy.
The proportion of patients who relapsed was 32.1% on fluoxetine, 57.7% on lithium, and 51.9% on placebo (Fisher's exact test, p=0.14). (Note that Fisher's exact test ignores time to relapse and censored data.) There was no significant difference in mean serum lithium levels between patients who relapsed (0.74 mmol/liter; SD=0.30) and patients who did not relapse (0.65 mmol/liter; SD=0.24).
Hypomanic Symptoms
Ten patients had hypomanic episodes: three in the fluoxetine group, two in the lithium group, and five in the placebo group. The mean number of episodes was 1.1 (SD=0.3, range=1–2), and the mean episode duration was 9.6 days (SD=5.7, range=5–16). Twenty-one patients had subsyndromal hypomanic episodes: 10 in the fluoxetine group, seven in the lithium group, and four in the placebo group. The mean number of episodes was 1.2 (SD=0.4, range=1–2), and the mean episode duration was 10.7 days (SD=14.8, range=1–48). Two hypomanic episodes resulted in treatment discontinuation: one in the fluoxetine group and one in the placebo group.
There were no statistically significant differences in YMRS scores among treatment groups over time (
Table 2). The estimated change in YMRS score from baseline to day 350 was –6.3 (95% CI=–47.5 to 34.9) in the fluoxetine group, 7.2 (95% CI=–33.3 to 53.8) in the lithium group, and 0.1 (95% CI=–1.0 to 1.2) in the placebo group.
Figure 3 displays the individual-level profile plots of YMRS scores. The number of patients with a YMRS score ≥8 at any study visit was six (21.4%) in the fluoxetine group, two (7.7%) in the lithium group, and three (11.1%) in the placebo group. The number of patients with a YMRS score ≥12 at any study visit was three (10.7%) in the fluoxetine group, two (7.7%) in the lithium group, and two (7.4%) in the placebo group. The mean value of the highest minus the baseline YMRS score was 4.0 (median=2.0) in the fluoxetine group, 3.0 (median=1.0) in the lithium group, and 3.7 (median=2.0) in the placebo group.
Safety and Tolerability
Overall, three patients (3.7%) withdrew because of adverse events: one (3.6%) in the fluoxetine group, one (3.8%) in the lithium group, and one (3.7%) in the placebo group. There were no serious adverse events. The most frequent adverse occurrences (≥10%) for all treatment groups were headache, polyuria, polydypsia, tremor, weight gain, agitation, and minor depressive symptoms. With the exception of an increase in resting systolic or diastolic blood pressure in four (14.3%) patients in the fluoxetine group, two (7.7%) in the lithium group, and three (11.1%) in the placebo group, no other clinically meaningful physical or laboratory events were observed during double-blind therapy. Four patients briefly received concomitant treatment with zolpidem, lorazepam, or trazodone for severe insomnia.
Table 3 lists the frequencies of reported and elicited psychiatric adverse events during double-blind therapy. We compared the distribution of adverse events in each treatment group that might suggest subsyndromal hypomania (e.g., agitation, insomnia, hyperthymia, irritability, pressured speech, racing thoughts, akathisia, mood lability, and hypomanic symptoms). There was no statistically significant difference among groups.
Discussion
Long-term antidepressant therapy of bipolar disorder has received relatively little attention (
30). Although some practice guidelines have recommended the cautious use of antidepressants in bipolar depression, these guidelines differ widely on the optimum duration of antidepressant therapy (
9). Moreover, most guidelines extrapolate from studies of patients taking tricyclic antidepressants or from studies of mixed bipolar I and II patient populations (
10–
13,
30–
32). While all practice guidelines recommend discontinuing antidepressants after recovery from depression (
1–
5), this approach has not been adequately examined.
Although the benefit of antidepressant therapy in bipolar II disorder is supported by some studies (
7,
18,
20,
15,
33–
37), other studies have not confirmed this benefit or have reported an increase in mood conversions during antidepressant use (
10,
11,
15,
38–
40). A naturalistic study that reviewed outcomes of 54 bipolar patients receiving combined therapy with lithium plus an antidepressant for up to 5 years found that 50% of them remained well for 17.2 months longer than patients receiving lithium monotherapy (
32). Manic conversions occurred in only 14% of patients (primarily taking tricyclic antidepressants). More recently, Kupfer et al. (
18) observed a sustained remission rate of 67% over 24 weeks in 21 bipolar type I and II patients who responded to adjunctive citalopram therapy in combination with lithium therapy.
A nonrandomized case-control study (
17) reported significantly fewer depressive relapses in bipolar patients who continued antidepressants (32%) compared with those who did not (68%). Moreover, antidepressant use was not associated with a greater mood conversion rate. However, this observational study included only 15% of the patients who initiated antidepressant treatment and continued it for 6 months. A subsequent retrospective study (
33) that examined relapse rates in recovered bipolar patients who either continued (N=19) or discontinued (N=25) antidepressant therapy found that after 1 year of treatment, depressive relapse occurred in 68% of those who discontinued therapy and 32% of those who did not. Moreover, patients who continued therapy for at least 6 months were less likely to relapse (
33).
Finally, we conducted a 26-week randomized double-blind placebo-substitution study of fluoxetine monotherapy in recovered bipolar II patients (
19,
24). In that study, 43% of patients receiving fluoxetine and all of those receiving placebo relapsed during continuation therapy (p=0.08). While the mean increase in YMRS score in that study was slightly higher in the fluoxetine group than in the placebo group (mean=3.0 [SD=1.8] compared with mean=0.2 [SD=0.4], p=0.01), the difference between groups was not clinically meaningful. Moreover, no hypomanic episodes were observed.
Findings from the present study are not definitive. Several limitations and caveats should be considered in their interpretation. For example, although the removal of the combined lithium-plus-fluoxetine condition increased the cohort sizes, the study still had only limited power to detect more than a substantial superiority of fluoxetine monotherapy. Moreover, it is possible that the lithium-plus-fluoxetine condition would have provided even greater efficacy than fluoxetine or lithium monotherapy.
We did not employ a patient-rated chrono-record for identifying ultrashort mood conversion episodes (
10). It is possible that we missed some subsyndromal hypomanic episodes that occurred between study visits. However, the proportion of patients in each group with even modest increases in YMRS score was neither statistically significant nor clinically meaningful.
The frequency and severity of mood conversion symptoms might have been greater if depressive and hypomanic symptoms had been rated without attribution as to cause or if we had used different threshold criteria (
10,
15–
18). Moreover, rating insomnia on both the HAM-D and YMRS could have inflated YMRS scores. While such inflation is possible, a similar effect would be expected for all treatment conditions. We found no significant differences in YMRS scores among groups.
It is possible that the frequency and severity of mood conversion episodes would have been greater if treatment duration had been longer. However, we found no clinically meaningful increase in YMRS scores after day 70. This observation supports earlier studies of antidepressant use in bipolar II patients (
7,
20,
35,
39,
40).
It is possible that the low mood conversion rate observed in this study resulted from the use of an "enriched" bipolar II population with more mild illness and a lower propensity for fluoxetine-induced mood conversions. However, the illness severity of patients in this study was similar to that of bipolar patients in prior studies that found similar frequencies of mood conversions during antidepressant monotherapy (
10,
20,
35,
39,
40).
It is possible that we underestimated the frequency of subsyndromal hypomanic episodes by employing the YMRS cut-point values of 8 and 12. Also, this study was not powered to detect a difference in YMRS score between treatment groups (and, as noted earlier, failure to identify significant differences in YMRS scores between groups does not constitute proof that differences do not exist).
Finally, while it is possible that a delayed effect of fluoxetine withdrawal affected relapse rates in the lithium and placebo groups, the rate of relapse during the initial 50 days of double-blind therapy was similar for all treatment groups (Figure 2).