However, results from randomized, controlled trials are difficult to generalize to clinical practice because they typically include only bipolar I patients and exclude those with substantial medical or psychiatric comorbidity, particularly substance abuse. Randomized trials also generally involve monotherapy, even though in clinical practice most patients receive multiple medications
(8). Furthermore, randomized pharmacotherapy trials generally do not allow adjunctive psychosocial therapies, even though such interventions have been shown to decrease the risk of recurrence
(6). Conversely, naturalistic studies often use select groups (e.g., first-episode patients, bipolar I/psychotic patients), many were conducted before the widespread use of newer pharmacotherapies for bipolar disorder, and all but two included fewer than 100 individuals with bipolar disorder
(2,
9–13). Therefore, the extent to which modern treatment approaches may improve outcomes in actual clinical populations with bipolar I and II disorder remains to be established. Likewise, the limited size of most prior naturalistic studies yielded little power to detect clinical predictors of risk of recurrence.
Method
Study Overview
STEP-BD is a multicenter study designed to evaluate longitudinal outcomes in individuals with bipolar disorder. The overall study combines a large prospective naturalistic study using a common disease-management model and a series of randomized, controlled trials that share a battery of common assessments
(14). All participants receive standardized ongoing assessment, regardless of whether they are participating in randomized treatment; this report includes data from the prospective naturalistic study only.
Participants
The study was approved by the human research committees (institutional review boards) of all participating treatment centers and the data coordinating center, and oral and written informed consent was obtained from all participants before participation in study procedures. For participants ages 15–17, written assent was obtained, with written informed consent obtained from a parent or legal guardian. STEP-BD participation was offered to all diagnostically eligible patients seeking outpatient treatment. To enter STEP-BD, the participants were required to be at least 15 years of age and to meet DSM-IV criteria
(19) for bipolar I disorder, bipolar II disorder, cyclothymia, bipolar disorder not otherwise specified, or schizoaffective disorder, bipolar subtype. Exclusion criteria were limited to an unwillingness or inability to comply with study assessments, an inability to give informed consent, or being an inpatient at the time of enrollment (although hospitalized patients could enter STEP-BD after discharge). The present report draws on the first 2,000 participants to enter STEP-BD with up to 2 years of data from the point of enrollment, focusing on the 858 who entered in a symptomatic state and achieved recovery during those 2 years.
Assessments
The Affective Disorder Evaluation (ADE)
(14) uses adaptations of the mood and psychosis modules from the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID)
(19). It was administered by ADE-certified psychiatrists to all participants at study entry and was the primary means of establishing a diagnosis. The ADE also included a systematic assessment of lifetime and recent course of illness based on patient reports, including age at illness onset, number of lifetime prior episodes, number of episodes, proportion of days in each mood state in the prior year, and longest period of euthymia.
The Mini-International Neuropsychiatric Interview, version 4.4 (MINI),
(20) was used to confirm bipolar diagnosis and establish comorbid axis I illness and was administered by MINI-certified study clinicians upon study entry. The MINI is a brief structured interview designed to identify the major axis I psychiatric disorders in DSM-IV and ICD-10. The MINI has been compared to DSM-III-R and has been found to have acceptably high validation and reliability scores
(20). The MINI and ADE were completed by different study clinicians, and a consensus diagnosis of one of the eligible bipolar disorders was required on both the ADE and MINI for study entry. Where the two instruments initially yielded discordant results, a consensus conference was convened to review all sources of data and determine best-estimate diagnosis and eligibility.
The clinical monitoring form (CMF)
(14,
21), which collects DSM-IV criteria for depressive, manic, hypomanic, or mixed states, was administered by CMF-certified study clinicians at each follow-up visit. Each criterion is scored on a 0–2 scale, where 1 or greater is syndromal and 0.5 is subthreshold. In addition to characterizing current mood state (over the past 7 days), the CMF produces a total score for current depressive symptoms and manic symptoms, which have been shown to be highly correlated with standard mood rating scales, including the Montgomery-Åsberg Depression Rating Scale (MADRS)
(22) and the Young Mania Rating Scale
(14,
21,
23). In completing the CMF, as with similar rating scales, the clinician is expected to use any available sources of information, including direct observation and questioning, as well as collateral informants when available.
Acceptable interrater agreement was established by requiring all treating psychiatrists to complete standardized training in the administration of the CMF and ADE, which required scoring of videotaped interviews. Upon completing this training, intraclass correlation coefficients for individual SCID mood items ranged from 0.83–1.00, with most items greater than 0.95, suggesting a high degree of uniformity in symptom ratings. Periodic monitoring was continued during the study to ensure that rating standards were maintained
(14). The MADRS and the Young Mania Rating Scale were also administered on a quarterly basis for the first year and then every 6 months thereafter by trained raters as an independent validation of CMF ratings.
Intervention
Because STEP-BD was designed as an effectiveness study, participants in the Standard Care Pathway of STEP-BD could receive any intervention felt to be clinically indicated by their clinician. However, study clinicians were trained (with a minimum of 20 credit hours of basic teaching related to the management of bipolar disorder, described elsewhere
[14]) to use model practice procedures. In addition to the monitoring procedures noted, clinicians used pharmacotherapy guidelines based on published treatment guidelines
(24–
26). This approach does not require adherence to a specific treatment algorithm. Instead, it emphasizes application of evidence-based treatments at every decision point in treatment rather than mandating a single treatment.
Because adjunctive psychosocial interventions have been reported to augment the efficacy of pharmacotherapies and to reduce treatment costs
(6,
7,
27,
28), STEP-BD also incorporated a core psychosocial intervention referred to as collaborative care
(14). The protocol specified that participants entering STEP-BD received a workbook and videotape describing this model, which emphasizes alliance-building as well as techniques for managing stress, negative cognitions, problems in interpersonal interactions, and sleep disruption. In collaboration with the treating clinician, participants completed a treatment contract describing their typical mood symptoms and interventions for managing them
(29).
Outcomes
The participants were seen in follow-up as often as clinically indicated rather than at a fixed interval. Their clinical state was assessed at each follow-up visit with the CMF and was used to define the mood states that represent the primary outcome measure. Recovery was defined as two or fewer syndromal features of mania, hypomania, or depression for at least 8 weeks, consistent with standard DSM-IV criteria for partial or full remission and with criteria used in the prior National Institute of Mental Health (NIMH) Collaborative Study of Depression
(30). Recurrence was defined as meeting the full DSM-IV criteria for a manic, hypomanic, mixed, or depressive episode on any one follow-up visit. Subsyndromal mood symptoms were defined as the presence of more than two syndromal features of either depression, mania, or hypomania without meeting the full DSM-IV criteria for a mood episode. Occurrence of subsyndromal mood symptoms during follow-up was not considered a recurrence.
Statistical Analysis
The at-risk population for this analysis, drawn from the first 2,000 participants enrolled in STEP-BD, was defined as the individuals (N=858) who had not recovered by study entry (i.e., were experiencing at least three clinically significant features of either mania, hypomania, or depression or had experienced these symptoms within 8 weeks) but who subsequently achieved a recovered state (i.e., two or fewer clinically significant features of mania, hypomania, or depression).
Time to an event/censoring was defined as the number of days from baseline (first recovered visit) to the time at first event (recurrence) or, for the participants who had no recurrence, the last CMF available within the 2 years of follow-up from the time of enrollment. In the analysis of depressive recurrence, a case with any manic, hypomanic, or mixed episode occurring before the depressive relapse was treated as censored at the time of first nondepressive relapse and vice versa for the manic/mixed recurrence analysis.
Cox regression models were used to examine the association between individual predictors and time to depressive recurrence and time to manic/hypomanic/mixed recurrence. All terms were then entered into a stepwise Cox regression model, with p value to enter or be removed from the model set at 0.05.
For the bivariate analyses of putative predictor variables, we elected to test primary (a priori) hypotheses with an alpha of 0.05. These variables included prior number of mood episodes, axis I comorbidity, and presence of residual manic or depressive symptoms at baseline. For secondary (hypothesis-generating) analyses of univariate predictors, we applied a Bonferroni correction. Thus, for 28 independent comparisons in each experiment (time to depression and time to maniac/hypomaniac/mixed episode), p<0.05/28=0.0018 was considered statistically significant. These variables included sociodemographic data (sex, age at study entry, years of education, marital status, and income), recent clinical course (rapid cycling at baseline; number of manic or depressive episodes in the past year; percent days of depression, anxiety, and mood elevation in the past year by patient report; polarity of most recent episode), lifetime clinical course (age at illness onset, duration of illness, number of hypomanic/manic/mixed episodes and depressive episodes, longest period of euthymia, prior suicide attempts, and lifetime history of psychosis), comorbid psychiatric illness (lifetime or current substance use disorders, anxiety disorders, or eating disorders), and family history of bipolar disorder.
We included p values <0.05 in the presentation of results for two reasons. First, given the clinical importance of recurrence and the paucity of long-term predictors in bipolar disorder, even modest associations were felt to be of potential clinical significance if they can be replicated, and thus, we were more concerned about type II errors. Second, many of the variables examined as predictors are expected to be correlated; full Bonferroni correction for multiple comparisons is likely overly conservative.
Discussion
In this prospective cohort study, only slightly more than half of the participants (58.5%) who were symptomatic at study entry achieved recovery during up to 2 years of follow-up. Furthermore, 48.5% of the participants experienced recurrence during up to 2 years of follow-up; the majority of recurrences (70%) were to the depressive pole, with a ratio of 2.5:1 for depressive recurrence versus manic/mixed/hypomanic episodes. Taken together, these results demonstrate that mood episodes in bipolar disorder, and particularly depressive episodes, are prevalent and likely to recur in spite of guideline-based treatments. Indeed, participants in STEP-BD received evidence-based care from specialized clinicians with training in the use of standardized assessments, combination pharmacotherapy, and psychosocial treatments where appropriate. In addition, participants received at minimum a core psychoeducational intervention. The finding that nearly half of the study participants nonetheless suffered at least one recurrence during follow-up highlights the need for development of new interventions in bipolar disorder.
Numerous other recent longitudinal studies report the results of follow-up without guideline-based treatment. Three studies in the past decade followed patients prospectively but naturalistically for at least 2 years; none included hypomania as an endpoint. In the McLean-Harvard First-Episode Mania Study, Tohen et al.
(2) reported that 34% of 154 individuals with bipolar I who had achieved recovery from a first manic or mixed episode suffered a recurrence within 2 years (55% of which were manic or mixed episodes); median time to recurrence was 32.3 weeks. A similar study by Bromet et al.
(9) examined 103 individuals with bipolar I disorder who were psychotic at the time of initial hospitalization and subsequently achieved remission; these patients were followed for 4 years. Median time to recurrence in this sample was 87 weeks, with 24.3% suffering recurrence by 6 months, 35.9% by 1 year, and 61.2% by 4 years. Three-quarters of this sample were manic at study entry, so it is perhaps not surprising that the majority of recurrences were also to manic or mixed episodes. In an outpatient study that did not examine first-admission patients (N=82), Gitlin and colleagues
(10) found that mean survival was nearly 3 years, and the 1-year recurrence rate was 37%. Finally, a fourth cohort, the NIMH Collaborative Study of Depression, included 152 individuals with bipolar I who achieved recovery during prospective follow-up
(30). That study found 1-year recurrence rates between 48% and 57%; of note, it was completed before more modern pharmacotherapies for bipolar disorder other than lithium were available and considered minor depression and hypomania as recurrence, unlike the three more recent studies.
Prospective longitudinal outcome studies in bipolar disorder thus exhibit substantial heterogeneity in terms of patient populations, available treatments (including psychosocial interventions), definitions of recurrence, and use of algorithms or guidelines. Such heterogeneity precludes direct comparison of our outcomes with those of prior studies. What is clear is that in all of these studies, recurrence rates remain substantial.
We confirmed several predictors of recurrence. In particular, residual mood symptoms early in recovery appear to be a powerful predictor of recurrence, particularly for depression. Risk of depressive recurrence increases by 14% for every DSM-IV depressive symptom present at recovery and by 20% for every manic/hypomanic symptom present at recovery. This is consistent with the work of Keller et al.
(15) that found that subsyndromal symptoms were associated with risk of recurrence, although that study examined lithium-treated patients only and did not specifically examine residual symptoms at the time of initial recovery. Data from antidepressant trials in major depressive disorder also highlight the importance of complete symptom remission as a treatment goal
(32). Given the prevalence of residual or subthreshold symptoms among bipolar I and II patients
(33,
34), this finding suggests that aggressively targeting subthreshold symptoms may offer a substantial opportunity to improve outcomes. Of note, residual manic symptoms appear to confer risk for both manic and depressive recurrence. The elevated risk for depressive recurrence is only evident beyond about 12 weeks (
Figure 1), suggesting that it is unlikely to indicate solely what has been referred to as “postmanic depression”—i.e., individuals who transition directly from mania to depression
(35).
Because many of the hazard ratios presented refer to continuous variables (e.g., percent of days with anxiety symptoms), the magnitude of effect can appear small. However, these statistical differences do appear to be clinically significant. For example, with a hazard ratio of 1.008 for percent days anxious, if two otherwise similar participants have a 25% difference in days with anxiety, the more anxious patient is nearly 5% more likely to have illness recurrence before the less anxious patient.
Two previous naturalistic studies suggested poorer outcomes for individuals with psychiatric comorbidity
(13,
18), and previous analyses of STEP-BD data likewise found evidence of a poorer retrospective
(16) and prospective course (unpublished paper by M.W. Otto et al.) among participants with anxiety disorders. Consistent with these findings, a greater proportion of days with significant anxiety in the year prior to study entry was associated with a greater risk for depressive recurrence. We found no evidence of an association between lifetime substance use disorders and earlier recurrence beyond the modest association evident for
current substance abuse or dependence. Of interest, the presence of a comorbid eating disorder also appears to increase the risk for depressive recurrence; whether it is simply a marker for more severe illness or represents an opportunity for intervention to improve outcomes merits further study.
We note four important limitations for these analyses. First, although STEP-BD was designed as an effectiveness study
(14), several study features affect its generalizability. The participants were enrolled in outpatient settings, primarily in bipolar specialty care clinics, and were predominantly Caucasian. Patients with a particularly severe or chronic course requiring frequent hospitalizations may have been too ill to attend the baseline visit and adequate follow-up. This bias may also be reflected in the limited number of individuals with current substance abuse or dependence participating in the study. Conversely, many bipolar patients are managed in primary care practices or general psychiatric practices; generally stable patients might be less willing to change clinicians to enter a study such as STEP-BD. In essence, then, the STEP-BD sample probably best captures individuals with moderate illness: ill enough to require regular visits and seek specialty care but stable enough to comply with study entry procedures.
An additional limitation is the absence of data on clinician adherence to treatment guidelines. Clinicians received standardized education in the application of evidence-based guidelines at critical decision points, but individual treatment decisions were not monitored in an ongoing fashion. Our results, therefore, are not directly comparable to those of monitored and supervised intervention studies
(36,
37), which also have limitations in generalizability. Moreover, even guidelines that purport to be evidence-based are often limited by a paucity of randomized, controlled data for next-step interventions.
A third limitation is our decision to first examine general predictors of recurrence rather than the effects of individual treatments. The “effectiveness” design of the STEP-BD standard care pathway allows for multiple reasonable interventions at each step in treatment rather than dictating a single intervention. Identifying non-treatment-specific factors associated with outcome will, however, facilitate future examinations of treatment response in this and other cohorts.
Finally, to mirror clinical practice, STEP-BD used visits at clinically appropriate but varying intervals and relied on an assessment of recent mood symptoms. It is therefore possible that participants could avoid follow-up visits during acute episodes; this might account in part for the far greater proportion of depressive than manic episodes, for example. The CMF could be completed based on other sources of information (hospitalization, contact with family members), which decrease this risk. Similarly, patients with residual symptoms might be seen more frequently and thus have recurrence detected earlier. In fact, however, in a comparison of frequency of visits, no significant differences were noted (results not shown).
Overall, these results suggest that in spite of modern evidence-based treatment, bipolar disorder remains a highly recurrent, predominantly depressive illness. Predictors of risk of recurrence, which might be useful in stratifying patients to more or less intensive maintenance follow-up and treatment included early residual symptoms, highlighting the need to target full remission, as in major depressive disorder
(32). A better understanding of the way in which these predictors may moderate (or mediate) risk of recurrence could also suggest directions toward novel strategies to modify this risk.