Almost two out of three patients with major depressive disorder do not achieve remission, even after a well-delivered 12- to 14-week trial of optimally dosed antidepressant medication (
1). Nonremission amplifies suffering, disability, costs, and suicide mortality (
2,
3). Thus, providing effective, feasible, and safe “next-step” treatments aimed at increasing remission rates is a clinical imperative.
To help address this need, we conducted a large, multisite, randomized, single-blind, parallel-assignment, three-arm U.S. Department of Veterans Affairs (VA) cooperative study called the VA Augmentation and Switching Treatments for Improving Depression Outcomes (VAST-D) (
4). VAST-D included 1,522 Veterans Health Administration patients with nonpsychotic major depressive disorder who had not achieved an optimal response to at least one antidepressant trial. Our initial report (
5) focused on the relative effectiveness and safety of three next-step treatment strategies: switching to another antidepressant—sustained-release bupropion; combining the current medication with sustained-release bupropion; or augmenting the current medication with an antipsychotic—aripiprazole. As previously reported, augmentation with aripiprazole had higher 12-week remission and response rates than switch to bupropion and a higher response rate than combination with bupropion. Relapse rates over a 6-month period did not differ among the treatments.
For the treating clinician, perhaps even more important than overall remission and relapse rates is predicting which patients will do better with which treatments (
6). Previous studies have been able to identify general, nonspecific predictors of overall response (
7,
8), but identifying specific sociodemographic and clinical features that point the way to selecting one antidepressant treatment over another has been a daunting task, and empirically based guidelines have not been established (
9). Finding specific moderators of outcomes to alternative subsequent treatments could help clinicians tailor treatment to individual patients with more precision, thereby minimizing exposure to inefficacious trial-and-error pharmacotherapy.
In this study, we reproduced and extended previous studies on general predictors and explored factors that predict which of three commonly used next-step medications are best suited for individual patients. Specifically, we examined the roles of selected sociodemographic and clinical variables as general prognostic predictors and as potential specific moderators of remission across treatment groups among VAST-D participants. These variables include gender, age, ethnicity, and employment; depressive symptom severity (
1), chronicity (
1), and subtype (
10); anxiety (
11); mixed hypomanic symptoms (
12); childhood adversity (
13); grief (
14); co-occurring general medical (
15) and psychiatric (
16) conditions; positive mental health (
17); and quality of life (
18). Although many of the variables we selected have been shown to be useful as overall prognostic features, their value in helping clinicians decide between alternative next-step treatments has not yet been demonstrated (
9). One previous trial that sought to identify specific symptom clusters that might help guide treatment decisions showed promising results, but the positive findings are preliminary and require replication (
10). Other variables that we explored here have not previously been systematically studied as predictors or moderators of treatment outcome for major depressive disorder; these include bereavement and complicated grief, baseline positive mental health, and quality of life. Because so many of the clinical features we examined in this study have not previously been studied in next-step-medication populations, our analyses were exploratory, and we had no a priori hypotheses to test.
Methods
The VA Office of Research and Development and the VA Central Institutional Review Board (CIRB) approved the study, and a Certificate of Confidentiality was obtained from the National Institutes of Health. The CIRB conducted annual continuing review, and a data monitoring committee reviewed the study biannually; both bodies reviewed adverse events throughout the study. All participants provided written informed consent and privacy authorization.
Sample
Participants were U.S. veterans diagnosed with major depressive disorder who were receiving care at one of 35 VA medical centers and had been referred by their VA clinician. Diagnostic eligibility was further established by research staff using DSM-IV-TR criteria, supplemented by the Patient Health Questionnaire–9 (PHQ-9) (
19). Participants with a suboptimal response to a treatment course with a serotonin reuptake inhibitor, a serotonin-norepinephrine reuptake inhibitor, or mirtazapine that met or exceeded minimal standards for dosage and duration of treatment were eligible (
4,
5,
20). Suboptimal response was defined by a score ≥16 (severe depression) on the 16-item Quick Inventory of Depressive Symptomatology–Clinician Rated (QIDS-C) (
21) after at least 6 weeks of treatment, or a score ≥11 (moderately severe depression) after at least 8 weeks of treatment, with the 3 most recent weeks at a stable “optimal” dosage.
As previously described (
20), participants in this sample were largely male (85%); 69% were white (69%), 26% African-American, and 10% Hispanic. Mean age was 54.4 years (SD=12.2, range=21–85). Most participants had at least some college education (72%). Almost half were married (43%) and almost half (45%) reported being unemployed, on disability, or receiving social assistance. Mean age at first diagnosis of major depressive disorder was 37.2 years (SD=15.6). Most participants had recurrent major depressive disorder (63%), with a mean of just over nine lifetime episodes (SD=19.1). More than a third of the participants (35%) had been treated with three or more medication trials (mean=2.4, SD=1.7). The mean duration of the current depressive episode was 87 months (SD=132), and the mean QIDS-C score at baseline was 16.7 (SD=3.3), indicating severe depression.
Participants were excluded if they were currently receiving treatment with bupropion or any antipsychotic medication, had a history of bipolar disorder, schizophrenia, schizoaffective disorder, or other psychosis; had current dementia, an eating disorder, or a seizure disorder; had suicidal ideation requiring inpatient treatment; had an unstable medical condition; were in need of immediate psychiatric hospitalization; had substance dependence requiring detoxification in the past 30 days; were pregnant, lactating, or planning to become pregnant; were unable or unwilling to provide informed consent; or declined to participate prior to randomization.
Interventions
Patients were randomly assigned to one of three treatment strategies: switching to another antidepressant (sustained-release bupropion); combining their current medication with another antidepressant (sustained-release bupropion); or augmenting their current medication with an antipsychotic (aripiprazole). Treatments included titration (cross-titration for the switching arm) from standard starting dosages of 150 mg/day of bupropion with titration to 300 mg/day or 400 mg/day or 2 mg/day of aripiprazole with titration to 5, 10, and then 15 mg/day, until depressive symptoms remitted or side effects were intolerable. Dosage adjustments were guided by measurement-based care (
22) using the PHQ-9 (
19) and a global side effects measure (the Frequency, Intensity, and Burden of Side Effects Rating) (
23) at each visit. The acute treatment period lasted up to 12 weeks. Treatment visits occurred at baseline and at weeks 1, 2, 4, 6, 8, 10, and 12.
Outcome Measures
The primary outcome measure was remission (close to asymptomatic status), operationalized as a QIDS-C score ≤5 at two consecutive scheduled follow-up visits during the 12-week acute treatment phase. The QIDS-C was administered at every visit by a study-certified independent evaluator who was blind to treatment condition.
The rationale for, properties of, and baseline values for the variables considered as potential predictors and moderators of remission have been described previously (
20) and include the following:
•
Basic demographic information, such as age, gender, race, marital status, and ethnicity, was obtained through chart review and confirmed with the participant at the baseline interview. Age was operationalized as both a continuous variable and a categorical variable, with cutoff points at 50 and 65 years.
•
Depression severity was assessed by mean total QIDS-C scores and ratings of markedly or severely ill on the Clinical Global Impressions severity scale (CGI-S) (
24).
•
Chronicity of the current episode was measured by months since the current depressive episode began, as determined by participant recall at the baseline interview.
•
Duration of the index medication trial was determined by patient recall at the baseline interview and confirmed by chart review.
•
Three depression clinical clusters that previous studies suggested might moderate responses to different antidepressants were based on mean scores on QIDS-C subset items: sleep (midnocturnal insomnia, sleep-onset insomnia, and early morning insomnia); core emotional (energy/fatigability, concentration/decision making, loss of interest, sadness, and feelings of worthlessness); and atypical (psychomotor agitation, psychomotor slowing, suicidal ideation, and hypersomnia) (
10).
•
Anxiety was measured with the Beck Anxiety Inventory (BAI) (
25).
•
Childhood adversity was assessed with the 10-item self-report Adverse Childhood Experiences survey (
26).
•
General medical co-occurring conditions at baseline were assessed with the Cumulative Illness Rating Scale–Geriatrics (CIRS-G) (
27). To better understand the role of nonpsychiatric, general medical comorbidities, and because all participants experienced substantial depressive symptoms, the category of psychiatric symptoms was not used in calculating the CIRS-G total score and severity index. The total severity score was the sum of points for the number of bodily systems evaluated. The severity index, rated 0 (no problem) to 4 (extremely severe), was the total score divided by the number of remaining bodily systems evaluated.
•
Other psychiatric conditions were identified with the clinician-administered Mini-International Neuropsychiatric Interview (
28).
•
Depression–specific quality of life was measured with the 16-item Quality of Life Enjoyment and Satisfaction Questionnaire–Short Form (Q-LES-Q-SF) (
29).
•
Three novel questionnaires, designed specifically for this study, measured the constructs of mixed features, complicated grief, and positive mental health. Mixed features (
28,
29) were measured with a self-rated 12-item mixed features scale, based on items taken directly from the DSM-5 criteria for mixed features (
30). A brief self-rated complicated-grief screening instrument (
31) was used to assess whether the patient had experienced the death of an important friend or relative, and if so, the time since the death, the deceased’s relationship to the patient, and the patient’s emotional, cognitive, and behavioral responses to the death. Positive Mental Health (
17) was measured with a seven-item self-rated positive mental health questionnaire.
Statistical Methods
We used descriptive statistics, including means and standard deviations for continuous variables and numbers and percentages for discrete variables. Analyses were carried out using SAS, version 9.3 (SAS Institute, Cary, N.C.), and were conducted on the full cohort using the intention-to-treat principle.
We used the MacArthur approach to evaluate moderators of treatment outcome (
6). According to this conceptualization, nonspecific predictors of treatment outcome are pretreatment variables that predict the outcome equally in each treatment. In contrast, moderators predict outcome differently depending on the treatment, and thus specify for whom or under what conditions a particular treatment works. We decided a priori to test several demographic and clinical variables as potential moderators of treatment outcome in individual life table regression models. Each model included treatment allocation and one predictor/moderator and their two-way interactions (treatment by moderator). When the main effect of a variable was significant but the treatment-by-moderator interaction was not, the variable was considered a nonspecific predictor of outcome. When the interaction was significant, whether or not there was a significant main effect, the variable was considered a potential moderator.
We report hazard ratios and 95% confidence intervals for the main effects in each model, estimating relative risk of remission compared with a reference level for categorical variables and per unit increase in value for continuous variables. We used a type I error or a p value of 0.05 as a threshold to identify covariates of interest as nonspecific predictors. To avoid false negatives (i.e., type II error), a p threshold of 0.10 was used as a filter to identify covariates as potential moderators of treatment effects for further exploratory analyses (
32). Because of the exploratory nature of the analyses, no adjustments were made for multiple comparisons, and these results should be interpreted accordingly.
To further explain the moderating effects found in models with acceptable treatment-by-moderator interaction terms, we conducted exploratory Cox regression models for each potential moderating variable, categorizing each continuous variable into three levels (e.g., prespecified age groups and three-way splits based approximately on terciles for other variables). In this exploratory analysis, a relative treatment effect was considered meaningful if the 95% confidence interval excluded 1.0. As a measure of effect size, we calculated number needed to treat (NNT) for treatment comparisons at each level of the three covariates. NNT was calculated as the inverse of the absolute reduction in remission between treatment groups across the 12-week acute treatment phase.
To visually depict the interaction between scores on potential moderators and treatment outcome, we plotted cumulative probability curves, plotting time to remission for high, medium, and low values.
Results
Association of Baseline Characteristics and Remission (General Prognostic Indicators)
Most demographic features (mean age, gender, marital status, race, and ethnicity) did not predict overall remission at 12 weeks. However, participants who were unemployed achieved remission at lower rates than participants who were employed (
Table 1).
Several measures of severity and chronic depression at baseline were significantly negatively associated with remission: higher (worse) scores on the PHQ-9, QIDS-C, and CGI-S and longer duration of depressive symptoms. All three QIDS-C clusters also were negatively associated with overall remission. Similarly, greater anxiety (higher BAI score), childhood adversity, and complicated grief symptoms also were negatively associated with remission.
Participants with longer durations of the index medication trial, better general quality of life (as measured by the Q-LES-Q-SF), and positive mental health were more likely to achieve remission.
The degree of mixed hypomanic symptoms and medical comorbidity at baseline were not predictive of remission.
Moderators of Treatment Effect
As
Table 1 illustrates, only one demographic feature, age group (p=0.10), and two additional clinical features, the core emotional QIDS-C cluster (p=0.07) and mixed hypomanic symptoms (p=0.01), met our criteria for potentially meaningful moderators of treatment effect.
The post hoc analysis of age by treatment revealed that the effect was driven by higher remission rates with augmentation with aripiprazole compared with switch to bupropion among those age 65 years or older (37.6% compared with 20.5%; hazard ratio=1.97; 95% CI=1.18, 3.28), corresponding to an NNT of 5.8 (
Table 2,
Figure 1).
Dividing core emotional symptom scores into three strata—low (score ≤10), medium (score 11–12), and high (score ≥13)—revealed no meaningful differences in treatment effects, although the relative effect of augmentation with aripiprazole was greatest in the less severe symptom groups (score ≤10: hazard ratio=1.29; 95% CI=0.96, 1.75; NNT=9.8 and score 11–12: hazard ratio=1.39; 95% CI=0.86, 2.27; NNT=12.6) and lowest in the more severe symptom group (score ≥13: hazard ratio=1.05; 95% CI=0.46, 2.39) corresponding to an NNT of 221 (
Table 2,
Figure 2).
Figure 2 demonstrates a clear association of decreasing remission with increasing baseline QIDS-C emotional score, with modest evidence of a treatment interaction with remission.
In contrast, dividing mixed hypomanic symptom scores into three strata demonstrated that the interaction could be explained by lower remission rates among those endorsing the greatest levels of mixed symptoms (score ≥13), for those who switched to bupropion compared with both augmentation with aripiprazole (13.9% compared with 30.1%; hazard ratio=2.19; 95% CI=1.29, 3.72), corresponding to an NNT of 6.2, and combination with bupropion (13.9% compared with 29.0%; hazard ratio=2.21; 95% CI=1.30, 3.77), corresponding to an NNT of 6.6 (
Table 2,
Figure 3).
Discussion
In this study, we sought to enhance clinicians’ ability to determine which outpatients with depression requiring next-step treatment should be treated by switching to another antidepressant (bupropion), by combining the current medication with another antidepressant (bupropion), or by augmenting the current medication with an antipsychotic medication (aripiprazole). We hoped to find sociodemographic, clinical, or treatment features that might lead to improved methods for making this determination, thus enabling clinicians to make more informed decisions about next-step treatments. Although this exploratory study identified several nonspecific predictors of next-step depression treatment outcomes, we found only two potential moderators—older age and mixed hypomanic symptoms—that might serve as signposts for specific, targeted interventions.
Our key findings regarding general predictors of remission replicated and extended previous reports. For example, in the first and largest study to assess the differential likelihood of achieving remission with next-step switching treatments, the Sequenced Treatment Alternatives to Relieve Depression study (STAR*D) also found that unemployment, health-related quality of life, severity of depression, and anxiety were associated with nonremission (
9). Unlike the STAR*D investigators, we did not find that age or gender was related to overall remission, perhaps because of the mostly older male demographic distribution of the VAST-D population. STAR*D also reported that melancholic features were associated with nonremission. While VAST-D did not characterize participants as members of a melancholic subtype, our QIDS-C core emotional cluster, consisting of low energy, loss of interest, sad mood, and feelings of worthlessness, is comparable to the melancholic subtype. In VAST-D, higher scores for the core emotional cluster, as well as for the sleep and atypical clusters, were indeed associated with nonremission. Additional general clinical predictors of nonremission among VAST-D participants included chronicity of depression, shorter duration of index medication treatment, childhood adversity, symptoms of complicated grief, impaired baseline quality of life and functioning, and a lack of positive mental health. Most of these features have not previously been studied in the context of next-step treatments. When present, they may indicate a need for extraordinary patience and lower expectations on the part of patients and clinicians or, alternatively, different interventions than those provided in this study.
The few features that were found to potentially moderate remission were somewhat unexpected. Most previous attempts to identify clinical features that favor one treatment over another for major depressive disorder have been conducted with individuals undergoing first-step treatment and have not yielded clinically useful findings (
9,
33,
34). Rush et al. (
9) hypothesized that the chances of finding such moderators of outcome may be higher during trials of next-step treatments, because individuals who will remit solely from nonpharmacological treatment are unlikely to enter next-step treatment. Yet, even after we applied a higher critical p value of 0.10 to identify potential moderator variables, only two showed any evidence of moderating treatment effects: older age and more severe mixed symptoms. While the core emotional cluster met our initial cutoff point as a potential moderator, it did not demonstrate significant effects when sorted into high, medium, and low levels. Because this cluster is comparable to the melancholic features subtype, the absence of a moderating effect on outcome is consistent with previous studies showing that the melancholic subtype does not moderate response to specific antidepressants either as first-step (
1,
33,
35) or next-step (
9) treatments.
Our sample, comprising individuals with a broad range of ages and a mean age above that of most published samples of individuals with treatment-resistant depression, gave us a unique opportunity to assess the role of age as a predictor and moderator of remission. Although age did not predict overall remission, participants who were older than age 65 were more likely than others to achieve remission with augmentation with aripiprazole than with switch to bupropion. Coupled with the lack of a significant difference in the younger age groups, this finding raises the possibility that the previously reported superiority of augmentation with aripiprazole over switch to bupropion (
5) may have been driven primarily by the advantage of augmentation with aripiprazole in the participants older than age 65. Previous studies have found aripiprazole augmentation to be more effective than placebo in inducing remission in older patients with treatment-resistant depression (
36), but we are not aware of similar controlled trials with bupropion switching or augmenting strategies in older adults. This finding may be particularly important in view of some previous studies suggesting that standard first-step antidepressants may be less efficacious in elderly compared with younger patients (
37). It will be important to balance the more robust remission rates and shorter time to remission with augmentation with aripiprazole in older adults with the higher rates of weight gain, laboratory abnormalities, extrapyramidal symptoms, and daytime somnolence associated with augmentation with aripiprazole (
5).
Participants with severe mixed symptoms did better with either augmentation with aripiprazole or combination with bupropion than with switch to bupropion. The finding that augmentation with aripiprazole did better than switch to bupropion is consistent with emerging clinical guidelines favoring mood stabilizers or second-generation antipsychotics for major depressive disorder with mixed features (
38,
39); but the novel finding that combination with bupropion was comparable to augmentation with aripiprazole and did better than switch to bupropion for those with severe mixed symptoms has not been previously studied and has not made its way into clinical guidelines. This finding bears further study because evidence-based options for the treatment challenges of this population are needed.
When interpreting our results, certain limitations must be kept in mind. First, this secondary analysis of the VAST-D trial was not powered to discover predictors or moderators of remission. Thus, our failure to find more than two features that potentially moderate remission may be a function of inadequate power. In addition, only one of the two identified moderating features was significant at a p value <0.05, often considered the minimal standard, and we did not adjust p values for multiple comparison because of the exploratory nature of the analyses. Thus, all results must be interpreted with caution. Further, we had no measures for several potentially revealing moderating features, such as age at onset of major depressive disorder, several DSM-5 subtypes, personality disorders, concurrent life stressors, and concurrent nonpharmacological treatments. Moreover, only one antidepressant (bupropion) and one antipsychotic (aripiprazole) were evaluated, and the generalizability of the results to other medications is unknown. Additionally, some of our more innovative measures, such as the scales for mixed symptoms, complicated grief, and positive mental health were designed for this study and have not yet been psychometrically validated. Finally, the generalizability of the findings must consider the unique nature of the VAST-D participants, who were mostly male veterans with chronic and recurrent major depressive disorder and multiple comorbidities (
5,
20).
In conclusion, in this VAST-D study, we found several clinical features that predicted remission overall, but only two that might help guide clinicians to specific next-step treatment selections: age ≥65 years (select augmentation with aripiprazole over switch to bupropion) and severe mixed hypomanic symptoms (select combination with bupropion or augmentation with aripiprazole; avoid switch to bupropion). These preliminary findings begin to offer clinicians tentative evidence for tailoring next-step medication treatment. If replicated, these findings should enhance clinicians’ ability to determine which depressed outpatients requiring next-step treatment are most effectively treated with specific augmentation, combination, or switching strategies.
Acknowledgments
VA Cooperative Studies Program (CSP) study 576 was performed at 35 VA medical centers across the United States, including Atlanta VA Medical Center; Bruce W. Carter VA Medical Center, Miami; Central Texas Veterans Healthcare System, Temple, Tex.; Charles George VA Medical Center, Asheville, N.C.; Cincinnati VA Medical Center; Clement J. Zablocki VA Medical Center, Milwaukee; Edward Hines, Jr. VA Hospital, Hines, Ill.; James A. Haley VA Hospital, Tampa, Fla; Kansas City VA Medical Center, Kansas City, Mo.; Long Beach VA Healthcare System, Long Beach, Calif.; Louis A. Johnson VA Medical Center, Clarksburg, W.Va.; Louis Stokes VA Medical Center and Case Western Reserve University School of Medicine, Cleveland; Memphis VA Medical Center, Memphis; Minneapolis VA Health Care System; New Mexico VA Healthcare System, Albuquerque; Philadelphia VA Medical Center; Phoenix VA Health Care System; Richard L. Roudebush VA Medical Center, Indianapolis; Salem VA Medical Center, Salem, Va.; San Francisco VA Health Care System; Southern Arizona VA Healthcare System, Tucson; Tuscaloosa VA Medical Center, Tuscaloosa, Ala.; University of California, San Diego; VA Eastern Colorado Healthcare System, Denver; VA Loma Linda Healthcare System, Loma Linda, Calif.; VA Maryland Healthcare System, Baltimore; VA Nebraska Western Iowa Healthcare System, Omaha; VA New England Mental Illness Research, Education, and Clinical Center, VA Connecticut Healthcare System, West Haven; VA Palo Alto Healthcare System, Palo Alto, Calif.; VA Pittsburgh Healthcare System; VA Puget Sound Health Care System, American Lake/Tacoma, Wash.; VA San Diego Healthcare System; Washington DC VA Medical Center; W.G. Hefner VA Medical Center, Salisbury, N.C.; William S. Middleton Veterans Hospital, Madison, Wisc.
The authors thank Sarah Pospos, M.D., for her editorial assistance and the VAST-D staff and patient participants at each of the 35 sites.