Despite a proliferation of pharmaceutical options for the treatment of major depression over the past 20 years, reported remission rates among patients given antidepressants have remained stubbornly low. Clinicians confronted with an inadequate antidepressant response have four options open to them—a dose adjustment, a switch to an alternative antidepressant, the introduction of another drug not considered itself an antidepressant (augmentation), or the addition of another antidepressant. Unfortunately, controlled trials have yielded very limited support for the first, second, and last of these options, and evidence for augmentation strategies concerns chiefly the use of lithium or T
3 with tricyclic antidepressants, rather than with the antidepressants now in wide use (
1). Indeed, the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) project revealed discouraging response rates when either of these was used in the face of inadequate response to a selective serotonin reuptake inhibitor (SSRI) (
2). There do now exist a number of large, industry-sponsored trials that have shown the addition of atypical antipsychotics to be helpful in the face of inadequate responses to SSRIs or to serotonin-norepinephrine reuptake inhibitors (SNRIs) (
1). Their use, though, often entails adverse metabolic consequences and/or considerable added expense.
Given the low overall response rates seen among patients whose depression does not remit after one or more antidepressant trials (
3,
4), might better outcomes be achieved by treatment that is more aggressive from the outset? Encouraging evidence exists for the treatment of other common and often refractory conditions in medicine, and in two recent trials by Blier et al., depressed patients whose treatment began with a combination of mirtazapine and paroxetine, fluoxetine, or bupropion experienced remission rates roughly twice those of patients who received fluoxetine or paroxetine monotherapy (
5,
6). These findings set the stage for the Combining Medications to Enhance Depression Outcomes (CO-MED) comparison of escitalopram with placebo, escitalopram with bupropion, and venlafaxine with mirtazapine (
7). Unfortunately, the results are satisfying chiefly in their clarity. No trend emerged to favor either combination therapy, and only the presence and severity of side effects significantly separated combination from monotherapy groups.
These findings complete an all too familiar sequence in which the encouraging results of small studies prompt a much larger and carefully designed effort, one that then ends with surprisingly negative conclusions.
Among the possible explanations considered for such disparities are variances in drug dosing and in trial duration, but these have, I believe, been credibly dismissed here. This leaves subject differences as the chief suspect. Did the CO-MED study group include far fewer individuals likely to respond preferentially to antidepressant regimens that affect multiple receptors?
At first glance, those who participated in the CO-MED study seemed as responsive to antidepressant treatment generally as did those described in the two reports by Blier et al.; pooled remission rates were 38.5% for CO-MED and 39.2% for Blier et al. The remission rate with monotherapy, however, was significantly higher in the CO-MED study than in the combined studies by Blier et al.: 38.8% versus 23.5%, respectively (χ2=5.4, df=1, p<0.03). The reverse was true for remission rates among those given combination treatments; these were 38.3% for the CO-MED group and 50.0% for the Blier et al. combined groups (χ2=4.6, df=1, p<0.05). To the extent that response differences between combination treatment and monotherapy are analogous to those between active drug and placebo, the reconciliation of findings hinges on characteristics of the CO-MED study group that are associated with smaller drug effects.
What might these characteristics be? Sizable literatures exist concerning the correlates both of placebo response and of antidepressant response in major depression. The latter, of course, is confounded by the fact that many perceived “drug responses” are, in reality, placebo responses. A more direct approach to the question of who is likely to benefit from the drug component of an administered antidepressant seeks to identify attributes associated with larger drug-placebo differences. Such studies are quite scarce, though, because they require the entry of large numbers of subjects into placebo-controlled trials, and such efforts have been largely consigned to industry. There, marketing considerations have discouraged designs that would identify subgroups for which the proprietary drug being tested would not be indicated.
The clinical trial database of the Food and Drug Administration (FDA) has permitted the clear establishment of one feature predictive of greater antidepressant-placebo differences: mean severity as measured by the scale common to nearly all antidepressant trials, the Hamilton Depression Rating Scale (HAM-D). It is notable, then, that of 10 placebo-controlled trials with a mean baseline HAM-D score resembling that of the CO-MED cohort, between 23 and 24, only one showed a significant treatment effect (
8). The Blier et al. groups, though, had similar mean HAM-D scores at baseline, so if differences in illness severity played a role in the contrasting results, it is not a straightforward one.
Perhaps the HAM-D as administered across the CO-MED sites did not adequately capture the element of symptom severity most relevant to the emergence of clear drug effects. The CO-MED study group contained far fewer subjects with melancholia than did either of the groups of Blier et al., and depressed patients with melancholic features have been shown to have higher severity ratings on global and other symptom-based measures in comparison to those without melancholia, despite having nearly identical HAM-D scores (
9). Group differences were only somewhat larger in the CO-MED subset with melancholic features than in the patients without them. The performance of melancholia as a response predictor, though, varies considerably by how its components are defined and applied (
10), and the CO-MED report does not state how melancholic symptoms were assessed. The relative scarcity of melancholia in the CO-MED patients nevertheless indicates a correspondingly lower severity level on some dimension.
A particularly striking feature of the CO-MED study group is the coexistence of the absence of treatment resistance, as specified by the entry criteria, and a high rate of chronicity. Although none of the participants had had an adequate monotherapy trial of an FDA-approved antidepressant within the current episode, over one-half of the group had a depressive episode that had been fully syndromal for at least the preceding 2 years. Why had they not undergone even one adequate antidepressant trial before the CO-MED effort came to pass? Whatever the answer, it seems likely that the average interval between episode onset and the receipt of first treatment was quite long. A number of prospective studies of major depressive disorder have shown the no-treatment interval to be as robust a predictor of poor outcome with treatment as neuroticism (
11). Why this is so is a matter of speculation, but the measure's association with poorer treatment response generally may well have narrowed differences between the regimens applied here.
Finally, there is the issue of study size. Difficulties in maintaining adherence to entry criteria and to standards with which diagnostic criteria and outcome measurements are applied necessarily grow with the number of sites involved in a trial. Such problems are likely to be compounded by competition among sites to meet recruitment goals. It is not surprising then that a review of multicenter antidepressant randomized controlled trials showed the number of sites to be the strongest determinant of placebo response rates and, therefore, of small effect sizes (
12).
The truth, of course, may lie with the CO-MED trial, and the conclusions of Blier et al. may comprise false leads. The trade-offs necessary in the pursuit of statistical power through multiple sites are widely recognized, though, and several remedies should come into play. One is the use of centralized ratings to ensure standard applications of entry criteria and outcome measures. This would limit measurement error from intersite differences and rater drift and thus improve the odds of revealing true treatment effects. The other is an increasing focus on tailored treatment. It may indeed prove much more efficient, particularly in the case of such highly heterogeneous disorders as major depressive disorder, to identify subgroups that are expected, from mechanistic considerations, to show a preferential response to a new intervention and to then test this with small study groups before a targeted population is put to a larger and more definitive trial. This is not a new idea but is one that appears to be finally gathering momentum.