The current study presents an opportunity to articulate challenging aspects of study design in randomized, controlled trials of psychotherapies. The field of psychotherapy outcome research is evolving, and the considerations that follow are not unique to this study. No study design is ever perfect; weaknesses always exist. As the authors note, budgetary limitations affected their design. This common consideration makes power calculations crucial during the planning phase of any study.
Most researchers test a favored psychotherapy: here, supportive-expressive psychotherapy
(5) . A more challenging decision facing investigators is the choice of which comparison condition to select in a trial for efficacy. Tests of new medication treatments face no such conundrum, as pill placebo is in many ways the perfect solution. Pill placebo is not an apt comparison for a psychotherapy, however.
In fact, no comparison condition for psychotherapy trials mirrors the parsimony of placebo for medication trials
(6) . A no-treatment or delayed-treatment condition is ethically questionable for anxious patients like those with generalized anxiety disorder. Furthermore, such a comparison predetermines disappointment and exaggerated symptoms by definition, as patients are told they have a disorder yet cannot qualify for immediate treatment. A waiting list may even be a
nocebo (7) . Unfortunately, at least in the United States, “treatment as usual” comparisons often amount to much the same thing. For patients with generalized anxiety disorder, who are sensitive to separation
(8), it is essential to balance treatment time and therapist attention across conditions, as this study did. Therapist experience and training must be matched to avoid results that reflect an artifact of experience.
Ultimately, the first-line utility of any psychodynamic psychotherapy for any anxiety disorder will depend on its efficacy relative to CBT, which has a solid research evidence base in this area
(9) . There are serious disadvantages to conducting direct comparisons between these two treatments, however, before having initially established the efficacy of a psychodynamic psychotherapy for generalized anxiety disorder. Leichsenring and colleagues chose this design, comparing supportive-expressive therapy to a standard reference treatment (CBT), rather than to a minimal psychotherapy control. This strategy carries risks. If both treatments were to perform comparably, which was not quite the case in this report (CBT did somewhat better for some symptoms), the study could not distinguish whether this occurred because both treatments had been efficacious or because neither worked, even if patients improved, as they did here. This problem is called “assay sensitivity”
(10,
11) . Generally accepted margins for testing equivalence in generalized anxiety disorder have not yet been established, and this is a sine qua non for an equivalence or noninferiority study
(12) . The Food and Drug Administration
(13) has made it clear: “In order to implement an equivalence or non-inferiority trial, the magnitude of [medication] effect must be stable and well-established in the literature, with consistent results seen from one trial to the next” (
12, p. 32). Despite the experience accumulated with CBT, we have not yet reached this research benchmark for studies of generalized anxiety disorder, one of the most heterogeneous DSM disorders
(14) . While most researchers would likely agree that a 1-point difference on the Hamilton Anxiety Rating Scale (HAM-A)
(15) does not constitute a clinically significant difference, not all of those who study generalized anxiety disorder would agree about the significance of 2- or 3-point differences in HAM-A outcome. Furthermore, the margin of equivalence (the mathematical point defining when two treatment conditions can be said to perform equally well—or equally poorly, for that matter) must be substantially smaller than the hypothesized treatment effect that is used to determine sample size in a superiority trial. So even if the field had agreed on a margin of equivalence, conducting a true equivalence study would require a very large study group, far exceeding the N of 57 in the trial by Leichsenring et al. Thus, the present study did not find an overall difference, nor did it establish that the two treatments are equivalent.