Compared with nonchronic forms of depression, chronic depression is associated with more severe functional impairment (
1), greater use of health care, and higher hospitalization rates (
2,
3). Randomized controlled trials (RCTs) have reported promising results for the efficacy of several pharmacological, psychotherapeutic, and combined pharmacological and psychotherapeutic interventions for chronic depression (
4–
11).
RCTs often focus on high internal validity—that is, whether potential group differences regarding the outcome can be attributed with a high degree of certainty to the received treatment. Yet the relevance of these trials for routine care depends on external validity (or generalizability)—that is, whether the results can be reasonably applied to a definable group of patients in a particular clinical setting in routine practice (
12). External validity is often neglected in comparison with internal validity for several reasons. For example, drug licensing bodies, such as the U.S. Food and Drug Administration, do not require evidence that a drug has a clinically useful treatment effect or that a trial population is representative of routine clinical practice (
12). In addition, guidance from ethics committees or funding agencies and reporting guidelines for RCTs emphasize internal validity over external validity (
12). The focus on internal validity is often justified on the grounds that without internal validity, external validity or generalizability would be irrelevant or misleading (
13). In recent years, practical clinical trials (PCTs) that aim to provide generalizable answers to important clinical questions without bias have been proposed; however, support for PCTs is still limited (
14).
One strategy often used in RCTs to increase internal validity is to conduct research in a homogeneous sample of patients defined by strict eligibility criteria (
15). However, the use of rigorous eligibility criteria to select patients for efficacy trials may limit the generalizability of results (
16).
In recent years, the generalizability of findings from efficacy trials of interventions to treat depression has been questioned (
17). Central to the question of generalizability is whether there are differences in clinical, demographic, and psychosocial characteristics and in treatment outcome between patients who would qualify for an antidepressant efficacy trial and patients who would not (
18).
Between 9% and 76% of depressed outpatients (mainly those without chronic depression) who seek routine treatment would qualify for efficacy trials (
18–
22). Compared with these patients, those who would be excluded were found to be a more chronically ill group, with more previous episodes of depression, greater psychosocial impairment, more personality pathology (
18), longer average duration of illness, higher rates of family history of substance abuse, more prior suicide attempts, and a greater number of anxious and atypical symptom features (
20).
Inconsistent findings have been reported regarding differences in treatment outcome between patients who would or would not be included in a trial. More favorable outcomes were reported for patients who would typically be included in a trial of citalopram, compared with those who would typically be excluded (
20). However, another study did not report different between-group outcomes for psychotherapy (
22). Treatment effects among patients in routine practice were found to be smaller than effect sizes reported in relevant RCTs (
22). However, the influence of eligibility on treatment outcome was found to be rather small (
19,
23).
Table 1 presents a detailed description of these studies, addressing differences between potentially included and excluded patients.
None of the studies mentioned above focused on chronically depressed inpatients. Therefore, little is known about the generalizability of findings from antidepressant efficacy trials for this subgroup in this setting. Compared with nonchronic depression, chronic depression is more often characterized by a longer duration of the depressive episode, a family history of mood disorders, and psychiatric comorbidity (
24), and these characteristics are more often found among patients who are excluded from antidepressant efficacy trials (
18,
20), which is of particular relevance.
The objectives of this study were to identify typical eligibility criteria used in RCTs of treatments for chronic depression, to examine whether these criteria are met by patients with chronic depression treated in routine inpatient care, to examine whether patients meeting RCT eligibility criteria are treated differently in routine care, and to determine whether these patients respond differently to routine care than patients who do not meet RCT eligibility criteria.
Methods
Eligibility criteria were extracted from RCTs of treatments for chronic depression. Trials were selected for this study if they included patients with chronic depression who were treated in at least one study arm with a combined psychotherapeutic and pharmacological intervention. We focused on combined interventions because these are recommended by current treatment guidelines for chronic depression (
25–
27) and because many patients in routine care are treated with more than one type of intervention (
28). A total of eight RCTs were identified through a systematic database search (conducted in January 2010) including MEDLINE and EMBASE (
4–
11). A detailed description of the systematic search and of all included trials can be found elsewhere (
29–
31).
We conducted an exploratory secondary analysis of data from a large multicentre trial of routine inpatient depression treatment in Germany (
32). The studied population included adult patients with any depressive disorder who received a minimum of three days of routine inpatient depression treatment in one of ten cooperating hospitals during the recruitment phase between December 2001 and February 2003. Because our secondary analysis focused only on chronic depression, we included patients with an onset of depression at least two years before admission to inpatient treatment who were currently treated either because of a “worsening of a chronic state” or “persistence of a long-lasting state” according to the documentation of the psychiatrist in charge at admission.
The psychiatrists documented the patients’ demographic characteristics, history of depression, and psychopathology with the Psychiatric Basic Documentation System (
33) and rated the patients’ level of depression at admission and discharge with the Hamilton Rating Scale for Depression (HAM-D) (
34). We operationalized each of the identified RCT eligibility criteria with data from the Psychiatric Basic Documentation System. On the basis of these operationalizations, we calculated absolute and relative frequencies of whether the inpatients in routine care met the criteria. We also calculated the number and proportion of patients who met all eligibility criteria and who would thus be eligible to participate in RCTs (referred to below as “RCT patients”) and the number and proportion who would be excluded from RCTs for any reason (“non-RCT patients”).
We used three definitions of treatment outcome: response, defined as symptom reduction between admission and discharge of at least 50%; remission, defined as severity of depression at discharge ≤7 points on the HAM-D; and absolute symptom reduction on the HAM-D between admission and discharge. Chi square tests and t tests were used to examine differences in treatment outcome between RCT patients and non-RCT patients.
We also compared the RCT and non-RCT groups on the effectiveness of various treatments strategies used during the inpatient stay. More specifically, we focused on selective serotonin reuptake inhibitors (SSRIs), tricyclic antidepressants (TCAs), monoamine oxidase inhibitors (MAOIs), modern antidepressants (venlafaxine, mirtazapine, and reboxetine), and psychotherapeutic interventions. In a multigroup logistic regression model, we included the various treatment strategies as predictors of treatment outcome. We included all strategies in the same model to estimate the efficacy of a certain treatment strategy while statistically controlling for the effects of all other treatment strategies.
We decided not to control for any patient characteristics (such as baseline difference in depression severity) because we assumed that relevant patient characteristics influenced both the chance to be eligible for an RCT and the choice of treatment strategy in routine care. These baseline patient characteristics are therefore not considered to be “confounders” that need to be controlled to estimate unbiased treatment effects in our analysis. Rather, differences in these characteristics are inherent to the groups we want to compare (and if the differences lead to patients being eligible or not eligible, they also define the groups of interest). Our central research question was whether these differences themselves lead to differential effects of specific treatment strategies.
Regression coefficients were estimated separately in both groups (RCT and non-RCT patients) within the multigroup model, and group differences in regression coefficients were tested for statistical significance with z tests. Response rates were used as the outcome criterion, and odds ratios (ORs) were used as a measure of effect size. To compare ORs of response rates, multigroup logistic regression models were estimated. The differences in regression coefficients were tested for significance by z tests.
Analyses were performed using PASW Statistics for Windows, version 18.0, and Mplus 6.1 (
35).
Results
We identified 13 eligibility criteria used in eight RCTs of combined psychotherapeutic and pharmacological interventions. All criteria were used in more than one study, and all studies used more than one eligibility criterion. Most studies explicitly mentioned the ability to provide written informed consent, a certain age range, and a specific diagnosis of chronic depression. Most studies excluded patients older than 75 and those who had various concurrent psychiatric or somatic disorders. The criteria are listed in
Table 2, along with our operationalization of these criteria.
A total of 2,133 patients were included in the primary multicentre trial of routine inpatient depression treatment in Germany (
32), of which 231 were treated because of chronic depression and could be included in this secondary analysis. The mean±SD age of the 231 patients was 51.6±12.9 years, and more than half were female (62%). The mean inpatient stay for these patients was 55.8±38.3 days, and patients had a high severity of depression at admission (HAM-D score of 23.3±8.2). Most patients received a combination of psychotherapy and pharmacotherapy; seven (3%) were treated exclusively with psychotherapeutic interventions and 27 (12%) with pure pharmacological interventions. The most frequent psychotherapeutic intervention was cognitive-behavioral therapy, followed by psychodynamic psychotherapy and interpersonal psychotherapy.
Of the 231 inpatients with chronic depression, only 25% met all 13 RCT eligibility criteria (
Table 2). Four of the 13 criteria were met by nearly all patients: ability to give informed consent (98%), operationalized as being treated of their own will; sufficient language skills (97%); younger than 75 years (96%); and not pregnant at admission (100%). However, a rather high percentage of these patients would not be eligible for RCTs because of somatic (38%) or psychiatric (16%) comorbidities, a high risk of suicide (29%), or a rather low severity of depression (HAM-D ≤16) at admission (17%).
We did not detect any statistically significant differences at admission in demographic characteristics between RCT patients and non-RCT patients (
Tables 3 and
4). Differences between RCT and non-RCT patients were found in clinical characteristics that were used as eligibility criteria. The non-RCT patients had various concurrent somatic or psychiatric disorders, which the RCT patients did not have, and a higher risk of suicide than the RCT patients. No statistically significant differences were found for baseline depression severity.
Of the RCT patients, 38% received SSRIs during their inpatient stay, compared with 24% of the non-RCT patients (p=.042). No statistically significant differences between the two groups were found in regard to treatment with TCAs, modern antidepressants, MAOIs, psychotherapy, combined psychotherapeutic and pharmacological interventions, and length of inpatient stay (
Table 3).
At the end of inpatient treatment, 46% of the RCT patients and 50% of the non-RCT patients reached remission, which was not a significant difference. At this time, 77% of the RCT patients and 65% of the non-RCT patients were classified as responders, and the average reduction in HAM-D scores was 15.8±7.9 among RCT patients and 13.5±9.1 among non-RCT patients. Differences in treatment outcome at the end of inpatient treatment between RCT patients and non-RCT patients did not reach statistical significance (
Table 3).
For the RCT and non-RCT groups, the ORs of response rates of patients who were or were not treated with a certain treatment (SSRIs, TCAs, modern antidepressants, MAOIs, or psychotherapy) were estimated. A comparison of the ORs indicated no statistically significant between-group differences in effectiveness (
Table 4).
Discussion
We identified a broad range of eligibility criteria typically used in RCTs that examine the effectiveness of combined psychotherapeutic and pharmacological interventions for chronic depression. Applying these eligibility criteria to a sample of inpatients with chronic depression in routine inpatient care, we found that a rather low percentage of patients would be eligible for RCTs (25%). This finding is consistent with previous research involving patients with nonchronic forms of depression who were screened for eligibility for clinical trials or screened in outpatient psychiatric care (
18–
21). Patients would have been excluded mainly because of suicide risk, low severity of depression at admission, and concurrent psychiatric or somatic illnesses.
We compared patients who met RCT eligibility criteria and those who did not, and we did not find statistically significant differences in demographic characteristics, length of inpatient stay, treatment outcome, and effectiveness of certain antidepressants, except that the percentage of patients receiving SSRIs was slightly larger in the RCT group. Our results indicated substantial heterogeneity of the effectiveness of single treatment strategies both between groups (effect sizes differed strongly between groups) and within groups (large confidence intervals of ORs in each group). However, the group differences were not statistically significant. One explanation may be a lack of power resulting from the limited sample size. Another explanation may be that the effectiveness of single treatment strategies in inpatient depression treatment varies mainly within each group of patients rather than systematically between RCT and non-RCT groups. This explanation would indicate that factors other than eligibility criteria modify treatment effects in routine care.
It should be noted that we could examine the effectiveness of only five specific interventions. Additional treatment modalities used in inpatient treatment, such as occupational therapy or somatic interventions, were not examined. Interaction effects between the examined interventions could not be accounted for because of the small sample. It is therefore possible that response variability to specific interventions was “overridden” by exposure to intensive treatment in an inpatient setting.
Because this study was based on routine data from inpatient depression treatment, we could not apply all eligibility criteria typically used in RCTs, such as contraindications in regard to a specific study drug or the formal assessment of a specific subgroup diagnosis. Thus, on the one hand, our operationalization of eligibility criteria was not as strict as those actually applied in RCTs. On the other hand, because we combined eligibility criteria extracted from multiple clinical trials, our criteria were stricter than those of a single study.
A major strength of this study was the use of routine data, which allowed investigation of the role of eligibility criteria in a sample of inpatients with chronic depression who did not undergo any kind of preselection, contrary to previous research, in which patients who sought treatment in clinical trials were included (
20).
Conclusions
Our study provides empirical evidence to inform the discussion of whether findings from clinical trials of treatments for chronic depression can be generalized to routine inpatient care. First, we showed that most of the inpatients with chronic depression in routine care would normally not be included in clinical trials. Second, we did not find evidence that treatment outcome significantly differed between patients who would and would not have been eligible for RCTs.
Our findings imply that even though most inpatients in routine care would not be suitable for clinical trials, the generalizability of RCT findings to routine care is less limited than frequently suspected. Furthermore, in clinical trials that apply commonly used eligibility criteria, a large number of patients (about four times the planned sample size) would need to be screened to yield a sufficient number of patients with chronic depression.
In our study we focused on possible effects of patient characteristics on the generalizability of findings from RCTs to routine care, which is a central aspect in assessing whether evidence from RCTs is useful in establishing evidence-based practice (
13). However, for a broader evaluation of the generalizability of findings of clinical trials, further research is needed on other factors relevant to the evaluation of generalizability. Such research would examine the implementation of the intervention and maintenance of effects—for example, by comparing whether antidepressants are delivered in the same way in clinical trials and routine care (dosage, duration, experience of staff members, and so forth). Another approach to account for the generalizability of findings from clinical trials is to choose a priori an appropriate research methodology for conducting clinical trials in fields as complex as the treatment of chronic depression in routine care, as proposed by the Medical Research Council guideline for developing and evaluating complex interventions (
36).
Acknowledgments and disclosures
For this study, trial-level data were collected in a project funded by a grant from the German Ministry of Education and Research (METACHRON; 01KG0923), and patient-level data were collected in a project supported by the German Research Network on Depression.
Dr. Hölzel has received a speaker's honorarium from AstraZeneca. The other authors report no competing interests.