The past few years have seen a new emphasis on translational research that takes findings generated by clinical research and translates them into effective treatments for patients who are seen in nonresearch settings. A recent National Institute of Mental Health (NIMH) report (
1) recommended more research to optimize the development and facilitate the use of research-based mental health interventions by the large community of service providers. This translational strategy is best exemplified by recent work completed in the area of depression.
Over the past decade, a number of studies have addressed the gap between clinical knowledge about treatment and the quality of care for depression in usual care settings by developing and testing a variety of health service approaches to the more effective delivery of evidence-based psychotherapies and antidepressant medications in primary care. Using strategies designed to overcome patient, provider, and process-of-care barriers to mental health treatment, along with judicious use of specialty consultation and close and sustained follow-up of patients, these approaches have demonstrated superior clinical and cost-effectiveness compared with usual care in the primary care setting (
2–
5). More recently, a quality-improvement strategy based on principles derived from these approaches to changing the care process in primary care systems has been shown to improve depression outcomes compared with usual care (
6).
In contrast with this progression in the area of depression from studies that strictly control all variables—for example, the type of patient, intervention, provider, and setting—to isolate a treatment effect (efficacy), to studies that attempt to develop approaches to more effectively deliver evidence-based treatments to patients in community clinical settings (effectiveness), to quality improvement studies that attempt to improve care across a variety of community clinic settings (system dissemination) (
7), few attempts have been made among patients with anxiety to bridge the gap between efficacy and effectiveness trials in a more gradual fashion.
Efficacy versus effectiveness
The elements of efficacy models and effectiveness models have been outlined in an article by Wells (
8) and represent two ends of the spectrum of intervention research design. Efficacy studies seek to answer the question, Does the treatment work better than placebo under ideal conditions? And does one treatment work better than another? Efficacy studies seek to isolate the treatment effect—a signal of treatment efficacy within the “noise” of naturalistic recovery, placebo effects, side effects, and patient dropout—by maximizing internal validity. Such isolation is commonly achieved through the use of placebos, standardized treatments, homogeneous patient populations, and blinding.
In their purest form, such studies involve specialized treatment settings in which care is free, compliance with treatment is artificially induced—for example, by enrolling only highly motivated participants and by scheduling visits more frequently—persons with comorbid medical and psychiatric conditions are excluded, patients’ treatment preferences are not taken into account, and outcomes are symptom focused, although efficacy studies have recently adopted the routine use of function and quality-of-life measures. These factors all diverge from usual care conditions.
By contrast, effectiveness and dissemination studies ask, What changes in service delivery need to occur if efficacious treatment is to be delivered to the widest populations? These studies seek to promote greater generalizability of findings by enhancing external validity. They emphasize flexible, innovative methods for delivering treatments that are more applicable to usual care or community settings and often include larger populations that are more representative in terms of socioeconomic and ethnic status and comorbid medical and psychiatric illness. The questions of how much these health-system innovations will cost and who benefits from improved health outcomes are also inherently important to these trials. Therefore, these designs often include measurement of societal costs, such as medical costs, disability days, missed workdays, work productivity, and family burden.
An ongoing effectiveness study in primary care
Getting started in translational research
Studies that attempt to bridge the gap between efficacy and effectiveness often combine selected elements of the two models. Such hybrid designs are essential if this translational research agenda is to be carried out for other mental disorders. Developing, refining, and operationalizing such designs require the active and ongoing collaboration of research experts in both efficacy and health services research. Study goals and questions, along with feasibility concerns, typically guide selection of various design elements.
In this article we review these design considerations and illustrate them with examples from our three-site NIMH-funded study of panic disorder in primary care, one of the first translational treatment studies to focus on an anxiety disorder. The study began in July 1999. After discussing the overall design strategy, we review the rationale for decisions about specific components of the study. For each area, the different choices dictated by the efficacy and effectiveness perspectives are resolved by balancing study goals and questions with feasibility concerns. The final design and its elements are a hybrid that includes parts of both efficacy and effectiveness models. Although the decisions described here are influenced by considerations specific to panic disorder and primary care, we believe the general approach can be applied to treatment studies for a variety of other mental disorders.
Creating the study design
The rationale for our study was based on the high prevalence (
9–
17), poor recognition, and inadequate treatment (
18,
19) of panic disorder in the primary care setting as well as the existence of evidence-based pharmacotherapeutic and psychotherapeutic treatments (
20–
22). Because of the absence of any intervention research for panic disorder in this setting—in contrast with the substantial amount for depression (
2–
4,
6)—a basic goal was to show that an evidence-based intervention was superior to usual care. Determining which kinds of treatments for panic disorder or methods of delivery were best—for example, self-help versus clinician-administered—was premature. The goals of the study were to evaluate whether the best available treatments for panic—cognitive-behavioral therapy and selective serotonin reuptake inhibitors—could be modified for easy implementation in a primary care setting and be more clinically effective and cost-effective than usual care.
To maximize the likelihood of demonstrating improved outcomes, and to minimize cost, we preserved some elements of the efficacy model. First, because of the failure of early first-generation effectiveness studies to show an effect on depression, partly because of inadequate standardization of intervention (
16), treatment had to be standardized. Second, to maximize treatment adherence, we excluded certain patient populations who were unlikely to participate in the protocol—for example, unstable patients with bipolar disorder and patients with active substance abuse. Third, we did not strictly accommodate patients’ treatment preferences—medications versus psychotherapy—because such an approach would have required a much larger sample. Finally, we chose not to randomize by provider or care system, for two reasons. First, both types of randomization would have required an extremely large sample because correlation among outcomes from patients who are being seen by the same physician or in the same clinic inflates the required size of the sample. Second, in previous depression effectiveness research that randomized by patient (
2,
3), no evidence was found of a spillover effect whereby usual care patients benefited from the intervention, probably because the intervention was more focused on the patient and on the care process than on provider factors.
Study setting and service delivery context
In an effort to contain costs and maintain control of the implementation, we restricted the number of clinics to two per site. To maximize statistical power, and given the limitation of only six clinics in total, we chose not to compare different types of settings, such as health maintenance organizations, private settings, and public-sector settings. To maximize the feasibility of implementing the intervention—our most important goal, given that this was a “first-step” study—we chose university-affiliated primary care sites. We believed that these clinics, because of the affiliation of the investigators, would allow greater control and ease of implementation. It was important to show that our intervention would work at one site before we tried to show that it would work at all sites. A review of rates of usual care received by depressed patients did not indicate that clinicians in university-affiliated settings (
23) were more likely to diagnose and successfully treat depression than HMOs (
2,
24).
How broad a sample to include
An effectiveness model requires that we think in terms of broad and representative groups of patients or clinics (populations) rather than simply in terms of use of specific services, such as specialty mental health care. However, there should also be a balance between generalizability and practicality within the research environment. Samples that are large enough to demonstrate differences in clinical outcomes are far too small to demonstrate differences in cost-effectiveness (
25,
26). Our study was powered by the use of clinical outcomes and by allowing for an attrition rate of 25 percent. A total of 360 patients from three sites would be randomly assigned to study groups. This sample size would allow adequate power for comparing clinical and functional outcomes between intervention and control groups.
Cost-effectiveness estimates derived from this sample could be used to model likely outcomes in a future, larger dissemination study. To accommodate this medium-range sample size, we aimed to be more inclusive than most efficacy studies of panic disorder, with broader socioeconomic and ethnic diversity, while retaining some exclusions.
All patients between the ages of 18 and 70 years who were seeking care at the participating primary care clinics were eligible to participate in the study. Although patients who visit a physician might be inherently sicker than those who are randomly selected from clinic rolls, treatment-seeking patients are more likely to be motivated to accept and benefit from treatment (
24). Patients had to be willing to consider—but not necessarily willing to accept or to continue to accept—an intervention that combined psychotherapy and medication. Offering both treatment modalities was a compromise that standardized treatment while partially accommodating patients’ preferences. To ensure generalizability, we minimized the exclusion criteria to patients over the age of 70 years (because they would have more impairment from comorbid medical conditions and possible difficulty tolerating the interceptive exposure used in cognitive-behavioral therapy), patients with comorbid conditions that were life threatening (such as active suicidality or terminal medical illness) or that would limit their participation in the study (psychosis, substance abuse, dementia, illiteracy, or pregnancy), and patients who were receiving or applying for disability benefits (because of their compromised motivation to improve).
Efficient methods for selecting patients
Effectiveness studies often use two-stage recruitment methods to select representative samples. We selected a highly sensitive—but only modestly specific—two-question panic disorder screen (
27), necessitating a follow-up interview to exclude false-positive results. Screening during high-volume clinic days as a means of minimizing cost maximized yield at the expense of obtaining a true epidemiologic sample. However, to maximize the size of the sample, we also elected to recruit by physician referral, a method similar to the recruitment strategies of many efficacy studies. The confirmatory diagnostic interview for panic disorder was conducted by telephone with use of the World Health Organization’s telephone-validated 12-Month Composite International Diagnostic Interview (CIDI) (
28). A preliminary psychopharmacology effectiveness study using similar methods (
29) demonstrated minimal recruitment bias and maximal participation by eligible patients.
Timing, method, and type of outcome assessment
We chose to assess treatment at three, six, nine, and 12 months after study entry. The first time point ensured that most patients would have completed the cognitive-behavioral therapy, thus allowing measurement of acute effects. Although we knew that follow-up intervals longer than 12 months would be beneficial for demonstration of cost-effectiveness, this concern was balanced by the absence of data on the short-term efficacy of our intervention. A one-year follow-up would increase the chance that effects of treatment could be detected in health-related quality of life and functional status (
30), and preliminary cost-effectiveness differences could be used to model effect sizes for a larger study.
We elected to perform assessments by telephone to reduce patients’ burden and maximize successful follow-up, thus minimizing problems related to missing data, being mindful of the extensive data on the reliability and validity of telephone assessments in multiple outcome domains (
31–
33) and their effective use in previous studies of panic disorder (
34,
35). We used the depression studies of Wells and colleagues (
7) to identify relevant outcome domains—for example, economic domains and barriers-to-care domains—and identify specific assessment tools that are generally unavailable from efficacy studies. Weighing issues of breadth versus depth of measurement, we selected the assessment battery to maximize coverage of a range of health, economic, and treatment domains while minimizing patients’ burden.
The initial assessment established study eligibility and gathered baseline data by using selected modules from the lay-interviewer-administered CIDI (
28) (to eliminate the need for clinician expertise), dimensional measures of major panic–related clinical areas, and measures of functional status and quality of life (
36–
39). We added measures of direct costs (for example, service use) and indirect costs (for example, lost workdays) to allow cost-effectiveness analysis.
We also established criteria to assess the quality of care received in terms of both medication and cognitive-behavioral treatment to allow an as-treated analysis (assessment of how well patients who were receiving “good” treatment did). We also developed several new measures for the study—for example, beliefs about medications—and chose, from larger batteries, subsets of items we needed—for example, the Mini International Neuropsychiatric Interview (MINI) (
40) for a bipolar diagnostic screen, the Alcohol Use Disorders Identification Test (AUDIT) (
41) for alcohol use, the World Health Organization’s Disability Scale (
42) for a broadly used disability measure, and items about service use, employment, income, and insurance from a study by Wells (
7,
8). The readability of the assessment instrument was adjusted to a sixth-grade level, and three experts revised it to ensure its acceptability and appropriateness for persons from ethnic minority groups.
Making effective interventions more practical
The decision about what type of intervention to use was based on efficacy studies showing that a combination of cognitive-behavioral therapy and pharmacotherapy was at least as good as either approach alone (
20,
21,
43) and, among patients who were more severely ill (
44), was somewhat better. Because previous primary care depression intervention studies documented the effectiveness of fusing evidence-based treatments with a collaborative care treatment model that targeted patient, physician, and care-process barriers to obtaining high-quality psychiatric care (
2–
4), we integrated the cognitive-behavioral and medication treatment modalities within this disease management framework.
Once we had our ideal treatment package, we tried to adapt the treatment to usual care conditions in an effort to maximize generalizability. First, we reduced the cognitive-behavioral therapy from 12 to six sessions, preserving the original content (
20) and expanding it to include assessment and intervention for medication adherence issues and additional modules for ways of coping with the phobic behavior, depression, and social anxiety that commonly accompany panic.
Next, to approximate what would be feasible in nonuniversity community primary care settings, master’s-level clinicians who were not familiar with cognitive-behavioral therapy were chosen to administer treatment after receiving training that might be carried out in an HMO setting, with periodic telephone supervision from the expert who wrote the cognitive-behavioral therapy manual (the third author). Because primary care physicians were the usual care providers who prescribed medications for their patients who had panic disorder, it was important to optimize their medication administration by having consultant psychiatrists advise them about medication type, dosage, and side effect management. This use of expert knowledge systems delivered by allied mental health professionals, such as nurses, is a core part of chronic disease management programs (
5) and has been used to improve care of other chronic illnesses (
45–
47).
Although the selection of combined treatment did not specifically address patients’ treatment preferences, it was meant to facilitate recruitment by indirectly accommodating them. Patients were asked only to agree to a treatment package that included medications, cognitive-behavioral therapy, and elements of collaborative care. In practice, the preferences of the patient might emerge and reduce compliance with one or another of the components. Such an outcome, although it would be tracked, would not compromise the patient’s participation in the protocol.
Patients received a videotape about the treatment of panic as well as a workbook geared to the cognitive-behavioral therapy sessions. The patient workbook and the structured cognitive-behavioral therapy were revised after review by an ethnic advisory board, and readability was adjusted to a sixth-grade level. All sessions were conducted in primary care settings to minimize patients’ burden, maximize treatment acceptance (
48), and facilitate communication with the primary care physician. Patients received the cognitive-behavioral therapy sessions, which were not part of their usual care, free of charge. However, they were required to pay for medications according to their insurance plan. The study was approved by the human subjects review committee of the University of Washington.
Conclusions
Our experience suggests that clinical and services researchers can work together to develop hybrid study designs that contain key elements drawn from efficacy and effectiveness research. The trade-offs and compromises necessary for optimal design of such studies are informed by study goals and feasibility concerns. By conducting our study in primary care, we preserved the usual care setting and providers, although we added a “physician extender” (therapist) and used a less typical primary care site—a university. We allowed a broad range of comorbid psychiatric and medical illnesses but preserved some exclusions for safety and feasibility reasons. Although we recruited participants by conducting screenings in the waiting room—consistent with a population-based approach—we also accepted referrals from physicians in the primary care clinics to minimize costs.
Working under conditions whereby a standardized treatment was necessary, we tried to preserve usual care conditions by using pharmacotherapy and psychotherapy (cognitive-behavioral therapy) similar to the chronic disease self-management approaches commonly used in medical settings, with novice therapists trained to deliver cognitive-behavioral therapy, as might be done if the program was to be later initiated by an HMO. We tried to partially accommodate treatment preferences, by letting patients’ adherence to particular elements of the treatment package be an outcome measure, without having preferences influence initial treatment allocation, which would have complicated the design.
We used a modestly large sample, which allowed for assessment of functional and quality-of-life outcomes but only some preliminary cost estimates. Our assessments were broad based, socially relevant, conducted by telephone, and adapted for persons with low levels of education and for persons from different ethnic groups. An intermediate (one-year) duration of follow-up was chosen to balance the cost of the study with the need to assess both the durability of the clinical effect and the functional status and quality-of-life domains that take longer than the assessment of short-term clinical response.
Both the emphasis on generalizability and the larger samples demanded by effectiveness research dictate that study participants be recruited from multiple sites. If clinical researchers want to ensure that the interventions they test will be effective not only in controlled, clinical research settings but also in situations that mirror more closely the “real world,” it will be necessary for them to consider many of the points presented in this article in their future research plans.