Identifying Frequent Patterns of Care
Claims data assembled by MEDSTAT for the period 1991–1996 from four large self-insured U.S. firms (representing 426,000 employees and dependents) with generous mental health benefits by industry standards were examined to identify outpatient episodes of acute-phase major depressive disorder in adults ages 18–64. The operational definition of a first or recurrent depressive episode required a course of care of up to 16 weeks prompted by an ICD-9 major depressive disorder diagnosis and preceded by a period of 8 or more weeks without depression care. Upon identifying 13,098 such episodes, we analyzed the demographic and clinical information contained in the claims and classified episodes according to the observed treatment, type of provider (primary or specialty care), and patient characteristics.
Treatments consisted of an intervention defined according to current procedural codes (Physician’s Current Procedural Terminology, 4th ed. [CPT-4]) and a time descriptor (e.g., tricyclic antidepressant for more than 30 days). Interventions included medications (antidepressants, lithium, and antipsychotics), psychotherapy, combined antidepressant drugs and psychotherapy, brief office visits (i.e., clinical management), unspecified mental health interventions (namely CPT-4 unspecified “focused” visits), and medical interventions only (e.g., diagnostic tests). Drugs were grouped according to known conventions (e.g., selective serotonin reuptake inhibitors [SSRIs]), except for trazodone, amoxapine, and bupropion, which were bundled as “other antidepressants.” Duration of drug treatments was calculated by using pharmacy claims, which provide both number of days and dosages.
Patient characteristics were demographic and clinical variables associated with either the choice of or response to treatment: gender, age (which for women was dichotomized to differentiate by menopausal status: 18–49 years versus 50–64 years), and medical and substance abuse comorbidity, either concurrent or within the previous year. Medical comorbidity was selected on the basis of its impact on treatment choice or prognostic relevance (e.g., ICD-9 disorders of the circulatory system).
After cataloguing all episodes as a combination of treatment, provider, and patient characteristics (or treatment “cell”), we excluded combinations with fewer than 30 episodes over the 6-year period unless deemed clinically relevant (e.g., lithium monotherapy). The final data set contained 9,054 episodes of depression distributed across 120 cells (
Table 1). These cells corresponded to 30 treatments modified by two provider types and seven patient characteristics (e.g., treatment with an SSRI for less than 30 days by a specialty care provider for a female subject 18–49 years of age with no medical or substance abuse comorbidity).
Because we wished to characterize the effectiveness of treatment practices commonly used in 1990s usual depression care, the focus of this paper is on the 10 most frequent treatments, observed in 8,160 episodes (90%) of the larger pool (
Table 1).
Expert Selection
From suggestions by National Institute of Mental Health staff and other highly regarded depression researchers, we created a short list of U.S.-based practitioners recognized by their peers as expert clinician-researchers in the field of depression. If they had been in clinical practice for at least 5 years and had treated depressed patients in the past year, we invited them to participate. All experts approached accepted our invitation. The expert panel (listed at the end of this article) included four psychiatrists, four clinical psychologists, and two primary care clinicians, all of whom were involved in academic research and clinically active (years practicing, mean=20, range=11–42), with a mean of 64 major depressive disorder patients treated over the previous year (range=4–200).
The Elicitation Process
For a thorough discussion of the elicitation process, please refer to Normand et al.
(23). In the first stage of the elicitation, experts independently estimated outcome, defined as treatment-related changes in score on the Hamilton Depression Rating Scale. Experts were asked to consider 100 outpatients with major depression and moderate to severe symptoms (i.e., Hamilton depression scale score=22) seeking treatment in 1998 and estimate what number would fall into each of four outcome categories after 16 weeks of usual care: remission (Hamilton score <8), significant improvement (score <13), mild improvement (score <18), and no change (score ≥18).
First-stage ratings were analyzed to calculate expert agreement. In the second stage of the elicitation, experts openly discussed at a face-to-face meeting treatments that had significant disagreement, and then independently rerated those treatments. At the meeting, experts agreed that effectiveness ratings should be anchored to likely outcomes given no treatment. They estimated that the no-treatment condition would have an approximate 16-week remission rate of 15%.
Computation of Effectiveness Estimates
Upon completing the elicitation, ratings were averaged across the 10 experts. Treatment response (i.e., Hamilton depression scale score <13) was not elicited but rather computed as a sum of the mean probabilities of remission and significant improvement. To illustrate the effectiveness of usual depression care, we report on expert-estimated rates of remission, response, and no change.
For clarity, we operationalized three categories of effectiveness: minimal, moderate, and high. Treatments with expert-estimated rates of ≤20% for remission, ≤45% for response, and ≥25% for no change were classified as minimally effective. Treatments with rates of ≥30% for remission, ≥60% for response, and ≤15% for no change were classified as highly effective. Treatments with outcomes between these two poles were classified as moderately effective.
Standard errors for the weighted mean probabilities were calculated by using bootstrap methods
(26). We did not evaluate the statistical significance of the difference between all 10 treatments at each outcome category because of the inevitable methodological problems associated with multiple comparisons. When we did test for differences, empirically calculated 95% confidence intervals (CIs) were used.
System effectiveness, the sum of all the effects associated with depression care in a population, was operationalized as a weighted average of expert-estimated rates of remission, response, and no change for the 10 most frequent treatments.