The high prevalence of major and subthreshold depression in the primary care setting (
1,
2,
3,
4,
5), in conjunction with high levels of associated disability (
6,
7,
8), make it imperative that primary care physicians provide effective treatment for their depressed patients. Randomized controlled trials have clearly shown that enhancing the quality of pharmacotherapy and psychotherapy improves the outcomes of depressed patients (
9,
10,
11,
12,
13). However, to our knowledge only one observational study has been conducted in routine practice settings showing that higher-quality care is associated with better clinical outcomes (
14). In the Medical Outcomes Study, patients with severe depression who received higher-quality care, especially those receiving both counseling and appropriate antidepressant medication, experienced greater improvement in functional limitations (
14).
The hypothesis of the nonexperimental study reported here was that patients receiving care for depression provided in concordance with treatment guidelines developed by the Agency for Healthcare Research and Quality (AHRQ) (
15) will have better outcomes than patients receiving care that is not in concordance with the guidelines (
9). Depression treatment is defined as guideline concordant if either the physician prescribes an adequate dosage of an antidepressant medication and the patient adheres to the pharmacotherapy regimen over time or if the patient makes regular visits to a mental health specialist for psychotherapy (
15). However, in routine practice, fewer than half of patients treated for depression receive care in concordance with AHRQ guidelines (
10,
16,
17,
18,
19), and practice patterns appear to vary substantially across clinical settings (
20).
Although subthreshold or minor depression is defined differently in different studies (
1), it has been found to be associated with substantial impairment and to be an important risk factor for the onset of a major depressive episode (
6,
21,
22,
23,
24,
25). An important area of debate is whether patients with subthreshold depression should be treated or whether a period of watchful waiting should be initiated instead (
4,
26,
27). The empirical evidence concerning the effectiveness of treating subthreshold depression is mixed, with some studies reporting that treatment is effective (
8,
28,
29,
30) and others that it is not (
10,
11,
12). To contribute to the literature on treating subthreshold depression, this analysis estimated separate treatment effects for providing guideline-concordant care to patients with subthreshold depression and those with major depression.
Methods
The sampling procedures have been described in detail elsewhere (
31). Briefly, 11,078 people in rural and urban areas of Arkansas were screened for depression over the telephone in 1993. Of 636 eligible participants, 470 (73.9 percent) participated in a face-to-face interview at baseline. We conducted follow-up telephone interviews with 455 of these subjects (96.8 percent) six months after the baseline interview to determine the type and extent of services used. To adjust for the sampling design and to increase the representativeness of study participants to those screening positive, all descriptive and inferential statistics were calculated using statistical weights.
Subjects were asked to identify by name and address all health care professionals visited between baseline and the six-month follow-up. Additional health care professionals not named by the subjects were identified from their insurance or pharmacy records or from the medical records of the identified health care professionals. Complete sets of billing, medical, and pharmacy records were collected for 435 subjects (95.6 percent) who completed the six-month follow-up interview.
A rigorous protocol was developed to abstract accurate information on service use from the medical, pharmacy, and billing records. For each outpatient visit, we abstracted information about the diagnoses made, medications prescribed, and symptoms mentioned in the clinical notes. A depression visit was defined as any outpatient encounter during which a diagnosis of depression was made, an antidepressant medication was prescribed, or depression was mentioned in the clinical notes.
Using these criteria, 115 of the 435 subjects with current depressive symptoms had a depression visit in the six months after the baseline interview. Of the 115 patients, 106 had complete data for all variables used in the analysis. These 106 patients were treated for depression by 105 different providers. The 106 patients constituted the sample used to compare the outcomes of those receiving guideline-concordant care with those not receiving concordant care. Outcomes included disease-specific measures of depression severity and generic measures of mental health and physical health.
Clinical improvement was measured by the change between baseline and six months in the severity score on the Inventory to Diagnose Depression (IDD). The IDD is a self-reported depression severity scale that has been shown to be reliable and highly correlated with interviewer-rated scales (
32). Possible IDD scores range from 0 to 100. Lower scores indicate less severe depression; therefore, change scores (pre minus post) that are positive indicate improved outcomes.
Outcomes not specific to depression were measured using two subscales from the Short Form Health Survey (SF-36) (
33,
34)—the mental component summary score (MCS) and the physical component summary score (PCS). The dependent variables representing functioning were measured by the change in the MCS and PCS scores between baseline and six months. The change scores were reverse-coded so that positive values would represent improvement.
Quality of depression care was defined dichotomously. Treatment received in the six months after baseline was categorized as high quality if it was provided in concordance with AHRQ guidelines for depression care during the acute stage of illness. Treatment was defined as guideline concordant if during the six months after baseline, the patient reported taking an adequate dosage of antidepressant medications for 75 percent or more of the days during an eight-week period or the administrative records indicated that the patient made eight or more visits to a specialty mental health provider over a 12-week period. It should be noted that this definition of guideline-concordant care is based on both patient self-report and administrative records, that is, medical, billing, and pharmacy records.
Covariates in the analysis included age, gender, the ratio of family income to the poverty line, physical comorbidity, psychiatric comorbidity, and treatment sector. Physical comorbidity was measured at baseline by the number of chronic physical problems the subject reported, using a standardized checklist. Psychiatric comorbidity was measured at baseline using the Quick Diagnostic Interview Schedule and represents a count of nine potential lifetime psychiatric diagnoses other than depression or dysthymia (
35). Patients making any visits to a mental health specialist were categorized as being treated in the mental health sector.
We used ordinary least-squares regression analysis in conjunction with the change scores to test the hypothesis that those receiving guideline-concordant care had better outcomes. To determine whether the effectiveness of higher-quality care was the same for patients with major depression and subthreshold depression, an interaction term between quality of care and meeting diagnostic criteria for major depression was specified. A count of depression symptoms (0 to nine symptoms) experienced in the two weeks before the baseline interview was made using the IDD. Subjects reporting five or more symptoms were coded as meeting diagnostic criteria for current major depression, and those with fewer than five symptoms were coded as meeting criteria for subthreshold depression. The IDD has been shown to have high specificity and sensitivity compared with diagnostic assessment by a clinician (
32,
35).
Because receiving guideline-concordant care and meeting diagnostic criteria were both measured dichotomously, the main effect for guideline-concordant care represents the treatment effect for those with subthreshold depression, and the interaction term represents the added treatment effect for those meeting diagnostic criteria for current major depression. The overall treatment effect for those meeting diagnostic criteria is given by the sum of the main effect and the interaction effect. The significance of the overall treatment effect for those meeting diagnostic criteria was determined using an F test, which tests whether the sum of the coefficients for the main effect and the interaction effect is different from zero.
Because this was an observational study, the estimated treatment effect is negatively biased if patients receiving guideline-concordant care have greater unmeasured illness severity than patients not receiving guideline-concordant care. It should be noted that both provider and patient behavior may contribute to negative selection bias. Negative selection bias may result from clinicians providing higher-quality care to patients with greater unmeasured severity of illness and from patients with greater unmeasured severity of illness being more likely to adhere to treatment regimens.
Because selection bias was potentially an important problem in this nonexperimental study (
36), we conducted a sensitivity analysis using a Heckman selection model. In the probit selection equation, which predicted guideline-concordant care, the instrumental variable was specified to be the travel time to the provider (
37). Travel time has been shown to significantly and substantially predict receiving guideline-concordant care, but it should not be associated with outcomes directly (
31). The Heckman selection model was estimated using full information maximum likelihood and included all observations in both the probit selection equations and the linear outcomes equations.
Results
Descriptive statistics for the 106 patients in the study are provided in
Table 1. Sixty percent met diagnostic criteria for current major depression, and the remaining 40 percent met the criteria for subthreshold depression. Eighty-seven percent of subjects met lifetime criteria for major depression. Approximately 29 percent of the sample received depression care in concordance with AHRQ guidelines. Almost all of the patients defined as receiving guideline-concordant care were getting an adequate dosage and duration of antidepressant medication—100 percent of the patients in the general medical sector and 85 percent of those in the mental health sector.
The mean±SD IDD change score was 3.22±14.65; change scores ranged from -32.0 to 56. A quarter of the sample (24 percent) improved by more than 10 points, and 10 percent improved by more than 20 points. In contrast, 17 percent of the patients' IDD scores declined by more than 10 points, and 7 percent declined by more than 20 points.
The two diagnostic groups—those with major depression and those with subthreshold depression—and the two quality-of-care groups—those who received guideline-concordant care and those who did not—represent four distinct groups of patients. Although the regression analysis used an interaction term to test the study's hypothesis, it is also helpful to look at outcomes in each of these four groups separately. The unadjusted outcomes and 95 percent confidence intervals for each of the four groups are graphically displayed in
Figure 1. For patients with major depression, improvements in depression severity and mental health were greater or those receiving guideline-concordant care than for those not receiving guideline-concordant care, but the difference was not statistically significant. For patients with subthreshold depression, those receiving guideline-concordant care did not appear to have better outcomes than those not receiving guideline-concordant care. Patients who met diagnostic criteria for major depression appeared to have had greater improvement in depression severity than patients who did not meet diagnostic criteria.
Table 2 shows the results of the standard regression analyses. For the regression predicting the change in the IDD score, a positive parameter estimate indicated improvement in depression severity. The most substantial and significant predictor was the variable indicating whether the patient met diagnostic criteria for major depression. Patients with five or more symptoms had a 9.2-point improvement over patients with four or fewer symptoms. Guideline-concordant care had a nonsignificant main effect and a significant (p=.02) interaction effect with a diagnosis of major depression.
The nonsignificance of the main effect in conjunction with the significance of the interaction effect indicates that patients with major depression who received guideline-concordant care showed greater improvement in depression severity than patients who did not receive guideline-concordant care, but that patients with subthreshold depression who received guideline-concordant care did not show greater improvement than patients who did not receive guideline-concordant care.
The total impact of receiving guideline-concordant care for patients with major depression is given by the sum of the regression coefficients for the main effect (-4.2) and the interaction effect (+14.3). Thus for patients with major depression, those receiving guideline-concordant care were predicted to have a 10.1-point improvement in depression severity. The total treatment effect for patients with major depression was significant (F= 6.04, df=1, 90, p=.02).
The MCS and PCS change scores were reverse-coded so that a positive parameter estimate indicates improvement. Patients with major depression were not significantly more likely than those with subthreshold depression to show an improvement in the MCS and PCS scores. Neither the main effect nor the interaction effect for guideline-concordant care was significant in either regression. Likewise, the total treatment effect for patients with major depression who received guideline-concordant care was not statistically significant according to the F tests.
Table 3 shows the results of the Heckman selection model. The first set of parameter estimates is from the selection equations—the probit regressions predicting receipt of guideline-concordant care. The second set of parameters is from the corrected outcomes equations—the linear regression predicting the change score.
In the model, decreasing travel time by one hour in this rural sample increased the odds of receiving guideline-concordant care by 3.4 times (
31). Although travel time is not a weak instrument, it is also not a particularly strong instrument. Therefore, the instrumental variables analysis should be interpreted only as a sensitivity analysis.
Compared with the standard analysis of change in depression severity, the Heckman selection model estimated a substantially smaller (negative) main effect for guideline concordance and a smaller and less significant (p=.06) interaction effect for a diagnosis of major depression. The estimate of rho was negative (-.56) and significantly different from zero (p=.001), suggesting that unobserved case-mix factors were positively correlated with receiving guideline-concordant care and negatively correlated with improved depression severity—that is, the instrumental variables analysis suggests that there was negative selection bias in the standard analysis.
The total treatment effect estimated by the selection model was larger than the total effect in the standard analysis (-.3±12.5=12.2). The Wald test of whether the main effect plus the interaction effect was different from zero was significant (χ2=6.82, df=1, p=0.01), which indicates that there was a positive treatment effect for patients with major depression. Results of the selection model suggest that the standard analysis may have overestimated the negative treatment effect of guideline-concordant care for patients with subthreshold depression and underestimated the total positive treatment effect (main effect plus interaction effect) for patients with major depression. In the Heckman models predicting change in MCS and PCS scores, the estimates of rho were not significantly different from zero, indicating no selection bias.
Discussion and conclusions
This study monitored patients treated for current depression over a six-month period to determine the impact of quality of care, as measured by concordance with AHRQ treatment guidelines, on clinical outcomes. Only 29 percent of depressed patients treated in routine practice settings received guideline-concordant care during the study period, which is consistent with results obtained from the National Comorbidity Study (
18).
Results of the regression analyses indicated that patients with major depression who received guideline-concordant care experienced significantly and substantially greater improvement in depression severity than patients who did not receive such care. Thus this purely observational study has documented a positive relationship between guideline-concordant care and clinical depression outcomes in routine practice settings. The findings support the results of previous experimental studies on the effectiveness of enhanced depression treatment (
9,
10,
11,
12,
13).
A possible reason why so few previous observational studies have supported the hypothesis that higher-quality care leads to better outcomes is that nonrandom selection into the high-quality group may result in selection bias, which causes the regression analysis to underestimate the true treatment effect. Selection bias probably caused the treatment effect to be underestimated in this analysis as well, and thus the treatment effects reported here should be interpreted as conservative estimates. In fact, the sensitivity analysis conducted using a Heckman selection model estimated a treatment effect 21 percent larger ([12.2-10.1]/[10.1]) than in the standard analysis.
However, it should be noted that the treatment effect estimated from the selection model is generalizable only for the subsample of patients whose care could theoretically have been affected by the instrumental variable used—travel time to care (
38). Specifically, the larger treatment effects estimated by the selection model are not generalizable to patients whose severity of illness was so low or so high that geographic accessibility would not have influenced whether they received guideline-concordant care.
Results also indicated that there was not a positive treatment effect for patients with subthreshold depression. The positive impact of receiving guideline-concordant care for patients meeting diagnostic criteria was statistically significant and clinically relevant. However, the estimated treatment effect for those with subthreshold depression was not significant in this sample. This finding suggests that patients meeting diagnostic criteria for major depression may respond better to medication management than patients with subthreshold depression. However, the larger treatment effect observed for patients with major depression than for patients with subthreshold depression may also simply reflect a floor effect. Patients meeting diagnostic criteria for major depression had greater depression severity at baseline than patients with subthreshold depression (35.4 versus 13.4, t=12.7, df=104, p<.01). Therefore, patients meeting criteria for current major depression had greater room for improvement than those with subthreshold depression.
Although the severity of depression is closely linked with overall physical and mental health, it should not be surprising that a strong positive relationship was not observed in this relatively small sample during the relatively brief study period. The statistical power to detect the modest increase in incremental R
2 associated with adding guideline-concordant care to the MCS outcomes regression equation was only .4. The complete lack of association between guideline-concordant care and the PCS change score may have been an artifact of how the PCS score is calculated. Simon and colleagues (
39) have argued that because the mental health and role-emotional subscales of the SF-36 actually make negative contributions to the PCS score, improvements in these two subscales may offset potential increases in PCS scores resulting from improvements in the physical health subscales, such as physical function, role-physical, bodily pain, and general health perception (
39).
The design of this study has both strengths and limitations. A major strength is its focus on a single tracer condition, major depression, which allows the relationship between concordance with disease-specific treatment guidelines and disease-specific clinical outcomes to be examined accurately. Another strength is that the research was based on a community sample of individuals and their clinicians. The fact that this sample included just as many providers as patients increases the generalizability of the findings across many different types of treatment settings throughout the state of Arkansas.
The biggest threat to generalizability in this study involves the antidepressant medications that were being prescribed during the study period. The prescribing patterns of providers in this sample reflected a transition period from tricyclic antidepressants to selective serotonin reuptake inhibitors. Consequently, the estimated treatment effect of guideline-concordant care provided during the study period likely underestimates the current treatment effect.
Acknowledgments
This research was supported by grants R01-MH-53629 to John Fortney and R01-MH-49116 and R01-MH-54444 to Kathryn Rost from the National Institute of Mental Health. Dr. Pyne was supported by a Veterans Affairs career development award.