Medicare or other claims have been used as case-finding tools for a variety of purposes, such as quality measurement or improvement, comparative effectiveness research, and economic assessments (
5,
7–
9), that are aimed at improving depression outcomes. However, the quality of such research depends on how well administrative claims data identify individuals with depression and those who receive care. For example, if the goal is to determine the quality of treatment among those treated, case-finding methods based on pharmacy claims or procedure codes may be adequate. The needs of researchers also likely differ from those of administrators seeking to use claims data for public reporting or quality monitoring. However, because claims may cover large populations and are readily available, using claims data is considerably less expensive than alternative approaches requiring primary data collection. The key is to understand when and how these data can be used appropriately.
The purpose of this study was to examine how closely codes for depression from claim files correlate with symptoms and identification of depression as measured by independent research assessment instruments. We chose to use two types of data that are frequently available in clinical and policy research settings: Medicare administrative claims and patient self-assessed depressive symptoms.
Discussion
Claims data are used for a variety of purposes, and the desired accuracy depends largely on the use. The key is to understand when and how they can be used appropriately. Our findings demonstrate that Medicare administrative data are more accurate for ruling out depression among older adults than for identifying those with depression. The 10%–46% sensitivity we found is similar to the 17% reported for chronic kidney disease (
24) and 35% for stroke (
25) when medical records and Medicare data were used. At the same time, the sensitivity estimates we report are lower than the 50%–90% sensitivity reported for breast cancer (
26,
27) and Parkinson's disease (
28). Our findings were robust regardless of the length of the follow-up period and were consistent across different codes used to define depression.
A number of studies have examined the accuracy of administrative or claims data for identifying depression (
29–
33). Overall, they have reported a substantial discrepancy between the presence of depression diagnoses in medical claims and those in patient charts. Moreover, claims and charts did not correlate very well in identifying the time of symptom onset (
30,
32). Requiring the presence of two or more depression-related visits to a health care provider decreased the number of identified cases and improved the specificity of the identification strategy compared with a less restricted approach (
29). However, the small sample size of several validation studies raises concerns about generalizability, patient and provider self-selection, and statistical robustness (
34).
Spettell and colleagues (
31) identified several additional challenges of health plan data presented by depression in addition to those normally encountered in administrative data: lack of recognition of depression by the physician, physician's or patient's unwillingness to report depression because of stigma, and the role of medical comorbidity in confounding the diagnosis (competing risks). They also mentioned three commonly encountered challenges of health plan data: submission by providers of incomplete data; limited clinical detail in the
ICD-9-CM, Current Procedural Terminology (CPT), and diagnosis-related group (DRG) systems; and inaccurate demographic data in administrative files.
In this study we required one health care visit or encounter with a depression code for any of six types of Medicare claims files, including outpatient visits and hospitalizations, which we thought was more realistic for the current pattern of care for patients with depression. Also, our study used a Medicare population exclusively, in which the vast majority of participants (88%) were not members of managed care health plans and were aged 65 or older (90%).
There are several possible explanations for the limited accuracy of the claims-based identification approach for depression in our study. First, despite recent efforts, depression remains underdiagnosed in primary care (
5,
35,
36). Even with the best clinical expertise and even when there are no time or system constraints, it has been shown that unaided physician detection correctly recognizes only about 45% of depressed patients in primary care (
37). This is comparable to the positive predictive values that we report. If a large portion of depressed patients are not diagnosed and are most likely not being treated, we cannot expect that using claims to identify depression would yield better results (that is, positive predictive value >45%). Because many patients do not receive care for their depression, or receive it late, the number of cases identified via claims is expected to be lower than number of persons with depressive symptoms. Second, the diagnosis of depression is often complicated by the presence of comorbid health problems (
38,
39), which may either overlap with depressive symptoms or simply take priority during provider visits because of the limited visit time.
Third, physicians may even underreport depression among poor elderly persons, assuming that these patients would not be able to afford treatment (
40,
41). Also, some older patients with clinically significant depression may underreport their symptoms (
42). There is also evidence that some physicians deliberately substitute other diagnostic codes for major depression to ensure that their patients are not penalized by high insurance copayments for depression (
34). Finally, it is possible that even when a physician recognizes and treats depression, a depression diagnosis may not appear on the medical claim because of stigma concerns.
We must acknowledge a number of limitations. First, the Medicare demonstration project was not intended to be representative of all Medicare patients but rather of seniors with substantial disability and health care use. Second, using the MINI-MDE, with a kappa of .55 compared with the SCID, as the gold standard comparator is a limitation. However, it is better than any other gold standard previously used. Third, although there was a lack of agreement between the baseline MINI-MDE and GDS, Adams (
43) recently reported that the GDS at standard cutoff points had 70%–75% agreement with the MINI-MDE. However, some GDS items did not adequately represent depression among functionally impaired senior or those in the oldest-old group. Fourth, a beneficiary could have been diagnosed as having depression before the baseline or 12-month administration of the MINI-MDE or GDS and have been treated successfully; thus the independent research assessment at baseline or 12 months would have incorrectly resulted in a false positive and would have lowered the specificity and the positive predictive value. Fifth, prescriptions for outpatient antidepressants were not covered by Medicare at the time of the demonstration and therefore Medicare pharmacy claims data were not available for inclusion in a depression identification algorithm. Sixth, asking the study participants and their caregivers questions about depression at baseline may have resulted in participants' perceptions of less stigma related to depression at 12 and 22 months (
44). Therefore, study participants may have been more willing to answer “yes” to questions about depression symptoms at 12 and 22 months. A similar effect could have occurred in the nurse and combination groups because the nurses also asked questions about depression. However, we do not know whether these effects occurred.
In addition, although using the presence of two claims for identifying a condition is common (
29), our base case identification of depression required the presence of only one depression diagnosis. Using two claims may have led to different findings. Finally, the HIPAA (Health Insurance Portability and Accountability Act) 4010 standard for electronic claims transfer limits the number of diagnoses that can be reported to the Centers for Medicare and Medicaid services. This may partly explain the low sensitivity we found and may also help mitigate the concern that requiring only a single claim with a depression diagnosis limits specificity.