Analysis of administrative databases is critically important for disease surveillance, program planning, and health services research (
1). Such analyses have the advantage of including larger numbers of participants than is typically feasible in clinical studies. Such studies make use of diagnoses rendered as part of clinical care, which are then secondarily rolled into large data sets for administrative and research purposes. However, diagnoses made during clinical care processes tend to be less accurate than diagnoses made in clinical research protocols (
2–
5). This may be attributable to differences, for example, in ascertainment method or clinician diagnostic tendencies. Thus analyses of administrative data sets may be sensitive to the rules used to convert these clinical diagnoses to administrative case definitions.
There is no universally accepted standard for utilizing encounter- or claims-based diagnoses to establish mental health diagnoses in administrative database analyses, although evidence indicates that multiple service encounters or claims with a given diagnosis are required to provide acceptable predictive values for at least some mental health conditions (
6–
8). Criteria for assigning a particular diagnosis for database studies have varied on several dimensions, including the number of required clinical encounters, whether these encounters were tied to inpatient or outpatient visits, the time frame in which they occurred, and the occurrence of competing diagnoses. For example, some studies have required just one clinical encounter (inpatient or outpatient) to qualify for a diagnosis (
9,
10), and others have required two such encounters (
11). Other studies have required one inpatient encounter (
12), and others have required one inpatient or two outpatient visits (
6,
13,
14).
Several studies with various methodologies have investigated the validity of diagnostic schemata for some mental health conditions (
6–
8,
15–
18). However, we are aware of no studies that have characterized the impact of population definition on prevalence or population composition across several serious mental health conditions in administrative databases. Such data are necessary to assess the impact of decisions related to diagnoses on research and administrative analyses, which is of critical importance for large health care systems such as the Department of Veterans Affairs (VA) and health maintenance organizations (
19), as well as for commercial insurers, Medicare, and Medicaid. The impact of population definition on case finding and case mix is also of increasing importance to accountable care organizations (
20), which are responsible for overall management of beneficiary populations.
We therefore utilized a large administrative database from the VA to address two questions. What is the impact of varying inclusion and exclusion diagnostic inclusion criteria on the prevalence of bipolar disorder, schizophrenia, and posttraumatic stress disorder (PTSD)? What is the impact of varying diagnostic inclusion criteria on key demographic and clinical characteristics in these populations?
Methods
Population Definitions
The VA Central Institutional Review Board approved all study procedures following Declaration of Helsinki principles. We obtained administrative encounter records from the VA Corporate Data Warehouse for fiscal years (FY) 2003–2010. Three populations were compiled for separate analyses: all VA service users who had at least one clinical encounter for a diagnosis of bipolar disorder (ICD-9 codes 296.0, 296.1, and 296.4–296.89, including bipolar type I, type II, and not otherwise specified [NOS]); at least one clinical encounter for a diagnosis of schizophrenia spectrum disorder, including schizoaffective disorder (ICD-9 codes 290.0–295.9); or at least one encounter for a diagnosis of PTSD (ICD-9 code 309.81).
To address our first study question, we utilized the above single-encounter inclusion criterion (
9,
10) as the base case (group A) against which to compare subpopulations by using two more restrictive inclusion criteria sets: at least one inpatient or two outpatient encounters with that diagnosis (
6,
13,
14) in a given year (group B), or at least two inpatient or outpatient encounters with that diagnosis (
11) in a given year (group C).
In regard to use of competing diagnoses as exclusion criteria, there is well-documented uncertainty about potential diagnostic overlap between bipolar disorder and schizophrenia because of variability in patient presentation, provider diagnostic trends, and intermediate disease states (
21,
22). We therefore identified individuals who had a diagnosis of bipolar disorder or schizophrenia by using at least one inpatient or two outpatient encounters, respectively, in a given year (group B), without consideration of competing diagnoses (schizophrenia or bipolar disorder, respectively). This population (group B) served as the base group against which we compared subpopulations in which more restrictive criteria were applied—those with no competing diagnoses in the prior 12 months (group D) or those with no competing diagnoses since FY 2002 (group E). [A figure illustrating subpopulation construction is available in an online
data supplement to this article.]
Population Characteristics
To address our second study question, determining the impact of diagnostic inclusion criteria on case mix, we identified a priori potential correlates by reviewing the literature to identify patient characteristics examined in prior administrative database studies—primarily common psychiatric and general medical comorbidities and specific general medical conditions that may affect treatment choice (for example, metabolic syndrome criteria in studies of antipsychotic use). Clinical diagnoses were considered present if identified on a single encounter within 12 months before the patient met criteria for the index diagnosis (ICD-9 diagnoses available on request). Demographic characteristics included age, gender, racial-ethnic minority status, marital status, and disability status (≥50% service-connected VA disability). Psychiatric diagnoses included major depressive disorder with or without psychotic features, alcohol use disorders, drug use disorders, PTSD (for bipolar disorder and schizophrenia), and other anxiety disorders. Medical diagnoses included diabetes, obesity, hyperlipidemia, liver disorder, kidney disorder, thyroid disorder, cardiac dysrhythmia, tobacco use disorder, traumatic brain injury, and sleep disorder.
Construction of Mutually Exclusive Subpopulations
The subpopulations of interest are subsets of the larger reference population (groups B and C versus group A; groups D and E versus group B). To construct groups of independent observations, we constructed four disjoint sets based on inclusion criteria for each of the three mental health conditions on the basis of the above definitions: group B, members of group A but not group B, group C, and members of group A but not group C. Similarly, to investigate the impact of excluding competing diagnoses for bipolar disorder and schizophrenia, we again constructed four disjoint sets: group D, members of group B but not group D, group E, and members of group B but not group E.
Statistical Analyses
We first calculated prevalence rates for the three populations—bipolar disorder, schizophrenia, and PTSD—for FY 2003–2010 using the base case and more restrictive definitions. We calculated rates without regard to overlapping membership, because the focus of interest was the overall “yield” according to each definition. We then conducted our primary analyses of population composition for individuals who met diagnostic inclusion criteria for the given diagnosis in FY 2010, with identical analyses run on FY 2005 data to investigate stability of findings. There were no substantive differences, and only the FY 2010 data are presented here.
Comparisons focused on determining the odds of having a particular characteristic, given membership in a specific subpopulation compared with the reference population. Analyses were conducted utilizing odds ratios (ORs), which, because of the large sample sizes, were almost all highly statistically significant (p<.001). However, the focus of interest in these analyses was magnitude of effect rather than statistical significance. We therefore characterized ORs as roughly equivalent to small (.20), medium (.50), and large (.80) effect sizes (
23): respectively, ≥1.46 or ≤.68; ≥2.50 or ≤.40; and ≥4.14 or ≤.24 (
24).
Results
Effects of Inclusion Criteria on Prevalence
Prevalence over time for the diagnostic inclusion criteria sets for bipolar disorder, schizophrenia, and PTSD in 2003–2010 is summarized in
Figure 1. For bipolar disorder, the number of VA service users increased according to all definitions, with the least restrictive group (group A) increasing from 82,131 to 120,382 and the most restrictive group (group E) increasing from 53,587 to 85,856 (p<.001 for each). [A table in the online
supplement presents individual time trend statistics.] The 2010 prevalence rates among all VA service users (N=5,536,465) for the least and most restrictive definitions were, respectively, 2.35% and 1.55%. In contrast, the number of service users treated for schizophrenia decreased from 100,853 to 91,977 (group A) and from 74,870 to 59,909 (group E) (p<.002 for each), with 2010 prevalence rates for the least and most restrictive definitions, respectively, of 1.66% and 1.08%. The PTSD population grew substantially, from 244,930 to 554,028 (group A) and from 180,261 to 411,526 (group E) (p<.001 for each), with 2010 prevalence rates for the least and most restrictive definitions, respectively, of 9.83% and 7.43%.
In FY 2010, moving from the least (group A) to the most (group E) restrictive definition for bipolar disorder, schizophrenia, and PTSD reduced the number of individuals by 28.7%, 34.9%, and 25.7%, respectively.
Effects of Inclusion Criteria on Population Characteristics
Demographic and clinical characteristics of the bipolar, schizophrenia, and PTSD populations in FY 2010 (group A) are summarized in
Table 1.
Table 2 compares the effects on population composition of applying more restrictive definitions (group B or C) to the single-encounter base case (respectively, group B versus group A-not-B, and group C versus group A-not-C) for each disorder.
Only 22 of 132 (17%) ORs reached even small effect sizes. Moreover, there were few differences in relative effects between the two more restrictive definitions (ORs for group B versus group A-not-B compared with ORs for group C versus group A-not-C). Only the percentage of individuals with PTSD who had a ≥50% service-connected disability reached a consistent medium effect size (ORs=2.55 and 2.58). Only the frequency of diagnosis of major depression with or without psychotic features was sensitive to change of diagnostic inclusion criteria across all three populations, with decreasing frequency in the bipolar disorder population (ORs=.55–.76) and schizophrenia population (ORs=.52–.65) and increasing frequency in the PTSD population (ORs=1.71–2.50).
Effects of Competing Diagnoses: Bipolar Disorder and Schizophrenia
Table 3 summarizes the effects on population composition for bipolar disorder and schizophrenia of requiring no competing diagnoses of the other disorder, either in the prior 12 months (group D versus group B-not-D) or since 2002 (group E versus group B-not-E). Analyses revealed ORs of small effect in 37 of 88 (42%) comparisons, with no ORs indicating a medium effect. There were few differences in relative effects between the two more restrictive definitions (ORs for group D versus group B-not-D, compared with ORs for group E versus group B-not-E).
Small, same-direction ORs across the two conditions were seen for drug use disorders and major depression with psychosis. Reciprocal differences in ORs across the two populations appeared in the percentage of white individuals when the definition of bipolar disorder was restricted to those without a competing diagnosis of schizophrenia (ORs=2.00–2.03) and when the definition of schizophrenia was restricted to those without a competing diagnosis of bipolar disorder (ORs=.60–.65). Similar, but somewhat smaller, consistent reciprocal effects were seen in the percentage of males when more restrictive definitions of bipolar disorder were used (ORs=.66–.73) and schizophrenia (ORs=1.88–1.89).
Discussion
Effects of Criteria on Population Prevalence
To our knowledge, this is the only study of the effects of varying administrative data set diagnostic inclusion criteria across several serious mental health conditions on resulting population composition, an issue of clear relevance both to health services research and administrative analyses. Comparing single-encounter definitions for bipolar disorder, schizophrenia, and PTSD to more restrictive definitions, we found expected reductions in population size. The most restrictive definitions for these conditions reduced population size in 2010 by 25.7%−34.9%.
Though time trends were not the focus of this study, the prevalence differences over time were roughly consistent over the study years. The substantial increase in PTSD prevalence over time among VA service users is consistent with other reported data (
25). However, we are not aware of data comparing the relative prevalence of bipolar disorder and schizophrenia, which were notably divergent over time for all definitions. The relative increase in diagnosis of bipolar disorder and decrease in schizophrenia among VA service users may be attributable to several factors. First, excess mortality among persons with schizophrenia compared with the general population has been well documented (
26), although this rate may (
27) or may not (
28) exceed that for bipolar disorder. Second, the prevalence of the two disorders may be changing as a result of changes in the rate of incident cases. Comparison with U.S. community samples may be instructive, although prevalence rates in the population of VA service users are not necessarily expected to replicate those of the general population (
29), and within-study longitudinal prevalence data are surprisingly scant. Although the National Comorbidity Survey (NCS) documented 12-month prevalence of 1.3% for mania and .5% for nonaffective psychoses (
30), the NCS replication documented a prevalence of 1.1% for bipolar I disorder but did not assess nonaffective psychoses (
31). In contrast, a community study of bipolar disorder in Australia described an up-to-twofold increase in bipolar disorder between 1998 and 2008 (
32).
Third, trends in diagnostic practice, independent of true prevalence, may also be contributing factors. Some have bemoaned a trend to “overdiagnosis” of milder forms of bipolar disorder as a result of multiple factors (
33). Others have pointed out that diagnosis of milder forms of bipolar disorder identifies a high-morbidity subpopulation that warrants attention (
31). Our data indicate that the proportion of patients with milder bipolar disorder (type II or NOS) increased from 8.7% in 2003 to 24.4.6% in 2010 for group A; however, this growth accounted for only a minority of cases and cannot completely explain the rise in bipolar disorder prevalence.
Effects of Criteria on Population Composition
We found that increasing the restrictiveness of the definitions resulted in small to medium effects in only a minority of contrasts, and most of these were not consistent across populations. Thus population composition for all three mental health conditions was relatively robust across commonly used diagnostic inclusion criteria. This does not, however, imply that there are no meaningful effects of increasing diagnostic restrictiveness or that investigators and administrators should utilize the least restrictive, single-encounter definition to maximize case finding (
6–
8). Rather, these findings demonstrate predominantly small impacts on specific population characteristics of selecting diagnostic inclusion criteria, which may or may not be important in specific analyses.
If maximizing case finding is a priority, the least restrictive definition might be most appropriate, and investigators can expect few shifts in population composition when more restrictive definitions are applied in subsequent database analyses. On the other hand, if more restrictive definitions are used to maximize predictive value, our analyses indicate it is unlikely that the investigators’ choice would substantively impact the population composition compared with less restrictive definitions.
Three Mechanisms of Effect
When differences in population characteristics are seen across diagnostic inclusion criteria sets, they may be attributable to one or more of at least three mechanisms: real differences in population composition, differences in clinician diagnostic tendencies, or differences in ascertainment.
For example, the reciprocal changes in the percentage of males found when bipolar disorder and schizophrenia were defined more restrictively (
Table 3) correspond to well-documented gender differences in clinical studies of the two disorders—that is, the proportion of women is larger among persons with bipolar disorder than among those with schizophrenia (
34). That is, when more restrictive definitions remove borderline cases, the gender differences move in the direction predicted by clinical samples: increasing the proportion of women in the bipolar disorder population and the proportion of men in the schizophrenia population.
The reciprocal differences in the proportion of whites as the diagnosis becomes more restrictive may also be attributable to underlying population differences. However, the differences may also result from the well-documented tendency of clinicians to diagnose schizophrenia more readily than bipolar disorder among individuals from racial-ethnic minority groups (
35,
36).
Finally, ascertainment effects are also relevant whenever comorbidity rates are considered: increased diagnostic rates may be linked to increasing surveillance of those assessed more frequently or comprehensively. Despite these likely effects, there were remarkably small, inconsistent differences in the odds of comorbidities across inclusion criteria for any of the three populations (
Table 2).
However, it is instructive to consider the effect on major depressive disorder of changing the inclusion criteria for each of these disorders, which illustrates the potential complexity of using administrative data to extrapolate to true clinical characteristics of the underlying population. Specifically, the increased odds of major depression with increasing restrictiveness of diagnosis of PTSD (ORs=1.71–2.50) may reflect a true difference among those with more certain diagnoses of PTSD or may reflect more extensive care; however, it may also be that those who are treated more frequently undergo greater surveillance for other disorders, with major depression correspondingly diagnosed more frequently because of increased surveillance rather than because of true increased prevalence.
In contrast, the decreasing odds of major depression with increasingly restrictive criteria for bipolar disorder (ORs=.55–.76) and schizophrenia (ORs=.52–.65), although similar in magnitude, may be due to different mechanisms. An ICD-9 diagnosis of major depression is incompatible with bipolar disorder, and thus it is likely that clinicians, appropriately, attribute depressive symptoms to bipolar disorder among persons with more a certain bipolar diagnosis. For schizophrenia, however, the diagnoses are not incompatible, and the lower rates of major depression diagnoses may reflect a “negative” ascertainment effect, where symptoms of depression are ascribed to schizophrenia in clearer cases of the latter.
Limitations
These analyses were affected by the limitations of any study of administrative data sets, which rely on diagnoses made by clinicians rather than on research diagnoses (
1). Thus differences in prevalence and sample composition cannot be ascribed to underlying differences in the population rather than to differences in diagnostic tendencies of clinicians. In addition, lack of differences does not necessarily increase confidence that the populations are similar—again because of the reliance on clinician diagnosis. Estimates of the validity and reliability of administrative diagnoses are not possible without research diagnoses. However, the purpose of these analyses was not to establish validity, reliability, or relative scientific merit of the diagnostic inclusion criteria but rather, working within administrative data sets, to determine the impact of typical diagnostic rules on prevalence and population composition. These analyses were also limited to VA service users, and it is likely that the impacts of choosing diagnostic inclusion criteria may differ in other care settings (
37). However, the VA may also serve as a “microcosm” (
29), which can inform methods used in other health care systems. Finally, findings may differ for other mental health conditions such as major depression or other anxiety disorders.
Conclusions
These findings can guide analyses of data from VA, health maintenance organizations, commercial insurers, Medicare, Medicaid, and accountable care organizations. Investigators can make decisions about diagnostic inclusion and exclusion criteria with the knowledge that population composition in all three mental health conditions is relatively stable across commonly used diagnostic criteria. An awareness of three potential mechanisms that produce differences in population composition across inclusion criteria can help guide investigators in their choices.
Acknowledgments
This material is based on work supported by Health Services Research and Development grant IIR-10-314 (to Dr. Bauer) from the Office of Research and Development, Veterans Health Administration, U.S. Department of Veterans Affairs. The funding organization provided competitive grant support for these analyses but was not otherwise involved in the design or conduct of the study or review or approval of the manuscript.
The authors report no financial relationships with commercial interests.