Each year across the United States, approximately 100,000 adolescents or young adults experience first-episode psychosis (
1–
3). Early intervention improves longer-term outcomes for this debilitating condition, and specialized treatment clinics are increasing in number (
4–
11). However, information on the number of patients who develop a schizophrenia spectrum disorder and the extent to which they receive early treatment remains limited. Furthermore, it can be difficult to accurately measure both the number and proportion of cases within a population. For example, survey samples or data relying on patient self-reports may be affected by response bias, particularly for conditions that impair cognition and insight. Similarly, data from hospitals or clinics reflect only individuals who present for treatment at those sites.
Large population-based data sets are needed to examine disease incidence. Most large data sets rely on insurance claims data from a limited number of payers. Moreover, the United States does not have a unique national patient identifier that is standardized across payers and data sources, hindering efforts to construct a longitudinal patient history. Thus, the payment fragmentation in the United States, the absence of standard identifiers, and patients’ movement across different payers make it difficult to identify the first diagnosed episode of a disease for an individual in a single payer’s data set (
10,
12–
16).
Previous surveys suggest that psychosis often first occurs between ages 15 and 35 years, with a slightly higher number of cases among males (
15,
17,
18). This timing of psychosis onset coincides with the transition from adolescence to adulthood and with changes in insurance coverage for many individuals. For example, Medicaid eligibility for children ends at age 19 in most states (21 in Massachusetts) (
19), and group commercial insurance plans limit coverage for dependent children to those ages <26 years, such that many individuals leave their family plan on turning 26. Moreover, the onset of first-episode psychosis itself could alter insurance coverage, for example, when employment changes affect employer-sponsored insurance coverage or when individuals become eligible for Medicaid because of decreased income.
One common approach in the literature for identifying individuals with initial diagnoses of psychosis is to examine the 12 months before a diagnosis within the available data (
14,
20). Other studies have not used historical review or have not specified the period of review and, unsurprisingly, have had variable incidence estimates (
15). To date, no assessments have explored the adequacy of reviewing 12 months of medical history or the implications of doing so.
This study used the Massachusetts All-Payer Claims Database (APCD) to identify individuals with new diagnoses of schizophrenia spectrum disorder within insurance claims after reviewing each patient’s history. The Massachusetts APCD contains standardized patient identifiers that permit an individual-level, longitudinal historical review. This study examined the number of health insurance payers in each patient’s history and the implications of reviewing 12, 24, 36, or 48 months of medical history for capturing previous diagnoses and estimating population-level rates of new diagnoses.
Methods
Framework
Previous studies have examined the incidence of schizophrenia spectrum disorder by using a range of diagnosis codes or treatments and months of historical data (
12–
14,
21). When attempting to identify the initial diagnosis, it is necessary to define an index diagnosis and the period preceding it (i.e., the patient’s history). The extent of patient history reviewed must be feasible; evaluating each patient’s complete life history is impractical at a population level. Most data sets include data from limited periods or from a single payer or are unable to track individuals across payers; thus, many studies review only the 12 months preceding an index event.
For schizophrenia spectrum disorder specifically, 12 months of history may not be sufficient, particularly given concerns about variable access to care. Assessing the adequacy of a patient’s history requires that patients have available data from the period of interest and involves a review for historical diagnoses. The ability to track patients throughout these preceding months is important because adolescents and young adults frequently change payers as they age out of some forms of insurance or gain or lose employer-sponsored insurance.
Study Data
This study used the Massachusetts APCD, release 6.0, which contains medical claims for commercially and publicly insured individuals and provides longitudinal information on diagnoses and treatments (
22). In 2016, 97% of individuals in Massachusetts had health insurance coverage, among the highest rates in the United States (
23); thus, the APCD captures most of the Massachusetts population until individuals’ death or departure from the state.
The data set includes group commercial, individual commercial, Medicare Advantage, and Medicaid data but does not include fee-for-service Medicare data. Eligibility for Medicare, however, is limited to those ages ≥65 years or those <65 who have qualified through disability; the latter group has a 2-year waiting period after becoming eligible for Social Security Disability Insurance benefits. The APCD does not include data from workers’ compensation, TRICARE and the Veterans Health Administration, or the Federal Employees Health Benefit Plan (
22). Self-funded employer plans also were not required to submit data to the Massachusetts APCD starting in 2016, resulting in a decrease to 84.5% of the proportion of Massachusetts residents with available data in the APCD in 2016 (
24). The Mass General Brigham Institutional Review Board approved this study and did not require individual patient consent.
Diagnoses for Psychosis
We identified all individuals with a diagnosis of schizophrenia spectrum disorder on the basis of
ICD-9 and
ICD-10 diagnosis codes. The study used information from inpatient, emergency department, observational, and outpatient claims. We categorized individuals as having a diagnosis of schizophrenia spectrum disorder by using definitions from the literature (see the
online supplement to this article) (
14,
16). We also restricted the sample to individuals ages 15–35 years on the basis of previous literature about age at onset and divided the cohort into those with early onset (ages 15–17), typical onset (ages 18–29), or late onset (ages 30–35).
Analysis
We identified the earliest diagnosis of schizophrenia spectrum disorder for each patient in the 2016 observation window (i.e., the index diagnosis), yielding up to 4 years of historical data before the index diagnosis, because the version of APCD we used contained 5 years of data (2012–2016). We examined total months of APCD enrollment across all payers, gaps in enrollment, and number of payers during the 4 years before the index diagnosis. We also examined the percentage of patients who were continuously present in the APCD (i.e., had complete data capture in the 12-, 24-, 36-, and 48-month periods before their index diagnosis). We used logistic regression to assess the probability of having historical data as a function of individual (e.g., sex, age, and insurance type) and diagnostic (e.g., site of diagnosis) characteristics.
We then examined the cumulative percentage of individuals with additional schizophrenia diagnoses in their history (i.e., during the preceding 12, 24, 36, and 48 months). We repeated the analyses for each age and sex combination. In the analyses, we categorized an individual as having a new 2016 diagnosis if they had no historical diagnoses of schizophrenia spectrum disorder and had adequate information to make this determination (i.e., 48 months of history).
Logistic regression models were used to examine the association between individual characteristics and having a new diagnosis of schizophrenia spectrum disorder among those with 48 months of history. All analyses were stratified by age and sex because of a priori hypotheses about incidence differences.
Sensitivity Analyses
We used several sensitivity analyses to complement the main analyses by using varying information sources, diagnosis codes, or other study parameters and accounting for previous treatments. First, we identified additional historical schizophrenia diagnoses from claims obtained from other clinical settings, such as postacute care or laboratory claims. The main analyses did not include these data sources because of a priori concerns about insufficient information quality. The sensitivity analysis added historical diagnoses of schizophrenia spectrum disorder from the expanded types of claims.
Next, we repeated the analyses, including additional diagnostic codes for other and unspecified psychosis (see
online supplement), because for some individuals, these codes may appear before a diagnosis of schizophrenia spectrum disorder. Then, we used information about historical receipt of antipsychotic medications, in addition to historical diagnoses, to classify patients as having care potentially consistent with a schizophrenia spectrum disorder diagnosis. Other sensitivity analyses restricted the review of historical data to the information obtained only from a single payer or allowed for having fewer continuous months of data.
Results
Population With Schizophrenia Spectrum Disorder in the Massachusetts APCD
We identified 7,217 individuals ages 15–35 years with an index 2016 diagnosis of schizophrenia spectrum disorder (420 per 100,000 residents ages 15–35 in the Massachusetts APCD) (
Table 1) (
25). Among these individuals, 67.7% (N=4,885) had at least 48 continuous months of historical data. In addition, 14.6% (N=1,052) had a single gap in insurance coverage (mean=37.1 months of enrollment per person over the preceding 48 months), and the remaining 17.7% (N=1,280) had either more than one gap or <48 months of historical data (see the
online supplement).
Among those with at least 48 continuous months of information before their index schizophrenia spectrum disorder diagnosis (i.e., adequate historical data), 3,749 (77%) had at least one similar historical diagnosis, and 1,136 (23%) did not (i.e., had new diagnoses in 2016). Multivariate regression models revealed that index diagnoses from inpatient psychiatric admissions or among female patients or those ages 15–17 were more likely to represent new diagnoses of schizophrenia spectrum disorder, compared with index diagnoses from most other diagnosis locations or among males or older age groups, and outpatient diagnoses were less likely to represent new diagnoses than were most other diagnosis locations (
Table 2).
Payment Fragmentation
Table 3 displays the proportion of the 4,885 individuals with at least 48 months of historical data whose data came from multiple payers. Of these patients, 64% (N=3,093) had at least two payers during the 48-month period (
Table 3). As expected, patients’ number of payers changed with age: for example, 65% of females ages 15–17 years and 72% of females ages 18–29 years had multiple payers during the 48 months before their index diagnosis (
Table 3).
Historical Diagnoses
Figure 1 displays the percentage of patients who had diagnoses of schizophrenia spectrum disorder before their index diagnosis, as detected in each patient’s historical 12, 24, 36, or 48 months of data. Among all patients with at least 48 months of historical data (N=4,885), 61% (N=2,958) had a historical diagnosis in the 12 months before their index diagnosis; expanding the review to the previous 24, 36, and 48 months increased the percentage of patients with a historical diagnosis to 71% (N=3,446), 75% (N=3,641), and 77% (N=3,749), respectively (see the
online supplement). Patients at younger ages at their index diagnosis were less likely to have historical diagnoses, compared with older patients (
Figure 1). When the historical review was expanded to 36 or 48 months from 24 months, the incremental increase in percentage of patients ages 15–17 years (at index diagnosis) with a historical diagnosis was only modest (e.g., a 2.6 percentage-point difference among males) (see the
online supplement).
Rates of New Diagnoses
After we reviewed 48 months of medical history for each patient with available data (N=4,885), the rate of new diagnoses of schizophrenia spectrum disorder was 66 per 100,000 individuals ages 15–35 years in the Massachusetts APCD, ranging from 37 per 100,000 for females ages 15–17 to 100 per 100,000 for males ages 18–29 (see the
online supplement). Reviewing less extensive historical data would have erroneously inflated the rates of new diagnoses (e.g., up to 112 per 100,000 individuals if examining only 12 months of history). More than 40% (N=791) of patients who appeared to have new diagnoses in 2016 after we examined only 12 months of medical history (N=1,927) in fact had additional diagnoses of schizophrenia spectrum disorder that we detected after reviewing 48 months of history.
Sensitivity analyses yielded comparable findings (see the
online supplement). Notably, restricting the analysis to individuals with insurance coverage under a single payer and with sufficient data before their index event resulted in lower estimated rates of new diagnosis (i.e., 22 per 100,000 individuals). Expanding the diagnosis codes of schizophrenia spectrum disorder resulted in additional patients (N=9,670) identified as having a 2016 diagnosis of schizophrenia spectrum disorder; however, this sample had a proportion of patients with adequate historical data and with historical diagnoses that was similar to the proportion in the sample derived from using the more restrictive diagnosis codes. Many patients received antipsychotic medications before their index diagnosis; thus, removing data from patients with historical medication use but no historical diagnosis of schizophrenia spectrum disorder decreased the estimated rates of new diagnosis, but we note that some antipsychotic prescriptions could have been for other indications.
Discussion
Identifying individuals with new diagnoses of schizophrenia spectrum disorder is critical for understanding the burden of this disease in the United States and for determining whether new policies and delivery options are adequate to meet the needs of this population. This study offered the first examination of approaches for defining new diagnoses of schizophrenia spectrum disorder by using comprehensive information about historical care received for a well-defined population. Review of up to 4 years of patient history may be necessary to prevent misclassification; for younger patients (i.e., ages 15–17 years), a 2-year history may be sufficient because schizophrenia starts in adolescence. Failure to consider patients’ histories or limiting the review to medical history based on data from a single payer may result in the misclassification of many patients.
Estimated incidence rates of new diagnoses of schizophrenia spectrum disorder in previous studies have been as high as 86 per 100,000 individuals, depending on age; however, these estimates are based on review of ≤1 year of patient history, if any history was reviewed (
14,
15,
26). In this study, we found a lower overall rate of new diagnoses when we considered several years of patient history because review of this extended historical period enabled us to detect additional diagnoses before the index diagnosis (i.e., 66 per 100,000 individuals, ranging from 37 per 100,000 for females ages 15–17 to 100 per 100,000 for males ages 18–29). Estimates based on self-reported diagnoses and survey methods can be challenging, given the impact of schizophrenia spectrum disorder on patients’ insight and cognition (
27).
These findings have several implications. Population-wide rates of new diagnoses of schizophrenia spectrum disorder are likely lower than suggested in the literature, with larger errors for rates among older individuals, because many patients have earlier diagnoses. The greater number of years of patient history we reviewed and the ability to track data across payers were both critical to detecting historical diagnoses. Indeed, this study’s estimates when restricting the historical review to 1 year were comparable to those from previous studies, but total cases and incidence rates decreased when we reviewed additional years of history. Although rates of new diagnoses may be lower after reducing misclassification of diagnoses as new ones, many people still develop this serious chronic disease; our refined estimates should not be misinterpreted as an indication that the burden of this disease is any less significant.
Although review of additional psychiatric history has the benefit of reducing misclassification, this approach reduces the amount of data available for observation of outcomes after a diagnosis and decreases the fraction of the population for whom such observation is possible when using a real-world data set. Use of a 5-year period, such as in this study, requiring 4 years of history to reliably identify a first incident of schizophrenia spectrum disorder diagnosis would leave at most only 1 year for postdiagnosis observation. Similarly, because of high patient turnover in large data sets, requiring longer periods of continuous enrollment decreases the population size meeting the criterion of 4 years of available history. Data sets always have inherent limitations and might not be appropriate for all research questions; in particular, data sets with only a few consecutive years of information would not be appropriate for examining new disease among older patients with schizophrenia spectrum disorder.
For clinicians, a better understanding of the true rates of new diagnoses may increase the likelihood that patients are connected to necessary specialized services. For policy makers, improved estimates of disease rates can help guide decisions about what types of services should be delivered, particularly with respect to investments in coordinated specialty care (CSC) clinics. For example, the 1,136 patients who had a new 2016 schizophrenia diagnosis exceeded the total capacity of all CSC clinics in Massachusetts for both new and existing patients.
This study had several limitations. First, no gold standard exists for identifying schizophrenia spectrum disorder diagnoses in administrative data. Our analyses used multiple definitions of the disease but may have missed individuals whose clinicians tended to start with unspecified psychosis diagnoses, even when they suspected that the patient has a schizophrenia spectrum disorder. However, including unspecified psychosis in the case definition yielded similar findings. Furthermore, there may be an even broader range of codes (e.g., including affective psychosis) that this approach also did not capture.
Greater diagnostic uncertainty may have been a factor early in the disease course (e.g., some patients may have been initially diagnosed as having major depressive disorder with psychotic features but were later diagnosed as having schizoaffective disorder). Coding may have varied across areas or among clinicians and longitudinally among patients. Moreover, clinical heterogeneity that is not captured through diagnosis codes was likely. We note that current approaches that use large data sets do not address the level of diagnostic certainty across different clinical and diagnostic settings and symptoms or differential standards for clinical certainty before clinicians make a diagnosis.
Finally, this study was conducted in a single state, and the results might not generalize to other states or individuals without insurance. Indeed, loss of insurance could either be a risk factor or reflect part of the disease course.
Conclusions
In summary, careful examination of multiple years of patients’ medical history is essential for identifying the true incidence of new diagnoses of schizophrenia spectrum disorder. Such diagnostic accuracy is critical to estimating population rates of new disease onset and understanding the early course of the disease. To prevent erroneous estimates and ensure that a clinical review encompasses the longitudinal history of each patient, the review should use data spanning all payers.