In 2007 the Joint Commission established a set of seven hospital-based inpatient psychiatric measures in collaboration with the National Association of Psychiatric Health Systems, the National Association of State Mental Health Program Directors (NASMHPD), and the NASMHPD Research Institute, Inc. Collectively referred to as the Hospital-Based Inpatient Psychiatric Services (HBIPS) measure set, the measures were considered to be an important step to create accountability in behavioral health inpatient care (
1,
2).
It has been repeatedly observed that health care organizations that implement and report national quality measures tend to improve steadily over time (
3–
5). This trend appears also to apply to the HBIPS measures. From 2011 to 2015, there was an overall improvement of 3 percentage points in the average HBIPS composite score nationwide (87.3% to 90.3%) (
6). Although these trends can be informative, they can also obscure differences in performance based on other factors.
Implementation of HBIPS reporting followed a protracted path that was influenced by changing reporting requirements. These changes led to staggered implementation of measures for large hospital cohorts, providing an opportunity to examine trends and variations in results associated with hospitals’ unique characteristics. This study examined these differences by analyzing 31 quarters of HBIPS data, comparing cohorts on the basis of their first reporting quarter and subsequent changes in measure performance.
Methods
Specifications for the HBIPS measures were developed by using the Joint Commission’s standard methodology (
7). A technical advisory panel was established to develop draft measures, which were posted for public comment and revised on the basis of the comments. The revised measure specifications were pilot-tested in hospitals to evaluate implementation feasibility and data collection reliability. On the basis of test results, the final measure specifications were refined and published.
The measures address the following topic areas: percentage of patients screened at admission for risk of violence, substance use, psychological trauma, and psychological strengths (HBIPS 1); ratio of restraint hours per 1,000 patient hours (HBIPS 2); ratio of seclusion hours per 1,000 patient hours (HBIPS 3); percentage of patients discharged with documented treatment justification for two or more antipsychotic medications (HBIPS 5); percentage of patients discharged with a continuing care plan (HBIPS 6); and percentage of patients discharged with a continuing care plan that was transmitted to the next level of care (HBIPS 7).
HBIPS data were collected quarterly from hospitals that treat psychiatric inpatients (freestanding psychiatric hospitals and acute care hospitals with psychiatric units) and that are accredited by the Joint Commission. Clinical data were abstracted from each eligible patient’s medical record by specialists using data element definitions and collection protocols (
8). For the percentage measures, record sampling was permitted for hospitals with greater than 44 eligible patients in a given quarter (62% [N=552] of hospitals submitted sampled records). Sampling was not permitted for the ratio measures (restraint and seclusion) because they require the calculation of total inpatient psychiatric days for each facility as well as total seclusion and restraint time. Sampling details are available in the
Specifications Manual for Joint Commission National Quality Measures (
7). HBIPS data were submitted for each quarter through an approved performance measurement system vendor.
Hospital Cohorts
Overall, data were available across 31 quarters, starting with the last quarter of 2008 and ending with the second quarter of 2016. In order to be included in the analysis, hospitals were required to report data for four to six consecutive quarters and to begin their reporting as part of an identifiable cohort. The 2009 cohort (N=243) consisted of hospitals that volunteered to report HBIPS measures to the Joint Commission after the measures were first released. About 71% (N=173) of the hospitals in this cohort were freestanding psychiatric hospitals, and the other 29% (N=70) were acute care hospitals reporting data for an inpatient psychiatric unit. The 2011 cohort (N=139) consisted of hospitals that began reporting HBIPS data during the first quarter of 2011. This coincided with a Joint Commission requirement mandating that all accredited freestanding psychiatric hospitals report HBIPS data. Over 95% (N=133) of the hospitals in the 2011 cohort were freestanding psychiatric hospitals.
The hospitals in the 2014 cohort (N=137) likely began reporting in response to two drivers: the adoption of the measures by the Centers for Medicare and Medicaid (CMS) and the decision by the Joint Commission to increase the number of measures that hospitals were required to report, from a four-measure set to a six-measure set. Over 96% (N=131) of hospitals in this cohort were inpatient psychiatric units. The final cohort (N=372), which began reporting in 2015, consisted almost entirely of hospitals with inpatient psychiatric units (N=368, 99%). The 2015 cohort entered the HBIPS reporting cycle after the Joint Commission had introduced a flexible reporting option that allowed hospitals to report combinations of six sets of measures.
Figure 1 shows a timeline of requirements for reporting performance measures and the number of organizations in each cohort reporting HBIPS data to the Joint Commission. Despite a small change in the number of reporting hospitals from quarter to quarter, the selected cohorts represent four major influxes of reporting hospitals.
Statistical Methods
Because cohort membership and hospital characteristics were either ordinal or categorical data, chi-square tests were used to examine differences in characteristics by hospital cohort. The hospital characteristics used to compare the cohorts were ownership (for profit, government, or not for profit), bed size (<100 beds, 100–299 beds, or ≥300 beds) location (rural or urban), psychiatric treatment facility type (freestanding hospital or psychiatric unit within a general hospital), and teaching status (major teaching, minor teaching, or nonteaching).
For each measure, quarterly percentages or ratios were calculated on the basis of a data aggregation from all eligible patients within a hospital. Percentages reflect the number of times a hospital treated a patient in a manner consistent with clinical practice guidelines divided by the number of patients who were eligible to receive such care. They were analyzed by using the generalized estimation equation (GEE) model. The GEE model, an extension of the general linear model, is a semiparametric approach to the longitudinal analysis of categorical responses (repeated categorical measures) (
8). The GEE model was used for two reasons. First, the measurement data collected across quarters were longitudinal in nature and thus correlated within hospital. Second, the assessment concerned longitudinal trends and cohort differences in these trends.
In addition to the previously mentioned covariates, reporting year cohort and reporting quarter for the first six quarters of reporting were included to detect cohort and trend effects. Because the two discharge planning measures (HBIPS 6 and 7) were retired at the end of 2015, prior to the analysis, only four quarters of these data were reported by the 2015 cohort. For the remaining HBIPS measures, six quarters of data were reported by all cohorts.
The restraint and seclusion measures (HBIPS 2 and 3) were calculated as the total number of restraint and seclusion hours per quarter divided by the total number of psychiatric inpatient hours in a quarter and multiplied by 1,000. Thus restraint and seclusion are reported in number of hours of restraint or seclusion per 1,000 patient hours. Because of the large number of zero hours, two component regression analyses were used. First, seclusion and restraint ratio measures were delineated as either zero if none were reported or one if a positive number was reported. A preliminary analysis of the ratios showed that the nonzero values were log normally distributed. Consequently, the data were analyzed by using a two-component model for longitudinal data, with clumping at zero (
9). The first component models the dichotomous variables for seclusion and restraint, respectively. The second component models the mean number of nonzero hours of seclusion and restraint, respectively, per 1,000 patient hours. The model allows for the longitudinal nature of the data by using random effects and also allows for the correlation between the two components. The models were fit by using SAS, version 9.4, PROC NLMIXED and the MIXCORR macro.
Results
Characteristics of hospitals that reported HBIPS measures are reported in
Table 1. The majority of organizations were freestanding psychiatric hospitals located in urban areas. A plurality were nonteaching, not-for-profit, and midsize (100–299 beds) facilities. There were significant differences among the four cohorts by hospital characteristic. As a group, for-profit and freestanding hospitals tended to report early, and acute care hospitals with psychiatric units tended to report later (p<.001).
Table 2 shows results of regression analyses for HBIPS 1, 5, 6, and 7. After an adjustment was made for covariates, the 2011 cohort had higher average performance on the admission screening measure (p=.04 [HBIPS 1]) and the two discharge planning measures (p=.01 [HBIPS 6], p=.02 [HBIPS 7]), compared with the 2009 cohort. The 2014 cohort had lower performance on admission screening (p<.001) than the 2009 cohort, but it performed better than the 2009 cohort on justification for multiple antipsychotics (p=.01 [HBIPS 5]) and transmission at discharge of the continuing care plan (p=.02 [HBIPS 7]). The 2015 cohort performed better than the 2009 cohort on providing justification for multiple antipsychotics (p<.001 [HBIPS 5]), preparing written continuing care plans at discharge (p<.001 [HBIPS 6]), and transmitting the plans to the next care provider (p<.001 [HBIPS 7]).
Differences in measure performance over the first few reporting quarters may be partly explained by differences that existed at the initial reporting point. After adjusting for covariates, we found that at the initial reporting point, the 2011 and 2015 cohorts had significantly higher performance on each of the four measures compared with the 2009 cohort (p<.001). The 2014 cohort showed lower initial performance than the 2009 cohort on admission screening (p<.001) but higher initial performance for antipsychotic medication justification at discharge (p=.02) and the continuing care plan measures (p<.001). In addition, the reporting cohort coefficients were positive and significant for each measure (p<.001 for all comparisons;
Table 2), indicating overall progress by each cohort.
The results for the two-part regression model of restraint and seclusion are reported in
Table 3. The coefficients for occurrence indicate the percentage of positive (nonzero) hours per 1,000 patient hours for each cohort. The nonsignificant slope × cohort coefficients indicate that the percentage of nonzero hours of restraint did not change during the initial six quarters.
Table 3 also examines the intensity of use of restraint and seclusion, reported as the average number of hours per 1,000 patient hours (excluding zero hours) spent in restraint or seclusion. Only the 2011 cohort had significantly fewer overall hours of restraint (p<.001) and seclusion (p<.03) per 1,000 patient hours compared with the 2009 cohort. The slopes for hours of restraint and hours of seclusion per 1,000 patient hours indicate that their use decreased significantly over the initial six reporting quarters in the 2009 (restraint, p=.002; seclusion, p=.017) and the 2015 (restraint, p<.001; seclusion, p=.005) cohorts.
For HBIPS 1, 5, 6, and 7, each cohort improved its performance from its initial level to the last measured point (
Figure 2). For these four measures, the weighted average percentage of clients who were treated at baseline in a manner consistent with the guideline was 76.4%. For the last reporting quarter, the weighted average gave a value of 89.2%. The difference of 12.9 percentage points provides a rough estimate of overall improvement. The geometric mean of hours per 1,000 patient hours for the first and last reporting quarters was calculated separately for restraint and seclusion and was compared across cohorts. Mean decreases (improvements) in these measures were less than 2 percentage points.
When each cohort’s performance for each measure was examined within the context of the entire time period, the percentage measures appeared to show consistent improvement. For cohorts with already high levels of performance, there was evidence of a ceiling effect. For the ratio of restraint hours to 1,000 patient hours, the 2014 and 2015 cohorts showed consistent improvement. The 2009 cohort appeared not to change much across quarters, although the level of restraint and seclusion for this cohort began and remained low.
Measure performance differed significantly by some hospital characteristics. The most consistent finding was related to hospital ownership type. Compared with not-for-profit hospitals, for-profit hospitals had significantly higher performance on writing a continuing care plan (p=.01) and transmitting the plan to the next treatment facility (p<.001) (
Table 2). Somewhat less consistently, for-profit hospitals had a significantly higher ratio of restraint hours (p<.001) but a lower ratio of seclusion hours (p=.012) (
Table 3).
Discussion
This research was designed primarily to examine performance trends across time for psychiatric treatment facilities that report on the HBIPS measures. The first cohort (2009) consisted of hospitals that were early adopters of the measures and voluntarily reported their data. This cohort was the most varied of the four cohorts in terms of facility type and other hospital characteristics. For this reason, this cohort provides an interesting point of comparison with the other three cohorts.
It is worth noting that the 2009 early adopters did not report consistently better or worse performance at baseline compared with hospitals in later cohorts, nor did the 2009 cohort stand out in terms of its overall rate of improvement. That is not altogether surprising, given that results from previous studies on health care organizations that voluntarily report data have been mixed. A study among managed care organizations of women’s health care quality showed that higher-performing organizations were more likely to voluntarily report their data (
10). In contrast, a study comparing quality rankings of hospitals that voluntarily participated in the 2008 Leapfrog Survey found little upward bias in the volunteer group and some evidence of downward bias (
11). Similarly, hospitals that voluntarily reported surgical outcomes to the CMS, which then posted the information on a Web site, did not vary on most quality indicators compared with hospitals that chose not to volunteer to have their information presented (
12).
One finding related to measure performance, however, was remarkably consistent across the cohorts. Hospitals in the earlier reporting cohorts were outperforming hospitals in later reporting cohorts by the time each new cohort began reporting. Examination of the longitudinal data presented in
Figure 2 clearly supports the benefits of earlier measurement and reporting. This finding suggests that hospitals in the early adopter group had the most time to utilize the feedback from reporting to improve performance. Although this conclusion should be interpreted with some caution, it is worth noting that the specifications for the HBIPS measures were available to all hospitals simultaneously. The freestanding psychiatric hospitals in the 2011 cohort would have likely been aware that the measures would focus on specific evidence-based practices and that reporting on these measures would soon be required. It is possible that many of the hospitals in the 2011 cohort chose to implement the measures earlier but elected not to report them to the Joint Commission until 2011. If that is the case, it would suggest that performance measurement initiatives may be enhanced by public reporting and accountability and not just by measurement and feedback.
Examination of the cohort × quarter interactions for HBIPS 1, 5, 6, and 7 showed that not every cohort improved at the same rate on each measure over the initial reporting quarters. However, for these measures, results for the last reporting quarter showed improvement over the initial reporting quarter for every cohort. The greatest variability in cohort performance was observed in the restraint and seclusion measures. This variation may be related to a combination of factors associated with ownership type and hospital type.
The 2011 and 2014 cohorts, which had significantly higher proportions of government-owned hospitals, had higher restraint intensity compared with 2009. The 2011 cohort, however, had seclusion ratios that were similar to those of freestanding psychiatric hospitals in the 2009 cohort and that were significantly lower than those of the 2014 and 2015 cohorts, which largely comprised hospitals with psychiatric units. It is possible that differences in restraint and seclusion ratios may be due to differences in patient mix associated with ownership type and hospital type. Although differences by hospital type were mixed and are difficult to interpret, one set of findings—for-profit hospitals performed better on writing and communicating discharge planning—was consistent with recent research by Shields and Rosenthal (
13).
This study had a number of limitations. First, although performance differences between cohorts, hospital types, and ownership types were observed, the study was not designed to explain such differences. Second, the magnitude of differences across some of the measures may be affected by ceiling or floor effects. Hospitals with high starting points on performance measures (or low starting points, in the case of restraint and seclusion hours) have little room to demonstrate improvement, reducing the value of slope comparisons across cohorts. In addition, performance measure rates may also be affected by changes in specifications, although the impact of such changes is likely to be small.
Third, limited data were available for the hospitals in the 2014 and 2015 cohorts, so a longer reporting period may have revealed different patterns compared with the 2009 and 2011 cohorts. It is also important to note that quality measures, such as the HBIPS measures, were developed by experts to help health care facilities provide better treatment. Although they address important aspects of care, they do not represent every dimension underlying good psychiatric treatment. Moreover, although the measures were selected and constructed on the premise that improved treatment processes lead to better treatment outcomes, they are not outcome measures. Finally, it is important to note that the study included only organizations that were accredited by the Joint Commission. This group represents the vast majority of freestanding psychiatric hospitals across the United States, but results cannot be generalized to the facilities that are unaccredited or accredited by some other organization.
Conclusions
This study found significant differences in performance among cohorts of hospitals that began reporting psychiatric measure data at various points in time. The cohorts were marked by significant variations in hospital type, ownership type, and other hospital characteristics, and their participation in measure reporting was likely influenced by changes in external reporting requirements. Examining measures separately by cohort provides an interesting perspective that is not generally included in reports on trends in measurement. The assumption that hospitals that chose early adoption of measurement—electing to voluntarily report data before being required to do so—were motivated by confidence that they were performing at a higher level than other hospitals was not supported.
Perhaps the most interesting finding from the study, however, was that previously observed associations between measure reporting and improvement appear to be quite robust. A trend toward improvement was found across nearly all measures and nearly all cohorts, despite the fact that the hospital cohorts had various initial starting points and improved at different rates and that various types of hospitals may have elected to start reporting at different times and for different reasons. Furthermore, the consistent association between measurement and reporting with steady improvement appears to have a corollary—hospitals that begin reporting early have a comparative advantage in performance over hospitals that delay.