Health care costs in the United States surpass those in other industrialized nations, but higher costs are not accompanied by superior health outcomes. The Institute of Medicine recently highlighted this deficiency in two reports and recommended developing pay-for-performance (P4P) structures as one vehicle to link reimbursement to quality with the aim of reducing costs and encouraging innovation. P4P is an umbrella term for providing financial incentives for achieving specific targets that promote quality.
P4P in health care is increasing rapidly in the United States. More than 100 programs provide financial incentives to either the physician or agency for meeting certain standards. In the past decade, at least six reviews have evaluated evidence for P4P in general medical care. Results are mixed, and few studies have had rigorous designs; however, results suggest that P4P can lead to improved quality and efficiency. In behavioral health care, the most recent survey counted 24 P4P programs and indicated increasing use. [An online supplement lists references for the reviews and survey.] To our knowledge no systematic review has assessed studies of the use of financial incentives to improve performance in behavioral health care.
Review of P4P in Behavioral Health Care
We identified articles that assessed the effect of P4P strategies on mental health care and substance abuse treatment by searching four databases (PubMed, PsycINFO, Web of Science, and Google Scholar), one search engine (Google), and one market intelligence firm report database (Open Minds). The following terms were used: pay for performance, value-based purchasing, performance-based contracting, and either behavioral health, mental health, or substance abuse. Titles and abstracts were assessed for eligibility, and studies were included if they evaluated the effects of P4P on mental health care or substance abuse services. We reviewed bibliographies of retrieved articles for relevant additions. Reviews, descriptive studies, and nonempirical studies were not included.
More than 100 descriptive studies and opinion pieces on “pay-for-value” and 15 quasi-experimental studies evaluating the effects of P4P on mental health or substance abuse services were identified. The study designs were diverse and included pre-post, cross-sectional, and randomized controlled trials. All 15 studies were conducted in the United States. Four (
1–
4) examined P4P programs implemented nationally, and 11 examined programs implemented in a state or county: Maine (five studies [
5–
9]); Delaware (three [
10–
12]); Massachusetts (one [
13]); Baltimore, Maryland (one [
14]); and King County, Washington (one [
15]). Outcomes included organizational, clinician, and patient factors, such as client retention, wait time, occupancy, length of stay, target threshold of treatment provided, and program completion (
Table 1).
Eighteen outcome domains were identified and included access, continuity, satisfaction, efficiency, engagement, and treatment fidelity. Twelve studies found a positive effect of P4P on all 18 outcomes. All therapist-level measures found positive effects in the intended direction. Two studies found no effect of P4P on four of the outcomes at the patient or organizational level (
2,
6), and two found a negative effect on four of the patient-level outcomes (
4,
6) (
Table 1). Null results were hypothesized to have resulted from a concurrent initiative that opposed the goals of the performance contract (
6) or to have been an unintended consequence of the incentive itself, which prioritized one outcome over the other (
4). Eight studies (
4–
9,
11,
12) also evaluated the effects of P4P on gaming (such as patient selection and dumping) and other unintended consequences. Four studies found evidence for patient characteristic changes (
11,
12), information reporting changes (
9), and reallocation of time to provision of fewer services (
4).
Discussion
Empirical data on the effect of P4P in behavioral health care are scarce. The limited data suggest that external incentives lead to positive outcomes. Only three of 15 studies had negative or null results; on the five outcomes for which a negative or null result was found, another study found a positive effect on the same outcome. Differences may have resulted from the significant heterogeneity among studies, types of setting (and patient), incentive, and specific outcome measures. Many of the studies had pre-post and quasi-experimental designs and did not include a control group; only two included randomization. Of note, the measures identified in this review were primarily measures of access, continuity, treatment fidelity, and engagement. There were no symptom (or medication) outcome measures, nor is it possible to make generalizations at this time about P4P with particular disorders or types of patient. More research is needed to determine whether P4P can directly improve clinical outcomes for patients, rather than proxies of good clinical care.
Our review also revealed a large amount of unevaluated P4P activity in the public sector, suggesting that P4P has proliferated in advance of experimental support. More research is needed on the effect of performance-based incentives, including isolating the effect of specific incentives and level of implementation (organizational versus practitioner), and on unintended consequences, such as gaming. Also needed is research on the cost-effectiveness of incentive programs. Collaborative stakeholder approaches to identifying fair and transparent incentives would be particularly beneficial.
P4P in behavioral health is simple in concept but complex in execution. Behavioral health does not have definitive outcome measures, such as presence or absence of disease or mortality, with which to measure outcome. At best, as evidenced in this review, we measure processes (or provision of health care services) that research has indicated are likely to improve patient outcomes. Operationalizing patient outcomes is complicated by case mix and heterogeneity of disorder presentation; for many disorders, the outcome is a change in social or functioning status, which is difficult to measure. Compounding these challenges is the difficulty of choosing a quality metric to incentivize (for example, efficiency, safety, satisfaction, outcome, or progress) to achieve the best effect. With any P4P implementation, incentives may lead to behavior against the payer’s or patient’s interest, including behaviors that harm consumers whom the incentives are most intended to help.
Beyond P4P and performance-based contracting is significant interest in “value-based reimbursement and purchasing.” Despite multiple webinars, white papers, and blog posts broadcasting the market shift to value-based reimbursement, our team was unable to locate a published definition or description of the term or a discussion of how it is distinguished from P4P. In some presentations, value-based reimbursement is synonymous with P4P, whereas in other venues, value-based reimbursement appears to include a menu of alternative payment models (including P4P, case rates, bundled payments, and capitation) that incentivize “value” (rather than volume, such as fee for service). As with performance or quality, value-based purchasing begs questions of the value of mental health services. This purported revolutionary concept of pay for value and value-based purchasing may be a “rebranding” of P4P. It appears that the rhetoric and discussion regarding pay for value have outpaced the evidence, and we caution the use of terms for which there is no agreed-upon definition or consensus.
P4P systems have been increasingly promoted to improve care quality, and as a result, P4P has been widely implemented across the United States. Given the complexity of behavioral health care, it is especially important to study the effects of P4P on care provision and health outcomes and the financial implications of these changes. The current state of the science of P4P in mental health care and substance abuse treatment implies that P4P improves, or at the very least does not diminish, the behaviors each particular payment system encourages. However, current research is not uniform; is largely payer-driven; evaluates P4P by using a range of outcome measures across diverse locations, specialties within behavioral health, and levels of incentives; and does not directly address clinical outcomes. In addition, unintended consequences, such as patient selection and dumping and other gaming behaviors, have been identified. With the implementation of the Affordable Care Act and Medicaid expansion, accountability for quality and measures for incentive-based approaches such as P4P will become even more critical. More research is necessary to determine the overall efficacy of P4P in behavioral health. Given the high prevalence of P4P systems in behavioral health and initial evaluation efforts, we are optimistic about the potential for future high-quality evaluation endeavors.