In 2003 the President’s New Freedom Commission on Mental Health issued a report that put forth six goals and 19 recommendations for transforming the system to “ensure that mental health services and supports actively facilitate recovery, and build resilience to face life’s challenges” (
1). To facilitate implementation of the commission’s report, in September 2005 the Substance Abuse and Mental Health Services Administration (SAMHSA) awarded mental health transformation state incentive grants (MHTSIGs) to the offices of the governors of nine states. Each grantee state received approximately $2 to $3 million in each of the five years of the program. Thus the total cost of the program was over $100 million.
SAMHSA specified that each state was to use the MHTSIG funds to implement infrastructure improvements consistent with a comprehensive mental health plan developed by the state but not to provide direct services (
2). In keeping with the commission’s emphasis on the principles of federalism, states had considerable leeway in the types of infrastructure improvements they could make. In the second year of the program, SAMHSA contracted for an evaluation of the program, which is described in this report.
From an evaluator’s perspective, the MHTSIG program had a number of problems. Federal program staff or grantees were not provided a detailed program theory or logic model. In fact, the core idea that services and consumer outcomes could be improved by supporting infrastructure change alone seemed to be contradicted by earlier studies, some supported by SAMHSA. In any case, no consumer service utilization or outcome measures were developed for the program specifically. In addition, the number of participating states was small (N=9), the evaluation of the program did not begin until one year after the program started, and the evaluators had limited access to federal program staff.
With these limitations in mind, the multisite evaluators and local evaluators at some of the sites concluded that that an evaluation summarizing program outcomes would be problematic to interpret. So although a summative evaluation was designed, the evaluators also put a great deal of effort into deploying a Web-based implementation monitoring tool that could be used formatively to help grantees report efforts to improve infrastructure. The tool, called the Transformation Tracker, was adapted from the Transformation Manager, a Web-based project management tool developed by the Human Services Research Institute. The goal of the Transformation Tracker was to provide an opportunity for federal program and state staff to evaluate the states’ formative decisions in hopes that a clear program theory (albeit post hoc) would develop, leading to the emergence of grantee programs that would generate large effect sizes.
This report describes the evaluation of the MHTSIG program. We discuss the evaluability of the program, use of the Transformation Tracker, the way in which the program was implemented, and lessons learned.
Methods
Grantees varied by geographical region, percentage of residents with serious mental illness, state versus county emphasis, and state political leanings.
Grantees were required to report progress on infrastructure changes in six categories that were based on infrastructure performance indicators developed by a small committee of SAMHSA staff members in consultation with SAMHSA subject matter experts. The indicators, listed in
Table 1, were designed to meet federal requirements for measurement of infrastructure change contained in the Government Performance and Results Act of 1993 (GPRA) (P.L. 103–62). The infrastructure indicators measure policy changes, training of persons in the mental health care and related workforce in service improvements, financing of policy changes, organizational changes, number of organizations regularly obtaining and analyzing data relevant to the goals of the comprehensive mental health plan, and number of consumer and family members of statewide consumer- and family-run networks.
Infrastructure changes were entered in the Transformation Tracker, which produced automated reports enabling SAMHSA project staff, grantees, and the evaluation team to monitor goal attainment.
Changes in services provided before and after the grant were assessed by two measures, neither collected specifically for the MHTSIG project. They were the National Outcomes Measures (NOMS) that states report annually to SAMHSA as a condition of community mental health block grant funding and items regarding unmet mental health services need and receipt of mental health treatment from the National Survey on Drug Use and Health (NSDUH).
To assess statewide changes in client outcomes, we used data from the Behavioral Risk Factor Surveillance System (
3), the NSDUH (
4), and the NOMS (
5).
Staff members’ perceptions of the adequacy of federal staffing and their ability to influence grantee activities came from semistructured interviews with SAMHSA program staff conducted at two points as well as observations by the federal project officer for the evaluation.
Although the federal program had no overall logic model, we adopted a model predicated on the hypothesis that there would be positive, but small, correlations between all program elements (infrastructure changes, service changes, and consumer outcomes). [Details of the model are available in the online
data supplement.]
We used Spearman’s rho to calculate the correlations between variables. Spearman’s rho is a nonparametric measure particularly well suited for small samples. Because we predicted only positive correlations, we used one-tailed statistical tests and adjusted for alpha inflation, given the large number of tests performed, by using a modified false discovery rate procedure called the Benjamini and Yekutieli False Discovery Rate (
6).
Results
All grantees were able to enter data into the Transformation Tracker, showing the feasibility of Web-based implementation tools to track infrastructure improvements. However, differences in “planning styles” made it difficult to compare the data. For example, some states made entries for activities only at completed “end points,” whereas others also entered the steps taken to reach end points. These findings suggest that planning style (preferences for reporting endpoints only or both endpoints and steps to reach endpoints) should be controlled in future studies involving planned infrastructure changes. However, in interviews, federal staff and some grantees did not report finding this tool particularly useful. This may have been related to the fact that the tool did not become available until year 2 of the project, after most planning activities were completed. It may also have been related to the fact that federal project officers did not report having used the tool.
Early in the cleaning of the infrastructure improvement data it became apparent that grantees differed in the types of efforts they reported. Some made entries reflecting actions taken as prerequisites to infrastructure changes—for example, actions related to project management, marketing, or other purposes. To account for these differences in reporting styles, only entries that were identified as an end-point infrastructure goal were included in our analyses. Data cleaning further identified infrastructure changes occurring incrementally over several different years, and these were counted as a single change. [Examples of infrastructure change, incremental infrastructure change, and supporting entries are available in the online
data supplement.]
Grantees completed a total of 405 infrastructure changes, or approximately nine infrastructure changes per grantee per year. Infrastructure changes in category GPRA 2 (training changes) were far more frequent than any other infrastructure entry. Grantees completed 159 infrastructure changes related to training over the grant period, or approximately 3.5 per grantee per year, accounting for 39% of all infrastructure changes.
Policy changes (GPRA 1) were the next most frequently assigned category, accounting for 28% of all infrastructure changes. The total number of these infrastructure changes was 112, yielding 2.5 infrastructure changes per grantee per year. The category of fewest infrastructure changes was number of organizations obtaining and analyzing data (GPRA 5). This category accounted for 23 (6%) infrastructure changes, or .5 infrastructure changes per grantee per year.
Analyses of correlations between changes in infrastructure and services, changes in services and consumer outcomes, and changes in infrastructure and consumer outcomes showed few statistically significant correlations and a number of negative ones. Overall, this pattern of relationships did not support the study hypothesis. [The nonparametric correlations between infrastructure changes and consumer outcomes are available in the online
data supplement.]
Discussion
Programs supporting infrastructure change may consume resources and attention, so understanding the effects of such programs on service improvement and consumer outcomes is important. This study showed that implementation of infrastructure changes can be monitored and that variations in planning style should be taken into account in project management and data analysis.
Because this evaluation was subject to the substantial limitations cited above, its negative findings by themselves cannot be taken as unambiguous evidence that infrastructure changes did not affect services and outcomes. Nevertheless, the findings are consistent with previous studies (
7–
10).
This consistency raises the question of why, given the previous studies cited above, the infrastructure improvement program was implemented with only very general requirements for infrastructure improvements and without requiring that the improvements be accompanied by services of proven effectiveness (recall this program actually prohibited spending program funds on services). Several possibilities taken singly or in combination suggest themselves. One possibility is that the focus of the program was more on disseminating and sustaining system improvements than on improving consumer outcomes. Another possibility is that SAMHSA assumes that general infrastructure changes and infrastructure changes directly linked to services are equally effective in improving consumer outcomes, an assumption that previous studies suggest may not be correct.
A third possibility is that rather than pursue an evidence-based assessment of what would most improve consumer outcomes, SAMHSA felt compelled to follow up on the New Freedom Commission recommendations for certain broadly conceived infrastructure improvements because of the federal government’s investment in the commission. A final reason might be that funding general infrastructure improvements looked like a relatively less expensive way of improving outcomes compared with funding statewide service improvements.
The evaluation’s findings that transformation of infrastructure had no effect on service changes or consumer outcomes may be spurious, reflecting type 2 error, given the limitations of the program design and the evaluation. However, assuming the findings are not wholly spurious, we believe it is useful for future evaluations to speculate on why infrastructure changes alone may not have affected services and outcomes as hypothesized. One theory is that the causal chain between infrastructure change and consumer outcomes is too long to guarantee reliable impacts. Another is that infrastructure change alone cannot succeed unless coupled with the implementation of effective services.
Another theory, specific to the MHTSIG program, is that the types of infrastructure change envisioned for the program were not implemented. Our interviews with SAMHSA program staff and the program’s federal project officer suggested several possible reasons why infrastructure changes differed from those envisioned by SAMHSA. One reason is that although SAMHSA specified areas of infrastructure improvement, it lacked the technical assistance capability to provide a great deal of detailed guidance to grantees as to what changes would be effective. SAMHSA initially contracted with an outside organization without public mental health system experience to provide “outside the box” technical assistance. This technical assistance was ineffective, and the contract was terminated in the program’s second year. Eventually a more effective contract was put in place but only after the grantee programs already had set their directions.
Another possible reason for the lack of effective transformation is the influence of the “new federalism,” a political philosophy operating over the past three decades that imposes constraints on federal agencies, such as SAMHSA (
11–
13). An additional factor may have been constraints on federal staffing of the MHTSIG program, part of what some have referred to as the “hollowing out” of many federal agencies, which is associated with the new federalism (
14,
15). A possible symptom of this phenomenon may have been the decision by federal staff to decline to formally approve state entries in the Transformation Tracker. Both of these explanations are consistent with statements by SAMHSA staff in formal interviews that at various times staffing and management support were not sufficient to provide adequate technical assistance to states, that more program staff were needed, and that the program needed “more teeth.”
Conclusions
Previous studies of more evaluable programs and the consistency of this evaluation with the previous work indicate that future federal investments, especially involving large sums of money, should take into account existing evidence, support operationally defined infrastructure improvements with accompanying service improvements, and go against the grain of the “hollow state” by providing sufficient federal or other resources—such as staff and Web-based implementation monitoring tools—to review and oversee grantee behaviors. In addition, to be more useful, future evaluations should not be subject to the methodological limitations noted above. Notably, a study of how sites were selected for this program was beyond the scope of this evaluation; however, this may also be important in influencing program results, given a lack of random sampling or assignment, and should be studied in future evaluations of programs of this type.
Acknowledgments and disclosures
This evaluation was conducted by Manila Consulting through contract HHSS280200600023C with the Substance Abuse and Mental Health Services Administration (SAMHSA). The authors thank Crystal Blyler, Ph.D., project officer for the evaluation; Grace Gonzalez, M.P.H.; and the local evaluators at the grantee sites. The views expressed in this article are those of the authors and are not those of SAMHSA.
The authors report no competing interests.