Among the most controversial interventions in community-based treatment for people with severe and persistent mental illness is the imposition of limit-setting restrictions to curtail certain maladaptive behaviors. Whether in its most comprehensive form, such as outpatient commitment (
1,
2), or in less globally restrictive forms, such as assignment of a representative payee (
3,
4) or supervision by mental health courts (
5), or through informal restrictions that are part of services delivered in assertive community treatment programs (
6), limit setting is regarded by some experts as necessary in the treatment of certain patients. Some have argued that behavior controls are needed to coerce or limit the freedom of some patients in their own best interests, especially because of the increasing numbers of forensic and dually diagnosed patients (
7,
8,
9,
10,
11). Despite considerable debate on whether such restrictions are beneficial or harmful, few empirical studies have examined these interventions. Although one study suggested that persons under outpatient commitment remain more clinically stable (
1), another study found no benefit (
2). Studies of the impact of representative payeeship have also had mixed results (
3,
4).
In a previous descriptive study, we used data from a 40-site Department of Veterans Affairs (VA) program that provides treatment similar to assertive community treatment to severely mentally ill veterans (
6). We described the frequency with which 25 different limit-setting or coercive activities were used during the first six months after program entry. That study identified various types of limit-setting activities and client characteristics that were associated with their use. However, it did not examine the relationship between the use of limit-setting interventions and subsequent outcomes.
The study reported here extends the previous findings by examining the association of limit-setting interventions and outcomes. We hypothesized that if these interventions are effective, outcomes of patients for whom limit-setting interventions were used would, at a minimum, be no worse than outcomes of other patients, after the analyses controlled statistically for potentially confounding differences between these clients.
A previous study based on this data set showed that clients for whom limit-setting interventions were used had more severe problems at baseline than other clients (
6). Therefore, in the study reported here, we used multiple regression analysis to minimize the potential influence of selection biases on our results. First, we examined the association of the use of each of five types of limit-setting interventions with client baseline measures at the time of program entry. Next, in an analysis of the relationship of limit-setting interventions to six-month outcomes, we used multiple regression analysis to control for these potentially confounding differences between clients exposed to limit-setting interventions and those who were not.
Finally, because use of limit-setting techniques may have a global effect on the tenor and effectiveness of service delivery by assertive community treatment teams, we compared client outcomes for programs in which the use of limit-setting interventions was high with client outcomes for programs in which such use was low. We thus attempted to evaluate the association of limit-setting interventions with clinical outcomes at both the individual client level and the program level, using multiple regression analysis to control for baseline characteristics.
Methods
As part of an administratively mandated performance monitoring system used by all VA mental health intensive case management teams, data were obtained on new clients of 40 different teams that were in operation between June 1995 and December 1997 and from their case managers. The teams provide services similar to assertive community treatment (
12,
13). The teams have high staff-client ratios, deliver services in community settings, and use a practical problem-solving approach. Continuity of care is high. On a standard measure of fidelity to assertive community treatment (
14), the VA teams scored an average of 4 out of a possible 5 (
13), which placed them in the same range as model assertive community treatment programs.
Community adjustment and clinical data were obtained at the time of program entry and six months later through highly structured face-to-face interviews. Delivery of assertive community treatment services was documented on structured service delivery summaries completed by the case managers six months after treatment started. The institutional review board of the VA Connecticut Health Care System agreed to a waiver of informed consent to use these administrative data.
Sample
Study participants were 1,564 veterans who had received treatment from a mental health intensive case management team for at least six months. Patients were eligible for referral to these teams if in the previous year they had spent at least 30 days in the hospital or had had three admissions, if they had at least one non-substance-related clinical psychiatric diagnosis (24 percent also had comorbid substance use disorders), and if they were judged by the referring clinician to need intensive community-based care because of significant functional problems with community adjustment or with adherence to treatment.
Clinical measures
Data were obtained on sociodemographic characteristics, including age, gender, race, education, marital status, disability status, and days incarcerated. Subjective psychological distress was measured by using the Brief Symptom Inventory (
15) and psychopathology by using the Brief Psychiatric Rating Scale (
16). The Global Assessment Scale was used to measure functional capacity (
17). The CAGE questionnaire was used to measure the severity of alcohol and drug use (
18). The Addiction Severity Index was used to obtain information about the number of days of paid employment in the previous 30 days (
19). In addition, clients rated their quality of life using items from the Lehman Quality of Life Inventory (
20). Violent behavior in the past month was rated with use of a scale from the National Vietnam Veterans Readjustment Study (
21). Current working clinical diagnoses were also documented.
Limit-setting interventions
Each client's primary clinician assessed limit setting by using a 25-item questionnaire. Each item described a limit-setting behavior that the clinician rated on frequency of use during the previous six months: never or rarely, 0; sometimes or occasionally, 1; and often or always, 2. The 25 different limit-setting behaviors are described elsewhere (
6). A previously published factor analysis identified five factors or subscales with eigen values greater than 1 and with item loadings ranging from .47 to .94 (
22). Five types of behaviors were derived based on nonoverlapping items.
The subscales were verbal discussion of harmful behaviors (six items); passive sanctions in which the team withheld certain types of assistance until the client curtailed behaviors that would preclude taking advantage of that assistance (four items); invocation of external authorities, such as a representative payee or a probation officer (two items); seeking a declaration of incompetence to manage funds or initiation of a request for a payee (four items); and forced hospitalization through civil commitment (four items). A sixth subscale was included because it was found to be clinically effective in other studies (
23,
24). It reflected behavioral contracting in which specific goals were identified and linked to reinforcers if the goals were achieved (two items). Because verbal discussions are not active limit-setting interventions and cannot be readily distinguished from other supportive, educational, or problem-solving kinds of verbal interaction, the subscale "verbal discussion of harmful behaviors" was dropped from subsequent analyses. Five dichotomous variables were constructed to reflect the use of each of the five remaining kinds of limit setting.
A limit-setting activity was considered to have been used with a particular client if at least two items in the subscale were coded 1 (used sometimes or occasionally) or one item was coded 2 (used often or always). An overall limit-setting index was also created, which was the sum of the five individual indicators. This measure was used to differentiate program sites on their overall use of limit-setting interventions.
Analysis
First we examined the bivariate relationship of limit-setting measures to baseline client characteristics using simple Pearson correlation coefficients. Analysis of the association of limit-setting interventions and individual client outcomes at six months relied on two types of models. In the first type, a series of multiple regression analyses were conducted that examined the association of each individual limit-setting indicator, taken by itself, and each of eight outcome variables; the analyses controlled for baseline characteristics found to be associated with the use of those interventions in the first set of analyses. Because the observations represented clients "nested" within the 40 sites, hierarchical linear modeling was used to address the nonindependence of observations (
25).
The second type of model examined the relationship of outcomes to all five of the limit-setting measures simultaneously to identify their independent relationship to outcomes, over and above the impact of the other interventions; these analyses also controlled for potentially confounding baseline measures.
Finally, because use of limit-setting interventions may affect the overall therapeutic environment of a clinical team, we also examined the relationship of limit setting to outcomes at the level of the overall team. For this analysis, sites were sorted into tertiles by using the mean score on the overall limit-setting index. For teams with the highest mean±SD scores on the overall index, the score was .91±.22, and scores at these sites ranged from 1.4 to .69. The mean midrange limit-setting score was .61±.06, and scores at these sites ranged from .67 to .51. The lowest mean score was .36±.09, and the range at these sites was .50 to .14.
Analyses of six-month outcomes were then repeated with replacement of the measures of individual client exposure to limit setting by dichotomous variables representing clients treated at sites with high and midrange use of limit-setting interventions. Clients from low limit-setting sites constituted the reference condition in these analyses.
Because these were exploratory analyses a significance level of .05 was used for all analyses. However, because we analyzed a total of eight dependent variables in each set of analyses, a more conservative alpha of .001 was also applied.
Results
Sample characteristics
As shown in
Table 1, for the 1,564 veterans who participated in the study, the mean age at program entry was 49.35 years. Ninety-three percent (N=1,455) were men, and 71 percent were white. More than three-quarters of the sample (78 percent) had a diagnosis of psychotic disorder. Twenty-four percent had dual diagnoses of both a psychiatric and a substance use disorder. Diagnoses included schizophrenia (58 percent), schizoaffective disorder (18 percent, or 282 participants), bipolar disorder (16 percent), major affective disorder (9 percent, or 141 participants), alcohol abuse or dependence (19 percent), and drug abuse or dependence (10 percent). Clients had long histories of illness. The mean duration of illness was 27±11 years. The mean number of lifetime hospitalizations was 20±38. Sixty percent of the sample (N=938) had a lifetime cumulative length of hospitalization of two years or more.
The baseline mean score for the sample on the Brief Symptom Inventory was 2.1 (range, 1 to 4). The baseline mean score on the Brief Psychiatric Rating Scale was 22.4 (range, 0 to 80). The baseline score on the Global Assessment Scale was 45 (range, 0 to 100). At baseline the mean score on the CAGE questionnaire for alcohol use was .66. For drug use the score was .43. The number of days of paid employment in the past 30 days was 1.2 (range, 0 to 30). The mean client rating using the Lehman Quality of Life Inventory was 25.3 (range, 0 to 42). The rating for violent behavior as measured on a scale from the National Vietnam Veterans Readjustment Study was .49 (range, 0 to 4).
Limit-setting interventions and baseline characteristics
As shown in
Table 1, with one exception, limit-setting measures were weakly intercorrelated (less than .25). The exception was hospitalization and contracting, which were correlated at .57. Limit-setting measures were significantly associated with 17 different baseline characteristics, which were included as covariates in all subsequent analyses.
Separate analysis of individual limit-setting interventions and outcomes, presented in
Table 2, showed that each of the interventions was associated with poorer outcomes (p<.05) on from four to six of the eight outcome measures and that each of the outcome measures was associated with a negative outcome for at least one of the limit-setting interventions (models 1 to 5 in
Table 2).
Thus these negative associations do not seem to be specific to a particular intervention or a particular outcome. The consistency of these relationships—63 percent were significant in the undesirable direction—suggests that they are unlikely to be an artifact of the large number of analyses conducted. Two—withholding support in association with violent behavior and money management in association with drug use—were significant at the .0001 Bonferroni-corrected level.
Similar overall results were observed when all five of the measures were included in each outcome analysis, although fewer relationships were significant (35 percent of the total examined) and no significant relationships were found with involuntary hospitalization (
Table 2, model 6) or with employment outcomes. It is notable that there was a greater differentiation across measures of limit setting in this set of analyses, with the largest number of adverse effects appearing in association with withholding (five of eight) and none with hospitalization, although the latter result likely reflects the substantial shared variance with contracting (r=.57) noted above.
The comparison of outcomes based on site-level use of limit setting showed more violent behavior at sites that made more extensive use of these interventions but also greater employment (
Table 2, model 7).
Discussion
This study of six-month outcomes in a large sample of veterans who received services in a multisite VA program similar to assertive community treatment showed that all five types of limit-setting interventions were associated with significantly poorer outcomes on at least four—and as many as six—of the eight outcome measures, after the analyses controlled for potentially confounding baseline characteristics. Similar results were obtained in a simultaneous analysis of all of these interventions together, although somewhat fewer relationships were significant.
These findings may be fairly interpreted as indicating that limit-setting interventions are not associated with outcomes that are as good as those found for clients who do not receive such interventions. The findings should not be interpreted as indicating that limit-setting interventions are harmful or cause worse outcomes. Limit-setting interventions are invoked on clinical grounds to assist clients who are having especially difficult problems, and it could be argued that it is not surprising—in fact that it could be expected—that such clients would have poorer outcomes than those who do not receive limit-setting interventions. It could be further argued that the direction of causality is reversed—that it is because some clients have poor outcomes that they receive limit-setting interventions. In the end, because the analyses showed only associations, causal interpretations are not warranted. However, we feel that a fair test criterion for these interventions in nonexperimental studies is that clients who receive them should do at least as well as clients who do not receive them, after differences in clinical status at baseline are controlled for. We feel that our data are consistent with the inference that use of limit-setting interventions in this sample did not achieve this modest objective of outcome parity.
We also conducted an analysis in which we compared sites that made more extensive use of limit-setting interventions with those that made less use of them, and we found less consistently negative results. These analyses suggest that the use of limit-setting activities does not have a diffuse adverse impact on clients treated by teams that use them.
Several limitations of this study require comment. First, as noted above, because of the observational nature of our data, we cannot conclude that the outcomes we observed are caused by the use of limit-setting interventions, although we attempted to address this limitation through the use of multiple regression analysis to reduce the effect of selection bias. A far preferable methodology would rely on random assignment to limit-setting procedures, but random assignment of seriously ill patients to such interventions would not be ethically acceptable in our view. To study these interventions we must rely on statistical methods that reduce the impact of selection biases.
Second, the analysis that compared teams with high, medium, and low limit-setting scores relied on a measure that combined all of the measures rather than on each of the subscales. It could be argued that these analyses should have been based on one of the subscales that reflected more intrusive limit setting. We did not repeat these analyses for each of the subscales because the risk of finding spuriously significant results after conducting six times as many analyses would have been substantial.
Third, the six-month time frame for outcome assessment was relatively brief. It is possible that the benefits of limit-setting interventions emerge only after more extended periods of time and would be more apparent in a longer-term study. However, if such gains were observed over a longer follow-up period, it would be more difficult to attribute them to the use of limit-setting interventions because such improvement might occur over the course of case management, even without such interventions.
Finally, it should be noted that the clients were predominantly male, and all were veterans treated in the VA health care system. The generalizability of our findings to other populations is unknown.
The ideal research design for isolating the effects of limit-setting interventions would be an experimental study in which a pool of clients judged to need limit-setting interventions would be randomly assigned either to receive such interventions or to be precluded from receiving them. Although such as study is greatly needed from a scientific and policy point of view, it would be difficult to justify ethically because it would entail either placing restrictions on clients or withholding potentially necessary behavioral restraints "by the flip of a coin." Even if an ethical experimental design was approved by a human studies review committee, clients who agreed to participate in such a study most likely would not be representative of those to whom these interventions are typically applied.
As is often the case, social and clinical policies that are highly controversial are among the most difficult to evaluate scientifically. Although limit-setting interventions may seem to represent an intuitively reasonable, common-sense approach to socially aberrant behavior, their only justification is that the restriction they impose on clients' personal freedom does indeed result in improved well-being of the clients themselves. Although the design of this study does not allow a definitive conclusion that limit-setting interventions have either ill or beneficial effects, it could well have provided more completely reassuring evidence that such interventions are harmless and suggests that such interventions do not accomplish outcome equity. We cannot rule out the possibility that limit setting results in significant improvement that does not meet this modestly ambitious standard. Further research is urgently needed in this area.