Historically, serious mental illnesses were viewed as chronic, noncurable, deteriorating disorders (
1,
2). Recent research, however, suggests that significant proportions of individuals with these diagnoses improve greatly or recover completely (
3–
10). In general, recovery rates tend to be consistent with Warner’s (
11) analysis of 85 outcome studies of people with schizophrenia, in that approximately 20%−25% of individuals make a complete recovery (defined as absence of psychotic symptoms and return to the level of functioning present before illness onset) and 40%−45% achieve social recovery (economic and residential independence and low social disruption). Gitlin and colleagues (
12) found similar outcomes among individuals with bipolar disorder who were on a maintenance level of pharmacotherapy; 27% did not relapse. Angst and Sellaro (
1) found slightly lower rates of recovery and remission in their review of studies of bipolar disorder.
Less is known about patterns or predictors of recovery trajectories. Cortese and colleagues (
13) found three longitudinal patterns of the clinical course of psychotic disorders (including schizophrenia and bipolar disorder) over a 12-month period, which they called positive incline, stable, and fluctuating. Strauss and colleagues (
14) also found evidence of longitudinal patterns indicating recovery; the most important predictors of greater severity of psychiatric symptoms and disability were the percentage of time experiencing psychotic symptoms in the first two years, younger age at study entry, and a baseline diagnosis of schizophrenia (in contrast to acute schizophrenia or bipolar disorder). The purpose of this study was to identify trajectories of recovery from serious mental illness and their predictors.
Methods
Setting
The setting for this study was Kaiser Permanente Northwest (KPNW), a not-for-profit integrated health plan serving about 480,000 members in Oregon and Washington State. KPNW provides outpatient and inpatient medical and behavioral health care and maintains an integrated electronic medical record that contains comprehensive administrative and treatment data for all of its members. Clinicians are salaried employees of either the health plan or the Permanente Medical Group.
Study design
The Study of Transitions and Recovery Strategies (STARS) was a mixed-methods, exploratory, longitudinal study of recovery among individuals with schizophrenia, schizoaffective disorder, bipolar disorder, or affective psychosis. Participants completed four in-depth interviews (two at baseline; one each one year and two years after study enrollment) and three paper-and-pencil questionnaires (one each at baseline, one year, and two years). In-depth interviews covered a wide range of domains, including mental health history, experiences affecting mental health and recovery, psychiatric symptoms, and mental health care. We also sought information about relationships with family and friends, current life circumstances, and role models that influenced participants’ recovery processes. Questionnaires assessed quality of life, happiness, psychiatric symptoms, recovery progress, stigmatizing experiences, substance use, typical activities of work and home life, living situation, and sociodemographic characteristics. Questionnaire data were linked to health plan records of diagnoses and service use. The study was approved and monitored by KPNW’s Institutional Review Board and Research Subjects Protection Office. All participants provided informed consent prior to participation.
Participant identification, inclusion criteria, and recruitment
Individuals with diagnoses of schizophrenia, schizoaffective disorder, bipolar disorder, or affective psychosis were identified via health plan membership and diagnostic records. Additional inclusion criteria included a minimum of 12 months of health plan membership before study enrollment, age ≥16 years, and no plans to leave the area for 12 months. Exclusion criteria included individuals with diagnoses of dementia, mental retardation, or organic brain syndromes and people whose mental health clinician felt they were unable to participate at the time of recruitment.
Health plan records identified a pool of 1,827 eligible participants. Of these, we attempted to recruit 418 individuals to reach our planned sample size. Potential participants were stratified according to diagnosis (mood versus schizophrenia spectrum diagnoses) and gender, and they were selected randomly within strata to achieve roughly equal representations of men and women within each diagnostic group. Recruitment letters were signed by the principal investigator and members’ primary mental health clinician (213 clinicians signed letters). Of letters sent to clinicians for review, 16% of potential participants were screened out as unable to participate; 15 clinicians did not return letters, which eliminated 17 individuals from the pool. We mailed letters to potential participants over a ten-month period and telephoned individuals who did not contact us after receiving the letter. Of the 418 to whom we sent letters, we made contact with 350 individuals, received 127 refusals, found 22 individuals ineligible, and enrolled 184. This represented 46% of those who were eligible and surpassed our recruitment goal of 170 participants. Of the 184 enrolled, three did not complete the baseline interviews and four were excluded because study staff determined that diagnoses were in error. Thus the total number of participants was 177.
Participants
Study participants were 92 women (52%) and 85 men (48%); mean±SD age at baseline was 48.8±14.8 years (range 16–84). Sample distributions for age and sex, within diagnosis, did not differ from the eligible population. Additional descriptive information is provided in the Results section and in published reports (
15,
16).
Data sources and measures
Self-reported standardized measures used in analyses presented here include the Recovery Assessment Scale (
17); the Wisconsin Quality of Life Index (W-QLI) (
18–
20); the National Opinion Research Center’s General Social Survey happiness question (
21); a modified version of Link and colleagues’ (
22) stigma measures assessing perceived devaluation and discrimination, rejection experiences, secrecy, and withdrawal-employment; the Medical Outcomes Study 12-Item Short-Form health inventory (
23); the Colorado Symptom Index (
24,
25); the Patient Activation Measure for Mental Health (
26); Drake and colleagues’ (
27) self-report rating scales for assessing alcohol and drug use and consequences; regularity of use and satisfaction with psychiatric medications (from the W-QLI); satisfaction with mental health clinician (clinicians’ interest and attention, competence and skills, and amount of information and explanation provided); quality of care (
28); perceived stress; number of traumatic events experienced as an adult; health practices; and medication use.
Quality-of-care indicators included calls returned within 24 hours, able to see a clinician when desired, treated with respect, adequate time during visits, adequate explanation during visits, help developing own treatment goals, providers sensitive to cultural background, given information on services available, given information about rights as a consumer, and given adequate information to handle condition. Health practices included exercise frequency, smoker or not, and self-reported alcohol or drug problem. Medication use included number of psychiatric medication classes and number of medication starts and stops.
Interviewer-rated measures included the
DSM-IV Global Assessment of Functioning (GAF) (
29) and the Brief Psychiatric Rating Scale (
30,
31) subscales for conceptual disorganization, excitement, motor retardation, blunted affect, tension, mannerisms and posturing, uncooperativeness, emotional withdrawal, motor hyperactivity, and distractibility.
In the interviews, we asked participants about the history of their mental health problems, including when they first began to “feel different.” These responses were used to assess age of symptom onset, and we converted responses to a categorical variable (always, during grade school, during high school, and as an adult); we were unable to obtain this information from 6% of participants (N=10). To assess illness severity, we asked about the worst symptoms participants had experienced and their best year since receiving their psychiatric diagnosis. On the basis of each participant’s descriptions, interviewers completed GAF ratings for these two periods.
Measures derived from health plan data included continuity-of-care measures for most frequently seen mental health provider, calculated according to Chien and colleagues’ (
32) methods;
ICD-9-CM diagnoses for mental and substance use–related disorders; counts of mental health outpatient visits; and a count of psychoactive and associated medications dispensed, linked to create episodes of medication use and continuity of use, according to Johnson and McFarland’s (
33) methods. All data collection occurred between April 2003 and February 2008.
Managing missing data
For scales, missing data were handled according to each instrument’s instructions; if no instructions were available, we required valid responses for at least 75% of the items to compute scales. We then used the expectation-maximization (EM) algorithm to estimate missing values for outcome variables. Given that this method is sensitive to the variables included for imputation, we used as predictors only scale scores and Likert-type items with less than 20% missing values. Before using the EM algorithm, we regressed these variables on each outcome variable for each time point to ensure that the equations explained an adequate amount of variance. The variance explained in each of the outcome variables ranged from 57% to 71%, a sufficient amount to support replacement.
Analyses
Our data included multiple measures designed to assess different dimensions of recovery. Thus for analytic purposes, we conceptualized recovery as a latent construct, based on seven measures (total score on the Recovery Assessment Scale, the SF-12 social functioning subscale, the SF-12 role emotional subscale, the W-QLI occupational subscale, GAF ratings, and the General Social Survey happiness question). We then computed factor analyses, using principal axis factoring, saving factor scores for each participant for each wave of interviews. Using the resulting factor scores and the quadratic formula, we calculated the intercept, linear slope, and quadratic slope for each participant’s recovery trajectory over time. Using Ward’s method, we then entered the intercept, linear slope, and quadratic slope parameters into a hierarchical cluster analysis and applied the squared Euclidean distance measure, followed by K-means cluster analysis, to identify groups of individuals with similar recovery trajectories. To assist in understanding cluster differences at baseline, we computed analyses of variance (ANOVAs) between each of the recovery measures and cluster membership. We then used discriminant analyses to explore relationships between cluster membership and blocks of conceptually related variables (evaluated as change scores for the follow-up interviews). Blocks of potentially discriminating variables included psychiatric symptoms, physical health status, satisfaction with mental health clinician, mental health service use, quality of care, resources and strains (social support, married or cohabiting, satisfaction with finances, perceived stress, patient activation, number of traumatic events experienced as an adult, and stigma and discrimination), health practices, medication use, and satisfaction with medication (among participants taking medications).
Results
Basic demographic information appears in
Table 1.
Recovery factor analyses
The factor analyses for each wave of interviews produced single factors with eigenvalues greater than 1 and accounted for 44%, 43%, and 45% of the variance at baseline, follow-up 1, and follow-up 2, respectively. All variables had strong factor loadings, ranging from .58 to .81 across time points.
Cluster analysis of recovery trajectories
We used each individual’s recovery factor score at each of the three time points to compute the cluster analysis (N=164, because critical data were missing for some individuals). We selected four clusters on the basis of the agglomeration schedule, dendrogram, and interpretability.
Figure 1 illustrates the mean recovery factor scores at each time point for each cluster. Using the baseline ANOVA results and the pattern of the trajectory, we named these clusters “high, stable” (N=46), “moderate-high, fluctuating” (N=36), “moderate-low, fluctuating” (N=43), and “low, stable” (N=39).
Table 2 shows mean baseline values on the recovery measures for the four clusters and presents ANOVA results for each. Results were remarkably consistent across clusters, with the high-stable group generally showing the highest levels of recovery and better scores on measures of functioning, followed by the moderate-high and moderate-low fluctuating groups and the low-stable group.
Table 3 describes the clusters in terms of demographic characteristics and history of mental health problems at baseline. Clusters did not differ on gender, adjusted household income, education level, disability status, most racial-ethnic categories, mental diagnosis, mental health or addiction-related diagnoses, antipsychotic medication use, or history of psychiatric hospitalization.
Some differences were apparent, however. Individuals in the low-stable and moderate-low fluctuating clusters were younger than those in the other two clusters, and participants in the low-stable and moderate-high fluctuating clusters were less likely than those in the other clusters to be currently employed or students. We found a higher proportion of Asian or Pacific Islanders in the moderate-low–fluctuating cluster. The clusters also differed on three indicators of severity of mental health problems. Participants in the low-stable and moderate-low fluctuating clusters reported earlier ages at which they first “felt different” (age-of-onset indicator) and lower lifetime “worst” GAF scores compared with high-stable and moderate-high fluctuating cluster members. Low-stable members had the least, and high-stable members the greatest, difference between “worst” GAF score and GAF score at baseline. Finally, the groups differed on levels of mental health–related patient activation, with the high-stable and low-stable groups having the highest and lowest levels, respectively.
Discriminant analyses for trajectory clusters
To understand the relationships between key variables and membership in recovery trajectory cluster, we explored these relationships using discriminant analyses. Because we had multiple measures of related constructs, variables were tested in blocks. We examined variables collected at baseline and their changes from baseline to follow-up 1 and from follow-up 1 to follow-up 2.
The blocks tested at baseline included psychiatric symptoms, physical health, satisfaction with mental health clinician, quality of care, mental health service use, resources and strains, health practices, medication use (number of psychiatric medication classes and number of starts and stops), and medication satisfaction for those who were taking medications. Means for discriminant functions and associated canonical correlations for significant blocks are presented in
Table 4. The canonical correlation is a measure of the degree of relationship between the block of variables and group membership, with higher values reflecting stronger relationships.
In general at baseline, participants in the high-stable cluster had fewer psychiatric symptoms, better physical health, greater satisfaction with mental health clinicians, more resources and fewer strains, greater medication satisfaction, and lower levels of service use. In contrast, the low-stable cluster fared worse on all of these constructs except symptoms and service use, where their scores were highest. Those in the moderate-low fluctuating cluster had more psychiatric symptoms and worse physical health at baseline; were moderate on satisfaction with mental health clinician, resources and strains, and medication satisfaction; and used more mental health services. Participants in the moderate-high fluctuating cluster had moderate psychiatric symptoms, satisfaction with mental health clinician, resources and strains, and medication satisfaction; higher physical health scores; and lower use of services. Level of satisfaction with mental health clinicians, resources and strains, and medication satisfaction (among those taking medications) appeared to differentiate the two clusters that were lower at baseline. The moderate-low fluctuating cluster was associated with higher satisfaction with mental health clinicians, better resources (and lower strains), and greater medication satisfaction than the low-stable cluster.
For follow-up 1 we used change scores from baseline to follow-up 1 for each set of variables and tested the same sets of conceptually derived blocks. Cluster membership was significantly associated with changes in psychiatric symptoms, physical health, and resources and strains.
Participants in the high-stable cluster had a slight increase from baseline to follow-up 1 in psychiatric symptoms and a slight decline in physical health. They also had a reduction in resources and increased strains. Those in the moderate-high fluctuating cluster had increased psychiatric symptoms, worse physical health, and worse resources and strains from baseline to follow-up 1, whereas those in the moderate-low fluctuating cluster improved in all three areas. The low-stable cluster had no change in psychiatric symptoms or physical health but improved in resources and strains.
In the final set of discriminant analyses, we tested change from follow-up 1 to follow-up 2 using the same set of blocks as in previous analyses. Change in psychiatric symptoms and medication use were the only significant blocks for these analyses. Individuals in the high-stable and low-stable clusters showed little change in psychiatric symptoms. In contrast, symptoms decreased among those in the moderate-high fluctuating cluster and increased among those in the moderate-low fluctuating cluster. Individuals in the moderate-low–fluctuating cluster also showed increased medication starts and stops at follow-up 2 compared with follow-up 1, whereas those in the low-stable cluster had fewer medication starts and stops. Participants in the high-stable and moderate-high fluctuating clusters showed little change in medication use.
Discussion
We found evidence for four recovery trajectories—two stable (high and low) and two fluctuating (higher and lower). Analyses of cluster characteristics at baseline suggest that few demographic or diagnostic factors differentiated the clusters. Exceptions included that older individuals were more likely to be further along in the recovery process, as expected, given opportunities for learning and adapting to chronic illness that occur over time (
35,
36), and participants in the low-stable cluster had the lowest activation levels, had experienced the worst lifetime symptom levels compared with the other groups, and were less likely to be working or students.
Discriminant analyses were useful for understanding trajectory cluster membership. At baseline, we found differences in psychiatric symptoms, physical health, satisfaction with mental health clinicians, resources and strains, satisfaction with medications, and service use. Generally, those with higher scores on our recovery factors had fewer psychiatric symptoms, better physical health, greater satisfaction with their mental health clinicians, fewer strains and greater resources, less service use, and greater medication satisfaction. In addition, there was a trend toward receiving care of better quality among those with higher recovery levels. The most consistent factor predicting recovery trajectory was psychiatric symptoms and changes in those symptoms. Changes in resources and strains and use of psychiatric medications were also predictive of recovery trajectories, but less consistently.
Our sample differed from persons in public-sector settings, so generalizability may be limited. Participants were more likely to be married and employed and to have higher education and income levels than individuals receiving care in the public sector. It may be that individuals with such characteristics are more likely to have access to a private health plan. An alternative explanation, however, is that good clinical relationships and long-term continuity of care (
34) affected these outcomes. Another limitation may result from our decision to have mental health clinicians screen participants for ability to participate at the time of recruitment. It is possible that individuals with lower levels of recovery may have been more likely to be screened out as part of this process.
Acknowledgments and disclosures
This research was supported by grant R01 MH062321 (Recoveries From Severe Mental Illness) from the National Institute of Mental Health. The authors thank Elizabeth Shuster and Jeff Showell for help with data extraction, and interviewers Sue Leung, Ph.D., Alison Firemark, M.A. L.P.C., and Micah Yarborough, M.A., for conducting excellent interviews, for help with developing the qualitative coding scheme, and for coding interviews.
The authors report no competing interests.