Accurately identifying young people at risk for developing schizophrenia or related psychotic illnesses is the critical first step for developing effective prevention programs (
1–
5). At present, preventive research typically focuses on individuals who have symptoms associated with the prodromal or prepsychotic phases of illness (
6–
8). The primary goal is to preemptively intervene to slow, modify, or, optimally, stop progression to full illness (
5). Moreover, establishing risk characteristics prior to onset of psychosis generates critically needed information about underlying mechanisms. Although several major rating systems are used extensively around the world (
9–
11), the Structured Interview for Prodromal Syndromes (SIPS), along with its companion rating scale, the Scale of Prodromal Symptoms, is most widely adopted throughout North America (
12–
14). At present, the North American rating system identifies three separate prodromal subgroups, defined by having 1) attenuated positive symptoms, considered an early, subtle form of the hallmark symptoms of psychosis; 2) positive symptoms at full psychotic intensity for brief, limited periods, referred to as “brief intermittent psychotic symptoms”; or 3) a trait/state combination of genetic risk or schizotypy accompanied by functional decline, labeled “genetic risk and deterioration.” Subgroups and criteria are similar to those specified in the Australian rating system (
15), and in both, the three subgroups are combined into a single high-risk entity. However, the rates of conversion to psychosis among high-risk subjects so defined has remained moderate to low (
2). One factor possibly contributing to the moderate predictive validity of the high-risk criteria is the implicit but untested assumption that the three subgroups are alike in risk factors and developmental pathways. Combining these three populations may mask or confound predictors and affect overall conversion rates (
16) and, as a result, contribute to inconsistencies across sites. For example, patients with intermittent psychosis are by definition more seriously impaired and develop full-blown psychosis at a much higher rate than do individuals at genetic high risk (
16). It is possible that a more accurate set of predictors may emerge by deconstructing the overall high-risk category back into its three major subgroups.
A second possible factor limiting predictive accuracy is reliance on attenuated positive symptoms as the primary criterion defining risk. Several studies (
17–
24) have reported a number of biobehavioral abnormalities (e.g., cognitive, functional, social) that appear, ad hoc, to improve prediction. However, none of these potential indicators has as yet been incorporated into the selection of high-risk subjects, a task that continues to depend on the SIPS or comparable symptom criteria despite the problematic conversion rates (
2). One factor limiting inclusion of vulnerability indicators to the high-risk criteria is the marked inconsistency across studies in specific variables identified. The selection of domains for inclusion in prediction research has varied substantially from study to study, and even within similar domains, measures are often not comparable. Predictive accuracy might be improved by adding theory-driven deficits in domains other than positive symptoms, as we did in the present study, in which domains selected for prediction were derived from a theoretical neurodevelopmental model.
In this study, individuals in both the genetic-risk and the brief-intermittent-psychosis groups were excluded. Individuals were included only if they exhibited attenuated positive symptoms at clinically significant severity levels as defined by the SIPS. Duration of symptoms was treated as a dependent variable and was not limited to an onset or worsening over the year prior to baseline, as specified by the criteria in standard use. In addition, selection of potential predictor variables was based primarily on the set of vulnerability indicators included in the neurodevelopmental model described by Cornblatt and colleagues at the Recognition and Prevention program in New York (
1), and consistent with Weinberger (
25), who proposed that a biologically based vulnerability resulting from genetic abnormalities is a major etiological component of psychosis. Four domains were proposed in the model to be major contributors to the underlying vulnerability: 1) cognitive deficits, 2) anxiety and depression, 3) social difficulties leading to increasing isolation, and 4) role functioning problems at school or work. These domains were originally selected because they had been broadly implicated in all phases of schizophrenia and appeared modifiable with early treatment and thus seemed optimal for early intervention (
1).
We hypothesized that adding the selected vulnerability indicators to SIPS clinical criteria would improve predictive accuracy in a homogeneous subgroup of at-risk young people who were selected according to attenuated positive symptom severity and who were also within the optimum age window (ages 12–22, prior to typical onset age). A major goal of this study was to validate the selected vulnerability indicators for the specific at-risk population tested, with the long-term expectation that, if replicated, any one or more of these indicators might be added to the clinical high-risk criteria. The applicability of this approach to treatment is also illustrated. This involves further stratifying the initial at-risk sample according to the risk index derived here and then indicating how interventions might be developed to directly target the different stratified groups.
Method
Sample
The data reported here were collected as part of the Recognition and Prevention (RAP) program of the North Shore-Long Island Jewish Health System in New York, first funded by the National Institute of Mental Health in 2000. The RAP program is divided into two phases. Phase I participants, the focus of the present analyses, were recruited from 2000 to 2006, with follow-up completed in 2010. In phase II, an independent replication sample was recruited from 2006 to 2012, and follow-up is expected to be completed in December 2016. Of the 192 treatment-seeking adolescents between 12 and 22 years old in RAP phase I, 101 participants met clinical high-risk criteria, which require one or more moderate to severe attenuated positive symptoms (i.e., unusual thoughts, suspiciousness, grandiosity, perceptual abnormalities, disorganized communication) as rated on the SIPS (scores of 3–5 on a scale of 0–6) (
12–
14). Individuals with a score of 6 (severe and psychotic) on any positive symptom item are excluded. Symptom duration is measured as time from when an attenuated symptom first appears as moderate to severe but not at psychotic levels, based on all information available at baseline. Adolescents who have only attenuated negative symptoms or who meet criteria for brief intermittent psychosis or for psychotic disorder not otherwise specified are excluded from the present analyses, although they are included in the larger RAP study (see reference
1).
Healthy comparison subjects (N=68) were recruited through announcements in local newspapers and bulletin boards at the medical center. Comparison subjects had to be 12–22 years old and English speaking. Exclusion criteria were a schizophrenia spectrum diagnosis, a mood disorder with psychosis, a medical or neurological disorder affecting the brain, an estimated IQ <70, or a first-degree relative with an axis I psychotic disorder.
Patient referrals were made to the RAP program by affiliated outpatient and inpatient psychiatry departments, local mental health providers, school psychologists or counselors, or by self-referral.
All procedures were approved by the North Shore-Long Island Jewish Health System Institutional Review Board. Written informed consent (with assent from participants under age 18) was obtained from all participants.
Baseline Measures
Full details of the baseline neurocognitive and clinical assessment have been reported elsewhere (
26–
29). All participants were evaluated on a comprehensive battery of clinical, behavioral, functional, and neurocognitive measures, including the Schedule for Affective Disorders and Schizophrenia for School-Age Children–Epidemiologic Version (
30) for axis I disorders and the SIPS for prodromal ratings. The Beck Depression Inventory (
31) and the Beck Anxiety Scale (
32) were used to measure self-reported symptoms of depression and anxiety. Social and role (academic or work) functioning was assessed using the Global Functioning: Social and Role scales (
33). Neuropsychological measures assessed eight cognitive domains: processing speed, verbal memory, executive function, working memory, visuospatial processing, motor speed, sustained attention, and language (
27–
29).
Follow-Up Procedures
Follow-up interviews included SIPS ratings, the two global functioning scales, and the depression and anxiety self-reports. Conversion to psychosis was defined as the presence of a psychotic-level positive symptom (a score of 6 with a minimum duration of 1 week). Of the initial 101 subjects considered to be at clinical high risk, 92 (91%) had follow-up clinical ratings. The remaining nine subjects were lost to follow-up. The mean follow-up period (time to conversion to psychosis or to last follow-up in those who did not convert to psychosis) was 3.0 years (SD=1.6).
Statistical Analysis
All analyses were conducted in SPSS, version 16.0 (SPSS, Inc., Chicago). Comparisons of demographic and clinical characteristics were performed with analysis of variance for continuous variables and chi-square tests or Fisher’s exact test for categorical variables (two-tailed, p<0.05). The cumulative incidence rates of transition to psychosis during the follow-up period were estimated with Kaplan-Meier survival analysis (
34).
A Cox proportional hazards model was used to evaluate the effects of potential predictors on outcome event (conversion) and time to each outcome event (time to transition). Predictor variables were generated by the RAP neurodevelopmental model (neurocognition, depression/anxiety, social and role deficits at baseline and follow-up). In addition, clinical characteristics were entered from the SIPS, and included total scores for the four scales (positive, negative, disorganized, and general symptoms) and the durations of positive and negative symptoms. Demographic variables were also included (age, race, gender, ethnicity, and socioeconomic status), as well as comorbidity and medications at baseline and follow-up (see Tables S1 and S2 in the data supplement that accompanies the online edition of this article).
Consistent with the analyses described by Cannon et al. (
17) and Ruhrmann et al. (
20), in the first stage of variable selection, all potential variables were computed individually in a univariable Cox regression analysis at a more liberal statistical threshold (p<0.25) (
35). Variables that remained after the initial screening procedure were entered into domain-specific regressions (p<0.10). A final multivariable model was built with the remaining variables using backward (stepwise, likelihood ratio method) inclusion (p<0.05). The −2 log-likelihood ratio test was used to evaluate the overall significance of the predictive equation. The Wald chi-square statistic was used to test the significance of individual variables in the model. Model selection was guided by the Akaike information criterion modified for survival analyses (
36). Bootstrap resampling with replacement (B=10,000 bootstrap samples) was used to internally validate the final prediction model (see the online
data supplement) (
37).
After deriving a final Cox regression model, a weighted risk index was calculated by multiplying each coefficient estimate from the model by the corresponding observed value for each individual (i.e., b
1X
1+b
2X
2…+b
kX
k). The risk index was then converted into an estimated probability of converting to psychosis with the inverse logistic function. Model discrimination and diagnostic accuracy were determined with the C statistic (area under the curve). Model calibration was assessed with the Hosmer-Lemeshow (
38) goodness-of-fit test (p≥0.10). Missing data were handled using expectation-maximization estimates (
39). Overall, 4.9% of the data (392 of 8,004 values) were missing. No significant relationship was found between the patterns of missing data and outcome.
Results
Table 1 presents baseline demographic and clinical data for the 92 participants at clinical high risk and the 68 healthy comparison subjects. There were no significant differences between groups in age, years of education, gender, or socioeconomic status. Healthy comparison subjects had higher estimated premorbid and current IQ, although the high-risk participants were well within the normal IQ range.
Clinical Characteristics
The total scores for all four SIPS scales are listed in
Table 1. As expected, the healthy comparison subjects had either no symptoms or low symptoms levels (scores of 0–2), in contrast to the high-risk participants, who had clinically significant scores across the four scales. In terms of functioning, healthy comparison subjects exhibited good to above-average social and role functioning (scores close to 9), whereas the clinical high-risk participants’ scores fell in the moderate to seriously impaired range (scores of 6.5 to 5.5). Compared with healthy comparison subjects, the high-risk participants had significantly higher symptom scores across self-report and interview measures of depression, anxiety, and substance abuse.
Duration of Symptoms Prior to Study Entry
As shown in
Table 1, the clinical high-risk participants exhibited both positive and negative symptoms considerably longer than the 1 year specified by the SIPS for attenuated positive symptoms (3.07 years for positive symptoms and 4.50 years for negative symptoms). No significant differences were found in conversion rates for participants with a symptom duration of 1 year or less (rate=24.1%, SE=0.095, N=28) compared with those with a longer prebaseline symptom duration (rate=20.0%, SE=0.065, N=57).
Medication
Of the participants who had already been treated for at least 4 weeks before study entry, about 30% received antidepressants, 20% antipsychotics, 10% anxiolytics, and 4% mood stabilizers. At last follow-up, the percentages of high-risk participants receiving antipsychotics (39.1%; p<0.001), antidepressants (56.5%; p<0.001), and mood stabilizers (14.1%; p=0.04) were significantly higher than at baseline, indicating active pharmacological treatment throughout the follow-up period.
Neurocognition
The high-risk participants showed overall poorer neurocognitive performance than the healthy comparison subjects (Wilks’s lambda=0.77; F=5.32, df=8, 146, p<0.001). As shown in
Table 2, the largest effect size differences were seen in verbal memory and processing speed.
Conversion Rates
Figure 1 presents the Kaplan-Meier estimate of the survival function for time to onset of psychosis in the clinical high-risk group. The mean time from baseline assessment to conversion to psychosis was 1.78 years (SD=1.40, median=1.46). The cumulative incidence rate of conversion was 5.6% for year 1 (SE=0.02, 95% CI=1.28–9.91), 11.9% for year 2 (SE=0.04, 95% CI=4.84–18.96), 16.6% for year 3 (SE=0.04, 95% CI=8.17–25.0), and 19.4% for year 4 (SE=0.05, 95% CI=9.60–29.20). There were no additional conversions during year 5. One additional conversion occurred during year 6, resulting in a cumulative rate of 28.3% (SE=0.10, 95% CI=9.68–46.92).
Prediction of Conversion
The final predictor model, adjusting for age, consisted of four variables: disorganized communication (any score above 2), suspiciousness (a rating of 5), verbal memory deficits (2 standard deviations below normal), and declining social functioning (as assessed with the Global Functioning: Social scale, from baseline to last follow-up). Surprisingly, Kaplan-Meier analyses indicated that the youngest age group (ages 12–14, N=34) had a very low conversion rate of 3% (N=1). Therefore, the final multivariable model was adjusted for baseline age. Combining the four variables resulted in an area under the curve of 91.9% (95% CI=85.7–98.0, p<0.001), indicating an excellent discriminative ability, with a sensitivity of 60.0%, a specificity of 97.4%, and a positive predictive value of 81.8%. (The negative predictive value was 92.6%; the positive likelihood ratio was 23.1, and the negative likelihood ratio was 0.41.) To determine whether active medication affected rate of conversion, medication at baseline and follow-up were added to the final prediction model. Treatment with antipsychotics, antidepressants, or mood stabilizers did not significantly predict conversion.
The estimates from the final Cox regression model were used to generate a risk index score and subsequent probability of conversion for each individual (
Table 3).
Figure 2 presents risk index scores derived from the regression weights plotted with the probability of conversion on an individual case level for those who converted to psychosis and those who did not. The figure illustrates a high accuracy level, with participants who did not convert to psychosis nearly all having low scores and most participants at the high end of the curve having converted to psychosis. Confusion in risk status is shown primarily for subjects falling in the middle, where index scores were not fully consistent with converter status. The Hosmer-Lemeshow statistic was 3.80 (p=0.88), suggesting a well-calibrated predictive model.
Discussion
Our results indicate that a profile consisting of four variables—disorganized communication, suspiciousness, verbal memory, and declining social functioning—can significantly improve accuracy in predicting future psychosis in a selected, homogeneous risk population. In this study, based on current high-risk criteria alone (presence of one or more attenuated positive symptoms), the rate of conversion approached 30% with longitudinal follow-up, a rate that is consistent with the literature (
2). However, when the expanded predictor profile was used instead of the standard symptom criteria, accuracy of prediction increased to 81.8%. This level of increased accuracy mirrors similar results previously reported (
17–
22; see the summary in reference
40). Nevertheless, there is insufficient consensus across studies as to which specific vulnerability indicator (e.g., cognitive, social) should be added to the clinical high-risk criteria. As a result, the clinical criteria have not changed substantively since they were originally introduced. Adding criteria to improve predictive accuracy would be the next step toward moving the field forward (
41).
Neurodevelopmental Model as the Basis for Prediction
The present findings support the neurodevelopmental RAP model proposal that abnormalities other than attenuated positive symptoms have the potential to predict future psychosis. Of the four vulnerability indicators included in the model, verbal memory and social functioning were identified as critical predictors of psychosis. Contrary to expectation, however, depression/anxiety and role functioning did not predict future psychosis. Duration of symptoms prior to baseline, comorbidity, substance abuse, and medication also did not play a role in predicting psychosis. Similarly, none of the symptoms in the remaining three SIPS scales (negative, disorganized, and general) entered into the prediction model. Despite consistent evidence that cognitive deficits characterize all phases of psychosis, including the prodrome (
42–
44), few studies have included cognition in predictor analyses. In one of the few exceptions, Riecher-Rössler et al. (
19) found processing speed to be a major predictor of the onset of psychosis. In the present study, by contrast, verbal memory but not processing speed predicted the development of psychosis. Also contrary to expectation, a low conversion rate (3%) was observed among at-risk subjects ranging in age from 12 to 14 years, suggesting that including very young adolescents may be a factor in the inconsistent rates reported across studies. Overall, these findings highlight the problem of sample specificity, where predictors differ depending on the sample under study.
In contrast with the majority of potential vulnerability indicators, impaired functioning is one domain that has been associated with conversion to psychosis in a number of studies (
17–
19,
45–
47). However, as pointed out by Nelson et al. (
22), functioning is a complex construct in its own right and can incorporate many factors. Previous studies have typically relied on the Global Assessment of Functioning to generate predictor scores in the functioning domain. However, there is no way to parse these global scores into their social, clinical, and role components and thus no way to directly tailor intervention to the deficit most in need of remediation. In the present study, functioning was divided into two specific dimensions, social and role, with both domains assessed independently of clinical characteristics. These two independent aspects of overall functioning appear to have substantially different predictor potential. Social difficulties start at a relatively young age, and evidence of further decline over adolescence appears to be a solid predictor of illness (
48). Role functioning appears to be more responsive to environmental conditions (e.g., change of school) than any of the other vulnerability indicators and may be primarily a state factor, most directly reflective of poor long-term functional outcome or disability (
27,
28,
48).
Personalized Prediction
The goal of prediction studies is to improve prevention by increasing the accuracy of positive identification and thus increasing the precision of intervention. According to the strategy proposed here, it may be most feasible to initially generate predictors of psychosis that are specific to narrower, well-defined populations than is currently the case, with subsequent replication across similar populations being essential. Validation would additionally depend on the effectiveness of interventions tailored to the specific deficits identified, also across similar homogeneous groups of subjects. Developing accurate prediction would then involve a gradual progression from a specialized, local predictor system to one that can be generalized to a more diverse population.
This study provides an example of how this might proceed. The first step is to improve prediction by calculating a risk index for a homogeneous high-risk group (e.g., use of a narrow definition and a restricted age range) from a set of theoretically derived potential predictors. The second step is to apply the resulting index to stratify risk within the overall group, consistent with the approach used by Ruhrmann et al. (
20,
49,
50). The risk index score distribution generated in the present sample suggests that adolescents with the lowest scores can be considered to be at minimal risk and those with scores at the high end at very high risk, with the risk of the middle group somewhat uncertain. The third step is then to personalize intervention to individuals, for example, by monitoring low-risk individuals, providing psychosocial treatment for those with intermediate risk, and providing more aggressive intervention, possibly involving medication, for those with the highest risk (i.e., those who exhibit serious verbal memory problems and social decline in addition to disorganized communication and high levels of suspiciousness).
Translating a statistical index into clinical terms is complex. The preliminary findings from this study suggest that for our adolescent sample, individuals are at particularly high risk if they display clinically significant evidence of disorganized communication, very high levels of suspiciousness, verbal memory problems that would be more severe than those displayed by at least 95% of the subjects tested, and a substantial decline in social interactions (in the case of the Global Functioning scales used in this study, a decline of one point or more). Conceptual disorganization, as measured by the SIPS, is weighted most heavily in the predictor algorithm. Considered the precursor to formal thought disorder, conceptual disorganization is characterized by speech observed to be perseverative and tangential and by a lack of ability to organize one’s thoughts, ranging from vague or confusing speech to not giving direct answers to questions. Severe levels of suspiciousness on the SIPS (a score of 5), with the second highest algorithm weight, are exemplified by beliefs, without any objective evidence, that others are intentionally provoking or intending to harm the individual. Preoccupation with such beliefs sometimes leads to confrontations and verbal altercations with others. The verbal memory deficit is associated with extremely impaired performance on the California Verbal Learning Test, a measure of list learning. In the present sample, difficulty encoding auditory verbal information appears to reflect a developmental cognitive abnormality that severely disrupts academic achievement in most of the adolescents at the high end of the risk spectrum. Lastly, the majority of cases with high risk index scores had problems with social functioning. While this is a common problem in treatment-seeking adolescents, the participants considered at particularly high risk had few friends at study entry and continued to withdraw from relationships over the follow-up period, with a trajectory of decline in social interactions.
It should be emphasized that application of the algorithm for screening in community and clinical settings should be done with considerable caution at this point, since predictors weighted on specific high-risk samples do not apply to other populations. For example, outreach to general school populations that recruit individuals who are not treatment seeking may generate a wholly different sample for which the algorithm used here would not be valid.
Limitations
It may be too early in the course of prodromal research to generate predictor profiles that will generalize across many different sites or populations. Developing site-specific predictor profiles that may have direct clinical application, as proposed here, has a number of limitations, the need for cross-validation being a primary one. In the short term, it will be necessary to determine whether the risk index used in this study will generalize to similar homogeneous clinical high-risk populations recruited independently either at the same or at different sites. Applicability to samples selected on the basis of the more heterogeneous standard criteria is a longer-term goal. The personalized prediction reported is intended to be an example of how a model such as the one used here might be applied in a clinical setting to identify individuals most and least in need of aggressive intervention based on sample-specific deficit profiles. This strategy will need to be evaluated in terms of the success of interventions structured to accommodate the deficits specifically identified. A limitation of the model illustrated here is that while those at very high and very low risk are clearly identified, there is more ambiguity among subjects in the middle range; these individuals may require additional screening, most likely involving biological measures, such as neuroimaging or genetics, for more conclusive accuracy of prediction (
51,
52).
Acknowledgments
The authors thank the study participants and staff of the RAP program for their time and effort from the very onset of these studies. The authors also thank Pradeep Nagachandran, M.D., for his assistance in carrying out this study.