Skip to main content
Full access
Priority Data Letter
Published Online: 4 October 2023

Are We There Yet? Predicting Conversion to Psychosis Using Machine Learning

Publication: American Journal of Psychiatry
The ability to accurately predict conversion to psychosis from clinical and other measurable features of an at-risk state is critically important to clinicians. To demonstrate clinical utility, these models should predict conversion with at least 80% sensitivity and specificity (1) with concurrently high positive and negative predictive value (PPV/NPV). Over the past decade substantial progress has been made in this area with the development of “risk calculators,” which consider various demographic, clinical, and neurocognitive factors in addition to family history to predict future conversion (e.g., 2, 3). The most well-studied of these, the risk calculator based on the second North American Prodromal Longitudinal Study (NAPLS) 2-based calculator, achieved a 71% model concordance index (analogous to the area under a receiver operating characteristic curve) (2).
These encouraging results helped motivate the NAPLS-3 (4) study. The NAPLS-3 includes longitudinal measurements from 710 individuals at clinical high risk for psychosis and 96 age- and sex-matched healthy control participants (4). To our knowledge, the ability of the features specified in the NAPLS-2 calculator to predict conversion in the NAPLS-3 sample has not yet been evaluated. We thus examined the ability of these features as well as cortisol (assessed at baseline) to predict conversion in clinical high risk using various linear (e.g., Cox proportional hazards regression, logistic regression, support vector machine) and nonlinear (e.g., random forest) machine learning algorithms. We hypothesized that these features would predict conversion with performance in line with models from other data sets, with some variability depending on the machine learning algorithm. We also hypothesized that nonlinear machine learning methods would perform qualitatively better than linear machine learning methods due to their ability to model complex nonlinear relationships.

Participants

The NAPLS-3 is an NIMH-funded study conducted at nine sites. All participants provided written informed consent, including parental consent for minors. The study was approved by all sites’ institutional review boards.
A detailed description of NAPLS-3 participants (including exclusion criteria) is provided in Addington et al. (4). Briefly, 710 clinical high-risk individuals and 96 individuals in the healthy control group were recruited and followed for up to 2 years, with some longer exceptions (see Results). Participants were between 12 and 30 years old. Predictors included those used by the NAPLS-2 calculator (riskcalc.org/napls; see Table 1 for list). As a recent study found that salivary cortisol improved prediction in the NAPLS-2 (5), we examined models both with and without cortisol as a predictor. Participants in the healthy control group and clinical high-risk participants who lacked follow-up data were not included in machine learning models.
TABLE 1. Healthy control group and clinical high-risk group demographic and clinical information, excluding participants with missing data (except for cortisol)a
HC Versus CHRHealthy Control GroupClinical High Risk GroupCHR Versus HC (t or χ2)pMissing From CHR Group (N)
NN
Starting sample96710
No follow-up data876
Missing baseline data336
Final sample85598
Sex0.350.550
 Male43323
 Female42275
First-degree relative with psychosisb17.61<0.0019
 Yes1117
 No84481
MeanSDMeanSD
Age in yearsb19.204.3118.794.030.860.390
BACS symbol coding raw scoreb60.6110.7254.7413.214.58<0.00115
HVLT raw scoreb27.604.226.395.212.050.04116
Number of trauma typesb0.641.121.871.61−8.91<0.0017
Decrease in global social functioning score over the past yearb0.580.781.030.97−4.14<0.0010
Number of undesirable life eventsb5.874.199.564.86−7.44<0.00112
SIPS delusions plus suspicionsb0.440.816.801.87−54.65<0.0010
Salivary cortisol μg/dLb0.120.080.140.13−1.840.06711
Days from baseline to conversion 278.0286.1
NN
Converters0629.690.0020
Non-converters85536
a
T and chi-square tests comparing converters versus non-converters were not performed as this would constitute a circular analysis vis-à-vis the machine learning-based modeling. Numbers in parentheses represent the standard deviation unless otherwise stated. BACS=Brief Assessment of Cognition in Schizophrenia; HVLT=Hopkins Verbal Learning Test; SIPS=Structured Interview for Psychosis-Risk Syndromes.
b
Included as features in Cox regression and machine learning models.
Consistent with prior work (6), conversion to psychosis was defined as meeting the Presence of Psychotic Symptoms criteria: one of the five SIPS Scale of Psychosis-Risk Symptoms positive symptoms must reach a psychotic level of intensity (rated 6) for ≥1 hour per day for 4 days per week during the past month in the clinical high-risk individual, and/or the clinical high-risk person must show that these symptoms seriously impact their functioning.

Analyses

First, as performed previously (2), a Cox proportional-hazards regression analysis was performed using these predictors (SAS v.9.4) to examine consistency with prior NAPLS-2 findings.
For machine learning, standard algorithms were employed using Weka software (University of Waikato, New Zealand) and included logistic regression, naive Bayes, a three kernel support vector machine, KStar, J48 decision tree, random forest, decision stump (with 100 iterations of AdaBoost), and multilayer perceptron. Classifier accuracies were calculated by averaging performance across 100 random assortments of 90% training data and 10% test data for each algorithm. Individuals with missing data were excluded from analysis. Because of class imbalance, prior to machine learning training data for the minority (converter) class was upsampled using the Synthetic Minority Oversampling Technique (SMOTE) (7). Due to class imbalance (see Results), the minority class was 400% oversampled in the present study, with k (number of nearest neighbors) set to five. We also determined feature importance ranking for the best classifier based on contributions to receiver operating characteristic area under the curve.

Results

Demographic and clinical information for participants (including the healthy control group) is provided in Table 1 and Table 2. As previously reported (8), relative to the healthy control group, clinical high-risk participants had lower Brief Assessment of Cognition in Schizophrenia (BACS) symbol coding and Hopkins Verbal Learning Test (HVLT) scores, more trauma, greater decrease in social functioning over the past year (i.e., prior to baseline), more undesirable life events, higher SIPS delusions plus suspicions score, and greater salivary cortisol. A higher percentage of clinical high-risk participants also had a first-degree relative with psychosis.
TABLE 2. Converter versus non-converter demographic and clinical information, excluding participants with missing data (except for cortisol)a
Converter Versus Non-ConverterConverter Mean (N=62)Non-Converter Mean (N=536)Missing From Converters (N)Missing From Non-Converters (N)
NN
Sex00
 Male35288
 Female27248
First-degree relative with psychosisb36
 Yes15102
 No47434
MeanSDMeanSD
Age (years)b19.314.0418.734.0200
BACS symbol coding raw scoreb50.2915.6455.2612.82411
HVLT raw scoreb24.875.8926.575.10511
Number of trauma typesb1.901.891.861.5725
Decrease in global social functioning score over past yearb1.111.041.020.9700
Number of undesirable life eventsb9.355.949.584.7239
SIPS delusions plus suspicionsb7.611.746.711.8700
Salivary cortisol (μg/dL)b0.130.090.150.1338
a
T and chi-square tests comparing converters versus non-converters were not performed as this would constitute a circular analysis vis-à-vis the machine learning-based modeling. Numbers in parentheses represent the standard deviation unless otherwise stated. BACS=Brief Assessment of Cognition in Schizophrenia; HVLT=Hopkins Verbal Learning Test; SIPS=Structured Interview for Psychosis-Risk Syndromes.
b
Included as features in Cox regression and machine learning models.
Examining conversion rates, out of 598 clinical high-risk participants with complete data, 62 converted and 536 did not over the course of the follow-up period. The average time from baseline to conversion was 278 days, with a range of 4 to 1,361 days. Four clinical high-risk individuals converted more than 2 years after their baseline assessment.
Results of the Cox regression analysis without cortisol suggested that the overall model was significant (likelihood ratio χ2=26.04, p=0.001, Harrell’s concordance index=0.70 [SE=0.04], mean specificity [across time]=0.67, mean sensitivity=0.62, mean PPV=0.15, mean NPV=0.95). Including cortisol did not substantially improve the model (likelihood ratio χ2=24.93, p=0.003, Harrell’s concordance index=0.70 [SE=0.03], mean specificity=0.54, mean sensitivity=0.75, mean PPV=0.14, mean NPV=0.96).
Machine learning performance metrics for each machine learning algorithm are provided in Table 3. Briefly, all models performed significantly above chance. The algorithm that showed the best overall performance was random forest. Including cortisol as a predictor did not appreciably alter performance metrics of most algorithms. Feature importance in order of greatest to lowest for the random forest algorithm was as follows: baseline SIPS P1 and P2 (delusions plus suspiciousness), HVLT raw score, number of undesirable life events, number of trauma types, BACS symbol coding raw score, decrease in global social functioning over the past year, age, having a first-degree relative with psychosis, and cortisol.
TABLE 3. Predicting conversion to psychosis from baseline NAPLS-3 clinical/demographic data and cortisol levels using various machine learning methodsa
MethodAccuracy95% CISensitivity95% CISpecificity95% CIPPV95% CINPV95% CI
Without cortisol
 Random forest9089–907978–809695–969291–928988–89
 J48 decision tree8584–867675–779089–918281–838786–87
 KStar7878–797776–787979–806968–708685–86
 SVM (3 Kernel)7474–744140–419393–947877–807373–73
 MLPb7372–744845–508887–897069–727574–75
 Decision stump with AdaBoost7171–724340–468887–896866–707372–74
 Naive Bayes7069–704545–468483–846261–627372–73
 Logistic regression6867–683534–358786–876160–627070–70
With cortisol
 Random forest8989–907675–779797–979493–948887–88
 J48 decision tree8483–857472–769089–908180–828685–87
 Decision stump with AdaBoost7574–765756–588584–877068–727878–79
 KStar7575–767776–787574–756463–648585–86
 SVM (3 Kernel)7272–723736–389291–927372–757272–72
 MLPb7170–725451–588179–826361–647675–77
 Logistic regression6666–673029–318787–875856–596969–69
 Naive Bayes6564–655655–577069–705251–527473–74
a
Values are percentages. Methods are sorted from highest to lowest accuracy. Confidence intervals are measured over 10 repetitions of random 90% training/10% test data allocations. NPV=negative predictive value; PPV=positive predictive value.
b
Multilayer perceptron (MLP) with two hidden layers (five nodes in the first and two in the second).

Discussion

As expected and previously reported (8), clinical high-risk participants in the NAPLS-3 had a greater percentage of first degree relatives with psychosis, worse neurocognition, more trauma and deleterious life events, greater decrease in social functioning prior to baseline, and higher levels of psychotic symptoms compared to the healthy control group. Clinical high-risk participants also had higher cortisol, possibly indicative of greater chronic stress levels compared to the healthy control group. Cox regression performance was comparable to previous clinical high risk studies (2, 9). Logistic regression performance (66%–68% accuracy, depending on inclusion of cortisol) was in line with prior studies (2, 3, 5, 913). All machine learning algorithms performed above chance, with accuracies 65% and higher. As hypothesized, linear methods (Cox regression, logistic regression, support vector machine) showed worse performance compared with most nonlinear methods (e.g., random forest). Furthermore, the highest performing algorithm (random forest with or without cortisol) achieved ∼90% accuracy while maintaining >75% sensitivity and >85% specificity, PPV, and NPV. Baseline SIPS delusions plus suspiciousness score was found to be the most important predictor.
Although it was expected that all algorithms would perform better than chance at predicting conversion to psychosis in clinical high-risk individuals, it was somewhat surprising to find that the best algorithm (random forest) performed at such a high level given that previous studies suggest that these features predict conversion with accuracies (or metrics related to accuracy, e.g., concordances) between ∼70% and 80% (2, 3, 5, 913). Notably, however, the majority of these studies used regression-based modeling to predict conversion (logistic regression performed worse than most other methods in this study), and no studies used the random forest algorithm. What aspects of the random forest may have enhanced performance to this degree? First, unlike most classifiers, a random forest is an “ensemble” classifier, in which the predictions (converter or non-converter) of several decision trees are tallied and the majority vote is used to make an overall prediction (14). These individual trees are comprised of random combinations of features, such that each tree makes its vote independent of and decorrelated from all others. The decision boundary induced by a random forest is therefore highly nonlinear compared with some other methods (e.g., logistic regression). Because not all the features are used in each tree, the random forest is relatively immune to the “curse of dimensionality,” where increasing the number of features causes overfitting unless the sample size is also exponentially increased in parallel. Averaging the votes of decision trees also helps reduce the overall variance. As the generalizability of this performance enhancement is unclear, an interesting future direction would be to apply the random forest algorithm to predict conversion in clinical high-risk individuals using other data sets (e.g., the NAPLS-2).
Limitations of the present analyses were the small sample size (particularly for converters) and heterogeneity of sample outcome (time to conversion ranged from 4 to 1,361 days). The imbalanced data set also necessitated the use of a minority class oversampling procedure (SMOTE) to prevent models from defaulting to predict the majority class (results without SMOTE showed poor sensitivity and PPV [data not shown]). The converter/non-converter distribution for training models in this study may not be representative of the general clinical high-risk population. Our result also requires replication in an independent data set to determine if overfitting occurred during machine learning as a result of SMOTE. Overall, however, the relatively high level of performance of random forest and other methods suggests that when features selected from previous, independent studies are combined with modern machine learning methods, performance levels of clinical outcome prediction may approach the performance standards needed for a predictive biomarker that provides early identification of individuals likely to transition to psychosis. Provided these results can be replicated in other clinical high risk data sets, researchers can thus begin searching for the primary causes of this transition while preparing for delivery of palliative care. In the context of study limitations, when asking “are we there yet?” in regard to the development of predictive biomarkers for psychiatric practice, the answer may be, “We’re on the way, but we need more data.”

References

1.
First MB, Drevets WC, Carter C, et al: Clinical applications of neuroimaging in psychiatric disorders. Am J Psychiatry 2018; 175:915–916
2.
Cannon TD, Yu C, Addington J, et al: An individualized risk calculator for research in prodromal psychosis. Am J Psychiatry 2016; 173:980–988
3.
Fusar-Poli P, Rutigliano G, Stahl D, et al: Development and validation of a clinically based risk calculator for the transdiagnostic prediction of psychosis. JAMA Psychiatry 2017; 74:493–500
4.
Addington J, Liu L, Brummitt K, et al: North American Prodrome Longitudinal Study (NAPLS 3): methods and baseline description. Schizophr Res 2022; 243:262–267
5.
Worthington MA, Walker EF, Addington J, et al: Incorporating cortisol into the NAPLS2 individualized risk calculator for prediction of psychosis. Schizophr Res 2021; 227:95–100
6.
Addington J, Liu L, Buchy L, et al: North American Prodrome Longitudinal Study (NAPLS 2): the prodromal symptoms. J Nerv Ment Dis 2015; 203:328–335
7.
Nitesh V, Bowyer KW, Hall LO, et al: Synthetic minority over-sampling technique. J Artif Intelligence Res 2002; 16:321–357
8.
Zaks N, Velikonja T, Parvaz MA, et al: Sleep disturbance in individuals at clinical high risk for psychosis. Schizophr Bull 2022; 48:111–121
9.
Carrión RE, Cornblatt BA, Burton CZ, et al: Personalized prediction of psychosis: external validation of the NAPLS-2 psychosis risk calculator with the EDIPPP project. Am J Psychiatry 2016; 173:989–996
10.
Fusar-Poli P, Werbeloff N, Rutigliano G, et al: Transdiagnostic risk calculator for the automatic detection of individuals at risk and the prediction of psychosis: second replication in an independent National Health Service Trust. Schizophr Bull 2019; 45:562–570
11.
Puntis S, Oliver D, Fusar-Poli P: Third external replication of an individualised transdiagnostic prediction model for the automatic detection of individuals at risk of psychosis using electronic health records. Schizophr Res 2021; 228:403–409
12.
Oliver D, Wong CMJ, Bog M, et al: Transdiagnostic individualized clinically-based risk calculator for the automatic detection of individuals at-risk and the prediction of psychosis: external replication in 2,430,333 US patients. Transl Psychiatry 2020; 10:364
13.
Moore TM, Calkins ME, Rosen AFG, et al: Development of a probability calculator for psychosis risk in children, adolescents, and young adults. Psychol Med 2021:1–9
14.
Breiman L: Random forests. Machine Learn 2001; 45:5–32

Information & Authors

Information

Published In

Go to American Journal of Psychiatry
Go to American Journal of Psychiatry
American Journal of Psychiatry
Pages: 836 - 840
PubMed: 37789742

History

Received: 29 November 2022
Revision received: 13 March 2023
Revision received: 20 April 2023
Revision received: 30 May 2023
Accepted: 5 June 2023
Published online: 4 October 2023
Published in print: November 1, 2023

Keywords

  1. Psychosis see Schizophrenia Spectrum and Other Psychotic Disorders
  2. Machine Learning

Authors

Details

Jason Smucny, Ph.D. [email protected]
Department of Psychiatry, University of California, Davis (Smucny, Carter); Department of Computer Science, University of California, Davis (Davidson).
Ian Davidson, Ph.D.
Department of Psychiatry, University of California, Davis (Smucny, Carter); Department of Computer Science, University of California, Davis (Davidson).
Cameron S. Carter, M.D.
Department of Psychiatry, University of California, Davis (Smucny, Carter); Department of Computer Science, University of California, Davis (Davidson).

Notes

Send correspondence to Dr. Smucny ([email protected]).

Competing Interests

The authors thank the NAPLS-3 study investigators and research participants for making these data available for analysis on the NIMH Data Archive.

Competing Interests

The authors report no financial relationships with commercial interests.

Funding Information

This work was supported by NIMH grants K01-MH125096 and R01-MH122139.

Metrics & Citations

Metrics

Citations

Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Format
Citation style
Style
Copy to clipboard

View Options

View options

PDF/EPUB

View PDF/EPUB

Login options

Already a subscriber? Access your subscription through your login credentials or your institution for full access to this article.

Personal login Institutional Login Open Athens login
Purchase Options

Purchase this article to access the full text.

PPV Articles - American Journal of Psychiatry

PPV Articles - American Journal of Psychiatry

Not a subscriber?

Subscribe Now / Learn More

PsychiatryOnline subscription options offer access to the DSM-5-TR® library, books, journals, CME, and patient resources. This all-in-one virtual library provides psychiatrists and mental health professionals with key resources for diagnosis, treatment, research, and professional development.

Need more help? PsychiatryOnline Customer Service may be reached by emailing [email protected] or by calling 800-368-5777 (in the U.S.) or 703-907-7322 (outside the U.S.).

Media

Figures

Other

Tables

Share

Share

Share article link

Share