Rates of suicide deaths and suicidal thoughts and behaviors have risen by more than 50% among young people in the past decade (
www.who.int/data/gho) (
1,
2), making suicide the second leading cause of death among those ages 10–19 years (
1,
3). Whereas individual, environmental, and clinical risk factors for suicidal thoughts and behaviors have been well established (
4–
13), including in the sample analyzed in the present study (
14), these factors have demonstrated low predictive validity (
15–
17). In response, the number of studies examining neurobiological underpinnings of suicidal thoughts and behaviors has grown exponentially in the past two decades (
18). Nevertheless, our understanding of the neural mechanisms underlying suicidal thoughts and behaviors and their utility remains poor, especially in young children, for several reasons.
First, it is still unclear whether findings from neuroimaging studies examining suicidal thoughts and behaviors apply to children, because most studies have been conducted using adult samples. Second, results of these studies have been inconsistent. Whereas systematic reviews on the topic suggest that suicidal thoughts and behaviors are associated with abnormalities in regions involved in affective processing and impulsive regulation, the specific regions highlighted in each review differ, and all emphasize the modest sample sizes, heterogeneity, and lack of replicability across studies (
18–
21). In addition, meta-analyses of structural and functional imaging studies have failed to find differences between suicidal and non-suicidal participants (
22–
24), and those that found differences either were based on a small number of studies or reported inconsistent findings (
24,
25). Third, it is unclear whether the effect sizes of any described neural correlates of suicidal thoughts and behaviors are large enough to have clinical utility. Studies with small sample sizes have limited power to detect differences (
19). However, finding no difference does not mean that the difference equals zero; the observed effect size could be considered large enough to be meaningful. On the other hand, studies with large sample sizes are more powered to detect small differences, yet the observed effect size of such differences may be too small for practical purposes (
23). To examine whether an observed effect size is large enough to be considered meaningful, one can test for equivalence (
26), an approach originally employed in the field of pharmacokinetics (
27) with the aim of showing that a new, cheaper drug was practically as effective as an existing one.
In the present study, we employed data from a large population-based sample from the Adolescent Brain Cognitive Development (ABCD) study (
https://abcdstudy.org) (
28,
29) to examine the correlates of suicidal behaviors using a multi-informant approach. In children ages 9–10 years, we first examined the classification accuracy (i.e., predicted probability of an observation of belonging to a class [e.g., suicide case compared with never-suicidal case]) of individual, environmental, and clinical correlates of suicidal thoughts and behaviors that have been shown to differ in the present sample (
14). Next, we sought to identify associations between suicidal thoughts and behaviors and brain morphometry, functional connectivity at rest, and functional measures during three tasks. Specifically, we examined neural correlates of processes that have been implicated in suicidal behavior in clinical and at-risk samples of youths, namely, reward processing (
30–
33) with the monetary incentive delay task (
34), inhibitory control (
35,
36) with the stop-signal task (
37), and working memory and affective processing (
38–
41) with an emotional version of the
N-back task (
42). We tested for differences in these measures using a traditional null hypothesis significance test and complemented our analyses with equivalence testing (
26) to examine whether observed effect sizes were large enough to be considered meaningful based on a prespecified benchmark. Finally, we examined the ability of neural correlates to classify suicide cases in our sample and analyzed this in relation to non-imaging correlates of suicidal thoughts and behaviors.
Results
Prevalence of Suicidal Thoughts and Behaviors in the Sample
The four groups based on suicidal thoughts and behaviors were categorized as follows: child-reported suicidal thoughts and behaviors (N=684, 8.6%), caregiver-reported suicidal thoughts and behaviors (N=654, 8.2%), concordantly reported suicidal thoughts and behaviors (N=198, 2.5%), and never suicidal (N=6,854, 85.7%). Among participants with endorsed suicidal thoughts and behaviors, either by the child or the caregiver (N=1,140, 14.3%), there was an agreement of 17.4% (N=198).
Lifetime suicidal ideation without attempts (including passive, active, and plans) by the child was endorsed by 577 children (7.2% of the sample, 84.4% of cases) and 615 caregivers (7.7% of the sample, 94% of cases). Lifetime suicide attempts by the child were endorsed by 107 children (1.3% of the sample, 15.6% of cases) and 39 caregivers (0.5% of the sample, 6% of cases). Current suicidal thoughts and behaviors in the child were endorsed by 183 children (2.3% of the sample, 27% of the cases) and 119 caregivers (1.5% of the sample, 18.2% of cases). The number of case subjects with lifetime suicidal thoughts and behaviors and family history of suicide attempt or death by suicide was 68 based on child reports (0.9% of the sample, 9.9% of cases) and 90 based on caregiver reports (1.1% of the sample, 13.8% of cases). The rate of suicidal behaviors reported by either the children or the caregivers and the rate of positive agreement for each item are presented in Table S2 in the online supplement.
Individual and Environmental Characteristics
A full report of suicidal thoughts and behavior correlates in the study sample has been published elsewhere (
14). A summary of the descriptive statistics and differences of individual, environmental, and clinical characteristics between the suicidal thoughts and behaviors groups is presented in Table S3 in the
online supplement. We further examined current and lifetime mental health problems among caregivers. Several variables differed at p
FDR<0.05 between the three groups with suicidal thoughts and behaviors and the never-suicidal group. For these variables, we estimated effect sizes and classification accuracy with the AUPRC (
Table 1). Absolute effect sizes (Cohen’s d) of non-imaging correlates of suicidal thoughts and behaviors ranged from 0.14 to 0.87 for child report, from 0.29 to 1.42 for caregiver report, and from 0.33 to 1.47 for concordantly reported suicidal thoughts and behaviors. Overall, the smallest effect sizes of significant correlates were for sex of the child, economic problems in the past year, and use of mental health services by the father. The largest effect sizes of significant correlates were for child-reported family conflict, child psychopathology (
14), and caregiver current and lifetime mental health problems, especially in the mother. General psychopathological symptoms, behavioral disorders (i.e., oppositional defiant disorder and conduct disorder), and posttraumatic stress disorder (PTSD) had the largest effect.
However, in terms of classification accuracy, this was either very poor or not better than what would be expected by chance. Specifically, the AUPRC ranged from 0.04 to 0.20 for child-reported suicidal thoughts and behaviors, from 0.03 to 0.30 for caregiver-reported suicidal thoughts and behaviors, and from 0.01 to 0.15 for concordantly reported suicidal thoughts and behaviors.
Differences and Equivalence of Neuroimaging Data
The sample sizes of the groups for each imaging modality analyzed are presented in Figure S1 in the
online supplement. For each modality, we report the combined results of applying traditional null hypothesis Welch’s t tests and equivalence tests, after applying FDR correction for multiple comparisons. The distribution of results is shown in
Figure 1.
Results at the corrected level and uncorrected level for all modalities are summarized in Tables S4 and S5 in the online supplement, along with brain measures, if any, that were statistically different and not statistically equivalent across two or more group comparisons.
Brain structural imaging.
Among the 86 regions examined, only the left bank of the superior temporal sulcus was found to be significantly thinner in the caregiver-reported suicidal thoughts and behaviors group than in the never-suicidal group after applying FDR correction (d=−0.17, 95% CI=−0.26, −0.08, p
FDR=0.019) (
Figures 1 and
2; see also Tables S6–S8 in the
online supplement). In addition, on the basis of our prespecified bounds of ±0.15, this effect was large enough to be considered meaningful. Thickness in the superior temporal sulcus was not associated with having a parent with a history of suicide or attempt (t=−0.59, df=5969, p=0.557). However, it was associated with level of income (r=0.05, df=5679, p<0.001), having economic problems in the past year (t=3.11, df=6727, p=0.002, d=0.10), caregiver’s highest level of education achieved (r=0.05, df=6227, p<0.001), history of mental health hospitalizations in the mother (t=2.36, df=6016, p=0.018, d=0.14), and history of depression in either the mother or the father (t=2.13, df=5976, p=0.033, d=0.06). Nevertheless, in a multivariate analysis including caregiver-reported suicidal thoughts and behaviors as a predictor, only suicidal thoughts and behaviors remained significantly associated with lower thickness in the superior temporal sulcus (β=−0.04, t=−2.93, df=5040, p=0.003).
All the remaining regions were not statistically different (all p
FDR values >0.05). Of these, most regions were statistically equivalent (i.e., effect sizes were practically zero) for the child-reported suicidal thoughts and behaviors comparison (62 regions [72.1%], d range, −0.07, 0.07) and for the caregiver-reported suicidal thoughts and behaviors comparison (76 regions [88.4%], d range=−0.06, 0.07). In contrast, for the concordantly reported suicidal thoughts and behaviors comparisons, all regions were found not to be statistically equivalent (i.e., 90% confidence intervals for Cohen’s d included zero and overlapped with at least one of the |0.15| bounds), with d ranging from −0.23 to 0.23 (
Figure 1, see also Figures S2–S7 in the
online supplement).
Resting-state functional imaging.
Among the 306 functional connectivity measures, none were statistically different after applying FDR correction (all p
FDR values >0.05) (
Figure 1; see also Tables S9–S11). In addition, most functional connectivity measures were statistically equivalent (i.e., effect sizes were practically zero) for the child-reported suicidal thoughts and behaviors comparison (170 [55.6%], d range, −0.04, 0.04). In contrast, for the caregiver-reported and concordantly reported suicidal thoughts and behaviors comparisons, all functional connectivity measures were found not to be statistically equivalent (i.e., 90% confidence intervals for Cohen’s d included zero and overlapped with at least one of the |0.15| bounds), with d ranging from −0.18 to 0.20 and −0.34 to 0.28, respectively (see Figures S8–S10 in the
online supplement).
Task-based functional imaging.
Results of Welch’s t test and equivalence tests for each of the tasks and contrasts examined are shown in
Figures 1 and
3 (see also Tables S12–S35 and Figures S11–S60 in the
online supplement). Briefly, among the 167 region-of-interest mean activations examined for each of the three tasks and eight contrasts, none were statistically different after applying FDR correction (all p
FDR values >0.05).
Of the region-of-interest mean activations, the number of statistically equivalent measures ranged between 0 and 126 (0%−75.4%, d range, −0.05, 0.05) for the child report comparison, 0 and 116 (0%−69.5%, d range, −0.04, 0.04) for the caregiver report comparison, and none for the concordantly reported suicidal thoughts and behaviors comparison (
Figures 1 and
3; see also Figure S60 in the
online supplement). No evidence of equivalence was found for 41–167 (24.6%−100%, d range, −0.17, 0.16), 51–167 (30.5%−100%, d range, −0.17, 0.20), and 167 (100%, d range, −0.34, 0.25) of region-of-interest mean activations for child-reported, caregiver-reported, and concordantly reported suicidal thoughts and behaviors comparisons, respectively. The monetary incentive delay task showed the higher rates of region-of-interest activations that were statistically equivalent, followed by the stop-signal task and the emotional
N-back (
Figures 1 and
3; see also Figure S60 in the
online supplement).
Classification Accuracy of Neuroimaging Data
Overall, observed effect sizes were small, especially for child-reported and caregiver-reported suicidal thoughts and behaviors analyses. Maximum effect sizes for child-reported, caregiver-reported, and concordant analyses were |0.17|, |0.20|, and |0.34|, respectively (see Table S36 in the online supplement). On the basis of the lowest and highest 90% confidence interval bounds, all results would have been statistically equivalent if thresholds were |0.29|, |0.33|, and |0.56| for child-reported, caregiver-reported, and concordantly reported suicidal thoughts and behaviors comparisons, respectively.
For child-reported and caregiver-reported suicidal thoughts and behaviors comparisons, only 23 tests (0.67%) resulted in an effect size equal to or greater than our smallest effect size of interest (d≥|0.15|) (
Figure 4, see also Table S37 in the
online supplement). These included lower thickness of the left bank of the superior temporal sulcus, aberrant connectivity of the default and cingulo-parietal network with the hippocampus and other subcortical areas, and aberrant task-elicited activation of frontal, temporal, and parieto-occipital areas, and the insula. The AUPRC of these observed effect sizes ranged from 0.07 to 0.10. On the basis of the prevalence of suicidal thoughts and behaviors in child and caregiver reports in our sample (approximately 8.5%), these can be considered random classifiers. The AUPRC of the largest effect size, found in the sensorimotor mouth-visual area connectivity in the concordant group analysis (d=0.34, 95% CI=−0.55, −0.12), was 0.02.
Neuroimaging Correlates of High-Risk Suicidal Thoughts and Behaviors Groups
The sample size of the groups for each imaging modality analyzed is presented in Table S38 in the online supplement, and a summary of the findings in high-risk groups in terms of difference and equivalence of means is presented in Table S39 in the online supplement. For case subjects with current suicidal thoughts and behaviors or lifetime suicidal thoughts and behaviors with a family history of suicide attempt or death, all results were not statistically different and not statistically equivalent after applying FDR correction (all pFDR values >0.05). The same was true for case subjects with lifetime suicide attempt, with two exceptions: during the stop-signal task, children with lifetime suicide attempt showed higher activation in the left pallidum when they were expected to inhibit a response but did not (d=0.47, 95% CI=0.20, 0.74, pFDR=0.028) and lower activation in the right ventral diencephalon when they succeeded in inhibiting a response (d=0.56, 95% CI=0.28, 0.84, pFDR=0.013). However, the classification accuracy of these two findings was not better than what would be expected by chance, with AUPRCs of 0.02 and 0.01, respectively.
Discussion
In a large U.S. population-based sample of school-age children, we found that endorsement of suicidal thoughts and behaviors was strongly associated with higher levels of psychopathology and social adversity, but these factors had poor classification accuracy. In terms of neural correlates, over the 5,000 tests performed to examine differences in structural MRI and resting-state and task-based fMRI, only one survived correction, in which suicidal thoughts and behaviors were associated with thinner left bank of the superior temporal sulcus. Nevertheless, effect sizes of neural correlates were very small, and their ability to classify case subjects with suicidal thoughts and behaviors was not better than what would be expected by chance.
The rate of reported suicidal thoughts and behaviors in our sample was in line with rates found in prepubertal and school-age children (
51,
52), which is lower than in community samples of adolescents and young adults (
4–
7,
53). Child and caregiver reports of suicidal thoughts and behaviors were not consistent, which is a common observation in adolescents and young adults, in whom nondisclosure may involve concerns about stigmatization, difficulties in communication, and unavailability of social and family support (
54).
In this sample, suicidal thoughts and behaviors were associated with higher psychosocial adversity and clinical correlates, thus replicating a number of studies (
14). The effect sizes of these associations ranged from small to large, with the largest effect sizes being linked to clinical characteristics in the child and caregivers and child-reported family conflict. For example, behavioral disorders such as oppositional defiant disorder and conduct disorder, as well as PTSD, had effect sizes >1. The strongest association of suicidal behaviors with externalizing rather than internalizing behaviors is consistent with reports on prepubertal children (
10,
11). However, regardless of the effect sizes, the classification accuracy of non-imaging correlates either was no better than random selection or was extremely poor, which is in line with previous reports (
16).
In terms of neuroimaging correlates, at the uncorrected level, we found several regions associated with suicidal thoughts and behaviors not consistently reported in the literature (
18), and those regions that we found that had been reported (e.g., aberrant thickness in the medial orbitofrontal gyrus, aberrant connectivity in the default mode and salience networks, and aberrant task-elicited activations in the temporal lobe and insula) differed in directionality or specific regions involved (
18,
40,
41,
55–
58). Moreover, after false discovery rate correction, we only found a thinner left bank of the superior temporal sulcus in the caregiver-reported suicidal thoughts and behaviors analysis. Similar findings in cortical thickness have been found in adults with schizophrenia (
59), and reduced volume but not thickness in this same region has been found in adolescents with a history of suicide attempt (
60). The superior temporal region is part of a neural network involved in inhibitory control and emotion processing in social contexts and has been associated with lethality of attempts and impulsivity (
61). Secondary analyses in those with lifetime suicide attempts showed altered activation during inhibitory control in the pallidum and the ventral diencephalon. The latter has been associated with suicidal ideation, but not attempts, in adults with depression (
62). No associations were found with current suicidal thoughts and behaviors or lifetime suicidal thoughts and behaviors with family history of suicide attempt or death.
Regardless of differences, and based on our prespecified conservative bound of |0.15|, we showed that around half of the group means for child-reported suicidal thoughts and behaviors comparisons (approximately 48%) and one-fifth for parent-reported suicidal thoughts and behaviors comparisons (approximately 22%) were equivalent (i.e., not a meaningful effect). These would have been nearly 100% equivalent with a prespecified bound of |0.30|, which is still small. In the case of the concordant group and the high-risk groups, all observed effect sizes of mean differences were not statistically equivalent (i.e., meaningful effects). In these cases, where there was no difference but effects were not statistically equivalent, the data were insufficient to draw conclusions. With our conservative smallest effect size of interest of |0.15|, the equivalence bounds became narrower, and the concordant and high-risk groups should have had a larger sample size in order to obtain a sufficiently narrow confidence interval to conclude that the observed effect sizes were statistically equivalent (i.e., not a meaningful effect). While widely employed, the choice of a d=|0.15| for the equivalence tests was, of course, arbitrary and only meant as an indicator of potential clinical meaningfulness. Our power calculations indicated that our sample sizes in this study could be used to detect such differences at a standard statistical significance threshold and power (α=0.05 and 1−β=0.80; see also the online supplement), at least for structural MRI. Obviously, even larger samples would be required to detect such differences after the application of correction for multiple comparisons. This points to the fact that, much like in the field of genetics, power—and therefore sample size—may be a limitation for the discovery of brain networks significantly contributing to clinical phenotypes, such as suicidality. Equally, our findings indicate that, at least for the phenotypes studied here, brain effects are most likely to be small.
Indeed, in this study, observed effect sizes were relatively small for all regions and connectivity indices tested (d<|0.30|), which is in line with studies conducted in large samples (
23). This was even true within the concordant and high-risk groups (with the only exceptions being the two significant findings in those with suicide attempt). Small effect sizes can still be clinically relevant if they can predict clinical outcomes or treatment response or point to mechanistic pathways of disease (
63). We therefore examined the classification accuracy of the largest effect sizes in our sample, as we did with the non-imaging correlates. We found that these were not better at classifying suicidal cases than what one would get by selecting cases randomly from the population. This is important because, ironically, the shift from studying psychosocial risk factors to studying neurobiological biomarkers of suicidal thoughts and behaviors was partly motivated by the poor sensitivity of the former in predicting suicide (
15–
17), which we also showed in this study. While the pattern of increasing suicide rates in young people does not show signs of stopping, it is not yet clear whether this change in focus of study is providing us with any benefit, especially given the cost of neuroimaging studies. The aim was to improve identification and prevention of suicidal thoughts and behaviors. However, to date, the evidence is still weak for this purpose as a result of small sample sizes, heterogeneity and inconsistency across studies, and, as further shown in this study, small effect sizes with limited classification accuracy. Non-imaging correlates, although they also have low classification accuracy, at least have moderate to large effect sizes, especially associated with clinical characteristics. Therefore, addressing psychosocial and clinical factors may be our best bet at this point. Nevertheless, there is an urgent need to improve the study of neurobiological biomarkers. In future studies, investigators should aim their efforts at using other methodologies, such as machine learning (
64,
65) and building models based on aggregation of multiple variables, including neuroimaging, psychosocial, clinical, and genetic correlates of suicidal behaviors, to increase predictive validity. That said, what our results show is that vulnerability to suicidal thoughts and behaviors does not appear to have a brain signature with a strong enough effect in school-age children. However, this does not imply that suicidal thoughts and behaviors do not have brain correlates; rather, it indicates that such associations, if any, may not be discernible using common neuroimaging measures at this age, for various reasons. First, it is possible that what we call suicidal thoughts and behaviors in children is different from suicidal thoughts and behaviors in adolescents or adults. Historically, it was believed that prepuberal children could not have suicidal thoughts and behaviors because of their concrete operational thinking, which limits their concept of causality, as well as the concepts of death and finality (
66,
67). However, current evidence suggests that even preschoolers can show suicidal behaviors with the intent to cause self-injury or death (
66,
68), which is agreed to be the essential quality of suicidal thoughts and behaviors, regardless of an understanding of finality or lethality (
67). Second, as brain organization evolves during the adolescent years, also a time when more active suicidal behaviors emerge, it is plausible that neural correlates of suicidal thoughts and behaviors become more evident as the brain matures. In that sense, investigation of the longitudinal data from the ABCD study cohort, when such data become available, will likely shed some light on these incongruent findings across samples of different ages. Finally, it is also plausible that we may not find a single correlate of suicidal thoughts and behaviors in those adolescents. Suicide is known to be a complex phenomenon, and based on works from affective neuroscience (
69,
70), it is likely that suicidal behaviors are not discrete categories and thus do not have one-to-one brain signatures. Comprehensive reviews on the topic are a good reminder of that heterogeneity (
18). Future studies should combine multiple types of correlates and examine interrelated trajectories across factors that might help us identify the shift to more active suicidal behaviors at peak ages, such as late adolescence and early adulthood.
Our study has some limitations. Because participants were drawn from the community, very few had active suicidal thoughts or behaviors—especially the latter—at the time of scanning and therefore were not necessarily comparable to clinical cases. This is important because some studies suggest that suicidal ideation may have distinct clinical, genetic, and imaging correlates compared with suicidal behaviors (
18,
71). We attempted to address these limitations by analyzing only those individuals at highest risk, yet this approach reduced considerably the number of participants and limited the power to detect differences. Nevertheless, the majority of effect sizes were small, which suggests that they would be small even with larger sample sizes. Moreover, passive ideation has been shown to be associated with significant psychiatric comorbidity and to be similar to active ideation in terms of risk factors (
72), as also shown in this study. Whereas using a community-based sample avoids referral bias and may aid in the identification of suicidal thoughts and behaviors in nonclinical samples, future waves of the ABCD study should capture the age-related increase in prevalence of more active suicidal behaviors.