Despite remarkable success identifying genetic risk factors for major psychiatric disorders, it remains unknown how genetic variants interact with environmental and epigenetic risk factors in the brain to impart risk for clinically distinct disorders (
1,
2). We reasoned that brain transcriptomes—a quantitative, genome-wide molecular phenotype (
3)—would allow us to determine whether disease-related signatures are shared across major neuropsychiatric disorders with distinct symptoms and whether these patterns reflect genetic risk.
We first analyzed published gene-expression microarray studies of the cerebral cortex across five major neuropsychiatric disorders (
3–
11) using 700 cerebral cortical samples from subjects with autism (ASD) (
n = 50 samples), schizophrenia (SCZ) (
n = 159), bipolar disorder (BD) (
n = 94), depression (MDD) (
n = 87), alcoholism (AAD) (
n = 17), and matched controls (
n = 293) (12). These disorders are prevalent and disabling, contributing substantially to global disease burden. Inflammatory bowel disease (IBD) (
n = 197) was included as a non-neural comparison.
Individual data sets underwent stringent quality control and normalization (
Fig. 1) (
12), including rebalancing so as to alleviate confounding between diagnosis and biological (such as age and sex) or technical (such as post mortem interval, pH, RNA integrity number, batch, and 3’ bias) covariates (figs. S1 and S2). Transcriptome summary statistics for each disorder were computed with a linear mixed-effects model so as to account for any sample overlap across studies (12). Comparison of differential gene expression (DGE) log
2 fold change (log
2FC) signatures revealed a significant overlap among ASD, SCZ, and BD and SCZ, BD, and MDD (all Spearman’s ρ ≥ 0.23,
P < 0.05, 40,000 permutations) (
Fig. 2A). The regression slopes between ASD, BD, andMDD log
2-FC effect sizes compared with SCZ (5.08, 0.99, and 0.37, respectively) indicate a gradient of transcriptomic severity with ASD > SCZ ≈ BD > MDD (
Fig. 2B). To ensure robustness, we compared multiple methods for batch correction, probe summarization, and feature selection, including use of integrative correlations, none of which changed the qualitative observations (fig. S3) (
12). Results were also unaltered after first regressing gene-level RNA degradation metrics, suggesting that systematic sample quality issues were unlikely to drive these correlations (fig. S3). Further, the lack of (or negative) overlap between AAD and other disorders suggests that similarities are less likely due to comorbid substance abuse, poor overall general health, or general brain-related post-mortem artefacts.
Disease-specific DGE summary statistics (data table S1) provide human in vivo benchmarks for determining the relevance of model organisms, in vitro systems, or drug effects (
13,
14). We identified a set of concordantly down-and up-regulated genes across disorders (fig. S4) as well as those with more specific effects. Complement component 4A (
C4A), the top genome-wide association study (GWAS)–implicated SCZ disease gene (
15), was significantly up-regulated in SCZ (log
2FC = 0.23,
P = 6.9 × 10
−6) and in ASD [RNA sequencing (RNA-seq); log
2FC = 0.91,
P = 0.014] (data table S1) but not in BD, MDD, or AAD. To investigate potential confounding by psychiatric medications, we compared disease signatures with those from nonhuman primates treated with acute or chronic dosing of antipsychotic medications. Significant negative overlap (fig. S5) (
12) was observed, indicating that antipsychotics are unlikely to drive, but rather may partially normalize, these transcriptomic alterations, whereas the psychotomimetic phencyclidine partially recapitulates disease signatures.
To validate that these transcriptomic relationships are generalizable, we generated independent RNA-seq data sets for replication for three out of the five disorders (fig. S6) (12). We identified 1099 genes whose DGE is replicated in ASD [odds ratio (OR) 6.4,
P = 3.3 × 10
−236, Fisher’ sexact test] (table S2), 890 genes for SCZ (BrainGVEX; OR 4.5,
P = 7.6 × 10
−155), and 112 genes for BD (BrainGVEX; OR 3.9,
P = 4.6 × 10
−26), which is likely due to the relatively smaller RNA-seq sample size for BD (12). We observed similarly high levels of transcriptomic overlap among ASD, SCZ, and BD and a similar gradient of transcriptomic severity (
Fig. 2C and fig. S7). The SCZ and BD patterns were further replicated in the CommonMind data set, although gene-level overlap was lower (fig. S7) (12, 16). The ASD signature was qualitatively consistent across the four major cortical lobules, indicating that this pattern is not caused by regional differences (
Fig. 2D).
To more specifically characterize the biological pathways involved, we performed robust weighted gene coexpression network analysis (rWGCNA) (12, 17), identifying several shared and disorder specific coexpression modules (
Fig. 3). Modules were stable (fig. S8), showed greater association with disease than other biological or technical covariates (fig. S9), and were not dependent on corrections for covariates or batch effects (fig. S10). Moreover, each module was enriched for protein-protein interactions (fig. S8) and brain enhancer-RNA co-regulation (fig. S11) derived from independent data, which provides anchors for dissecting protein complexes and regulatory relationships.
An astrocyte-related module (CD4 and hubs
GJA1 and
SOX9) was broadly up-regulated in ASD, BD, and SCZ [false discovery rate (FDR)–corrected
P < 0.05] (
Fig. 3C and data table S2) (12) and enriched for glial cell differentiation and fatty-acid metabolism pathways. By contrast, a module strongly enriched for microglial markers (CD11) was up-regulated specifically in ASD (twosided t test, FDR-corrected
P = 4 × 10
−9). Hubs include canonical microglial markers (
HLA-DRA and
AIF1), major components of the complement. system (
C1QA and
C1QB), and
TYROBP, a microglial signaling adapter protein (18). Results fit with convergent evidence for microglial up-regulation in ASD and an emerging understanding that microglia play a critical role regulating synaptic function during neurodevelopment (
19).
One module, CD2, was up-regulated specifically in MDD (FDR-corrected
P = 0.009) (data table S2) and was enriched for G protein–coupled receptors, cytokine-cytokine interactions, and hormone activity pathways, suggesting a link between inflammation and dysregulation of the hypothalamic-pituitary (HPA) axis, which is consistent with current models of MDD pathophysiology (20). Several modules annotated as neuronal/mitochondrial were down-regulated across ASD, SCZ, and BD (CD1, CD10, and CD13) (
Fig. 3C and data table S2) (12). The overlap of CD10 with a mitochondrial gene-enriched module previously associated with neuronal firing rate (
21) links energetic balance, synaptic transmission, and psychiatric disease (data table S2).
The transcriptome may reflect the cause or the consequence of a disorder. To refine potential causal links, we compared single-nucleotide polymorphism (SNP)–based genetic correlations between disease pairs (22) with their corresponding transcriptome overlap. SNP coheritability was significantly correlated with transcriptome overlap across the same disease pairs (Spearman’s ρ = 0.79, 95% confidence interval 0.43 to 0.93,
P = 0.0013) (
Fig. 2C), suggesting that a major component of these gene-expression patterns reflects biological processes coupled to underlying genetic variation.
To determine how disease-associated variants may influence specific biological processes, we investigated whether any modules harbor genetic susceptibility for specific disorders or for relevant cognitive or behavioral traits (12). We identified significant enrichment among several of the down-regulated, neuronal coexpression modules (CD1, CD10, and CD13) for GWAS signal from SCZ and BD, as well as for educational attainment and neuroticism (FDR-corrected
P < 0.05, Spearman) (
Fig. 4A) (12). We also observed enrichment for the three down-regulated neuronal coexpression modules in the iPSYCH Consortium (
23) ASD GWAS cohort (
Fig. 4A and table S3) (12). By contrast, these modules showed no enrichment for MDD, AAD, or IBD. Further, none of the microglial- or astrocyte-specific modules showed psychiatric GWAS enrichment. Extending this analysis to disease-associated rare variants (data table S3) (2, 12), we found that the CD1 neuronal module was enriched for genes harbouring rare, nonsynonymous de novo mutations identified in ASD (OR 1.36, FDR-corrected
P = 0.03, logistic regression) and SCZ cases (OR 1.82, FDR-corrected
P = 0.014) but not unaffected controls (
Fig. 4B). A similar CD1-enrichment was observed for genes affected by rare, recurrent copy-number variation (CNV) in ASD (OR 2.52, FDR-corrected
P = 0.008) and SCZ (OR 2.46, FDR-corrected
P = 0.014). These results suggest convergence of common and rare genetic variation acting to down-regulate synaptic function in ASD and SCZ.
We next used LD score regression (
24) to partition GWAS heritability (
Fig. 4C and data table S4) into the contribution from SNPs located within genes from each module (
Fig. 4D) (12). CD1 again showed significant enrichment for SCZ (2.5-fold, FDR-corrected
P = 8.9 × 10
−11), BD (3.9 fold, FDR-corrected
P < 0.014), and educational attainment (1.9-fold, FDR-corrected P < 0.0008; χ
2 test) GWAS, accounting for ∼10% of SNP-based heritability within each data set, despite containing only 3% of the SNPs. This illustrates how gene network analysis can begin to parse complex patterns of common variants, each of small effect size, to implicate specific biological roles for common variant risk across neuropsychiatric disorders.
These data provide a quantitative, genome-wide characterization of the cortical pathology across five major neuropsychiatric disorders, providing a framework for identifying the responsible molecular signaling pathways and interpreting genetic variants implicated in neuropsychiatric disease risk. We observed a gradient of synaptic gene down-regulation, with ASD > SZ ≈ BD. BD and SCZ appear most similar in terms of synaptic dysfunction and astroglial gene up-regulation, which may represent as trocytosis, activation, or both. ASD, an early-onset disorder, shows a distinct up-regulated microglial signature, which may reflect the role for microglia in regulation of synaptic connectivity during neurodevelopment (19). MDD shows neither the synaptic nor astroglial pathology but does exhibit dysregulation of HPA-axis and hormonal signaling not observed in the other disorders.
Our data suggest that shared genetic factors underlie a substantial proportion of cross-disorder expression overlap. Given that a minority of these relationships represent expression quantitative trait loci (fig. S12), most of the genetic effects are likely acting indirectly, through a cascade of developmental and cell-cell signaling events rooted in genetic risk. Genetic variation is also not the only driver of expression variation; there is undoubtedly a contribution from environmental effects. Hidden confounders could introduce a correlation structure that matches SNP-level genetic correlations, but parsimony and hidden covariate correction suggests that this is unlikely. Diagnostic misclassification could artificially elevate shared signals, but the results are robust to disorder removal (fig. S13), and misclassification would not account for the substantial overlap we observed with ASD, which has a highly distinct phenotypic trajectory from later onset disorders. Last, we have replicated broad transcriptomic and cell type–specific patterns independently for ASD, SCZ, and BD, providing an organizing pathological framework for future investigation of the mechanisms underlying specific gene- and isoform-level transcriptomic alterations in psychiatric disease.