Family and twin studies have established that schizophrenia, bipolar disorder, and major depressive disorder are familial and heritable phenotypes and that genetic factors are the most robustly validated risk factors for each disorder (
1–3). However, several findings have called into question whether these disorders are etiologically distinct. First, several key clinical features, including psychosis, neurocognitive impairment, and suicidality, may be observed in all three disorders. Second, genetic epidemio-logic studies have documented that schizophrenia, bipolar disorder, and major depressive disorder share familial and genetic determinants. Family studies have shown familial coaggregation for schizophrenia and bipolar disorder (
4–6) as well as bipolar disorder and major depressive disorder (
1). In a population-based study of more than 2 million families, Lichtenstein et al. (
7) demonstrated increased risks of schizophrenia among relatives of bipolar disorder probands and increased risks of bipolar disorder among relatives of schizophrenia probands. Comorbidity between the disorders was mainly attributable to overlapping genetic influences. Twin studies have similarly documented substantial shared genetic variance between psychotic disorders and bipolar disorder (
8) and between bipolar disorder and major depressive disorder (
9).
Although family and twin studies can estimate the shared heritability across disorders, they cannot identify the genetic loci contributing to this overlap. To date, evidence implicating specific chromosomal regions and genes in the shared liability to psychotic and mood disorders has largely been limited to linkage and candidate gene association studies. Some of the regions with the strongest linkage evidence for schizophrenia are also among regions most strongly linked to bipolar disorder (
10–12), although simulations suggest that such overlap could easily occur by chance (
7). Several chromosomal microdeletions have also been associated with both mood and psychotic disorders. The balanced translocation ([1, 11][q42;q14.3]) that disrupts the
DISC1 gene was first identified because of its cosegregation with a broad phenotype comprising schizophrenia, bipolar disorder, and recurrent major depressive disorder (
13). The 22q11 microdeletion responsible for velocardiofacial syndrome also appears to confer increased risk of both psychotic and mood disorders (
14,
15). Candidate gene studies have also found association between specific genes and both psychotic and mood disorder phenotypes (
11,
16,
17), although results have been inconsistent (
18,
19).
Genomewide association studies (GWAS), which provide a survey of common genetic variation across the genome, offer a more comprehensive method for identifying risk loci at the genotypic level. Early efforts to apply this technology to major psychiatric disorders have begun to bear fruit, with GWAS implicating several susceptibility genes for schizophrenia (
20), bipolar disorder (
21), and major depressive disorder (
22). A recent analysis examined genewide evidence of association using data from both a bipolar disorder and schizophrenia GWAS, respectively, and found nominal evidence that several genes influence both disorders (
23). Data from the International Schizophrenia Consortium demonstrated that common genetic variation (involving thousands of small-effect alleles) accounts for at least one-third of the total variation in liability to schizophrenia and that these polygenic risks are substantially shared with bipolar disorder (
24). To date, however, no GWAS have been reported that examine cross-disorder analyses of the specificity of genetic influences for all three disorders. In the present study, we report the first genomewide cross-disorder analysis incorporating samples from the three largest treatment effectiveness studies of schizophrenia, bipolar disorder, and major depressive disorder, respectively. To address the issue of multiple comparisons, we utilized a novel approach that examines the patterns of cross-disorder effects in a single-model selection framework that controls the type I error risk at an experiment-wise level.
Method
Clinical Samples
Bipolar disorder (Systematic Treatment Enhancement Program for Bipolar Disorder [STEP-BD]).
STEP-BD was a national, longitudinal public health initiative designed to examine the effectiveness of treatments and their impact on the course of bipolar disorder (
25). Over a 7-year period, 4,361 participants were enrolled across 20 sites and followed for up to 2 years. To maximize external validity, enrollment was offered to all eligible patients seeking outpatient treatment at one of the participating sites (
26). Eligibility for STEP-BD required a consensus DSM-IV bipolar diagnosis on both the Affective Disorders Evaluation and Mini-International Neuropsychiatric Interview-PLUS semistructured interviews, as previously described (
25). From the parent STEP-BD study, 2,089 individuals were enrolled in a genetic substudy.
Schizophrenia (Clinical Antipsychotic Trials of Intervention Effectiveness [CATIE]).
CATIE was a multiphase randomized controlled trial of antipsychotic medications comprising 1,460 individuals with schizophrenia followed for up to 18 months (
27,
28). Final study diagnoses of DSM-IV schizophrenia were established by CATIE clinicians using the Structured Clinical Interview for DSM-IV (
29), including review of all available information (encompassing psychiatric and general medical records). As detailed previously (
27), exclusion criteria included diagnosis of schizoaffective disorder, mental retardation or other cognitive disorder, single psychotic episode, and history of treatment resistance or serious adverse reaction to the study treatments. As previously described (
30), the genetic substudy included 738 cases.
Major depressive disorder (Sequenced Treatment Alternatives to Relieve Depression [STAR*D]).
STAR*D was a multi-site, prospective, randomized multiphase clinical trial of outpatients with nonpsychotic major depressive disorder that enrolled 4,041 participants over a 3-year period (
31). Eligibility required a single or recurrent nonpsychotic major depressive episode (by DSM-IV criteria) and a score of ≥14 on the 17-item Hamilton Depression Rating Scale. Relevant exclusion criteria included history of bipolar disorder, schizophrenia, schizoaffective disorder, or psychosis not otherwise specified. In the genetic substudy, blood samples were collected from 1,953 participants.
Control (National Institute of Mental Health [NIMH] Genetics Repository).
As previously described (
32), control subjects were collected by Knowledge Networks, a survey and market research company whose panel contains approximately 60,000 households representative of the U.S. population. Subjects completed an online psychiatric screen that included questions regarding demographics, ancestry, and DSM-IV criteria for a range of psychiatric disorders. Participants who reported a history of schizophrenia, psychosis, or bipolar disorder were excluded from the GWAS analyses, as previously described (
32). We also excluded individuals (N=126) who met criteria for a history of major depressive episode.
Genotyping
Genotyping of STEP-BD and CATIE samples was performed using the Affymetrix GeneChip Human Mapping 500K Array Set (Affymetrix, Inc., Santa Clara, Calif.), while one-half of the STAR*D sample was genotyped with the 500K array and one-half with the Affymetrix Human Single Nucleotide Polymorphism (SNP) Array, 5.0. Genotyping of the STEP-BD and control samples was performed at the Broad Institute, as previously described (
33). Quality control processing of genotypes was described by Sklar et al. (
32). Genotyping of the schizophrenia sample was performed at Perlegen Sciences (Mountain View, Calif.), as reported elsewhere (
30). Genotyping of the STAR*D sample was performed at Affymetrix (500K) or at the University of California, San Francisco (5.0), as described previously (
34).
Quality control and harmonization of genotype data.
Additional quality control for the combined genotypic data set was performed using PLINK (
35), as previously described (
32).
In brief, individuals were excluded if they had overall call rates <95%, excess or insufficient heterozygosity, or apparent relatedness. We only included non-Hispanic Caucasian individuals with European ancestry based on self-reported race and ethnicity information. The PLINK nearest neighbor method (
35) was used to filter out potential outliers based on the first 10 multidimensional scaling factors. SNPs were excluded if they had a call rate <98%, had a minor allele frequency <1%, were inconsistent with Hardy-Weinberg equilibrium at a p value of <1×10
–6, or showed differential rates of missingness in patient and control cases (
32). After quality control steps were performed, 224,395 genotyped markers were retained, with a total genotyping rate in the final sample >99%. This set of genotyped markers is smaller than that reported for the primary GWAS reports of each sample (
30,
32) because of the aforementioned additional quality control steps. We used BEAGLE, Version 3.1.1, (
http://www.stat.auckland.ac.nz/∼bbrowning/beagle/beagle.html) to impute missing geno-types, with HapMap (Centre d'Etude du Polymorphisme Humain from Utah population, release 23, forward strand) as the reference panel. We excluded SNPs with an imputation quality R
2 score <0.8 and obtained a total of 1,574,154 SNPs for the final analysis. For each SNP, imputed dosage was then summed for each of the five phenotype groups.
Further control for confounding by population stratification was performed by calculating the first 10 quantitative ancestry indices based on the merged data set using multidimensional scaling analysis (
35). To examine the effect of each of the 10 quantitative indices (C1–C10), we calculated the genomic inflation factor (λ) from a GWA analysis of genotypes using each quantitative index as the dependent variable. We plotted the λ's against each of the 10 indices (analogous to a scree plot) and found excessive values only for C1–C4. Thus, we used these four indices as covariates in our GWAS of the phenotypic groups.
Statistical Analysis
Examining the pleiotropic of genetic variants across disorders in the GWAS context requires additional attention to problems of multiple testing and type I error. One approach to the analysis is to conduct a series of pairwise comparisons of the individual disorders and their combinations. However, this would effectively entail conducting multiple GWAS analyses, each of which would require correction for multiple testing. We have taken an alternative approach that involves a single genomewide omnibus test of association across disorders followed by a model selection approach that asks which configuration of phenotypes is most likely to be associated with a given variant. The sequence of analytic steps was as follows:
We first computed a likelihood ratio statistic (the omnibus test statistic) from a multinomial logistic regression in which allele frequencies can vary for each sample (schizophrenia, bipolar I disorder, bipolar II disorder, major depressive disorder, and control) relative to a null model in which allele frequencies are the same across all groups. This 4-df test is essentially a test of whether there is any association between the variant and any of the four target disorders.
Next, for each SNP whose omnibus test resulted in p<5×10-5, we fit nine additional log-linear models corresponding to the following nine patterns of allele frequency configurations: 1) shared by all disorders model ([schizophrenia, bipolar I, bipolar II, major depressive disorder]≠control); 2) shared by psychotic disorders model ([schizophrenia, bipolar I]≠[major depressive disorder, bipolar II, control]); 3) shared by mood disorders model ([bipolar I, bipolar II, major depressive disorder]≠[schizophrenia, controls]); 4) shared by depressive disorders model ([bipolar II, major depressive disorder]≠[bipolar I, schizophrenia, control]); 5) schizophrenia-specific (schizophrenia≠[major depressive disorder, bipolar I, bipolar II, control]); 6) bipolar disorder-specific ([bipolar I, bipolar II]≠[major depressive disorder, schizophrenia, control]); 7) bipolar I-specific (bipolar I≠[bipolar II, major depressive disorder, schizophrenia, control]); 8) bipolar II-specific (bipolar II≠[bipolar I, major depressive disorder, schizophrenia, control]); and 9) major depressive disorder-specific (major depressive disorder≠[schizophrenia, bipolar I, bipolar II, control]).
We then identified the best-fit model based on the Bayesian information criterion. Last, in order to con trol the marker-wise type I error at alpha, only the omnibus test p value (4 df) and the best-fit model are reported as primary results. Thus, the result of this analytic procedure is a p value for the single omnibus test and a best-fit model that indicates the phenotype(s) providing the best fit for genotype-phenotype association for a given variant.
As is customary in pooled GWAS analyses, we do not adjust p values for the prior GWAS results of the individual samples.
Discussion
We combined genomewide association methods and a model selection procedure to examine the specificity of genetic influences on three major mental illnesses: schizophrenia, bipolar disorder, and major depressive disorder. This study is the first, to our knowledge, to systematically examine genetic effects across these disorders in a genomewide framework. We observed genomewide significant evidence of association for SNPs near the
ADM gene on chromosome 11p. The best-fitting model for our strongest single marker results indicated effects specific to bipolar II disorder. Family studies have supported the hypothesis that bipolar I and II disorder are at least partly genetically distinct (
1). Risks of bipolar II disorder tend to be highest among relatives of bipolar II disorder probands as opposed to those with bipolar I disorder or major depressive disorder (
39–44). Twin data support the heritability of bipolar II disorder but also suggest shared genetic influences with bipolar I disorder (
45). Linkage analyses of bipolar II disorder have found modest evidence of linkage on chromosomes 9p and 18q21 (
46). Interestingly, the sixth-ranked region (top SNP: rs11875674; p=3.97×10
-6) in our GWAS is located on 18q21, and model selection implicates a bipolar II disorder effect, although this is 7 Mb from the reported linkage peak.
Our results are the first evidence, to our knowledge, implicating specific genetic variants in the risk for bipolar II disorder and provide support for the hypothesis that the genetic etiology of this disorder is distinguishable from other DSM-IV mood disorders. The genomewide signifi-cant signals we observed are located within an expressed sequence tag (EF537581) that has not been well-characterized. This expressed sequence tag does not appear to be brain-expressed (
47), making it an implausible candidate for bipolar II disorder. The nearest gene to the associated SNPs is
ADM, the gene encoding adrenomedullin. Adrenomedullin is widely expressed in the brain, and mice lacking CNS adrenomedullin exhibit hyperactivity, increased anxiety, and increased sensitivity to the neurotoxic effects of hypobaric hypoxia (
48). Elevations of serum adrenomedullin have been reported in bipolar disorder (
49), and a recent study identified a functional
ADM variant (rs11042725) that was associated with self-reported response to paroxetine among patients with depression (
50). We were unable to examine this SNP (or proxies for it) because it is not included in our GWAS or in the HapMap database. However, the SNPs for which we found association do not appear to be in strong linkage disequilibrium with SNPs in adrenomedullin.
Our results also provide preliminary evidence that several SNPs, including markers within several gene regions, have pleiotropic effects that cross traditional DSM-IV boundaries. Although these results did not reach genomewide thresholds for statistical significance, they are consistent with emerging evidence of shared genetic influences on psychotic and mood disorders. In particular, we note a group of SNPs in intron 1 of
NPAS3 (top SNP with a p value of 3.96×10
-6), for which the best-fitting model indicated effects on schizophrenia, bipolar disorder, and major depressive disorder. The
NPAS3 gene encodes a neuronal transcription factor and was identified as a candidate locus for schizophrenia after a balanced translocation (t [9, 14][q34.2;q13]) disrupting the gene was observed to segregate with schizophrenia and learning disability (
37). Mice deficient for
NPAS3 display behavioral abnormalities, including altered prepulse inhibition, impaired recognition memory, and dysregulation of glutamatergic, dopaminergic, and serotoninergic neurotransmission as well as diminished hippocampal neurogenesis (
51–53). Pickard et al. (
38) recently reported evidence that haplotypes of
NPAS3 are associated with both schizophrenia and bipolar disorder, although the SNPs reported in
Table 2 of the present study are not in linkage disequilibrium with those reported by Pickard et al. Nevertheless, our evidence that variation in
NPAS3 may have pleiotropic effects on mood and psychotic disorders is intriguing, given the diverse neurobiologic functions of this gene.
Our results have several implications for future genetic studies of major mental illness. First, the analytic approach illustrated provides a basis for identifying genes that underlie the common genetic basis of schizophrenia, bipolar disorder, and major depressive disorder. It is likely that genes influencing these disorders consist of a combination of disorder-specific and shared susceptibility loci. Similar evidence for cross-disorder genetic effects have emerged from GWAS of other medical disorders, most notably autoimmune diseases where specific genetic markers are strongly associated with multiple, clinically distinctive conditions (
54,
55). For example, nearly one-half of the risk alleles identified as influencing either type I diabetes or celiac disease have been shown to influence both disorders (
55,
56). In addition, GWAS have demonstrated that a coding SNP in the
PTPN22 gene is associated with reduced risk of Crohn's disease but increased risk of systemic lupus erythematosis, rheumatoid arthritis, type I diabetes, and Graves disease (
54). Second, our results may inform case definition in future genetic studies of schizophrenia, bipolar disorder, and major depressive disorder. Specifically, the power of future analyses may be enhanced by combining affecteds across disorders for loci that appear to have cross-disorder effects. Finally, identifying susceptibility loci that have pleiotropic effects may reveal shared etiologic mechanisms underlying multiple diseases. Biological pathways involving these loci may represent targets for the development of broad-spectrum treatments with efficacy for a wider range of psychopathology than currently available options. We note that the STAR*D sample excluded patients with psychotic depression. While this would not bias our results, it might have implications for their generalizability. To the extent that psychosis itself may be under genetic influence, this exclusion might have made the diagnostic groups more distinct than they would have been had psychosis been present in the major depressive disorder group.
Our results should be considered in light of several limitations. First, given the available sample size, our study had limited power to detect modest genetic effects or rare susceptibility variants using a strict genomewide threshold for significance. Because this is the first report of its kind and the goal was to characterize genotype-phenotype relationships, we present results down to a threshold of p<5×10
-5 to inform future analyses. Although our study detected only one region exceeding genomewide significant thresholds, recent pooled analyses of GWAS data have demonstrated that many variants that fall short of this threshold in a single study emerge as confirmed susceptibility loci when additional data sets are analyzed (
20,
21). Thus, the top hits in our study provide preliminary evidence that awaits replication in future studies. A large-scale effort to use genomewide data to dissect within-disorder and cross-disorder influences has begun through the Psychiatric GWAS Consortium (
57). Our results may inform these analyses, which ultimately may include GWAS data on >80,000 individuals across five disorders: schizophrenia, bipolar disorder, major depressive disorder, autism, and attention deficit hyperactivity disorder (
57).
A second limitation is that by design, our model fitting approach did not involve statistical testing of each model versus the null hypothesis. Instead, to control inflation of type I error, we performed only the single omnibus test followed by model selection. A single test of the true underlying model would be expected to have greater power than the omnibus test. However, since the true model is unknown a priori, we would have to test multiple models to cover the range of possible true models. We have shown by simulation that the omnibus test has power comparable to a test of the true model, especially when the latter is corrected for multiple testing (unpublished data available upon request from Purcell et al.).
In summary, in a genomewide association analysis of three major psychiatric disorders, we identified a region of genomewide significant association on chromosome 11p15, near the ADM gene, which appears to be specific to bipolar II disorder. We also found preliminary evidence for both cross-disorder and disorder-specific influences on schizophrenia, bipolar disorder, and major depressive disorder. Consistent with previous reports in independent samples, we observed evidence that variation in NPAS3 has effects that transcend DSM-IV diagnostic categories. The methods illustrated here may inform larger-scale efforts to examine cross-disorder genetic effects.
Acknowledgments
The authors thank Rutgers Cell and DNA Repository for extracting DNA and providing samples. The STEP-BD project was funded in whole or in part with Federal funds from the National Institute of Mental Health (NIMH), National Institutes of Health, under contract N01MH-80001 to Gary S. Sachs, M.D. (principal investigator), Michael E. Thase, M.D. (co-principal investigator), and Mark S. Bauer, M.D. (co-principal investigator). Active STEP-BD sites and principal investigators included: Baylor College of Medicine (Lauren B. Marangell, M.D.); Case University (Joseph R. Calabrese, M.D.); Massachusetts General Hospital and Harvard Medical School (Andrew A. Nierenberg, M.D.); Portland VA Medical Center (Peter Hauser, M.D.); Stanford University School of Medicine (Terence A. Ketter, M.D.); University of Colorado Health Sciences Center (Marshall Thomas, M.D.); University of Massachusetts Medical Center (Jayendra Patel, M.D.); University of Oklahoma College of Medicine (Mark D. Fossey, M.D.); University of Pennsylvania Medical Center (Laszlo Gyulai, M.D.); University of Pittsburgh Western Psychiatric Institute and Clinic (Michael E. Thase, M.D.); and University of Texas Health Science Center at San Antonio (Charles L. Bowden, M.D.). Collection of DNA from consenting participants in STEP-BD was supported by N01-MH-80001 (Gary S. Sachs, M.D., principal investigator). The Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study was supported by federal funds from NIMH under contract N01 MH-90003 to the University of Texas Southwestern Medical Center at Dallas (A. John Rush, M.D., principal investigator). Genotyping of STAR*D samples was funded by NIMH (R01 MH-072802; Steven P. Hamilton, M.D., Ph.D., principal investigator).The CATIE project was funded by NIMH contract N01 MH-90001. Control subjects from the NIMH Schizophrenia Genetics Initiative (NIMH-GI) data and bio-materials were gathered by the Molecular Genetics of Schizophrenia II (MGS-2) collaboration. The investigators and co-investigators are: Evanston Northwestern Healthcare/Northwestern University, Evanston, Ill., MH-059571, Pablo V. Gejman, M.D. (collaboration coordinator; principal investigator), Alan R. Sanders, M.D.; Emory University School of Medicine, Atlanta, MH-59587, Farooq Amin, M.D. (principal investigator); Louisiana State University Health Sciences Center, New Orleans, MH-067257, Nancy Buccola A.P.R.N., B.C., M.S.N. (principal investigator); University of California-Irvine, Irvine, Calif., MH-60870, William Byerley, M.D. (principal investigator); Washington University, St. Louis, Mo., U01 MH-060879, C. Robert Cloninger, M.D. (principal investigator); University of Iowa, Iowa City, IA, MH-59566, Raymond Crowe, M.D. (principal investigator), Donald Black, M.D.; University of Colorado, Denver, Colo., MH-059565, Robert Freedman, M.D. (principal investigator); University of Pennsylvania, Philadelphia, MH-061675, Douglas Levinson, M.D. (principal investigator); University of Queensland, Queensland, Australia, MH-059588, Bryan Mowry, M.D. (principal investigator); Mt. Sinai School of Medicine, New York, MH-59586, Jeremy Silverman, Ph.D. (principal investigator).