Published Online: 1 July 2010

Genetic Studies of Schizophrenia and Bipolar Disorder


Since the early 20th century, both schizophrenia (SZ) and bipolar disorder (BP) have been observed to run in families. This link was conclusively determined to be at least in part genetic in a number of twin and/or adoption studies. With the introduction of genetic polymorphisms, the identification of these factors accelerated. Dozens of whole-genome linkage studies clearly implicated a few genomic regions in each disorder, such as 13q, 8q, and 6q in BP and 5q, 6p, and 8p in SZ. Even then, debates about their etiological relationship followed findings of shared linkage regions. Since 2002, genes such as DTNBP1, NRG1, and G72/G30 have been identified in schizophrenia samples in regions of linkage and subsequently replicated in independent samples, including some BP samples. Since 2007, whole-genome association studies have been performed in ever larger samples. These have led to some novel and subsequently replicated findings such as ZNF804A markers associated in both disorders, the identification of widespread association in the major histocompatibility complex, and genome-wide correlation in small-effect polygenic variation between them. The identification of numerous copy number variants on a genome-wide scale in independent samples highlighted commonalities between SZ and developmental and learning disorders. Although the overlap in genetic findings between BP and SZ is incontrovertible, it remains to be established whether this is due to the presence of a single underlying genetic signature manifesting as a clinically variable phenotype (i.e., a unitary psychosis model). The impact of psychotic BP in many BP samples on overlap findings has not been fully investigated. A change in nosological conventions based on these data is premature, and careful examination of clinical subtypes in large samples could be informative. At the current time, large-scale sequencing and investigations of epigenetic factors, gene-gene, and gene-environment interactions, and other sources of complexity are in very early stages but are likely to contribute substantially in the future.
Schizophrenia (SZ) is marked by positive, negative, and disorganization symptoms (1) as defined in DSM-IV-TR. The defining characteristics of bipolar disorder (BP) in DSM-IV-TR are episodes of depression alternating with mania and/or hypomania positive (psychotic) symptoms and thought disorder are also common. Schizoaffective disorder and BP with psychotic features share characteristics of both disorders, as they may include mood and psychotic symptoms in varying degrees of admixture. Both are potentially severe neuropsychiatric conditions, the etiology of which is poorly understood. Genetic studies may be particularly fruitful endeavors in such conditions, as the underlying etiology might be illuminated by a detailed knowledge of risk alleles. Furthermore, both disorders are diagnosed solely on the basis of behavioral signs and symptoms as neither currently has a specific, clearly defined, pathophysiological foundation on which to base diagnostic tests. Because genetic studies inherently examine biological markers, they may provide clues in the development of such tests, which may be clinically useful in risk prediction.
The distinction between SZ and BP is generally credited to Emil Kraepelin, who distinguished dementia praecox from manic-depressive illness on the basis of course of illness (2). He observed the former, which we now refer to as SZ, as having a chronic, deteriorating course, whereas the latter, now called BP disorder, had a relapsing-remitting course. These two illnesses have generally been thought to be genetically distinct, as numerous epidemiological studies failed to convincingly demonstrate familial coaggregation until very recently. Their etiological distinctness has been further supported by the fact that antipsychotics and mood stabilizers are the mainstays of the treatment of SZ (3) and BP (4), respectively, although both drugs of both classes are often used to varying degrees in the course of both conditions. However, recent studies have strongly suggested significant commonalities in their genetic underpinnings. Furthermore, both disorders are associated with less severe manifestations segregating in family members, suggesting a spectrum phenomenon. Unipolar depression occurs more frequently in relatives of BP probands (5). In SZ, family members have an increased risk of delusional disorder, as well as schizotypal, paranoid, schizoid, and avoidant personality disorders (69), whereas clinically unaffected relatives have increased levels of schizotypal traits (e.g., magical thinking, odd speech, and odd behavior) (10). These findings taken together suggest that current systems of operationalized criteria for categorical disease traits may not adequately reflect their underlying genetic architecture. This hypothesis has been widely recognized and is informing present debate in the development of DSM-5 (11).
Genetic studies in general have proceeded separately for each illness. Only in the last few years has it been feasible to examine the genetic overlap of these two illnesses. Two main factors have contributed to this. First, major technological advances have made it possible to genotype more than a million markers using GeneChips or DNA microarrays. This ability has allowed unprecedented sensitivity in detecting allelic association signals on a genome-wide basis. Second, informatics, database, and web-based applications have matured greatly. Sharing very large data sets within and across countries or continents has become routine, allowing individual samples to be combined into ever larger pooled data sets, with a concomitant considerable increase in statistical power. In this article, the current status of genetics studies of both disorders will be reviewed. Particular attention will be paid to studies of their genetic relationship.


Family studies test whether the family members of affected probands have an increased risk of the disorder compared with the general population. Whereas establishing an increased risk suggests that genetic factors are etiologically relevant, familial aggregation of illness can also be caused by environmental factors affecting sibships or even whole families. Adoption studies can separate the family environment from genetic causes by establishing differences in risk between adoptive and biological relatives (12). Twin studies exploit the fact that monozygotic twins share 100% of their DNA sequences, whereas dizygotic twins share only 50% on average (13). This 2:1 ratio of genetic resemblance makes it possible to calculate the proportion of risk to any trait or disorder that is due to genetic factors. This proportion is known as the heritability, often called h2. It is also possible to calculate c2, the proportion of risk due to environmental factors shared by family members (e.g., nutrition, parenting, socioeconomic status, infectious agents, and others).


A recent family study in Sweden examined approximately 9 million subjects in approximately 2 million families using national registries (14). First-degree relatives of individuals with SZ had an approximately ninefold increase in risk compared with that of the general population, confirming the findings of previous studies. Adopted-off offspring of schizophrenic mothers have an elevated risk of SZ compared with adopted-off offspring of control subjects (15). Furthermore, schizophrenic adoptees have been found to more frequently have schizophrenic relatives than do control adoptees (7, 16). A number of twin studies have been performed in SZ (17). Individually, there has been very little evidence of the importance of common environmental factors. However, a meta-analysis of 14 twin studies from 6 countries has demonstrated a heritability of 81%, as well as a significant common environmental component of 11% (17). This fact may have gone undetected in individual studies due to insufficient power and suggests that it may be possible to identify risk factors in the family environment and develop interventions against them, which was further supported in the Swedish population study (13).

Bipolar disorder

The risk of BP in first-degree relatives of BP probands approaches 10-fold that of the general population (5). This fact has been confirmed in the recent large Swedish population study as well (14). Twin studies indicate that the heritability of BP is high, estimated at 80–90% (18, 19). Twin studies do not support the importance of common environment, but the Swedish population study does (14). Although the risk of major depressive disorder in relatives of BP probands is increased, there is evidence of nonshared genetic factors for mania and depression (19). Adoption studies in BP have largely been underpowered (18).


Linkage is defined as the presence of two or more loci on the same chromosome. By testing for cosegregation within families of diseases and alleles of specific marker loci of known location, the probability of a disease gene being on the same chromosome as (i.e., linked to) that marker can be calculated. Levels of significance have been based on the likelihood of linkage versus the likelihood of nonlinkage, called the logarithm of the odds (lod) score (20). Linkage studies of mendelian traits have demonstrated a high degree of sensitivity and accuracy in localizing genomic regions harboring susceptibility genes, often leading to the identification of the genes themselves, as in the case of Huntington's disease (21). Linkage studies in complex traits in general and psychiatric disorders in particular have been hampered by low power and probable genetic heterogeneity (22). Numerous linkage studies have been published in BP and SZ (2326). These have led to a number of reports of suggestive linkage, defined as a lod score expected once per genome scan (27). However, there has been a disappointing level of replication in independent samples, as well as relatively few findings of significant linkage, defined as a lod score expected once per 20 genome scans (27). These results have been attributed to low power or the presence of different risk genes in different populations (i.e., genetic heterogeneity). Another limitation of linkage studies is their low spatial resolution (28). Regions of linkage to a trait can be tens of millions of markers (megabases) in length and display considerable stochastic variation in locations of maximum linkage (29). Such limitations make identifying linkage the first step in a long process in gene discovery. Despite these limitations and the apparent eclipse of linkage studies by genome-wide association studies (GWASs), which can examine common alleles only, linkage studies may still have a role in identifying genomic regions harboring rare alleles that may have an effect in one or a few families.


More than two dozen genome-wide linkage studies of SZ have been published, as has been extensively reviewed (25, 30). The first meta-analysis used the multiple scan probability (MSP) method, which combines the region-specific p values of individual studies (26). The most significant regions of the genome were 8p, 13q, and 22q. The second used the genome scan meta-analysis (GSMA) method, which determines the significance of chromosomal regions based on how often they are highly ranked, across genome scans (25). The GSMA reported 2q as the most significant region, although few individual studies reported linkage there. In addition, less stringent significance criteria were met by 5q, 3p, 11q, 6p, 1q, 22q, 8p, 20q, and 14p (25).

Bipolar disorder

More than 20 genome-wide linkage scans have been conducted in BP, the vast majority in European ancestry populations. As with SZ, no chromosomal region has been uniformly supported across all scans, and significant linkage has been reported rarely. Regions that have been supported in several independent samples have included 1q, 4q, 6q, 8p, 8q, 10q, 11p, 12q, 13q, 18p-q, 21q, and 22q (24). Three meta-analyses have been published. As with SZ, the MSP (26) and GSMA (24) methods have been used. However, unlike in SZ, a more powerful method, that of pooling marker data (PMD), has also been used (23). In this method, genetic marker data from multiple studies are pooled and analyzed de novo using a sex-averaged marker map. In the PMD meta-analysis, a model-free lod score of 4.19 was obtained at chromosome 6q22. The 6q region had previously demonstrated significant linkage in two individual studies (31, 32). Chromosome 8q was the only other region demonstrating significant linkage in the PMD analysis.


Association studies are based on allele frequency differences between case patients and control subjects. Association studies have a much higher spatial resolution than linkage studies and can implicate specific risk genes or alleles. Furthermore, association is more powerful than linkage (33). Candidate gene studies select markers in specific genes for association analysis based on a priori evidence of involvement of biological systems containing such genes in the etiopathogenesis of a given disorder. Candidate gene selection is difficult in neuropsychiatric disorders because etiopathogenesis is poorly understood. However, a number of hypotheses have been generated and tested in the last several decades, such as the neurodevelopmental (34) and dopamine (35) hypotheses of SZ, as well as the monamine (36), neuroendocrine (37), and more recently, the cellular signaling and plasticity hypotheses of BP and other mood disorders (38, 39).
In more recent years, significant progress has been made in positional cloning, or association mapping, approaches. These studies usually begin with regions linked to the disorder using genome-wide linkage surveys, which use a few hundred microsatellite markers. Additional markers, including single nucleotide polymorphisms, are typed to allow for greater spatial resolution of the linkage “signal” in a process called “fine mapping.” An increasing density of markers within the linked region makes it possible to test specific genes for association. This, when combined with confirmatory approaches of such association in independent samples, expression studies, or other types of in vitro functional studies (40, 41), allows for the identification of specific risk genes or alleles. Such genes may suggest novel neurobiological pathways that may themselves become candidate etiopathogenic mechanisms.
Hundreds of association studies have been published in the last two decades. Hence, our coverage in this review will necessarily be selective. The most promising candidates, which have been supported by numerous replicated studies and/or meta-analyses, are discussed. No association studies have yet reported a sequence variant that is directly causal, i.e., either causing an amino acid substitution, frameshift, or expression change that directly results in a biological defect that can explain the symptoms of the illness. It is noteworthy that GWASs (reviewed below), which generally use denser genotyping per gene than candidate gene studies, have not supported previously associated candidate genes. They have, rather, pointed to previously unsuspected genes and intergenic regions.


More than 900 genes have been tested in SZ in worldwide samples (42). The majority, however, have been in populations of European and East Asian ancestry. The candidate gene approach in SZ has not met with much success. The first genes identified using positional cloning have been in the 6p and 8p linkage regions. Dystrobrevin-binding protein 1 (DTNBP1) (43) and neuregulin-1 (NRG1) (40) were identified in the Irish and Icelandic populations, respectively, in 2002. Since then, they have been tested and replicated in numerous other samples (44, 45). However, the pattern of replication has been complicated by frequent failures to replicate. Furthermore, there has been an inconsistent pattern of associated alleles and haplotypes across samples in different populations. This has seldom been studied in depth. However, a recent meta-analysis of 32 studies of DTNBP1 indicated the presence of distinct association subpopulations that were not obviously explained by ethnicity (44). More thorough investigation into the causes of heterogeneity will necessitate denser genotyping of samples in which association has been observed.
Several other genes have been identified in regions of linkage and subsequently replicated in multiple samples. Regulator of G-protein signaling 4 (RGS4) (46, 47) and NRG1 (45) have both been supported by meta-analyses of associated haplotypes, but not individual SNPs. G72/G30 has been supported by a meta-analysis examining individual SNPs in both European ancestry and Asian populations (48). A meta-analysis of nine central European samples demonstrated relatively homogeneous association in 10 SNPs in disrupted in schizophrenia 1 (DISC1) (49). Although not discovered using association mapping, DISC1 occurs in the 1q42 region, which has been linked to SZ in several samples (50, 51). This gene has been supported not only by replicated association, but also by neurobiological studies. It is an interacting partner with PDE4B, which, along with DISC1 itself, is involved in cAMP signaling (52). Animal models of DISC1 underexpression are associated with brain structural abnormalities and concomitant behavioral deficits suggestive of aspects of the SZ and/or BP phenotypes (53, 54). A number of other genes have received support as well, including Akt1 (55, 56) and catechol-O-methyltranferase (COMT) (57, 58). The status of studies of these genes, including functional studies, has been recently reviewed (59).

Bipolar disorder

A number of SZ susceptibility genes have also been associated in BP. The strength of the evidence, however, is generally not as strong as that in SZ. For example, a meta-analysis of markers in the gene for G72/G30, which was initially identified in SZ and subsequently replicated in both disorders, was significant in SZ, but not in BP (48). This finding may have been due to power differences, as the sample size of SZ subjects was severalfold that for BP subjects. DISC1 (60, 61) and NRG1 (62, 63) have both been associated in more than one independent sample. However, consistent associations of alleles and haplotypes have been as elusive as in SZ. Meta-analyses have yet to be published for these genes, and they have received little support in GWASs.
A number of genes have been studied because of their function in pathways involved in biological systems thought to mediate mood symptoms. These include monoamine oxidase A (MAOA), which is involved in the degradation of biogenic amines, such as neurotransmitters. Very few individual studies have reported significant association, but a meta-analysis reported association with a CA repeat polymorphism (64). The serotonin transporter (SERT) is one of the main targets of antidepressant medications and is therefore a natural candidate gene for mood disorders. A 44-bp insertion/deletion in the promoter region (5HTTLPR), thought to alter transcription levels, has been supported in two of three meta-analyses (65, 66). The dopamine type 2 receptor (DRD2) has been studied in multiple samples, because of its blockade by antipsychotic medications. A multicenter European study reported significant association in the 5-repeat allele of a CA repeat, which has been the most studied marker in this gene (67). A number of other genes have been tested, as discussed comprehensively elsewhere, but results have shown moderate or inconsistent levels of association (68). These include DTNBP1, TPH, DRD4, and BDNF.


GWASs of numerous disorders have been published. The National Human Genome Research Institute GWAS catalog ( lists 583 publications as of June 14, 2010. The objective of these studies is to survey the entire genome, in the most systematic and unbiased way possible, for specific alleles associated with a disease (69). They have most commonly been performed in case-control samples, in which allele frequencies in case patients are compared with those of control subjects, much as in other association studies. However, family-based methods, examining the transmission of alleles from parents to affected offspring, may also be used. To cover as much of the entire genome as possible, approximately 1 million SNP markers are used across all chromosomes, including the X, Y, and mitochondrial chromosomes. This unprecedented density of markers is possible using GeneChips, the most recently developed of which are the Affymetrix 6.0 and Illumina 1M chips, which can genotype 906,000 and 1,199,187 SNPs, respectively. Although there are many times more SNPs in the genome, because of the correlation of SNPs close to each other (called linkage disequilibrium) (70), it is possible to detect association signals throughout the genome with only a subset of existing SNPs.
Because of the extremely large number of markers tested in a typical GWAS, the problem of multiple testing is a daunting one. An early and influential benchmark has been p<1×10−5 and p<5×10−7 as moderately strong and strong evidence for association, respectively, as suggested and used by the Wellcome Trust Case Control Consortium (WTCCC) (71). Naturally, such stringent significance thresholds necessitate samples of much greater power than in previous association studies. Such sample sizes have been made possible by combining samples from multiple sites. Nevertheless, achievement of these criteria has been rare. Many studies report not only those markers that meet criteria, but also clusters of markers, which individually have lower statistical significance but may index regions that could be pathologically relevant. The review below is necessarily selective, as numerous regions or markers are often reported in any given study and discussion of all of these is beyond the scope of this article. An attempt has been made to emphasize regions/genes with high levels of clustering (i.e., numerous associated markers close to each other), biological significance, or both.


A total of seven GWASs of SZ have been published to date. The first, in a sample of 178 case patients and 144 control subjects, reported a SNP near colony stimulating factor 2 receptor alpha chain (CSF2RA), which is in the pseudoautosomal region of the X and Y chromosomes (72). A study based on pooled DNAs observed association in the reelin (RELN) gene, which was specific to females, in an Ashkenazi Jewish sample. This was replicated in three European and one Chinese sample. RELN is known to be involved in corticogenesis, and a mutation is associated with lissencephaly in humans (73). The Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) study, one of the largest studies of antipsychotic efficacy ever performed, also contained a genetic substudy (74). This sample formed the basis of a GWAS of SZ, which reported no SNPs with genome-wide significance. However, it may be much more informative as a genetic study of treatment response and adverse effects. In a U.K. sample, a zinc finger protein gene of unknown function (ZNF804A) was identified and then subsequently replicated in two independent replication samples which included European, Ashkenazi, Chinese, and Japanese subsamples (75). In 2009, three large GWASs of SZ performed in three independent consortia were published in the same issue of Nature (7678). The International Schizophrenia Consortium (ISC) combined samples from several sites in the United Kingdom, Ireland, Bulgaria, Sweden, and Portugal, comprising 3,322 case patients and 3,587 control subjects (76). Significant association was observed at a marker in the major histocompatibility complex (MHC) region, very close to NOTCH4. However, 450 SNPs were associated to a lesser degree throughout the MHC region. The two other GWASs were in the Genetic Analysis Information Network (GAIN), a multicenter U.S. and Australian sample (77), and S-GENE, a European consortium (78). Neither of these two samples yielded genome-wide significant markers on its own. However, each of the groups shared at least some of their data with the other two, allowing for an unprecedentedly large meta-analysis of 8,008 case patients and 19,057 control subjects. This resulted in multiple SNPs in the MHC region meeting the criteria for genome-wide significance, as well as additional significant SNPs in or near neurogranin (11q24.2), involved in N-methyl-d-aspartate signaling, and TCF4 (18q21.2), which is essential to normal brain development. The ISC study further described a novel method in which subjects are “scored” for the number of risk alleles they carry. This method is used to test whether a set of markers only nominally associated in one sample can predict the risk of illness in a second, independent GWAS sample. Furthermore, the illness in the second sample may be the same as or different from that tested in the initial sample. With this method, the authors found evidence of the effects of possibly thousands of SNPs, supporting the long-held polygenic theory of SZ. They also observed significant genetic overlap between SZ and BP, using the WTCCC and Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) samples (covered below). Such overlap was not seen in a range of nonpsychiatric disorders, such as type II diabetes and hypertension, which were also studied in the WTCCC. This result suggested that the genetic risk to both disorders resulting from common polymorphisms was shared and that this was not due to an artifact of the DNA microarray or statistical methods used.

Bipolar disorder

To date, six GWASs using individual genotyping have been published, as well as one using pooled genotypes. The first study, the WTCCC, used 1,900 case patients and 3,000 control subjects (71). The latter were shared among GWASs of other disorders as well. It reported a SNP on 16p12 near the genes PALB2, DNUFAB1, and DCTN5. The latter interacts with DISC1. The second study combined the DNA of 461 case patients and 562 control subjects into 50–80 sample pools and used a German sample of 772 case patients and 876 control subjects for replication (79). Several markers were associated in the diacylglycerol eta (DKGH) gene, which is part of the lithium-sensitive phosphatidyl inositol pathway, and is located in chromosome 13q14, which has been linked to BP (26). The third combined a London sample with that of STEP-BD, a longitudinal U.S. cohort study assessing treatment response in BP, as well as replication samples from the United States and Scotland (80). This is often referred to as the STEP-UCL study. Haplotype analyses of SNPs in MYO5B were significant genome-wide. This gene is involved in vesicle trafficking. Other genes that were implicated in this study included TSPAN8 and EGFR. This study was followed by one that added additional samples from Edinburgh, Scotland, Dublin, Ireland, and another STEP-BD subset to the original STEP-UCL sample (81). The latter study resulted in association of ANK3, which regulates voltage-gated sodium channels in axons, as well as confirmation of CACNA1C, which had previously showed suggestive associations in the WTCCC and STEP-UCL studies. This class of calcium channels mediates a number of processes in neurons. Interestingly, calcium channel blockers have been considered a possible treatment for BPs, although their use for this indication is probably now rare. The fifth study was also funded by the GAIN network, and included samples of both European American (EA) (1,001 case patients and 1,033 control subjects) and African American (AA) subjects (345 case patients and 670 control subjects) (82). With use of haplotype analysis, it reported evidence consistent with ANK3 harboring risk variants. However, it failed to support CACNA1C. For the EA sample, the most associated SNP was in an intergenic region of chromosome Xq27.1. In the AA sample, the most associated SNP was in the DPY19L3 gene at 19q13.11. Several regions that contained clusters of SNPs with low p values (p<1×10−4) were also observed. Of these, the NAP5 gene was associated in the EA and the NTRK2 gene in the AA sample. The latter is a tyrosine kinase receptor that binds to brain-derived neurotrophic factor (BDNF), an important neurotrophin. Haplotype analysis also implicated SLITRK2, which is known to regulate neurite outgrowth. The seventh and most recent GWAS of BP was conducted in a combined sample from the National Institute of Mental Health and Prechter repositories as well as GlaxoSmithKline along with the WTCCC (83). The strongest association in the combined sample of 3,683 case patients and 14,507 control subjects was in an intron in the gene MCTP1, which binds Ca2+. Other regions containing clusters of strongly associated SNPs included 1p31.1, 3p21, and 5q15. Several of the genes located in associated regions were noted to have effects on neural development and plasticity.


Cytogenetic studies examine the structure of chromosomes to determine whether chromosomal segments have either been deleted or enlarged or have changed location. One of the most prominent among such findings has been a deletion on chromosome 22q11 known to cause velocardiofacial syndrome. In addition to the learning disability, palatal defects, and craniofacial abnormalities seen in this syndrome, about one-third of patients have psychiatric conditions, including SZ (84) and autism (85). This region contains genes that have been associated with SZ, including COMT and PRODH (59). Another seminal finding involved the discovery of DISC1. This gene, along with DISC2, is disrupted in a balanced translocation between chromosomes 1 and 11. The translocation was discovered in a Scottish family that segregated SZ, BP, and major depressive disorder (86).
Duplications are increases in the number of times a repetitive DNA segment occurs. Deletions, on the other hand, are decrements in the number of times such a segment occurs. Together, these are called copy number variants (CNVs). One of the benefits of DNA microarrays used in GWASs is that they not only allow the simultaneous genotyping of approximately 1 million SNPs, they also can be used to infer copy number using the intensity values observed for individual SNPs. For example, four copies of a given allele will have greater intensity than two copies, and so on. This same technology has therefore led to a rapid increase in studies of both common SNPs and rare CNVs. Other technologies, such as array-comparative genomic hybridization may also be used. The last several years have witnessed a significant proliferation of CNV studies in both BP and SZ. One study group, the ISC, has published genome-wide association (76) and CNV studies (87) using the same DNA microarray data. Up to now there have been considerably more CNV studies in SZ than in BP. As with GWASs, these studies of necessity use very large sample sizes but for different reasons. Because individual CNVs can be quite rare, e.g., they may appear only a few times in several thousand subjects, the effect sizes must therefore be large to be detectable, which had previously been very rarely observed in psychiatric genetics.


A number of studies have reported that patients with SZ have a greater number of CNVs (i.e., CNV burden) than control subjects (87, 8890). Interestingly, the CNVs that had the largest effect were present in genes and also were the rarest CNVs, occurring in only one individual (i.e., singletons). Pathway analyses of genes containing CNVs demonstrated an enrichment if genes were involved in synaptic transmission or neurodevelopment (90). Several specific deletions and duplications have been found to be recurrent, i.e., occurred in several individuals. Some of these have now been replicated in independent samples. As with GWASs, the earlier genome-wide CNV studies included at most a few hundred case patients and control subjects but were able to detect significant differences in CNV burden between the groups.
In 2008, two comparatively very large studies were published in the same issue of Nature (86, 88). These two identified the 1q21.1 deletion. It was rare, at 0.2% in control subjects, and had an odds ratio (OR) of 9.1. A combined analysis of these two samples, the ISC and Steffanson et al. studies (86, 88), along with a Cardiff sample, strongly supported a deletion at 15q13.3 (91). It was rare in schizophrenic patients, at 0.2% frequency, but more than 10-fold rarer in control subjects, yielding an OR of 11.4. This deletion has been found in non-SZ samples as well, where it has been associated with variable phenotypes including intellectual dysfunction, autism, and attention deficit/hyperactivity disorder (92). Recently, evidence for a deletion at 16p11.2 was reported in a meta-analysis to have an OR of 8.4 (93). This deletion has also been found in autism and intellectual disability (94). In addition to larger CNVs deleting one or more genes, some CNVs may be smaller and may only disrupt a single gene. One prominent example is in neurexin 1 (NRXN1), which along with other neurexins, induces postsynaptic differentiation in dendrites. Several independent groups have identified SZ-related CNVs in this gene (90, 95, 96). One study examined 2,977 case patients and 22,746 control subjects from seven European populations, focusing specifically on the neurexins. Patients with SZ were much more likely to have exonic CNVs in NRXN1 than control subjects (0.24%–0.015%, OR=9) (97). The common finding that SZ-related CNVs are also associated with other developmental or behavioral disorders has driven hypotheses about their variable expressivity. For example, it may be possible that environmental, epigenetic, or stochastic factors can shape the form and outcome of a neuropsychiatric condition once a pathway has been interrupted by a specific CNV (98).

Bipolar disorder

The GAIN BP sample was analyzed for singleton deletions and duplications (i.e., those that appear only once in a data set) (99). Although no significant differences in number of duplications were found in patients with BP, the number of singleton deletions was enriched in case patients versus control subjects (0.18 versus 0.13, p=0.01). When only those with onsets at age 18 or younger were included in analysis, the proportion of case patients with deletions was even greater. Using Ingenuity Pathways Analysis software, the authors observed that genes disrupted in case patients, compared with those in control subjects, were overrepresented in pathways important to “psychological disorders and behaviors such as learning” (99).


Family studies have been used to determine whether the relatives of patients with SZ have an increased risk of BP and vice versa. Several studies have found an increased risk of psychotic BP in the relatives of SZ probands, but a number of studies have failed to find such familial coaggregation of SZ and BP in general (100, 101). Recently, a record linkage study in Sweden examined more than 2 million families merged in a multigenerational population registry with a national hospital discharge registry, which contained information on specific diagnoses (14). Significant familial coaggregation of these two disorders was observed. The authors were able to use the information available on family relationships (e.g., sibling versus half-sibling, parent, and others) to estimate the heritabilities of both disorders. Furthermore, they estimated the proportion of risk attributable to genetic factors common to both at 63%. This influential study of unprecedented size suggested that previous studies examining familial coaggregation of SZ and BP were underpowered to detect a true phenomenon.
Some chromosomal regions have been implicated in linkage studies of both disorders. Among these are 13q (26). Interestingly, one of the genes in this region, G72/G30, has also been associated with both disorders (48). Schulze et al. (102) reported that markers in this gene were associated with a subset of a German bipolar sample who had persecutory delusions. This finding was replicated in a Polish sample and suggested that the genetic overlap between BP and SZ may in part be mediated by psychotic symptoms common to both. A number of other genes have also been implicated in both disorders, such as NRG1 and DISC1, as discussed above in “Association Studies.”
GWASs have provided an unprecedented opportunity to investigate the genetic overlap between disorders on a genome-wide scale. Evidence for association in a GWAS in a U.K. SZ sample for ZNF804A increased when BP was added to SZ in the phenotype definition (75). A similar pattern was observed in an independent sample (103). Whereas twin studies can indicate genetic correlation between disorders, they can only infer the aggregate additive genetic effects using the correlations between relatives, such as monozygotic and dizygotic twins. Twin studies, furthermore, rest on the equal environment assumption, which posits that common environmental sharing is the same magnitude in monozygotic as in dizygotic twins, but this theory has not always been supported in all disorders (104). GWASs, on the other hand, can use the actual marker data in cohorts of unrelated subjects, allowing specific genomic regions causing overlap to be identified. Recently, the ISC described a novel method to test for genetic overlap between phenotypes using GWAS data. They reported that the observed polygenic risk for SZ was substantially shared with that for BP, using two independent BP GWAS samples, STEP-BD and WTCCC (76).
Discussion of the nosological implications of genetic overlap has accelerated since early reports of association of BP with genes discovered in SZ, most notably G72/G30 (105, 106). These findings, in conjunction with the convergence of findings in SZ and BP from the realms of structural and functional imaging, neuropathology, neuropsychology, and psychophysiology [reviewed by Maier et al. (107)] have led to debates about the usefulness of the “Kraepelinian dichotomy.” A widely accepted set of criteria put forth by Robins and Guze (108) to support the validity of any given illness included family inheritance and biological markers (of which sequence variants are a subset). These two criteria suggest that a “unitary psychosis” model (109) may “carve nature at its joints” more meaningfully than the Kraepelinian dichotomy. However, clinical course is another validating criterion, and one in which SZ and BP differ considerably. Furthermore, differential response to treatment, as clearly seen in BP and SZ, has also been posited to be a distinguishing factor (110). Therefore, despite the overlapping genetic findings reviewed in this article, important discontinuities between BP and SZ remain difficult to ignore or explain away. Moreover, the reliance on genetics as an arbitrating factor in nosological distinctions is limited by the increasing complexity and plasticity of our concept of the gene as a discrete entity, as argued by Kendler (111).


Both BP and SZ are known to be clinically heterogeneous even in their full-blown forms. BPs may have psychotic, euphoric, mixed, and rapid cycling presentations, and puerperal onset, as well as comorbid panic disorder and obsessive-compulsive disorder (112). SZ has disorganized, catatonic, and paranoid subtypes (DSM-IV-TR), as well as clinically and genetically relevant presentations such as deficit syndrome (113). The remarkable variability in the clinical presentation of SZ has prompted debate over whether it is one or more disorders since the time of Eugen Bleuler (114). Both disorders can have early or late age of onset, good and poor course and outcome, and the presence of suicidality and substance abuse, not to mention symptomatic dimensions such as psychosis, thought disorder, and others. Many of these illness variables demonstrate heritability, suggesting that they may reflect genetic subtypes (115, 116). If latent genetic subtypes are in fact expressed as clinical subtypes, use of clinically nonspecific illness definitions (such as traditional SZ and BP operationalized criteria) may decrease the signal to noise ratio in gene-finding studies.


Even before Kraepelin's seminal dichotomy between dementia praecox and manic-depressive illness, a number of more specific clinical presentations were described. These included hebephrenia, paranoia, and catatonia (117). Although they are still seen as DSM subtypes (118), their etiological relationship with operationally defined SZ has been unclear. A number of investigations have suggested the presence of genes that affect more or less distinct illness subtypes. Family studies demonstrate that several symptomatic dimensions, as well as subtypes, aggregate within families (115). Twin studies of symptom dimensions or subtypes require samples of concordant monozygotic and dizygotic twin pairs (compared with twin studies of SZ, which examined both concordant and discordant pairs). The resulting smaller sample sizes and low power have made it difficult to draw conclusions from twin studies of such phenotypes.
A number of association studies have tested for allelic effects on symptom dimensions in SZ-only samples. Although these have been too few and underpowered to be conclusive, some interesting patterns have emerged. For example, DTNBP1 has been shown to be preferentially associated with negative symptoms in two independent samples (119, 120). It was also associated with a more progressive illness course (121) and cognitive decline (122) as well as low levels of manic symptoms (123), which may correlate with negative symptoms, and with reduced occipital and prefrontal brain volumes (124) and greater spatial working memory deficits (125). These findings taken together suggest that DTNBP1 variants may be risk factors for a more chronic, deficit-like clinical picture with more negative symptoms, cognitive deficits, and brain structural abnormalities, reminiscent of Crow's type II syndrome (126). Additional findings have included associations between DISC1 and delusions, Akt1 and hallucinations (56), and TAAR6 and delusions (127). Two independent samples reported association between COMT and affective symptoms (128, 129). Interestingly, these were the same two samples that reported associations between DTNBP1 and negative symptoms. Although these studies provide interesting leads, the general lack of replication and small sample sizes make it difficult to draw firm conclusions. However, studies of quantitative symptom dimensions and subtypes may critically inform nosological debate. In fact, the shift toward a more quantitative approach to diagnosis is one of the key components of the development of DSM-5 (11).
Several linkage studies have used illness subtypes as the phenotype of interest to reduce clinical heterogeneity. These have mostly used phenotypes based on empirical (i.e., statistical) subtyping methods, such as latent class analysis. Results include linkage of SZ with cognitive deficit to 6p24 (130), the latent classes “mania” and “schizomania” to 20p (131), and a latent class resembling deficit SZ to 1q23–25 (132). The finding of little or no linkage in these regions to traditional phenotype definitions suggests that genetic heterogeneity across clinical subtypes may contribute to negative findings. Testing of empirical subtypes and quantitative symptomatic dimensions in GWAS and other studies may therefore help clarify the pattern of findings across samples. This is planned as part of the Psychiatric GWAS Consortium (PGC) (133).

Bipolar disorder

The presentation of BP has been noted to be variable even since the adoption of Kraepelin's dichotomy, when manic-depressive illness subsumed the more contemporary concepts of major depressive disorder and BP with psychotic features (2). A number of illness variables, including subtypes, have been delineated. The one that has received by far the most attention in genetic studies is BP with psychotic features (psychotic BP). These findings have been reviewed in detail (112). A number of linkage studies have been performed using psychotic BP as the phenotype of interest. Interestingly, some of these have yielded linkage signals in chromosomal regions that have been linked to SZ in several independent samples. These have included 5q33-q34 (134) and 8p21-p22 (135). Chromosome 13q31-33 has been linked to both SZ and BP and has also resulted in linkage to psychotic BP (134136). Although 2p11-q14 has not been linked to SZ in many individual samples, meta-analysis of 20 SZ samples showed it to be the most highly linked region of the genome (25). A study of psychotic BP with mood-incongruent psychotic features reported linkage to this region (136). A number of SZ candidate genes have also been associated with psychotic BP, such as DTNBP1 (137), and G72/G30 (102), as mentioned above. Although not a SZ candidate gene, GRIA1 is in the 5q linkage region of SZ and has been associated with psychotic BP (138). The number of significant findings in psychotic BP and SZ raises the question of whether the observed genetic overlap of SZ and BP (14, 76) is explained by that between SZ and psychotic BP specifically. A number of other putative subtypes and quantitative traits remain to be tested in informative samples. This may be facilitated by the Bipolar Phenome Database (139), which aims to catalog phenotypic data on numerous BP samples. Work proposed in the PGC, which has a Cross-Disorders Group, may also shed light on a number of BP phenotypes as well as their genetic overlap with SZ and other disorders (133).


The studies reviewed heretofore involve the examination of known sequence variants and their impact on the clinical phenotype. These approaches may not be as sensitive to other kinds of genomic alterations, necessitating the development of new avenues of inquiry. Epigenetic changes are heritable modifications to the genome that do not involve sequence changes. For example, an addition of a methyl group to a cytosine residue can decrease expression by reducing binding of transcription factors. Methylation is mediated by DNA methyltransferase (DNMT1), the expression of which is up-regulated in brain tissue in both BP and SZ (140). Intriguingly, some antipsychotics and mood stabilizers may decrease methylation, which may explain a portion of their mechanism of action (141). This has recently been observed in peripheral blood lymphocytes as well (142).
Genome-wide gene expression in peripheral blood lymphocytes has been a primary focus in several studies. Although sequence variants are assumed to alter either protein function or expression, these studies examined RNA transcript levels directly. One of the criticisms of these studies is based on the assumption that the brain is the only organ relevant to the vulnerability to psychiatric illness. However, the recent strong implication of the MHC region, including several human leukocyte antigen genes, suggests that genetically mediated dysfunction of cell types in the immune system may affect risk as well. There have been few such studies, and replication has not been attempted yet, but expression changes in genes involved in myelination and growth factor signaling in mood disorders (143), as well as myelination, synaptic function, (144), and glutamate metabolism (145) in SZ.
Finally, a method that has long been practiced but that will assume increasing prominence is DNA sequencing. Studies using known markers, as reviewed above, may miss true causative variants, some of which may be rare, because they may not correlate with unobserved variations in regions of low linkage disequilibrium. Recent technological developments, such as the Illumina/Solexa, Roche/454, and SOLiD 3 systems have allowed for unprecedented throughput, orders of magnitude greater than traditional Sanger sequencing (146). This has made it practical to sequence whole chromosomal regions and exomes, and, in the future, whole genomes of patients and control subjects, which will be facilitated by publicly available whole genome sequences such as the 1000 Genomes Project (


After many years of meager success, genetic studies of SZ and BP have borne considerable fruit in the last decade. Much of this has been due to the success of the Human Genome Project's rapid discovery and dissemination of genomic information. Technological advances have made it possible to cheaply and rapidly measure up to a million genotypes, which has led to numerous discoveries of both genomic regions likely to harbor susceptibility genes, as well as a plethora of individual putative susceptibility genes. Although the majority of these have failed to be replicated, a small but significant number are very unlikely to be false-positives. Examples of both common and rare variants are of particular interest. The first are common SNPs in genes such as ZNF804A, MHC genes, and CACNA1C, which have been discovered in GWASs and replicated in additional independent samples with a high level of statistical significance. The second are rare CNVs, such as the deletions in 1q21.1 and 15q13.3, which have been independently confirmed in large samples. The genome-wide consistency of small-effect alleles across large independent samples is also striking (76), especially given the rigorous effort to eliminate artifacts.
Intriguingly, in all of these examples, the effect of alleles has an impact on both SZ and BP. The association of ZNF804A in both SZ and BP supports the genetic overlap of these disorders strongly suggested by a large GWAS (76) and family studies (14). However, a true understanding of the causes of such phenotypic overlap is limited by our diagnostic system, especially as used in genetic studies. BP studies tend to “lump” bipolar I disorder with and without psychosis, along with schizoaffective disorder, bipolar type. It is therefore possible that the latter two entities are genetic subtypes of a putative psychotic spectrum and are in fact driving the observed genetic overlap. These questions can only be resolved by examining large, phenotypically homogenous subtypes, such as BP with and without psychosis, and SZ with and without affective symptoms. This examination may now be possible in the largest consortia such as the PGC. It is therefore too soon to use genetic data to declare the demise of the Kraepelinian dichotomy, especially because there are other reasons not to. It could be more helpful to use the wealth of genetic information currently available to attempt to explain the seemingly infinite variation observed in the mood and psychotic disorders. Doing so may also speed gene discovery by decreasing homogeneity.
Perhaps more a more pressing issue than the knowledge that has already been gained involves the massive remaining gaps in our knowledge base. The allelic heterogeneity of susceptibility genes is a conundrum, especially given the large numbers of studies reporting association (e.g., with DTNBP1). Rather than eschew the evidence for such genes, it would be wise to continue to probe for an underlying explanation for the observed associations using deep sequencing and other modalities. GWAS signals also need to be probed to isolate nearby causative variants. Finally, there are a number of other factors that remain to be accounted for in explaining the impact of genetic variation on risk. These include environmental variables, gene-gene interactions, gene-environment interactions, the interaction between sequence and epigenetic variation, and the presence of very rare alleles and CNVs, which may not be detected using current technology. Some of these factors have been invoked in explaining the “missing heritability” of complex disease, and increasing attention should be paid to them (147).


