Schizophrenia (SCZD, OMIM 181500) is a serious neuropsychiatric illness estimated to affect approximately 1% of the general population. Family, twin, and adoption studies have demonstrated that schizophrenia is predominantly a genetic disorder, with a high heritability
(1) . Multiple genetic and nongenetic factors are likely to be involved
(2) . As part of a genome-wide search for loci contributing to risk for schizophrenia, we previously reported linkage, with a maximum heterogeneity lod score of 6.5, to chromosome 1q21-22 (SCZD9, OMIM 604906) in a group of 22 medium-sized Canadian families selected for study because multiple relatives were clinically diagnosed with schizophrenia or schizoaffective disorder
(3,
4) . Other studies have also reported linkage
(5 –
8) and linkage disequilibrium (LD)
(9,
10) of schizophrenia to this region. We previously tested markers from this region for evidence of LD to schizophrenia in our Canadian families, identifying significant LD with three markers within the gene for NOS1AP (nitric oxide synthase 1 [neuronal] adaptor protein; formerly known as CAPON)
(11) . The same allele for two of these markers was recently reported to be associated in a South American population isolate
(12) . Association of different single nucleotide polymorphisms (SNPs) within
NOS1AP to schizophrenia also has been reported in a Han Chinese sample
(13) .
NOS1AP is an attractive candidate for schizophrenia susceptibility. NOS1AP was first identified in the rat as a neuronal nitric oxide synthase (nNOS) binding protein, capable of disrupting the association of nNOS with the postsynaptic density scaffolding proteins PSD93 and PSD95 through the binding of the C-terminus of NOS1AP to nNOS
(14) . The interaction between nNOS and PSD93 and PSD95 is important in targeting nNOS to the postsynaptic
N -methyl-
d -aspartate receptor (NMDAR) complex and facilitates the tight coupling between activation of the NMDAR and nNOS, allowing nNOS activation by Ca
++ influx through the NMDAR, producing NMDAR-mediated NO release into the synaptic structures
(15) . This places NOS1AP at the scene of NMDAR glutamate neurotransmission, long proposed to be involved in schizophrenia
(16) .
NOS1AP is a large gene, with a genomic extent of approximately 300 kb. Sequencing of the 1.5 kb coding region of
NOS1AP in individuals from the Canadian linkage sample has failed to identify any coding mutations associated with illness
(11), suggesting that important regulatory mutations could be present within the noncoding sequence of this gene. We have reported significantly increased expression of NOS1AP in postmortem samples from Brodmann’s area 46 of the dorsolateral prefrontal cortex of individuals with schizophrenia and bipolar disorder relative to psychiatrically normal comparison subjects
(17), further supporting a role for NOS1AP misexpression in schizophrenia susceptibility. To more comprehensively survey
NOS1AP for association with schizophrenia in our Canadian families, we have now genotyped a total of 60 SNPs from this gene and analyzed them using a novel analysis technique, the posterior probability of linkage disequilibrium (PPLD). We have conducted functional assessments of the SNPs with the highest probabilities of being in LD with schizophrenia and have identified a specific sequence change that causes significantly increased gene expression, implicating the A allele of rs12742393 as a schizophrenia risk allele.
Method
Linkage and Association Subjects
The subjects for this study were used in our prior association study of
NOS1AP (11) . The sample consisted of 24 Canadian families of Celtic (N=23) or German (N=1) descent, recruited for study because schizophrenic illness appeared to be segregating in a unilineal (one side of the family) autosomal dominant-like manner. After complete description of the study to the subjects, written informed consent was obtained. Protocols were approved by the institutional review boards of Rutgers University, University of Toronto, and the Centre for Addiction and Mental Health (Toronto). For this study, we used a narrow definition of schizophrenia: only the 85 individuals with a diagnosis of schizophrenia or chronic schizoaffective disorder were coded as affected; 232 individuals were coded as unaffected, including 40 with the spectrum diagnoses of nonaffective psychotic disorder, schizotypal personality disorder, and paranoid personality disorder. DNA samples were available for 332 subjects. Subjects with available DNA but no assessment data were coded phenotype unknown. The ascertainment and assessment procedures and composition of the sample have been described elsewhere in greater detail
(4,
11,
18,
19) .
SNP Selection and Genotyping
We selected tag SNPs to cover the entire genomic extent of
NOS1AP, including 41 kb 5′ and 15 kb 3′ of the untranslated regions (UTRs). SNPs were chosen from the Perlegen database, which groups SNPs into LD bins based on r
2 (http://genome.perlegen.com/), and the Applera SNPbrowser program, version 3 (http://www.applera.com/), which chooses SNPs on the basis of LD units (LDUs) created with the measure r. First, one tag SNP was selected from each Perlegen LD bin containing two or more SNPs. To enhance our chances of locating a mutation in a regulatory region, we chose SNPs in close proximity to evolutionarily conserved sequences whenever possible. These were identified using a MultiPipMaker
(20) alignment of sequence from the human, chimpanzee, pig-tailed macaque, mouse, rat, dog, and opossum genomes, including 50 kb 5′ and 3′ of the gene. Regions were deemed to be conserved if the sequence showed ≥70% identity over 100 bp. This set was further supplemented by several novel SNPs detected by resequencing of evolutionarily conserved regions within a set of 16 founders from our sample. We imported all these SNPs into SNPbrowser and selected additional SNPs to a final density of one SNP per 0.33 LDUs. Fifteen SNPs from our previous study of
NOS1AP (11) were included in this calculation of coverage but were not retyped.
DNA was extracted from blood samples or lymphoblastoid cell lines using the GenePure system (Gentra Systems, Minneapolis). Thirty-eight SNPs were genotyped by a primer extension strategy (pyrosequencing) and were typed as simplex assays using the automated PSQ HS96A platform
(21) ; analysis of 15 of these by a different statistical method has been reported elsewhere
(11) . Twenty-two SNPs were genotyped by ligase detection reaction and Luminex 100 flow cytometry
(22,
23) . Reaction conditions were as previously described
(11,
22) . A list of all SNPs genotyped, their locations, and all genotyping primer sequences is provided in Tables S1–S5 in the data supplement that accompanies the online edition of this article.
SNP genotypes were checked for Mendelian errors with Pedcheck, version 1.1
(24), and were merged with existing genotype data from five microsatellites from the region
(11) and analyzed with Merlin, version 0.10.2
(25), for unlikely genotypes based on high-order recombination. Overall, less than 1% of genotypes were identified as potential errors, and these were regenotyped twice, resulting in 40 unresolved genotypes (0.17%) which were excluded from further analysis. Approximately 4% of attempted genotypes were unavailable because of failure of genotyping assays.
Linkage Disequilibrium Analysis
The PPLD is a variant of the LD-PPL, or posterior probability of linkage allowing for LD
(26) ; both are variants of the PPL
(27 –
29), which directly measures the probability of linkage between a trait and a given chromosomal location. Inclusion of an LD parameter in the underlying PPL likelihood allows for the possibility of trait-marker LD. It is then straightforward to condition on “linkage” while modeling LD by rescaling the LD-PPL
(30) . In this way we can disaggregate LD evidence from the underlying linkage evidence at any given locus. This is key here because the multipoint PPL is 99.7% over NOS1AP
(31), where 100% would represent statistical certainty of linkage. Thus, any measure that confounded linkage and LD would be expected to find effects even in the absence of LD.
Specifically, the PPLD has the following form: where
D ′ is the standardized LD parameter
(32) ; θ is the recombination fraction; γ is a vector of trait parameters, in this case comprising the parameters of an approximating single-locus model allowing for locus heterogeneity (disease allele frequency, three penetrances, and the admixture parameter α
[33] ); L is the likelihood; and f(D′, θ, γ) is the prior probability distribution for the parameters. The prior distributions are essentially uniform, except for the priors on D′ and θ, which allow for LD only for very small values of θ (see references
26,
30 for details) and assign a nonzero point mass over D′=0. All PPLD calculations were conducted using the software package KELVIN
(34) .
The PPLD integrates over the trait parameters rather than maximizing over them or fixing them at specific values. It also makes full use of all pedigree data. It intrinsically measures LD due only to close physical proximity (or perhaps epistasis), and it has been shown to be robust to departures from Hardy-Weinberg equilibrium at the marker
(26) .
In computing the usual form of the PPL, the prior probability of linkage is set to 2% on the basis of theoretical calculations
(35) . For comparability of scale, therefore, we also set the prior probability of LD given linkage to 2% (so that the joint prior probability of linkage and LD is just 0.04%). Values of the PPLD <2% therefore represent evidence against LD in the presence of linkage, while values >2% indicate evidence in favor of LD in the presence of linkage. As with any form of the PPL, the PPLD has a direct interpretation as a probability: a PPLD of, say, 25% can be interpreted in the same way as a 25% probability of rain, of winning a bet, or of a successful medical intervention.
The PPLD applies to one SNP at a time (two-point analysis) and measures the evidence for or against LD to that SNP. As a measure of evidence, the PPLD does not require “correction” for multiple testing, any more than, say, measurements of length for various pieces of rope would require “correction” as a function of the number of pieces measured. On the other hand, the very small prior probability implicit in the calculation effectively adjusts for the biological fact that even at a linked locus, any given SNP may have only a very small chance of exhibiting LD. If the true prior probability of LD is in fact higher than 2% for SNPs within NOS1AP, then our measure of evidence in favor of LD would be conservative.
Postmortem Expression Analysis
Samples from the Stanley Array Collection of the Stanley Brain Collection (http://www.stanleyresearch.org/programs/brain_collection.asp) were analyzed. This is a collection of biomaterials derived from postmortem brain specimens from 35 individuals with schizophrenia, 34 individuals with bipolar disorder, and 35 psychiatrically normal comparison subjects. We have previously quantified and reported
NOS1AP expression levels in Brodmann’s area 46 of the dorsolateral prefrontal cortex from these samples by real-time polymerase chain reaction (PCR) and normalization to beta-actin
(17) . Genotypes for rs12742393 for these samples were determined by ligase detection reaction assay. The dependence of
NOS1AP short isoform expression on genotype was tested with analysis of variance controlling for storage time, as suggested in our earlier reported work
(17) .
Luciferase Reporter Assay
Segments of the NOS1AP gene were tested for regulatory function in the human neural cell lines SK-N-MC and PFSK-1 (ATCC, Manassas, Va.). Endogenous NOS1AP expression was confirmed in these cell lines by reverse-transcriptase PCR. The dual luciferase reporter (DLR) assay (Promega, Madison, Wisc.) was used to test noncoding regions of the NOS1AP gene for function. This assay measures the amount of luciferase protein generated in cells transfected with a vector (pGL3) containing a firefly luciferase (luc) gene and the putative regulatory sequences to be tested for function. To test putative regulatory sequences in conjunction with the native NOS1AP promoter, a 1.3 kb sequence including the 5′UTR of NOS1AP and 800 bp 5′ to the UTR was cloned into the Nco I site of the pGL3-Basic plasmid (Promega). Test sequences of approximately 850 bp to 1.4 kb in length surrounding each SNP were then cloned into the Hind III site 5′ to the NOS1AP promoter sequence (see Table S6 in the online data supplement). These plasmids were co-transfected with the Renilla luc plasmid (phRL-null) for normalization purposes in a 50:1 ratio to minimize trans effects between the plasmids; approximately 0.8 μg plasmid DNA was transfected per assay. For both cell lines, cells were plated onto 24 well plates and grown until 70% confluent. The plasmids were then transfected with lipofectamine 2000 using the standard protocol (Invitrogen, Carlsbad, Calif.). Twenty-four hours after transfection, the plates were assayed for luciferase expression on a luminometer (Turner BioSystems, Sunnyvale, Calif.) following the standard DLR protocol.
Experiments on each plasmid were conducted in three or four parallel replicates; these replicate experiments were then repeated on three separate occasions (four for rs4145621 in PFSK-1). The controls included were a nontransfected control for background luminescence of the plate and cells; the pGL3-Basic plasmid for background luminescence from the empty vector; and the pGL3-Enhancer plasmid, which contains the SV40 promoter and enhancer sequences and was used as a positive control. After subtracting the background luminescence, the log of the ratio of firefly to Renilla luciferase values was fitted to a linear model incorporating both “occasion” blocks and plasmid effects. Linear contrasts were tested in the context of this model.
Electrophoretic Mobility Shift Assay
Nuclear extracts were prepared from cultured SK-N-MC and PFSK-1 cells using the Panomics (Fremont, Calif.) Nuclear Extraction Kit (AY2002). Various amounts of SK-N-MC and PFSK-1 nuclear extract (0.25, 0.40, 0.50, 1.0, and 1.5 μg) were incubated with 10 ng of biotin-labeled oligonucleotide sense probe for either the A or C allele with 1 μg of Poly d(I-C) for 5 minutes at room temperature; 2 μl of 5× binding buffer and 10 ng of biotin-labeled probes were then added to a final volume of 10 ml and incubated for 30 minutes at 20°C. For competition assays, 100-fold molar excess unlabeled probe was added to the mixture prior to the 30-minute incubation. The protein/DNA complex was then UV cross-linked and separated on a nondenaturing 6% acrylamide gel in 0.5× Tris-borate-EDTA buffer and wet-transferred onto a Biodyne nylon membrane (Pall Corporation, East Hills, N.Y.), which was exposed to a HyBlot CL Autoradiography film (Denville Scientific, South Plainfield, N.J.) for chemiluminescence detection. Probe sequences for rs12742393 were: A allele—GCATTTTACCAGTACAATCTG; C allele—GCATTTTACCCGTACAATCTG.
Discussion
We previously reported association between schizophrenia and SNPs within
NOS1AP using a small set of markers
(11) . We also reported significantly increased expression of
NOS1AP in postmortem samples from the dorsolateral prefrontal cortex of individuals with schizophrenia and demonstrated that specific SNP alleles associated with disease are also associated with increased
NOS1AP expression
(17) . In this study, we conducted a more thorough search for disease-associated SNPs throughout the large genomic extent of
NOS1AP . We identified rs12742393 as a functional SNP that exhibits strong LD with schizophrenia, is associated with higher levels of
NOS1AP expression in postmortem samples from the dorsolateral prefrontal cortex, and is capable of significantly increasing expression from the
NOS1AP promoter in two human neural cell lines. The allelic variants of this SNP also exhibit differential affinity for binding of nuclear proteins from these cell lines, suggesting that altered transcription factor binding may be the mechanism responsible for the allelic expression differences.
We also demonstrated the utility of LD mapping using the posterior probability of linkage framework. As a measure of evidence, the PPLD is straightforward to interpret. It answers the fundamental question, “What is the chance that each of the SNPs we are investigating is in LD with a risk mutation?” Variations in DNA that are located physically far apart (on the same chromosome or on different chromosomes) will randomly assort during meiosis, with combinations of alleles at these different loci conforming to the ratios predicted by the Hardy-Weinberg law. Alleles at loci that are physically very close to each other are often co-transmitted during meiosis, leading to a correlation between the presence of a specific allele at one locus and a specific allele at the other locus. This correlation between alleles at nearby loci is termed linkage disequilibrium. In this study, we evaluated the probability that each tested SNP was in LD with a functional DNA change in NOS1AP that increased risk for schizophrenia. It is important to point out that we cannot statistically distinguish a SNP that is in strong LD with a functional DNA change from the actual functional SNP itself; the functional DNA change and the associated alleles at nearby SNPs all will be inherited together, requiring additional laboratory experiments to determine which sequence change has biological function.
In this study, three SNPs out of the 60 evaluated were calculated to have more than a 40% chance of being in LD with a schizophrenia risk allele. The PPL software can also be used to estimate the magnitude of LD between each SNP and the risk locus. For all three of these SNPs, the estimate of the LD parameter D′ was 1.0, the maximum possible value. There are several important points to be made about the interpretation and utility of this result. First, this does not mean that there was a 40%–50% chance that each of these SNPs will be functional. The genotypes at these SNPs are very highly correlated; knowing the DNA sequence at one SNP allows the prediction, with a high degree of accuracy, of the DNA sequences at the others. So the PPLD result tells us there is a good chance that at least one of these SNPs (or a yet untested SNP that is also highly correlated with this group) will be functional. Second, the probability that these SNPs will be involved was calculated at 40%–50%; this is far from certainty of being involved (100% chance) but also far from certainty of being not involved (0% chance). But one SNP does appear to be functional, so was our calculation of the PPLD too low? Here it is important to remember that the PPL framework, like other measures of evidence, is limited by the sample to which it is applied. Our sample was modestly sized for association analysis; it may simply be too small to give PPLD values close to certainty. But while the probability of LD was estimated at 40%–50%, the magnitude of LD was estimated to be extremely high (D′=1.0). Such a high D′ value would be expected to be observed with an actual schizophrenia risk allele. It is also important to remember that these PPLD values only reflect the evidence from the genotypes in the Canadian families; other evidence that one of these SNPs is functional, such as the results of the postmortem study, is not reflected in the PPLD. Finally, as can be clearly seen in
Figure 1, the PPLD analysis did an excellent job of separating the SNPs into two distinct groups, those with high and those with low evidence for being involved in schizophrenia risk. This allowed us to efficiently prioritize the selection of SNPs for the costly and time-consuming step of functional validation.
We previously
(11) analyzed 15 of these 60 SNPs for association to schizophrenia using the program Pseudomarker
(36), identifying three SNPs (rs1415263, rs4145621, and rs2661818) as associated. Two of these (rs1415263 and rs4145621) provide strong evidence of LD using the PPLD. Interestingly, these same two SNPs showed significant association to schizophrenia in the recent study of Kremeyer et al.
(12), while rs2661818 did not, suggesting that the PPLD might be resistant to type I errors that may be generated by other association analysis methods.
While we have identified what appears to be a functional risk allele for schizophrenia, it is important to put this finding in context. First, it is not clear that we have identified the only functional polymorphism within this region.
NOS1AP is a large gene; analysis of the current HapMap data (version 23a)
(37) from this gene using the Caucasian population, the pairwise Tagger algorithm
(38), an r
2 threshold of 0.8, and a minimum minor allele frequency of 0.05 indicates that 97 SNPs would be necessary to fully tag this gene. However, the HapMap database is still incomplete; 20 of the SNPs genotyped in our study have no data in HapMap. Three of these SNPs, all with minor allele frequencies exceeding 10% in our sample, were recently discovered by resequencing efforts in our lab, which suggests that there is still an important role for resequencing of candidate genes to identify common variants.
It will be important to determine whether the rs12742393 functional allele acts to confer risk of schizophrenia in other populations. We believe that there is genetic heterogeneity in schizophrenia and hence do not expect that all samples will easily replicate this finding. This may be particularly true for large samples for which mixed ascertainment strategies were used
(39) . It is important to remember that failure to replicate this finding in other samples does not invalidate the finding in this population. The importance of genetic background to the manifestation of inherited traits is well known in animal genetics, where genetic differences between inbred strains can result in the same DNA change producing a profound effect in one strain while remaining silent in another. It is likely that we will need to consider the effects of interacting risk alleles in multiple genes in order to observe more consistent replications of association to any schizophrenia candidate gene across different samples
(40) . Schizophrenia is clearly not a single-gene disorder. The frequency of the rs12742393-A allele is ∼40% in Caucasian populations, ∼70% in Asian populations, and ∼80% in African populations. Given the population rates of schizophrenia, it is obvious that a single copy of this risk allele is far from sufficient to cause the illness, but the high frequency of this allele does mean that it could potentially play a role in a large number of cases.
Identification of a functional DNA change associated with schizophrenia in this Canadian population should help stratify other samples into more homogeneous subsets. Investigators will readily be able to determine which subjects carry the rs12742393-A allele that could increase NOS1AP expression. Given the interest in pharmacologic targeting of the glutamate receptor system in schizophrenia, identification of this functional risk allele may also have utility in identifying patients who may respond better to certain new medications. The results of this study may help lay the foundation for a better understanding of the complex genetic architecture of schizophrenia risk.