The principle to be kept in mind is to know what we see rather than to see what we know. (
1)
It seems more and more to be the rule, rather than the exception, that identical symptoms, such as social or language impairment, are shared across distinct clinical syndromes such as autism, language disorders, attention deficit hyperactivity disorder (ADHD), schizophrenia, and anxiety disorders. Two simple interpretations of these shared phenotypes include comorbidity at random (coincidental comorbidity) or a shared etiologic substrate, consistent with biological pleiotropy. Sorting this out has proven to be challenging both for those who create nosological systems and for those looking for the etiologic substrates of disorders. In their article in this issue, Bartlett et al. (
2) tackle this important problem by asking if there is a portion of the autism language phenotype that is shared with specific language impairment. They then go one big step further, by trying to examine whether there are specific genetic factors that account for the shared phenotype.
To accomplish their goals, Bartlett and colleagues conducted a genome-wide linkage scan of pedigrees ascertained by using a novel sampling design that required distinct probands with autism and with specific language impairment in each pedigree; the pedigrees were then used for linkage and association analyses. Using the posterior probability of linkage test for genome-wide linkage scans on two categorical language phenotypes (language and reading impairments), three quantitative language traits (derived from factor analyses of various language measures), and two nonlanguage quantitative phenotype traits (scores on the Yale-Brown Obsessive Compulsive Scale [Y-BOCS] and Social Responsiveness Scale) in 79 families, the authors found peaks for language on 15q23–26 and reading impairment on 16p12. Quantitative trait analyses yielded no linkage. Additionally, a nonlanguage phenotype, the score on Social Responsiveness Scale, showed linkage on 15q for the quantitative phenotype and on 14q for a categorical phenotype; there was also linkage to a YBOCS phenotype on 13q.
This study used DSM-IV diagnostic criteria, which for autism require impairments in communication. Therefore, it may not be surprising to any of us that the authors found linkage peaks that are common to both autism and language impairment. However, looking past what seems to be the obvious, one can see the real value of this study: The use of an innovative and appropriate sampling strategy, along with proper genetic analysis, to dissect the phenotypes of multiple, overlapping clinical syndromes can offer insight into the biological substrates of those phenotypes.
Because of the broad heterogeneity of the language phenotype associated with autism (ranging from nonverbal to fluent language), it is highly unlikely that there will be a complete overlap between the autism language phenotype and specific language impairment. In order to address this issue, the investigators must clearly identify the specific shared language phenotype for autism and specific language impairment. To accomplish this goal, there must be careful and specific sampling tactics to ascertain the families that have both probands with autism and probands with specific language impairment within that single family, while also being certain that there is no comorbidity of the two conditions in any individual. The authors accomplished this difficult sampling goal. Then, in the genome-wide linkage analysis, both the autism and language-impaired probands were considered to be “affected.” If autism is not related to language impairment, then including persons with autism as “affected” in the same analysis with persons having specific language impairment will reduce linkage/association; it can be expected that this sampling design will effectively map only those loci that are etiologically relevant to both autism and specific language impairment and that positive findings will indicate shared loci and etiology. Using this innovative strategy, the authors found two novel loci; neither locus has been previously identified in either autism spectrum disorder (ASD) or specific language impairment. This finding clearly supports the authors’ hypothesis of shared etiology and creates new opportunities for examining the biological substrates of both disorders.
Sampling strategies are a critical part of hypothesis testing: Specifically, how one samples determines what one can observe. Rather than simply seeking what was already known by considering autism and language impairment as two distinct clinical conditions, based on an arbitrary classification system, Bartlett et al. chose to test their hypotheses by using a clever sampling strategy and then combining distinct individuals with distinct diagnoses, autism and specific language impairment, into a single group of “affected” individuals. Considering seemingly distinct clinical syndromes as existing in a single “affected” state is a relatively new effort in genetic research; this stems from the observation that identical genetic variants consistently lead to distinct clinical syndromes. Among many examples, we might consider 16p11.2 copy number variations (CNVs) that lead to multiple phenotypes, such as autism, schizophrenia, intellectual disability, and language impairment (
3–
13). Such cross-condition sampling strategies are beginning to yield interesting and useful results. For example, the Psychiatric Genomics Consortium examined cross-disorder effects of genome-wide significant loci previously identified for bipolar disorder and schizophrenia. Investigators used genome-wide association study data on single nucleotide polymorphisms (SNPs) in 33,332 individuals with ASD, ADHD, bipolar disorder, schizophrenia, and major depressive disorders, along with 27,888 comparison subjects. The study showed genome-wide significance in intronic SNPs within ITIH3 and AS3MT, along with SNPs at two
l-type voltage-gated calcium channel subunits, CACNA1C and CACNB2 (
14). These results provide empirical evidence that these disorders have at least some level of shared genetic etiology.
My colleagues and I have demonstrated the importance of sampling strategies in an epidemiological study that can be used not only for prevalence estimates but also explorations of phenotypic variations and genomics. Rather than sampling only high-risk individuals with a higher probability of having developmental psychopathology (such as those receiving educational or psychiatric services), we used a total population approach. We screened all children in a geographic area, including individuals at both high and low risk for developmental psychopathology. On the basis of sampling alone, we found that ASD prevalence increased from 0.8% in the high-risk group to 2.6% in the total population. We were also able to demonstrate that the ASD phenotype was different from that in previous reports. For example, the male-to-female ratio changed from 5:1 to 2.5:1 and mean performance IQ changed from 75 to 98 in the high-risk group and total population, respectively (
15).
It has always been important to use scientifically rigorous, clever, and creative sampling strategies. But this critical part of science has often been neglected. In the current era of exploration in etiological substrates of disorder and dysfunction, it is absolutely essential that sampling strategies are carefully planned a priori and are then equally carefully executed. This is crucial for successful hypothesis testing, identification of both genetic and nongenetic etiological substrates, and comprehensive description of phenotype distribution. Bartlett and colleagues acknowledge that their study suffers from a small number of subjects, resulting in limited statistical power to test a portion of their hypotheses, and that their findings need to be replicated in larger-scale, independent populations. Nonetheless, these authors have demonstrated the value of a carefully constructed and well-executed sampling tactic in the search for a common genetic liability for shared phenotypes across distinct clinical conditions. This essential work offers us a strong lesson and will likely contribute to refinement of psychiatric nosology and identification of the target molecular systems that, in turn, will lead to the development of novel and effective treatments. The article by Bartlett et al. shows us that careful scientific technique, especially sampling, allows us to better know and understand what we see.