Applying the tools of genetic epidemiology, particularly when coupled with continued progress in the neurosciences and behavioral sciences, is likely to be one of the most fruitful approaches to resolving sources of complexity in the mental disorders and translating the progress in genomics to the public
(98).
Figure 1 shows the classic triangle that illustrates the major focus of epidemiologic investigations: the products of the interaction between the host, an infectious or other type of agent, and the environment that promotes the exposure
(99). The factors that may be associated with increased risk of human disease are shown under each of the three domains of influence. The field of genetic epidemiology focuses on the role of genetic factors that interact with other domains of risk to enhance vulnerability or protection against disease
(100–
102). It is quite conceivable that several combinations of these risk factors could produce similar phenotypes in susceptible individuals. The test for epidemiology over the next decades will be to determine the extent to which the tools can be refined to capture these situations.
Sampling
The shift from systematic large-scale family studies to linkage studies in psychiatry has led to the collection of families according to very specific sampling strategies (e.g., many affected relatives, affected sibling pairs, affected relatives on one side of the family only, availability of parents for study, etc.) in order to maximize the power of detecting genes according to the assumed model of familial transmission. Despite the increase in power for detecting genes, these sampling approaches have diminished the generalizability of the study findings and contribute little else to the knowledge base if genes are not discovered. As we learn more about the complexity of genetic risk factors, it may be advisable in the future to collect both families and control subjects from representative samples of the population in order to enable estimation of population risk parameters, enhance generalizability, and examine the specificity of endophenotypic transmission.
Study Designs
Epidemiologic studies generally proceed from retrospective case-control studies to develop specific hypotheses that can be addressed in prospective cohort studies in order to demonstrate causality. The major goal of analytic epidemiology is to identify risk and protective factors and their causal links to disease, with the ultimate goal of disease prevention. Genetic epidemiology employs traditional epidemiologic study designs to identify explanatory factors for aggregation in groups of relatives ranging from twins to migrant cohorts. The tools of genetic epidemiology will be employed in the era of genomics to derive estimates of the population distribution of disease genes, to test modes of disease transmission in systematic samples that are representative of the population, and to identify sources of gene-environment interactions for diseases. Since epidemiology has developed sophisticated designs and analytic methods for identifying disease risk factors, these methods can now be extended to include both genes and environmental factors as gene identification proceeds.
In general, study designs in genetic epidemiology either control for genetic background while letting the environment vary (e.g., migrant studies, half siblings, separated twins) or control for the environment while allowing variance in the genetic background (e.g., siblings, twins, adoptees-nonbiological siblings). Because each of the study designs has both strengths and limitations, it is important to evaluate aggregaate evidence from multiple approaches to yield conclusive evidence regarding the role of genetic and environmental risk factors. Over the next decades, it will be important to identify and evaluate the effects of specific environmental factors on disease outcomes and to refine measurement of environmental exposures to evaluate specificity of effects.
Migrant studies are perhaps the most powerful study design to identify environmental and cultural risk factors. One of the earliest controlled migrant studies evaluated rates of psychosis among Norwegian immigrants to Minnesota compared with native Minnesotans and native Norwegians
(103). The higher rate of psychosis among the immigrants than in both the native Minnesotans and Norwegians was attributed to greater susceptibility to psychosis among the migrants who left Norway. It was found that migration selection bias was the major explanatory factor rather than an environmental exposure in the new culture
(103).
Another powerful study design is the nested case-control study built on an established cohort. Prospective cohort studies are also valuable sources of diagnostic stability, causal associations between risk factors and disease, and developmental aspects of psychiatric disorders. Langholz et al.
(104) described some of the world’s prospective cohort studies that may serve as a basis for studies of gene-disease associations or gene-environment interactions. Finally, the half-sibling approach may eventually replace the adoption paradigm to investigate genetic and environmental effects because of the recent trends toward selective adoption and the diminishing frequency of adoptions in the United States and in numerous other countries (i.e., maternal selection of adoptive parents and continued contact with biological mothers).
Population-Based Studies
The importance of epidemiology to the future of genetics has been described by numerous geneticists and epidemiologists who conclude that the best strategy for gene identification will ultimately involve large epidemiologic studies from diverse populations
(39,
44,
47,
98,
105–108). It is likely that population-based association studies will assume increasing importance in translating the products of genomics to public health
(47). The term “human genome epidemiology” was coined by Khoury et al.
(108) to denote the emerging field that employs systematic applications of epidemiologic methods in population-based studies of the impact of human genetic variation on health and disease.
There are several reasons that population-based studies will be critical to the future of genetics. First, the prevalence of newly identified polymorphisms, whether SNPs or other variants, especially in particular population subgroups, is not known. Second, current knowledge of genes as risk factors is based nearly exclusively on clinical and nonsystematic samples. Hence, the significance of the susceptibility alleles that have been identified for cancer, heart disease, diabetes, and so forth is unknown in the population at large. In order to provide accurate risk estimates, the next stage of research needs to move beyond samples identified through affected individuals to the population in order to obtain estimates of the risk of specific polymorphisms for the population as a whole. Third, identification of risk profiles will require very large samples to assess the significance of vulnerability genes with relatively low expected population frequencies. Fourth, similar to the role of epidemiology in quantifying risk associated with traditional disease risk factors, applications of human genome epidemiology can provide information on the specificity, sensitivity, and impact of genetic tests to inform science and the individual
(107).
Because genetic polymorphisms involved in complex diseases are likely to be nondeterministic (i.e., the marker neither predicts disease nor nondisease with certainty), traditional epidemiologic risk factor designs can be used to estimate their impact
(101). As epidemiologists add genes to their risk equations, it is likely that the contradictory findings from studies that have generally employed solely environmental risk factors, such as diet, smoking, alcohol use, etc., will be resolved. Likewise, the studies that seek solely to identify genes will also continue to be inconsistent without considering the effects of nongenetic biologic parameters as well as environmental factors that contribute to the diseases of interest.
There are several types of risk estimates that are used in public health. The most common is relative risk, the magnitude of the association between an exposure and disease. It is independent of the prevalence of the exposure. The absolute risk is the overall probability of developing a disease in a particular population
(99). The population attributable risk relates to the risk of a disease in a total population (exposed and unexposed) and indicates the amount the disease can be reduced in a population if an exposure were eliminated. The population attributable risk depends on the prevalence of the exposure, or in the case of genes, the gene frequency. Genetic attributable risk would indicate the proportion of a particular disease that may be attributed to a particular genetic locus.
Figure 2 illustrates the known genetic and environmental risk factors for Alzheimer’s disease
(109). The orange areas on the left represent the roles of deterministic genes (β-amyloid precursor, presenilin-1 and -2) and the susceptibility gene apolipoprotein-E ε4 (APOE-ε4)
(110). The blue areas on the right indicate environmental risk and protective factors, respectively
(111–
113). Individuals with mutations in deterministic genes appear to have nearly a 100% chance (i.e., fully penetrant) for the development of Alzheimer’s disease. Likewise, the relative risk of these genes would also be quite high. In contrast, because these mutations are presumed to be very rare in the population, the population attributable risk is quite low, meaning that were these mutations to be eliminated from the population, there would be little impact on the prevalence of Alzheimer’s disease.
APOE ε4 The APOE ε4 allele has been shown to increase the risk of Alzheimer’s disease in a dose-dependent fashion. Using data from a large multiethnic sample collected by more than 40 research teams, Farrer et al.
(114) reported a 2.6–3.2 greater odds of Alzheimer’s disease among those with one copy, and 14.9 odds of Alzheimer’s disease among those with two copies of the APOE ε4 allele. Moreover, there was a significant protective effect among those with the ε2/ ε3 genotype. As opposed to the deterministic mutations, the APOE ε4 allele has a very high population attributable risk because of its high frequency in the population.
Identification of Environmental Factors
The identification of gene-environment interactions will be one of the most important future goals of genetic epidemiology. Newman et al.
(115) credit the synergy between genetics and epidemiology for elucidating the initial gene findings as well as for the subsequent identification of other susceptibility alleles and the environmental factors that may influence the risk of breast cancer in susceptible persons. Study designs and statistical methods should focus increasingly on gene-environment interaction
(116–
122). Evidence is emerging that gene-environment interaction underlies many of the complex human diseases. Some examples include inborn errors of metabolism, individual variation in response to drugs
(123), substance use disorders
(124,
125), and the protective influence of a deletion in the CCRS gene on exposure to HIV
(79,
126).
With respect to mental disorders, recent reviews of prospective studies that evaluated environmental risk factors for the common mental disorders ascertained in population-based studies yielded few specific environmental factors that could be etiologically linked with any of the major mental disorders
(127). However, one promising exception is the increasing evidence from genetic epidemiologic studies that environmental exposures including pre- and perinatal factors, such as viral agents, may enhance the risk of schizophrenia
(128). Other informative study designs for identifying gene-environment interactions include migrant studies and genetic case-control studies in which the cases may be defined by a genetic susceptibility marker.
Future research designed to identify environmental factors that operate either specifically or nonspecifically on those with susceptibility to mental disorders may provide an important opportunity for prevention and intervention, once susceptibility genes have been identified. The recent advances in understanding the bidirectional communication of neural systems and experience
(76) provide an ideal opportunity to apply genetic epidemiologic methods such as case-control and prospective cohort studies. Increased knowledge of the developmental pathways of emotion, cognition, and behavior will expand our ability to identify specific environmental factors such as infection, poor diet, prenatal environment, and early life experiences that interact with the genetic architecture of mood regulation and cognition
(129).