The pursuit of innovative findings has advanced science throughout the years. Studies focusing on replication of initial results, although also recognized as essential in the scientific process (
1–
4), have received less attention and are frequently regarded as just “another brick in the wall” (
5). In recent years, however, the low rate of replicability of results has produced incredulity, doubt, and distrust in the biomedical sciences in general (
6–
8), as well as in the field of mental health research (
9,
10). Pioneer studies that focused on measured gene-environment interactions (G×E) had a tremendous impact in the psychiatric literature (
11,
12), instigating many replication efforts. Discrepancies in subsequent results, however, have generated a heated debate over the past decade, casting doubts on whether and to what extent the initial findings are reproducible (
13–
17).
In the quest to understand why some but not all individuals exposed to stressful life events and maltreatment are at increased risk for major depression, Caspi et al. (
12) presented an innovative framework, providing evidence that this phenomenon was moderated by one’s genetic background. In a cohort of individuals followed up to early adulthood, those carrying the short allele of a serotonin-transporter-linked polymorphic region (5-HTTLPR) exhibited an increased risk for depression after experiencing stressful life events or maltreatment compared with those who were homozygotes for the long allele.
Mixed findings among studies trying to replicate the initial report were summarized in meta-analyses that suggested an overall lack of this interaction in the prediction of depression (
18) and that the “addition of the 5-HTT genotype did not improve the prediction of depression beyond that associated with negative life events” (
13). After criticism regarding selection bias in the included studies (
19), a broader meta-analysis was published 2 years later that posited that aggregated results were in accordance with a significant role for the 5-HTTLPR polymorphism in moderating the relationship between stress and depression (
20). Evidence of publication bias of negative studies in the G×E literature and the overinclusiveness of the later meta-analysis (
15) have been pointed out as possible reasons for these contrasting conclusions and their failure to elucidate this contradictory issue.
An array of explanations have been proposed for these conflicting findings, among them the heterogeneity of design used in the studies (cross-sectional, retrospective, prospective), the variety of environment assessment tools used (parent/teacher, self, objective assessment), and the nature of the evaluated stressor itself (
20). The convergence of all these issues on a single research question has been regarded as a methodological “perfect storm” (
21). It has been argued that evidence for the 5-HTTLPR stress sensitivity hypothesis came not only from human but also from animal studies, as well as from research using observational and experimental designs (
22).
Given the highly controversial state of current G×E research, unambiguous parameters for statistical replication are required. In line with this idea, recommendations and protocols have been developed to establish common and comparable designs and methodologies (
20,
23–
25). Culverhouse et al. (
26) have developed a sensitive and well-elaborated meta-analysis protocol to address the inconsistencies encountered in previous studies, emphasizing the sources of misinterpretations in G×E research results on depression. An important methodological aspect not prioritized in the protocol, however, is the relevance of the timing of environmental exposure. Multiple lines of evidence corroborate the notion that timing is essential in such investigations (
21,
24,
25,
27), as the absence of a clear definition of the temporal order between risk factor and outcome can markedly limit the interpretation of findings. In fact, the centrality of this issue was already demonstrated in the original Caspi et al. study (
12), in which the G×E effect was observed only when the reported stressful life events occurred before—but not after—depression.
Up to now, published G×E research originated almost exclusively in high-income countries. In the last three meta-analyses reported, only five of the 54 included studies were from low- and middle-income countries. The underrepresentation of low- and middle-income countries in G×E research contrasts with both the greater frequency of known environmental risks in these low-resource settings and the higher proportion of youths in the population (
28). The assessment of G×E in different settings can be useful not only to determine the validity of the G×E construct across multiple contexts but also to identify specific instances when the replication does not occur (
22).
Our primary objective in the present study was to test the interaction between maltreatment and 5-HTTLPR in the development of depression in young adults, replicating to the extent possible the design and methods of the original study but in a different sociocultural context—a middle-income country. Given the importance of temporal order in G×E assessment, our analyses also aimed to ascertain the temporality between the stressor and outcome. Using a large and prospectively assessed birth cohort, we sought to employ definitions of variables, a coding strategy, definitions of outcomes, and a statistical approach as similar as possible to those of the original study. As secondary objectives, we also tested whether our model of G×E was specific for major depression and whether our results would stand for the 5-HTTLPR triallelic approach (
29).
Method
Design and Sample
The data for this study were from the 1993 Pelotas Birth Cohort Study. A detailed description of the cohort characteristics and study design can be found elsewhere (
30,
31; see also the
data supplement that accompanies the online edition of this article). In brief, all children born in Pelotas, Brazil, in 1993 were eligible for the study. All but 16 mothers agreed to take part, resulting in a total cohort of 5,249 children. Follow-up visits were made at multiple time points, the last in 2011 and 2012. The data used in the present study were collected at the perinatal assessment and the age 11, age 15, and age 18/19 assessments, which had retention rates of 87.5%, 85.7%, and 81.3%, respectively (
32).
After initial recruitment, mothers were interviewed on a variety of perinatal health topics. Additional data were obtained on maternal education (number of years of education, categorized into three strata) and monthly family income (measured in number of minimum wages, a standard unit in Brazil valued around US$60 in 1993).
At the age 11 visit (in 2004), data on child mental health were obtained using the Brazilian Portuguese version of the Strengths and Difficulties Questionnaire, with parent and child assessments (
33,
34). Data on maternal mental health status were obtained using the Brazilian Portuguese validated version of the Self-Report Questionnaire (
35).
For the age 15 visit (in 2008), adolescents filled out a confidential form that included seven dichotomous questions about lifetime emotional, physical, or sexual abuse and child neglect (see Table S3 in the online
data supplement for details). We followed the same approach adopted by Caspi et al. in regard to maltreatment classification (
11,
36), in which individuals were grouped into three categories: no maltreatment (no positive answer), probable maltreatment (one positive answer), and severe maltreatment (two or more positive answers). Information on mental health problems was collected using the same strategy as in 2004, with the exception of the Strengths and Difficulties Questionnaire, which was available only in the parent version.
Also at age 15, DNA samples were obtained from saliva using the Oragene OG-250 DNA Self-Collection kit, and the 5-HTTLPR polymorphism was amplified as described previously (
37–
39). The primary focus of interest was the biallelic variant (LL, LS, SS), since it was the genotype system analyzed in the original study as well as in the vast majority of replication attempts (
18,
25,
26,
40). However, as a more recently identified single-nucleotide polymorphism (SNP) in the L allele (rs25531, a>g) led to differentiation between the high-functioning La variant versus the Lg variant, more functionally equivalent to the S allele—the so-called triallelic approach (
29)—secondary analyses were performed with this latter model, using La, Lg, and S as allele coding.
For the age 18/19 visit (2011–2012), all traceable young adults were assessed by trained psychologists for psychiatric diagnoses, using an instrument derived from the Mini International Neuropsychiatric Interview (
41). Depressive episode diagnoses were evaluated with DSM-IV-TR criteria, assessing symptoms in the previous 2 weeks. Other psychiatric diagnoses assessed at this time point were bipolar disorder, generalized anxiety disorder, social anxiety disorder, and attention deficit hyperactivity disorder.
To allow the assessment of timing of events, a sensitive screening index for depressive symptoms was developed with the aim of identifying individuals with a possible major depression diagnosis prior to or at the time of the maltreatment assessment. For this screening index, we used scores from the emotional subscale of the Strengths and Difficulties Questionnaire and the impact supplement administered at the age 11 visit (self-report and parent-rated) and at the age 15 assessment (parent-rated only). To increase sensitivity, an overinclusive index was created using those scores, in which any available data compatible with the presence of emotional problems causing an impact was sufficient for a positive screening. The definition of positive screening followed the borderline cut-point on the emotional and impact subscales, as recommended for the Brazilian Portuguese version of the original instrument. A second strategy was also adopted in order to validate the screening index’s findings, this time using the dispersion of emotional scores within the sample. The 80th percentile was used a priori as the cut-point for the construction of the same index above mentioned.
For a flowchart of the study design, see Figure S1 in the online data supplement. The research protocol was approved by the Federal University of Pelotas Ethics Committee. Written informed consent was obtained from primary caregivers or cohort members (in the last visit) prior to each assessment.
Statistical Analysis
Data analyses included a description of the sample characteristics for each variable group and comparison of frequencies between groups, using chi-square tests for categorical variables and t tests and analysis of variance (ANOVA) to compare the group means for quantitative variables. We also tested the independence of childhood maltreatment and 5-HTTLPR genotype variables, using ANOVA for the identification of gene-environment correlation.
To evaluate the association between a depressive episode and 5-HTTLPR genotype, childhood maltreatment experience, and their interaction, we performed a binary logistic regression with a first-order interaction model, coding childhood maltreatment exposure by counting the number of maltreatment experiences and coding 5-HTTLPR genotype to reflect the number of short (S) alleles (0=LL, 1=LS, 2=SS). Subsequently, we included all conceptually defined potential confounders in the model, such as gender, maternal education and mental health, family income, and skin color (classified according to self-identification, grouped as white and nonwhite).
Considering our replication focus, we also performed a sensitivity analysis with only participants who self-identified as white. Secondarily, we also assessed the specificity of the logistic regression model, following the analysis strategy described above but with the dependent variable changed to other psychiatric diagnoses, all assessed at the same time point as major depression.
All analyses were carried out with SPSS, version 20 (IBM Corp., Armonk, N.Y.).
Results
Sample Characteristics
From the original sample of 5,249 individuals, 4,052 were evaluated for depressive episode in early adulthood, and within that group, childhood maltreatment and genotype data were available for 3,558 individuals. (See the flowchart in Figure S1 in the online data supplement; see Table S1 for demographic information for the original sample and a comparison of participants who remained in the study and those who were lost to follow-up.)
Using our sensitive screening index for depressive symptoms for the 3,558 retained participants to filter out those likely to have experienced previous depression, a total of 2,392 young adults were selected for our analyses. The final sample characteristics across each group of measures of interest are summarized in
Table 1.
Participants who screened positive for early depressive symptoms, excluded from the main analyses (N=1,039), differed significantly from the analyzed sample on several measures, such as gender, skin color, maternal education rate, family income, and exposure to childhood maltreatment (all p values, <0.01; see the online data supplement for further details.)
For those included in the final analyses, the proportion of 5-HTTLPR polymorphism genotypes within the biallelic model was 32.4% for LL, 47.8% for LS, and 19.8% for SS. For the triallelic model, genotype distribution was 40.8% for SLa, 25.0% for LaLa, 19.8% for SS, 7.4% for LaLg, and 7.0% for SLg. Neither distribution deviated from Hardy-Weinberg equilibrium.
Evidence of childhood maltreatment was distributed in the sample as follows: 70.8% were classified as having no maltreatment, 17.7% as having probable maltreatment, and 11.5% as having severe maltreatment. There were gender differences in childhood maltreatment distribution (χ2=38.74, df=2, p<0.001), with higher levels of maltreatment exposure for females. There was no identifiable evidence of gene-environment correlation, as maltreatment frequency did not differ according to 5-HTTLPR genotype groups.
Gene-Environment Interaction
The assessment of potential confounders indicated that, with the exception of gender, which was included in the final model as a statistically significant covariate, inclusion of other covariates in the model (either individually or as a group) did not change the main results, showing that skin color, maternal mental health, maternal education, and family income were not associated with the outcome in our sample.
Our final regression model showed that the association between depression and childhood maltreatment was moderated by 5-HTTLPR genotype (B=0.47, SE=0.19, Z=2.49, p=0.01), predicting depression in our cohort in the same direction as seen in the original study, and with a similar effect size (
Table 2). The association between child maltreatment and youth depression was strongest among SS homozygotes (B=1.22, SE=0.30, Z=4.07, p<0.001), followed by SL heterozygotes (B=0.98, SE=0.21, Z=4.60, p<0.001) and LL homozygotes (B=0.50, SE=0.24, Z=2.11, p=0.03).
Figure 1 shows the probability of depression in each 5-HTTLPR genotype group for the three maltreatment exposure strata, adjusted for gender. As identified in the original study, although carriers of the small 5-HTTLPR allele with severe maltreatment exposure represented only 8% of the eligible sample, they accounted for more than 28% of the total cases of depression identified in our study. Among individuals with severe maltreatment, 17.6% of those with the SS genotype became depressed, whereas only 7.3% in the LL group did so.
Sensitivity Analyses
Selecting only individuals who self-identified as white for analysis, using the same final regression model, the interaction term remained associated with depressive episode, in the same direction as in the main analysis (B=0.54, SE=0.24, Z=2.21, p=0.03). Analysis of all available individuals, regardless of early depression screening status, failed to show an interaction effect on depressive episode occurrence (B=0.21, SE=0.14, Z=1.51, p=0.13), reaffirming the importance of temporal order in G×E studies. Validity analysis using the 80th percentile as a cut-point showed results similar to the original screening index (B=0.35, SE=0.18, Z=1.99, p=0.04 for the interaction term).
Secondary Analyses
Extending the original study, we also assessed the triallelic model and the specificity in terms of outcome measure. Regrouping individuals with long 5-HTTLPR alleles according to presumed functionality of the a/g SNP resulted in an association for the interaction term that approached but fell short of significance (B=0.34, SE=0.19, Z=1.86, p=0.07). Examining the specificity of the model by substituting the outcome measure in the moderated regression framework revealed that the G×E was present only when major depression was the outcome, as shown in
Table 2.
Discussion
This study replicates the 2003 findings of Caspi et al. (
12), reinforcing the role of the interaction between genetic and environmental influences in the etiology of youth major depression. We identified an interaction similar to that observed in the original study, using a categorical outcome within a multiplicative model. The results showed that the association between childhood maltreatment and depression was related to the number of S alleles carried, with progressively higher risk for individuals with more copies of the short variant.
These findings are relevant for several reasons. First, they strengthen and add to the body of evidence that individuals exposed to maltreatment are at an increased risk of developing a depressive disorder. Second, they support the original hypothesis that, given comparable levels of childhood maltreatment, the risk of depression varies according to 5-HTTLPR genotype. The cross-study inconsistency in the measurement of variables has been pointed out as one of the most important factors confounding replication attempts (
15,
20,
22,
26,
42). As the quality of meta-analyses is limited by the quality of the studies reviewed, this has also led to difficult-to-interpret statistical summaries.
The low number of standard replications, in comparison to a multiplicity of “approximate replications,” has been suggested to be one of the major weaknesses of current psychiatric research (
43). In the present study, we adhered as closely as possible to the original design and methods (see
Table 3), which is the most recommended strategy for G×E replication (
13,
18,
42). Using this approach, we were able to identify a measured G×E, this time in a totally different sociocultural context, giving strength to the validity of the G×E concept in the etiology of depression. Importantly, however, because a previous study that followed a similar strategy failed to replicate the original findings (
17), it is not possible to entirely exclude the nonexistence of a true G×E effect. As the magnitude of the variance attributable to this specific G×E is at most small, literature syntheses aggregating comparable studies are required to further disentangle this issue (
44).
The replication of the original finding of G×E in a different culture, especially in a middle-income country, reinforces the G×E construct and provides more robustness to the reproducibility of this evidence. Of note, 91% of the studies included in the above-mentioned meta-analyses were from high-income countries, which account for only 17.8% of world’s population (
45) and only 10% of people under age 18 (
28).
Our methods and data analyses are consistent with the recent recommendations and updated understanding about G×E research. Currently, childhood maltreatment is considered the key risk factor for depression, with stressful life events probably representing a proxy marker of early maltreatment (
42,
46). The additive genotypic model for 5-HTTLPR has been the most utilized strategy in G×E studies exploring this gene’s moderation (
18). We also evaluated the impact of substitution of this biallelic system for the triallelic model, which resulted in a p value of 0.07 for the interaction term.
The importance of timing in G×E research was also evidenced in our results. When the logistic regression including all 3,558 available participants was performed without taking into account the temporal relationship between the stressor and outcome, the model failed to show an effect of the interaction on depression. This result is consistent with the current literature (
21,
24,
25,
27), which emphasizes the importance of the assessment of temporal order in G×E studies, suggesting a possible explanation for the mixed results in other replication attempts.
Our study is not without limitations. Given that the 1993 Pelotas Birth Cohort Study was not primarily a mental health study, we had psychiatric diagnoses according to DSM-IV-TR criteria available only in the age 18/19 assessment. On previous visits, mental health problems were evaluated with the Strengths and Difficulties Questionnaire, a screening instrument whose accuracy and predictive capability have undergone extensive evaluation, with good results in different cultures, including the one studied (
34). However, for evaluation of temporality between stressor and outcome, we could not rely on DSM-based diagnoses to identify participants who might have had earlier depressive disorders. To reduce this limitation, we used a highly inclusive screening assessment for emotional problems, decreasing the chances of false negatives, despite the risk of a high false positive rate. As our exclusion rate reached 30% of the eligible sample, we decided to run another secondary analysis, using the dispersion of emotional scale scores. After application of these strategies, our exclusion rate dropped to 21.2%, maintaining the same main results previously found. Another potential limitation is the possible modification of the observed interaction of 5-HTTLPR and childhood maltreatment by other, unmeasured, genetic characteristics. Additionally, as our results were based on a limited number of depressed individuals, we cannot rule out an eventual unaddressed admixture in our sample or the instability of our analyses’ results. Also, our findings were identified from a majority-white population, and our results may not be generalizable to populations with other ethnic backgrounds. Finally, although results from the city of Pelotas do not indicate population stratification in the region (
47), we were not able to rule out this potential bias in our sample.
Our results show not only that S allele carriers are at risk for depression after experiencing childhood maltreatment, but also that early exposure to maltreatment significantly increases the risk for depression regardless of 5-HTTLPR genotype, a finding that differs from the original study. Indeed, a difference of degree rather than an all-or-nothing phenomenon seems more plausible given the complexity of biological mechanisms. Importantly, the S allele conferred an increased risk for depression at age 18/19 only for those individuals who suffered maltreatment before age 15. For those who did not report any maltreatment, the presence of the S allele was associated with lower rates of depression. The interplay of 5-HTTLPR genotype and childhood maltreatment exhibited specificity for the development of major depression in our study, but not for the other psychiatric diagnoses tested, which is in accordance with previous findings (
24,
48).
In conclusion, these results provide further evidence supporting the initial hypothesis of interaction between 5-HTTLPR polymorphisms and childhood maltreatment in the development of youth depression. Despite some differences in study methods, adopting the general principles of the original study proved to be indispensable to adequately assessing replicability—an aspect that was evident especially in regard to the timing of measurements. At a time when high hopes after initial studies are usually followed by discrepant results and skepticism, we believe that groundbreaking advancements in psychiatric knowledge will be contingent on further replication of findings across diverse settings.
Acknowledgments
The authors thank Lucas Battel and all ProDIA group for their assistance in the development of this work, and they thank especially Giovanni Abrahão Salum for his thoughtful insights on the initial version of this study.