To the Editor: We are pleased that our findings of a relationship between the rs28365143 genotype and antidepressant outcomes received this level of attention and prompted a meta-analysis to explore the generalizability of our results. Here we respond to the points raised by Fabbri et al. and discuss aspects of the meta-analysis that may explain decreased sensitivity and lack of concordance with our original findings.
An important consideration of replication is the extent to which key factors that were part of the original study are similar in the replication. If these assumptions are not met, then the analysis serves more as a test of generalizability rather than replication. A sufficiently powered, large sample may still fail to find significant effects because of key differences between samples. Arguably, numerous such differences may have contributed to the lack of statistical significance of the meta-analysis.
Our samples differed in three categories: participants, data analysis, and outcome measures. First, the International Study to Predict Optimized Treatment in Depression (iSPOT-D) and the Predictors of Remission in Depression to Individual and Combined Treatments (PReDICT) study appear to have more stringent inclusion criteria than those used in the meta-analysis. This departure accords with the meta-analytic goal to consider broader generalizability. For example, current substance abuse, a condition that can affect the efficacy of treatment (
1,
2), was excluded in the iSPOT-D and in the PReDICT study but not in the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study or the Genome-Based Therapeutic Drugs for Depression (GENDEP) project (although substance dependence was excluded) (
3,
4). Similarly, benzodiazepines were permitted in both the STAR*D and GENDEP trials but were prohibited in the iSPOT and PReDICT trials. Indeed, approximately one-third of participants in the GENDEP study received benzodiazepines during the study period (
3). While these less stringent inclusion criteria maximize clinical generalizability, increased patient heterogeneity could increase noise and, with it, the likelihood of type II error. Second, with respect to data analysis, while our recent iSPOT-D and PReDICT analysis focused on completers of an 8-week treatment with antidepressants, the studies incorporated in the meta-analysis imputed depression severity of patients who dropped out before the end of treatment. This difference may represent another source of noise that could increase the chances of a false negative result. Third, differing outcome measures between studies can further account for lack of generalizability. The associations reported in the iSPOT-D and PReDICT samples used the Hamilton Depression Rating Scale (HAM-D), while studies included in the meta-analysis used a variety of different outcome measures, including the HAM-D, the Beck Depression Inventory, and the Montgomery-Åsberg Depression Rating Scale (
3–
6). Although these measures all aim to assess the same underlying depression severity, the concordance between them is variable (
7,
8), which could inflate confidence intervals and again increase the likelihood of a type II error.
When considering these differences between our studies, it is perhaps not surprising that the significance of rs28365143 was not maintained in the meta-analysis. We are encouraged that the relationship reported by the meta-analysis is in the same direction as the relationship we observed despite these differences. Future analyses exploring these conditions are needed to fully understand the association between genotype and treatment outcome in depression.
We also agree there are inherent limitations to relatively small candidate gene studies in racially nonhomogeneous populations, including confounding by genetic ancestry. Although we acknowledge this limitation in our original publication (
9), there are several reasons we believe it does not explain the observed effect size. First, remission rates did not differ significantly between participants who identified as Caucasian, black, or Asian in the iSPOT-D sample (44%, 40%, and 47%, respectively, analysis of variance p=0.52), making it unlikely that differing remission rates are driving the observed association. We agree, however, that incorporating principal components based on ancestry is a valuable approach when genetic homogeneity cannot be attained in a sample. Although this could not be performed in the iSPOT sample, we did perform a principal components-adjusted analysis for the PReDICT sample. Critically, while it increased the degrees of freedom of the model and therefore increased the p values slightly, the effect of genotype remained, leading us to believe that the effect of single-nucleotide polymorphism was not driven by population stratification.
In summary, we agree that collaboration is essential to progress in the field of psychiatric genetics, and we believe that the work of Fabbri et al. exemplifies an important step in assessing the generalizability of new findings. We advocate for further meta-analyses like theirs to further characterize the role of corticotropin-releasing hormone binding protein in antidepressant response.