Efficacy and side effect data for psychotropic medications in the United States have been investigated primarily in non-Hispanic Caucasian populations. Currently, the Hispanic population is the largest ethnic minority group in the country, representing over 37 million people (
1). Within this group, almost 70% are Mexican Americans. Although this population is growing dramatically, there is insufficient research regarding psychiatric diagnosis and treatment in this group (
2).
Major depressive disorder is a serious public health problem worldwide, with a lifetime prevalence of 10%–20% in the general population (
3). Vega et al. (
4) reported that the lifetime prevalence of major depression in U.S.-born Mexican Americans is 14.8%. Although Hispanics have participated in antidepressant treatment studies, it has been difficult to ascertain whether there are any major differences in antidepressant efficacy in that population, for several reasons, including methodological differences among studies, small sample sizes, and the inclusion of several Hispanic subgroups in an attempt to illustrate a “Hispanic response” (
5).
A retrospective review (
6) suggested that pharmacokinetic factors play a role in the differential sensitivity to tricyclic antidepressants in depressed Puerto Rican American females as compared with Anglo females, resulting in greater efficacy, higher rates of adverse drug reactions, and higher dropout rates in the former group. In an open-label study, nefazodone (
7) was found to have similar efficacy but a higher dropout rate in a predominantly Hispanic Caribbean female sample as compared with non-Hispanics from previous nefazodone trials. An open-label study of two selective serotonin reuptake inhibitors (SSRIs) (
8) found that although the drugs were comparable in efficacy, Mexican American females had a higher dropout rate even though there were more severe adverse drug reactions reported in the non-Hispanic group. A more recent study of an SSRI reported no differences in efficacy, rates of adverse drug reactions, or dropout rates between Hispanic and non-Hispanic patients with HIV (
9).
Three recent meta-analyses on the pharmacogenetics of antidepressants in major depression were not able to show genome-wide significant variations (
10–
12). The nonsynonymous single-nucleotide polymorphism (SNP) rs6265 in the
BDNF (brain-derived neurotrophic factor) gene may have a minor impact on susceptibility to major depression (
13) and antidepressant drug response (
11); however, the overall conclusion of these meta-analyses was that the results do not support any major effect of any single gene variation in the pharmacogenetics of antidepressants in major depression.
In the present study, we focused on functional SNPs, based on 1) the likely significance of a functional SNP in the
BDNF gene (rs6265) in the genetics and pharmacogenetics of major depression; 2) our own recent work proposing a predictive framework for the diagnosis of major depressive disorder using interactions of multiple functional SNPs and environmental factors (
14); and 3) a growing body of evidence supporting the involvement of epigenetic mechanisms in major depression and antidepressant action (
15–
19). Here we present data on the efficacy and adverse drug reaction profiles of desipramine and fluoxetine, two extensively used off-patent antidepressants, and new pharmacogenetic leads that could advance our understanding of genetic variants implicated in antidepressant treatment response.
Method
The study protocol was approved by the University of California Los Angeles and University of Miami institutional review boards and the Australian National University Human Ethics Committee. This was a single-site prospective double-blind 8-week trial with fluoxetine and desipramine conducted in the greater Los Angeles area. All study participants had an initial medical evaluation consisting of a detailed history, physical examination, and blood collection for routine testing and genotyping, followed by two consecutive study phases: a 1-week single-blind placebo lead-in phase to minimize the effect of placebo response, and subsequent random assignment to receive either 10–40 mg/day of fluoxetine or 50–200 mg/day of desipramine, with weekly follow-up visits to assess clinical status. Participants provided written informed consent after receiving a complete description of the study.
Given the proven efficacy of these antidepressant medications, we used a placebo lead-in period followed by active treatment for all patients in order to minimize risk to participants (
20,
21).
Participants
All participants met the following inclusion criteria (
22): at least three of their four grandparents born in Mexico (
23); a DSM-IV diagnosis of current unipolar major depressive episode; a score ≥18 on the 21-item Hamilton Depression Rating Scale (HAM-D) (
24), with item 1 (depressed mood) rated ≥2; and age between 18 and 70 years. Exclusion criteria were any axis I disorder other than major depressive disorder or primary anxiety disorders; an active medical illness that could be related to the ongoing depression (e.g., untreated hypothyroidism, myocardial infarction or cerebrovascular incident within the previous 6 months, uncontrolled hypertension or diabetes); current suicidal ideation with a plan and strong intent, or recent serious suicide attempt; history of ECT in the previous 6 months; current use of medications with CNS activity that interferes with EEG activity or any other antidepressant treatment within 2 weeks before enrollment; history of poor response to treatment with desipramine or fluoxetine; illicit drug use or alcohol abuse in the previous 3 months; and current enrollment in counseling or psychotherapy treatment. In addition, women who were pregnant or lactating or were of childbearing age and not using contraception were excluded.
Recruitment and Outcome Measures
Participants were recruited by advertisements in bilingual newspapers, radio, and television. Informed consent forms, questionnaires, and assessment scales were given in their preferred language (English or Spanish). In addition, clinical staff also participated in health fairs, conferences, and cable network programs through which they recruited participants, and some participants were referred by regional outpatient community clinics.
The presence of a current major depressive episode was determined by the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID) (
25) (mean kappa score for sensitivity and specificity among raters, 0.84–0.85), and diagnoses were confirmed by a research psychiatrist. Symptom severity was rated by experienced bilingual clinical personnel using Spanish or English versions of the HAM-D (
24), the Hamilton Anxiety Rating Scale (HAM-A) (
26), the Global Assessment Scale (GAS) (
27), the Beck Depression Inventory (BDI) (
28), and the Center for Epidemiologic Studies Depression (CES-D) Scale (
29). Participants also completed an acculturation questionnaire to determine their language preference, education level, and generation status.
Interventions and Treatment
All participants received 1 week of single-blind placebo to minimize the effect of placebo response, which was defined as a decrease of 25% or more in HAM-D score compared with the screening visit or a HAM-D score <18. Those who did not show a placebo response were randomly assigned in a 1:1 ratio to receive either fluoxetine or desipramine for an 8-week double-blind phase. Participants initially received 10 mg/day of fluoxetine or 50 mg/day of desipramine, which increased at week 2 to 20 mg/day of fluoxetine or 100 mg/day of desipramine. At week 4, for participants who had less than a 25% decrease in their HAM-D score, dosages were increased to 30 mg/day of fluoxetine or 150 mg/day of desipramine. At week 6, for participants whose HAM-D score was >12, dosages were increased to 40 mg/day of fluoxetine or 200 mg/day of desipramine. The staff and participants were aware of dosage escalation, which occurred only if the previous dosage was well tolerated, but they were blind to the drug. At the end of the study, participants were referred to a psychiatric clinic of their choice for follow-up treatment. Random antidepressant blood levels were collected to ascertain medication adherence but not to obtain therapeutic levels.
Statistical Analysis
Analyses were performed using SAS, version 9.1.3 (SAS Institute, Cary, N.C.), and included all participants who received at least 1 week of study drug. The primary outcome measure was change in HAM-D score from week 0 to week 8. The secondary outcomes of interest were change in HAM-A, BDI, CES-D Scale, and GAS scores. Remission was defined as a HAM-D score <8, response was defined as a reduction of ≥50% in HAM-D score, and nonresponse was defined as a reduction of <50% in HAM-D score. Remission and response were compared between treatment groups. Student’s t test was used to compare the mean values for age, acculturation score, and baseline clinical measurements, and chi-square or Fisher’s exact test was used to compare the percentages of demographic characteristics and side effect events.
For repeated continuous outcome measure analyses, the likelihood-based mixed-effects model as the primary analysis of efficacy was used to assess differences between treatment groups in changes from baseline across 8 weeks. The model included the categorical effects of treatment, treatment week, treatment-by-week interaction, and gender, as well as continuous covariates of baseline score and age. Mixed-model repeated-measures analyses of changes from baseline were conducted using PROC MIXED in SAS. A macro was written in SAS for covariance structure selection by comparing Akaike’s information criterion and Schwarz’s Bayesian criterion using compound symmetry, unstructured, first-order autoregressive (AR[1]), and Huynh-Feldt covariance structures based on the “smaller is better” criterion. For each outcome comparison, the covariance structure was used if the sum of Akaike’s information criterion and Schwarz’s Bayesian criterion was smallest.
For repeated analyses of dichotomous outcomes (remission and response), the modified Poisson regression model with robust error variance estimated by the generalized estimating equation approach (
30) was performed with PROC GENMOD in SAS to estimate the adjusted relative risk. To compare the time to response or remission between the two treatment groups, Cox regression analysis was performed with PROC TPHREG in SAS using the DISCRETE option to handle ties in event time. Gender, age, and baseline scores were included as covariates in all models. For all analyses, the threshold for significance was a p value ≤0.05 (two-sided).
Pharmacogenetics Procedures
Whole-exome genotyping.
Genomic DNA of 65 participants who completed the 8-week treatment course (36 of whom had remitted at week 8 and 29 of whom had not responded) was subjected to whole-exome genotyping, performed by the Australian Genome Research Facility (Melbourne), an Illumina Certified Service Provider for the Infinium Genotyping Service. We used the Illumina HumanExome-12v1_A BeadChip, which covers putative functional exonic variants selected from over 12,000 individuals. The exonic content consists of >250,000 markers representing diverse populations (including European, African, Chinese, and Hispanic individuals) and a range of common conditions, such as type 2 diabetes, cancer, metabolic disorders, and psychiatric disorders. Samples with calls below the Illumina-expected 99% SNP call rate were excluded. To test genotyping reliability and quality, an individual was duplicated. The identity by descent between all pairs of individuals was estimated and used for quality control.
Quality control and filtering.
GenomeStudio data were imported to SVS, version 7.6.7 (Golden Helix, Bozeman, Mont.;
http://www.goldenhelix.com), an integrated collection of analytic tools for managing, analyzing, and visualizing multifaceted genomic and phenotypic data.
Parameters for excluding markers from analyses included 1) deviations from the Hardy-Weinberg equilibrium with p<2×10−7 (0.05/250.000 markers) in both case and control subjects (this avoids the exclusion of major effect causal variants); 2) a genotype call rate <90%; 3) more than two alleles; and 4) monoallelism. Genotype and allelic frequencies were estimated by maximum likelihood.
Genetic stratification analysis.
We estimated the inbreeding coefficient in order to detect the presence of hidden biological relatives in the sample, which might reduce the independence of the data. The fixation index between pairs of subpopulations, case subjects, and control subjects was estimated to evaluate the potential presence of genotype stratification (microdifferentiation), a common cause of spurious associations. Independently, an additional correction of putative population stratification was applied with 10 principal-components analyses to normalize genotypic data by its actual standard deviation.
Exome-wide association analysis.
The genotypic (additive model) and allelic tests of association were applied. Multiple test correction to determine exome-wide significance was performed using the false-discovery-rate approach. Mixed linear models were applied as a tool to include in the analyses fixed factors (sex, age, treatment) and random effects (family or population structure) and to contrast with other analytical tools, different from principal-components analysis, the effects of potential inbreeding (by inclusion of the kinship matrix as defined by identity by descent). We applied the single-locus mixed model (which assumes that all loci have a small effect on the trait) and the multilocus mixed model (which assumes that several loci have a large effect on the trait) as implemented in SVS. Linkage disequilibrium analysis was also implemented using SVS.
Advance recursive partitioning (tree-based) approach (ARPA).
Rao has suggested that recursive partitioning techniques should be highly recommended for genetic dissection of complex traits (
31). ARPA is widely used in predictive analyses, as it accounts for nonlinear and interaction effects, offers fast solutions to reveal hidden complex substructures, and provides truly nonbiased statistically significant analyses of seemingly unrelated high-dimension data (
14).
ARPA accounts for the effect of hidden interactions better than alternative methods and is independent of the type of data (categorical, continuous, ordinal, etc.) and data distribution type (normal or nonnormal) (
31). Furthermore, results supplied by tree-based analytics are easy to interpret visually and logically (
14). Therefore, to generate the most comprehensive and parsimonious classificatory model to predict remission of major depression at end of treatment, we applied ARPA using a set of different modules implemented in the Salford Predictive Modeler (SPM) software package, namely, CART, random forest, and Tree-Net (
http://www.salford-systems.com). SPM is a highly accurate and ultra-fast analytics and data-mining platform for creating predictive, descriptive, and analytical models from databases of any size, complexity, or organization. One important advantage of SPM compared with other available software is that it can use raw data with sparse or empty cells (a common problem when dealing with genetic data).
CART is a nonparametric approach in which a series of recursive subdivisions separates the data by dichotomization (
32). The aim is to identify, at each partition step, the best predictive variable and its best corresponding splitting value while optimizing a splitting statistical criterion so that the data set is successfully divided into increasingly homogeneous subgroups (
32). We used a battery of different statistical criteria as splitting rules (e.g., Gini index, entropy, and twoing) to determine the splitting rule that decreased the relative cost of the tree the most while increasing the prediction accuracy of target variable categories (
32). The best split at each dichotomous node was chosen by either a measure of between-node dissimilarity or iterative hypothesis testing of all possible splits to find the most homogeneous split (lowest impurity) (
32). Similarly, we used a wide range of empirical probabilities (priors) to model numerous scenarios recreating the distribution of the targeted variable categories in the population (
32). Following this iterative process, each terminal node was assigned to a class outcome (remitter or nonresponder).
To avoid finishing with an overfitted CART predictive model (a common problem in CART analysis) and to ensure that the final splits were well substantiated, we applied tree pruning (
32). During this procedure, predictor variables that were close competitors (surrogate predictors with comparable overall classification error to the optimal predictors) were pruned to eliminate redundant commonalities among variables, so the most parsimonious tree would have the lowest misclassification rate for an individual not included in the original data (
32).
Furthermore, to exactly identify the most important set of variables predicting remission of major depression, we applied the random forest method using a bagging strategy (
33). The random forest strategy differs from CART in the use of a limited number of variables to derive each node while creating 100 to 1000 trees (
33). This strategy has proved to be immune to the overfitting generated by CART (
33). In the random forest strategy, variables that appeared repeatedly as predictors in the trees were identified. The misclassification rate was recorded for each approach.
The TreeNet strategy was used to complement the CART and random forest analyses because it reaches a level of accuracy that is usually not attainable by single models (CART) or by ensembles such as bagging (random forest) (
34). The TreeNet algorithm generates thousands of small decision trees built in a sequential error-correcting process converging on an accurate model (
34).
To obtain honest assessments of the derived models and have a better view of their performance on future unseen data, we applied a cross-validation strategy in which both training with all the data and then indirect testing with all the data were performed. To do so, we randomly divided the data into 10 separate partitions (folds). This strategy allowed us to conduct a cross-validation and review the stability of results across multiple replications (
32). Cross-validation is a model validation technique for assessing how the results of a statistical analysis will generalize to an independent data set; the n-fold cross-validation technique is designed to get the most out of data sets that are too small to accommodate a holdout or test sample. For our specific problem, we used a maximization algorithm that allowed us to validate the entire original results of our genome analysis by considering n-fold subsamples of the original set. So we used cross-validation to be able to both train with all the data and then indirectly test with all the data as well.
We also applied a categorical approach to link the set of genotypes of the associated marker (rs1321744) by using latent class analysis to identify unobservable subgroups within the subset of completers who were genotyped (remitters and nonresponders).
Discussion
In this single-site double-blind antidepressant trial of predominantly first-generation Mexican American patients, we found that both fluoxetine and desipramine were effective; however, patients treated with fluoxetine had significantly better scores at endpoint on the HAM-D, HAM-A, BDI, and GAS than desipramine-treated patients across all analytical approaches. The advantage of fluoxetine over desipramine was also evidenced by a shorter time to response in the survival analysis, a lower occurrence of anticholinergic and cardiovascular adverse drug reactions, and a lower dropout rate. These clinical outcomes contrast with previous studies showing similarities in efficacy between SSRIs and tricyclic antidepressants (
36) and no difference in the response rate between SSRIs and tricyclics in an intent-to-treat analysis (
37).
Our results are in concordance with data showing higher dropout rates among patients treated with tricyclic antidepressants because of lack of efficacy or adverse reactions (30.0% and 24.7%, respectively) (
37). Tricyclic antidepressants have historically been found to be associated with moderate to severe adverse drug reactions (
38); SSRIs are associated with milder adverse drug reactions that may diminish as treatment continues, and they also have lower toxicity and lower lethality when taken in an overdose situation (
39). The overall comparability we observed in tolerability between desipramine and fluoxetine could be related to the close monitoring of our patients, a practice that improves outcomes of treatment with tricyclic antidepressants in depression (
40). Our data supported a higher dropout rate and occurrence of anticholinergic and cardiovascular side effects in desipramine-treated patients. Previous studies have indicated that Africans are more sensitive to tricyclic antidepressants (
41,
42), and populations with higher African genetic admixture rates, such as Puerto Rican Americans, have elevated adverse drug reaction rates and dropout rates with tricyclic antidepressants (
43). This could also have contributed to a greater sensitivity to tricyclic antidepressants in our Hispanic subgroup with an African genetic admixture.
To our knowledge, this is the first randomized double-blind placebo lead-in trial conducted in the United States comparing the clinical efficacy and tolerability of antidepressants in depressed individuals of Mexican descent. The reasons for the lack of prospective randomized clinical trials in ethnic minorities are multiple and include difficulties in recruitment and retention of appropriate subjects and significantly less adherence to antidepressant therapy during the initial 100-day period (
44). Considerable resources were needed for our study. Recruitment was challenging despite the fact that over 8 million persons of Mexican descent live in California (
1), including 3 million in Los Angeles County. We conducted more than 4,300 telephone interviews and scheduled 1,223 screening visits in order to obtain 166 study completers.
A World Health Organization study (
45) found that treatment with tricyclic antidepressants was more cost-effective than treatment with SSRIs in 14 different populations (including Mexico and the United States), particularly in lower-income subregions. However, in our study, fluoxetine treatment produced a better and faster response than desipramine in first-generation Mexican Americans with mild to moderate major depressive disorder, which suggests that fluoxetine may constitute a better drug of first choice for patients of Mexican descent.
Our whole-exome genotyping approach identified one functional intergenic SNP in chromosome 6 with exome-wide association with remission (p=1.98×10
−6; false-discovery-rate-corrected p=0.05), which is remarkable given the small number of remitters (N=36) and nonresponders (N=29). Mexicans are an admixture of Europeans, Native Americans, and Africans, but a considerable proportion of their ancestry is Caucasian (
46). For European-descendent populations, a p value ≤7.2×10
−8 is regarded as compellingly significant for a genome-wide effect (
47), but this assumption is appropriate for hypotheses that are tested on a genome scale (
48). Our results strongly support the involvement of common functional variants in antidepressant drug response, specifically of brain methylation sites. Not much is known about the function of the intergenic exm-rs1321744, but growing clinical and preclinical evidence has implicated brain epigenetic changes in stress, depression, and antidepressant action (
15–
19); however, that body of work has focused mainly on regulation of the hypothalamic-pituitary-adrenal axis.
Even though linkage disequilibrium block harboring exm-rs1321744 does not harbor any gene, there are three genes surrounding the significant associated peak: TBX18, NT5E, and SNX14. TBX18 encodes a member of an evolutionary conserved family of transcription factors that plays a crucial role in embryonic development. NT5E encodes a plasma protein membrane that catalyzes the conversion of extracellular nucleotides to membrane-permeable nucleosides. SNX14 encodes a member of the sorting nexin family that is involved in intracellular trafficking. The encoded protein also contains a regulator of G protein signaling domains that act as GTPase activating proteins for G alpha subunits of heterotrimeric G proteins.
Our ARPA results suggest that the phenotype for antidepressant response is polygenic, as tree analysis showed that three common functional SNPs could predict remitter versus nonresponder status with 94% accuracy in our population. The top main splitter, exm-rs1321744, was discussed above, but, intriguingly, the other main splitter variants are located in regions relevant to lipoprotein function; rs7679 (in the
PCIF1 gene, chromosome 20) has genome-wide association with blood lipoprotein concentrations (
49), and intergenic exm-rs350035 on chromosome 5 is 10 nucleotides away from rs351629, a genome-wide SNP associated with triglycerides phenotype (
50) (source, dbGaP;
http://www.ncbi.nlm.nih.gov/gap).
Our study had several limitations. One is that the genotyping was done in only 65 of 166 participants who completed 8 weeks of treatment and 232 intent-to-treat subjects. We found no differences in week 8 HAM-D scores between completers who were genotyped and those who were not (see Table S3 in the online data supplement), which supports the absence of any ascertainment bias. A second limitation is that antidepressant blood levels were not included as a covariate, as they were randomly collected at different times of the day; some patients could come for follow-ups only in the evening. The absence of placebo is another important limitation. However, as the efficacies of desipramine or fluoxetine were well known when we designed our study, it would be ethically problematic to justify a placebo arm. To minimize placebo response in our study, patients received placebo during the first week of their treatment. Placebo was given single-blinded; patients knew that they were going to receive 1 week of placebo at some point in their treatment course, but staff knew that this would occur in the first week of treatment. In our study, we determined a priori that placebo response was a reason to remove participants from the study, and therefore we believe that the SNPs we identified are likely to predict response to antidepressant treatment and not merely improvement.