In patients with major depressive disorder, nonresponse to treatment with selective serotonin reuptake inhibitors (SSRIs) and serotonin-norepinephrine reuptake inhibitors (SNRIs) is common, particularly in women, occurring in about 70% of patients despite adequate dosing (
1). Additional well-tolerated augmentation strategies are needed, particularly ones that do not cause or exacerbate symptoms such as fatigue and sexual dysfunction. Modest preclinical and clinical data suggest that transdermal testosterone, in dosages designed to raise levels into or near the physiologically normal range for women (10%−20% of male levels), is a candidate for such a therapy.
Preclinical and animal models are consistent with the hypothesis that androgens are modulators of mood (
2–
4). Testosterone affects brain function directly as well as by aromatization of testosterone to estradiol and conversion to the potent androgen dihydrotestosterone (
5–
10). Additionally, the testosterone metabolite 5α-androstane-3α,17β-diol (3α-diol) is a neuroactive steroid that has been shown to exert important GABAergic-related effects on affective symptoms. Preclinical models also suggest androgen effects on central serotonin neurotransmission, which is relevant to the modulation of core depression symptoms (
11).
There are also clinical data that suggest a beneficial effect of testosterone on mood in hypogonadal men with refractory depression (
12) and from small randomized placebo-controlled trials in women not selected for major depression. In the latter studies, conducted in women after bilateral oophorectomy with sexual dysfunction (
13) or in women with hypopituitarism (
14), low-dose testosterone administration improved mood, fatigue (
14), and sexual dysfunction (
15). Our preliminary data in women with treatment-resistant depression (N=9) treated with transdermal testosterone in an 8-week open-label trial (
16) demonstrated that 67% achieved categorical response (defined as a decrease ≥50% from baseline on Montgomery-Åsberg Depression Rating Scale [MADRS] score), and 33% achieved remission (defined as a MADRS score ≤10) after 8 weeks of therapy. Moreover, in our studies and in other investigations, in total evaluating more than 2,000 women followed for as long as 1 year, low-dose transdermal testosterone was found to be extremely well tolerated, without significant hyperandrogenic or metabolic side effects (
17).
Based on these data, we hypothesized that low-dose adjunctive testosterone would result in greater improvement in depression symptom severity in women with antidepressant-resistant major depression compared with placebo. We additionally hypothesized that testosterone would be well tolerated and, compared with placebo, would improve fatigue and sexual dysfunction—specific symptoms that are commonly associated with major depression and with many medications used to treat major depression. In addition, because use of compounded and male-branded testosterone products by women is common (
18), we sought to establish whether adjunctive transdermal testosterone was safe and well tolerated for women with major depression. Finally, we also explored whether adjunctive low-dose testosterone would increase activation of the subgenual and dorsal anterior cingulate cortex (ACC), a brain region important in the regulation of mood.
Methods
Participants
The protocol was approved by the Partners Human Research Committee and the Butler Hospital Institutional Review Board, and written informed consent was obtained from all participants before any procedures were performed. Inclusion criteria were female sex, age between 21 and 75 years, a primary diagnosis of major depressive disorder by DSM-IV criteria using the Structured Clinical Interview for DSM-IV Axis I Disorders (SCID-IV) (
19), a MADRS score ≥12, and current treatment with an antidepressant (a first, second, or third trial in the current depressive episode) at an adequate dosage for at least 8 weeks, with sufficient source documentation to confirm a high level of confidence in treatment details, using the MGH Antidepressant Treatment Response Questionnaire (
20). In addition, a baseline free testosterone level no higher than the third quartile of the normal range was required. Exclusion criteria included significant suicide or homicide risk, history of psychotic features, or bipolar disorder as assessed by the SCID-IV. Medical exclusions included untreated hypothyroidism, current use of androgens, and a history of a hormone-responsive cancer. Sixty-five participants were recruited and studied at the Massachusetts General Hospital site, and 36 at the Butler Hospital site.
Study Design
The study was a two-site 8-week randomized placebo-controlled parallel-groups trial. Participants were assigned, in a randomized 1:1 ratio, to receive low-dose adjunctive testosterone cream (
21) or identical-appearing placebo made by the same manufacturer. The starting dosage of 10 mg/day was chosen to target the upper-normal range of testosterone for young women (
21). Dosage titration was performed by an unblinded study monitor and was based on serum levels of free testosterone. Placebo sham dosage adjustments were made to maintain investigator blinding. Testosterone cream was applied as an adjunct to ongoing stable pharmacotherapy.
Psychiatric assessments, hormone level measurements, and dosage adjustments were performed as outlined in
Table 1. To ensure reliability and quality control, ratings at Massachusetts General Hospital were supervised by psychiatrists and psychologists at Massachusetts General Hospital and Butler Hospital, using the same standards and principles to ensure consistency across the two sites. Psychiatric evaluators at both sites were extensively trained in the use of the SCID and the MADRS by gold-standard videos; prior assessments of interrater reliability in these measures yielded kappa coefficients greater than 0.75 and intraclass correlation coefficients greater than 0.8.
Hormone Assessment
Samples were collected before 10 a.m., stored at −80°C, and batched for analysis. Serum testosterone, free testosterone, and cortisol concentrations were assayed by Mayo Medical Laboratories (Rochester, Minn.). Serum testosterone was measured using liquid chromatography with tandem mass spectrometry, and free testosterone by equilibrium dialysis.
Statistical Analysis
The primary efficacy endpoint was change in MADRS score. Chi-square tests or t tests were used, as appropriate, to compare the distributions of baseline variables and evaluate baseline group equivalence. Outcome variables were measured at five time points: at baseline (before treatment) and at weeks 2, 4, 6, and 8. The efficacy analysis used a repeated-measures analysis of variance, with both treatments set to placebo at baseline, and to their respective groups at follow-up. The treatment effect using this model measures the average difference between the treatments at weeks 2 through 8, corrected for baseline values. The variance-covariance matrix was left unspecified. With 50 participants in each treatment group, we predicted greater than 80% power to detect a 5-point difference in the change in MADRS scores (baseline to 8 weeks) between the testosterone and placebo groups (
22,
23).
The same procedure was used for analysis of change in hormones, except that for all hormone levels other than testosterone and free testosterone, only two measurements were available, at baseline and at week 8. We tested for an interaction between baseline free testosterone level and treatment by introducing an interaction term and a main effect of testosterone into this model. Interaction testing between menopausal status and treatment was performed using the same methods. Additionally, within-group analyses were performed using paired t tests. Response and remission rates were compared across the two groups using Fisher’s exact test. Data are presented as means and standard deviations.
The Systematic Assessment for Treatment Emergent Events–Systematic Inquiry (SAFTEE-SI) questionnaire categorizes adverse event severity as 0 (none), 1 (mild), 2 (moderate), or 3 (severe). Treatment-emergent side effects were defined as an increase by 2 or more levels of severity from pretreatment baseline assessment. The proportion of participants in each of the two treatment groups who reported threshold side effects at any time during the treatment period were compared using Fisher’s exact test.
fMRI Substudy
Thirty-one participants at the Massachusetts General Hospital site were evaluated to enroll 20 participants in the functional MRI (fMRI) substudy, and 11 of them either declined (N=8) or were not eligible (N=3). fMRI scans were performed at baseline and at week 8. In premenopausal women, all testing was performed during the follicular phase of the menstrual cycle.
fMRI data were acquired using a 3-T Siemens Skyra whole-body scanner equipped for echo planar imaging (Siemens Medical Systems, Iselin, N.J.), with a three-axis gradient head coil. Images were projected using a rear projection system and the E-Prime 2.0 stimulus presentation software program. After automated scout and shimming procedures and two high-resolution three-dimensional magnetization-prepared rapid gradient echo sequences, fMRI images (i.e., blood-oxygen-level-dependent signal) were acquired using a T2*-weighted sequence (39 horizontal slices aligned perpendicular, 3.1-mm thickness, TE=28 ms, TR=2.0 seconds, flip angle=90°).
Participants completed a rapid event-related emotional conflict paradigm (
24,
25) in which faces with fearful and happy expressions were presented with the words “happy” or “fear” written across them; the participants’ task was to identify the emotional expression of the faces while ignoring the words, which were either congruent or incongruent with the facial expression.
fMRI data were processed using SPM8 (Wellcome Department of Cognitive Neurology, London;
www.fil.ion.ucl.ac.uk/spm). fMRI images were motion corrected and normalized to Montreal Neurological Institute space. Condition effects were modeled with regressors representing the occurrence of each trial type (incongruent or congruent). For each participant (first-level analysis), condition effects were estimated at each voxel, and statistical parametric maps (SPMs; i.e., contrast images) were produced for each condition (incongruent or congruent). To estimate conditions at the group level (second-level analysis), individual participants’ SPM contrast images were entered into a second-level random-effects analysis, using a flexible factorial model with subject as the first factor and condition (incongruent or congruent) as the second factor. The a priori specified region of interest was the ACC, and the posterior cingulate cortex (PCC) was a secondary prespecified region of interest. For each, we adopted a statistical significance threshold of p<0.05 uncorrected. The ACC and PCC were defined with masks provided by the Anatomical Automatic Labeling tool (
26) implemented in the WFU PickAtlas (
http://www.ansir.wfubmc.edu) (
27,
28).
Discussion
Augmentation with transdermal testosterone, administered at low dosages designed to raise free testosterone levels to the high-normal female reference range, did not improve depression symptom severity, fatigue, or sexual dysfunction to a greater degree than placebo. Additionally, fMRI data from a subset of trial participants showed no difference in activation of the dorsal and pregenual ACC after testosterone treatment compared with placebo. We found that low-dose adjunctive testosterone had an excellent safety profile in this 8-week trial. The placebo response rate in this trial was high (49%), which may have accounted in part for the lack of observed treatment effect.
Major depression disproportionately affects women, and inadequate treatment response in depression is highly prevalent. For example, the multicenter Sequenced Treatment Alternatives to Relieve Depression (STAR*D) trial demonstrated that approximately two-thirds of patients with major depression do not achieve remission after 8–12 weeks of adequate antidepressant therapy (
29). Although there are effective augmentation therapies available, such as lithium and atypical antipsychotics, many patients do not respond to or are intolerant of these interventions. Even when adequate responses are obtained, these drugs may cause significant side effects, which limit their long-term use (
30–
36). Previous trials and our pilot data supported the evaluation of transdermal testosterone in a properly scaled prospective randomized trial. Testosterone was an excellent candidate for study in women with antidepressant-resistant depression because of its potential antidepressant effects, its ability to improve fatigue and sexual function, and its favorable side effect profile (
13,
15,
37–
42). Nevertheless, in this rigorously designed placebo-controlled study, we did not find a significant improvement in mood with testosterone compared with placebo, although, as noted above, the high placebo response rate may have accounted for the lack of significant differences. As these were antidepressant-resistant patients, who by definition are a more challenging population to treat, the results of this study may not be generalizable to other depressed patients.
It should also be noted that the response and remission rates we observed for testosterone administration in this study are in line with those that the literature typically suggests for antidepressants (
43). However, the placebo response rate was also high, and it was similar to that for testosterone. Placebo response rates in depression studies are often high (
44), and it has been shown that studies with greater than a 40% placebo response rate are unlikely to demonstrate a statistically significant effect of the antidepressant (
45). Additionally, studies demonstrate a higher placebo response rate when patients with a lower severity of depression are included (
46,
47), as was the case in this study. High placebo response rates may challenge underlying assumptions when designing trials using current methods, and several initiatives have focused on novel strategies to reduce placebo response in depression studies (
48–
51). Studies implementing these strategies may produce better quality data, with greater separation rates between active treatment and placebo in cases where the drug is truly effective. Therefore, further studies employing these strategies may be warranted.
To our knowledge, this is the first randomized placebo-controlled study of adjunctive low-dose testosterone in women with antidepressant-resistant major depression. Fooladi et al. (
52) assessed effects on libido and sexual function in a randomized trial of low-dose testosterone in women (N=44) with treatment-emergent loss of libido who were on stable SSRI or SNRI therapy. The authors observed no improvements in their primary endpoint, reported level of libido, but they did find an increase in frequency of sexual activity. Consistent with our findings in women with antidepressant-resistant major depression, they reported no group difference in change in depression severity as a secondary endpoint. However, women with severe depression (Beck Depression Inventory–II scores >28) were excluded from that study, resulting in recruitment of a sample with mild depression (a mean baseline Beck Depression Inventory–II score of 8.0). Although these results may not be generalizable to a more severely ill or antidepressant-resistant population, they are consistent with the results of our study.
The anterior cingulate cortex (ACC) has been implicated in the pathophysiology of major depressive disorder (
53,
54). In a previous fluorodeoxyglucose positron emission tomography study examining the effects of low-dose testosterone in women with anorexia nervosa and relative androgen deficiency (
55), we found lower cerebral metabolism in women with anorexia nervosa than in control subjects, with increases in subgenual ACC activation after testosterone administration. Consistent with these prior findings, here we report an inverse association between baseline free testosterone levels and both ACC and PCC activity (as measured by fMRI), suggesting that further study of the possible role of gonadal steroids, including androgens, in the etiopathology and/or as treatment targets in antidepressant-resistant depression is warranted. However, we did not find a difference in activation of the dorsal and pregenual ACC or PCC after testosterone administration compared with placebo in this study.
Limitations of this study were those inherent in all blinded clinical trials of a disorder characterized by symptom heterogeneity and response vulnerable to placebo effect, in this case major depression. It is possible that a more homogeneous population of postmenopausal women with lower levels of testosterone at baseline would allow detection of differences in adjunctive testosterone compared with placebo. Moreover, we cannot rule out type II error in the context of a higher than expected placebo response.
Acknowledgments
Supported by NIMH grant R34 MH099315 (Drs. Miller, Fava, and Carpenter). Additionally, individual investigators were supported by NIH grants K24HL092902 (Dr. Miller), K23 DK113220 (Dr. Dichtel), K23 MH100623 (Dr. Brady), K23HD087464 (Dr. Fisher), K23AT008043 (Dr. Nyer), K23 DK097356-02 (Dr. Chang), and T32 DK007028 (Dr. Kimball); the Dupont-Warren Fellowship and Livingston Award through the Harvard Department of Psychiatry (Dr. Cassano); and the National Alliance for Research on Schizophrenia and Depression (NARSAD) Young Investigator Award from the Brain and Behavior Research Foundation (Dr. Cassano). Lawley Pharmaceuticals provided study medication and identical placebo at no cost. The Foundation for Women’s Wellness provided funding for the fMRI substudy (Drs. Dichtel and Miller).