Major depression is a common and often disabling disorder with high lifetime prevalence rates. While we have made considerable progress in understanding the biology of the disorder and in treatment development, there is much yet to be done. Many have lamented that the disorder appears to be too highly heterogeneous, and, understandably, genetic studies to date have yielded limited numbers of genetic variants that contribute to risk. One explanation for the relative lack of progress is that the diagnosis as it currently stands may be reliable (i.e., two clinicians may come to agreement regarding a given patient’s diagnosis) but not particularly valid (i.e., it may not reflect a disorder that connotes an underlying biology). Hence, we have seen efforts to concentrate research on core physiological constructs such as anhedonia rather than on major depression overall. Further, while we have treatments that provide reasonable efficacy, most patients do not remit with a single treatment (
1,
2), instead needing either second- to fourth-phase monotherapy or combinations of treatments (
3). Our hope is that outcomes will be improved with the development of new treatments based on alternative mechanisms of action and with the development of more personalized approaches in which clinical characteristics or biological markers are used to improve treatment selection, prediction of response, and overall response.
Efforts in depression research are highlighted in a series of articles published in this issue the
Journal (
4–
9), which serve in part as an impetus for this commentary. Taken together, these studies and other recent developments provide considerable hope that we can improve diagnosis and outcome. However, they also raise a number of questions and problems that the field still encounters in researching the biological basis and treatment of depression. Here, I address some of these issues and discuss possible solutions.
Diagnosis: Time to Think About a Change
Depression is a common disorder with a lifetime prevalence rate of about 17% in this country (
10). The disorder imposes the highest morbidity in the world, with a great deal of the cost being the indirect expense of loss of productivity (
11,
12). The DSM-5 diagnosis is based on patients’ fulfilling at least five of nine symptoms for at least 4 weeks (
13). Historically, the diagnosis has been seen as reliable in assisting communication between providers about patients, determining health care benefits, planning for treatment, assessing outcome, and so on. However, the rubric has considerable limitations, such that many do not see it as particularly valid. This has stemmed largely from concerns about genetic studies that have not been particularly productive in determining specific risks and from failed attempts to develop novel treatments. In a report on a meta-analytic genome-wide association study in this issue that explores 16 lead candidate genes for risk for depression, none were seen as reaching statistical significance in the overall sample (
4). Of potential promise was that the dopamine receptor 2 gene was found to be significant in the subsample from the Psychiatric Genetics Consortium. The studies used major depressive disorder criteria and explored a number of phenotypes based on characteristics such as number of recurrent episodes, severity, and so on.
But perhaps of even greater concern beyond failed candidate gene studies is that the field trials for DSM-5 (
14,
15) pointed to unexpectedly low reliability for the major depression diagnosis. While there are methodological differences between this set of field trials and previous ones, it is disconcerting that the recent reports have been wanting in not having a high degree of reliability for a diagnosis of major depression.
One approach to dealing with this issue has been to develop alternative constructs of core symptoms that can be studied biologically and used to screen for possible efficacy of innovative compounds. This is most notable in the National Institute of Mental Health’s Research Domain Criteria (
16,
17), which have been applied to studies using brain imaging and even, most recently, drug and device research. How well the criteria will perform for treatment development of putative antidepressants is not at all clear. Recently presented data (
18) indicate that anhedonia as a construct could be used to explore effects of possible antidepressants. However, it is not clear whether improvement in anhedonia will be accompanied by improved sleep, less psychomotor retardation, and improvements in other symptoms, such as anxiety. Another possible approach would be to reconsider what goes into the diagnosis of major depression. Should the diagnosis be based on just five of nine symptoms? The current matrix of symptoms? Are there other core or common symptoms that have not been routinely included in the diagnostic criteria?
Key Clinical Features That Have Been “Lost in Translation” in DSM
Anxiety symptoms are not only common in major depression but also are an important predictor of clinical response to antidepressants. In both the large-scale Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study and the International Study to Predict Optimized Treatment in Depression (iSPOT-D), prominent anxiety was a predictor of poor response to treatment (
2,
19). Interestingly, in iSPOT-D, comorbid anxiety syndromes were not a significant predictor of response (
20). Rather, the degree of anxious symptoms in general augured poor response to traditional treatments (antidepressant treatment with selective serotonin reuptake inhibitors, serotonin-norepinephrine reuptake inhibitors, and bupropion). There are indications from other studies that addition of a benzodiazepine or the atypical antipsychotic quetiapine to monotherapy is effective in anxious depression and that monotherapy with mirtazapine is more effective than paroxetine in geriatric depression (
21–
23). The finding that anxious symptoms—but not comorbid anxiety disorders—predict poor response suggests that anxiety needs to be considered when assessing depression. Assessing anxiety in major depression has been recommended in DSM-5 using an anxiety dimensional measure. But if anxiety is common and it needs to be assessed for treatment decisions, should we not include it as a core symptom for diagnosis?
Similarly, our group has reported that in large-scale community samples, physical pain is seen in some 50% of individuals with major depression, in contrast to 15% in the general population (
24,
25). Pain has been used to predict response to specific agents, such as duloxetine (
26), and thus should we not include pain as a criterion too? Indeed, much of the opioid epidemic reflects depressed patients becoming addicted to these agents after they undergo surgery, such as knee replacement. A few months ago, the director of the National Institutes of Health tweeted (
https://twitter.com/NIHDirector/status/1035238775394381824) that our group’s observation that ketamine’s antidepressant effects were blocked by the mu opioid receptor antagonist naltrexone (
27; see below for elaboration) may help explain the triad of major public health problems—depression, pain, and opioid addiction. Interestingly, ketamine is effective in both major depression and chronic pain. Including pain as a criterion for major depression could help identify more efficiently those surgical patients who suffer from comorbid depression and who are at increased risk for developing opioid addiction. Such patients need better strategies for treating their comorbid pain and depression and for reducing the risk of developing addiction.
Historically, DSM has avoided building treatment response into the diagnostic criteria, and there are cogent reasons for that. However, here we see that key symptom dimensions that are commonly found in individuals with major depression (often more commonly than the current core criteria) are not included in the criteria. Using dimensional scales does not appear to be adequate. Clinicians often do not apply them, and unless the features are included in the diagnostic criteria, they may not be considered routinely. The failure to include them diminishes the usefulness of DSM. Add to this that candidate gene studies have not been particularly informative using major depression constructs (such as number of episodes, severity, etc.) and that the reliability of the major depressive disorder diagnosis was low in the DSM-5 field trials (
14,
15). Thus, psychiatry and the American Psychiatric Association need to rethink the major depressive disorder syndrome and the criteria for the diagnosis. This could possibly be done by adding these commonly observed symptoms while retaining the current criteria to allow for continued analysis of already collected genetic samples in subjects who previously met criteria for major depressive disorder. Irritability, another feature that could be used in the diagnostic criteria, is discussed below.
Prediction of Response and Response Trajectory
An alternative approach to developing new agents to improve response is to better predict which patients will respond to specific treatments, perhaps on the basis of clinical characteristics, genetics (pharmacogenetics), brain imaging (e.g., functional MRI), and so on. In addition, potential outcomes could be judged earlier in trials by applying data from previous studies on response patterns. Such approaches could help determine likelihood of response or of failure earlier in the course of treatment, saving needlessly prolonged exposure to drug. Three of the articles in this issue present data on clinical feature approaches to predict response to medications or repetitive transcranial magnetic stimulation (rTMS).
The article in this issue by Zisook and colleagues (
5) based on a large Veterans Affairs (VA) cohort with depression attempted to look at several questions that arose from the STAR*D study. The VA study took patients who had a suboptimal response to one drug treatment and randomly assigned them to switch to bupropion, combination treatment with bupropion, or augmentation with the antipsychotic aripiprazole. In a sense, this study was retesting the augmentation or switch to bupropion strategies that were included in STAR*D as well as testing antipsychotic augmentation, which was not in STAR*D, largely because STAR*D was designed before the common augmentation with atypical antipsychotics. The study, an effectiveness trial, employed a single blind but no placebo arm. The three next-step approaches were largely equivalent in efficacy. General predictors of response included lower severity, lower chronicity, current employment, and several other measures. Of particular note was the little difference among the treatments. However, patients over age 65 responded better to aripiprazole augmentation than to switching to bupropion, and in patients of all ages who presented with mixed hypomanic symptoms, augmentation with aripiprazole and bupropion combination were more effective than bupropion alone. These data fill in important information regarding atypical antipsychotic augmentation in refractory depression as well as in patients with mixed features. The latter is consistent with others’ observations that the atypical antipsychotic lurasidone is effective in major depression with mixed features (
28).
In a study on rTMS (
6), data were pooled from trials on traditional left dorsolateral prefrontal cortical rTMS studies as well as from intermittent theta burst rTMS to explore for patterns of response. Although independent raters were used, there was no sham treatment group. Four patterns were observed. One of them was a rapid response, in which efficacy can be discerned as early as week 1. Milder severity was seen as a significant predictor of such rapid response, as was older age. In contrast, high severity predicted nonresponse. The findings across the two studies that older age predicts response to aripiprazole augmentation (
5) and to rTMS (
6) are helpful for clinicians treating this age group. However, in the absence of a placebo or sham control in these two trials, we need to be careful about the conclusions we draw from the results. This highlights the need to have placebo controls if we are to answer important questions with confidence. These studies indicate that more severely depressed patients respond poorly to agents that have been shown previously to be effective and have been approved for clinical use. This is by no means new. Severity in depression augurs poorly for treatment response and remains a clinical challenge. However, what are we to make of the finding that mild severity predicted a positive response in these non-placebo-controlled trials? Regardless, we are still left with the issue of finding positive treatment predictors in more severely ill patients. This remains a major need and challenge for the field.
The study by Trivedi’s group (
7) discusses an interesting clinical feature of depression that has not attracted much attention—irritability. This article points out that irritability is observed in some 40%−50% of patients with depression (
29) but has been largely overlooked in clinical studies. The authors analyzed data from two major studies and found that reduction in irritability at week 4 was a significant predictor of final response. This finding offers another tool for evaluating early predictors of treatment response and could aid in determining early on in therapy when to switch from a given medication. The finding that irritability is common and is potentially informative about treatment is just one of several key commonly observed features that have not been incorporated into DSM-5 criteria for the disorder (see above for discussion of anxiety and pain).
Ketamine and Abandoning the Blind
Ketamine represents an exciting step forward in the drug treatment of depression. A number of controlled trials point to intravenous racemic ketamine being more effective than control (generally saline or midazolam). In a review of existing studies, Newport and colleagues (
30) point to considerable efficacy of single ketamine infusions compared with control, as rated by day 1 and which appears to last for a few days. Similarly, Wilkinson et al. (
31) reported that suicidal ideation was reduced rapidly by intravenous ketamine, and the effect often lasted for a full week. Two recent reports on intranasal esketamine compared with saline also point to significant acute effects in refractory depression and suicide prevention (
32,
33), although a recent U.S. Food and Drug Administration advisory board also pointed to failed trials and small effect sizes in the positive trials. Moreover, in the suicide prevention study (
33), intranasal esketamine separated from placebo at days 1 and 2 but not beyond. Initiation of other treatments, including hospitalization, was thought to possibly explain the later lack of differences between study groups.
One major issue with intravenous ketamine is that there has been little study of longer-term treatment—both on efficacy and potential side effects—to allow for recommendations as to whether and how to use it (
34). The article from Blier’s group (
8) attempts to provide those data; however, the design raises vexing questions for the field. In that study, the research group performed a double-blind crossover study that acutely compared intravenous midazolam with ketamine. Midazolam has been used previously as a control in parallel-design comparisons with ketamine (
30,
35). It produces somewhat more sedation but less dissociation than ketamine. In the first phase of the study (
8), a crossover design was employed. An extremely low rate of remission was seen with ketamine (12%), although it was significantly higher than with midazolam (0%). In the next phase, patients were treated with six open-label ketamine infusions over 2 weeks, and an increasing response was observed over the 2-week period. In phase 3, open-label once-weekly infusions were given for 4 weeks. The conclusion was that further ketamine treatment provides additive benefit beyond an initial infusion and it can be maintained over several weeks. But in the absence of a blind for the longer-term treatment phase, what can one conclude regarding long-term efficacy? The authors note that it made no sense to blind that phase since patients had been exposed to ketamine and would know if they had not received it. However, the low response rate to the blinded ketamine in phase 1 of the study makes one wonder if that is the case. Indeed, with ketamine and other powerful mind-altering agents, we have seen the suspension of the usual principles of experimental design—namely, that we need to perform double-blind randomized controlled trials to assess efficacy. Similarly, in the industry trials of intranasal esketamine, there is a major question as to whether there is a true blind when one uses saline as the control against a drug that produces pronounced dissociation and nausea.
High response rates to an inactive agent or placebo in double-blind clinical trials have been a bane of psychiatric drug development over the past two decades. There are numerous examples of agents with a number of putatively effective mechanisms of action that have failed in phase 2 or 3 studies when compared with placebo. Several reasons are commonly proffered. We lament the looseness and nonvalidity of the diagnosis of major depressive disorder and the limitations of available rating instruments. We lament that we do not understand the true biology of major depression or what is the best approach to its treatment. We do this in the context of having a growing cadre of patients with refractory depression who continue to experience suboptimal response to standard and novel agents. To solve the problem, we have moved on to using powerful mind-altering agents such as ketamine and, more recently, the hallucinogen psilocybin. But these agents present major problems in assessing efficacy. Ketamine appears to have considerable potential, but should we not maintain our belief that randomized controlled trials are the gold standard for our field? We all know about treatments (both psychosocial and somatic) that have been adopted largely for off-label use in major depression that have little, if any, data to support their use. For psilocybin, we also see considerable difficulty in having truly blindable controls, since the dissociation is seen by some to be key for response, and very low doses of the drug (or niacin, which has been used as a comparator) do not produce similar effects (
36,
37).
Science as a field requires not only the acknowledgment and application of laws of nature but also the agreed-upon principles and rules for the design and implementation of experiments. Psychiatry strives to be seen as a medical specialty based on the science of psychology, pharmacology, and neuroscience. Are we now moving away from those principles because we are desperate for new, more powerful treatments? Unfortunately, the longer-term ketamine trial gives us little data that are supported by controlled experimental design methods and that allow us to judge how we should use the drug long-term.
Ketamine and a Search for the New Ketamine
Since ketamine is novel in its acute and rapid antidepressant effects, it makes a great deal of sense to determine its key mechanism of action to be able to plan for follow-on compounds, particularly ones that can be administered more easily and with fewer side effects such as dissociation, nausea, and hypertension. The report from Duman’s group (
9) presents elegant studies on whether vascular endothelial growth factor (VEGF) and its receptor mediate the response to ketamine in rodents and in other laboratory models. The data appear to support the notion that VEGF plays a role. The study uses the necessary control comparisons. The authors advocate exploring for agents that work via VEGF as well as brain-derived neurotrophic factor (BDNF), which has been implicated in mood disorders as well as response to ketamine (
38) and raises the importance of exploring results on synaptogenesis and neurogenesis. These are not the only intracellular molecules to be implicated in ketamine response. Previously, mTOR (mammalian target of rapamycin) was also implicated to explain why ketamine—but not other NMDA antagonists—is effective in depression (
39). But we need to pause and take stock of where we truly are. While these animal models provide methods for testing possible mechanisms of action underlying changes in behavioral properties, do they actually tell us much regarding clinical antidepressant effects or future drug development? Indeed, BDNF studies have yet to yield a compound in the clinic, and neurogenesis as a possible mechanism of antidepressant action in other compounds has also been studied without positive results (
40). In ketamine we see effects almost immediately, well before we would expect neurogenic changes in the hippocampus, if that even occurs. Moreover, blocking mTOR with rapamycin in patients treated with ketamine or placebo failed to decrease ketamine response (
41). In fact, longer durations of antidepressant effects were seen in patients treated with ketamine plus rapamycin than in those treated with ketamine plus placebo (
41).
Given these findings taken together, one wonders whether we are making progress in understanding ketamine’s mechanism of action. Our group has posited that ketamine acts via an opioid mechanism either by being a weak mu opioid receptor agonist or by releasing endogenous opioids. We tested this in a randomized controlled trial in which we gave patients two ketamine infusions 2–9 weeks apart and administering 50 mg of oral naltrexone or placebo 45 minutes before the infusion (
27). The study was conducted under classic double-blind random-assignment conditions. In ketamine responders, naltrexone virtually eliminated the antidepressant effects. These data would explain why non-ketamine NMDA antagonists such as memantine may not have antidepressant effects. It also suggests that this mechanism of action should be focused on for screening of rapidly working antidepressants. These data and those from the rapamycin study (
41) are results from appropriately controlled studies aimed at understanding, in humans, mechanisms of action underlying antidepressant effects.
Conclusions
Recent research has provided incremental advancement in the treatment of depression. However, as scientific investigation moves forward, it opens up new questions regarding whether we are making true progress in the diagnosis and treatment of depression. Patients at the milder end of the spectrum appear to be adequately served by the current armamentarium of medication, but more severely ill patients are not. Our acute need for better treatments must be tempered by not abandoning the rigor needed for appropriate study designs that allow accurate and unbiased assessment of efficacy. Lastly, it is a conundrum for the field that commonly seen symptoms, which also may serve as predictors of treatment response, are not included in DSM criteria for major depressive disorder.