The efficacy and safety of antidepressants in children and adolescents have been a subject of great controversy in the scientific and clinical community, and they receive a great deal of attention in the media. When one grapples with the literature on antidepressants for children and adolescents with depression, there are a number of scientific and clinical complexities that are important to understand when reconciling the common use and perceived efficacy of antidepressants with meta-analyses that suggest that antidepressants are ineffective. There has been a spike in the controversy recently, following publication of a meta-analysis of antidepressants for children and adolescents with depression (
1) and of interpretations of this and other meta-analyses suggesting that antidepressants are not effective (
1,
2), are minimally effective, or are no more effective than placebo (
3).
The premise of this review is that meta-analyses that include the large number of industry-sponsored antidepressant trials (N>16) distort the picture of antidepressant efficacy for child and adolescent depression. The industry-sponsored trials have consistently had high placebo response rates (>50%) and/or small differences between active drug and placebo (∼10%). The industry-sponsored trials are most often considered negative trials (i.e., trials that did not demonstrate efficacy), yet the methodological and implementation challenges inherent to these studies (described below) suggest that they should be considered failed trials, and thus largely uninformative regarding efficacy and ineligible for inclusion in meta-analyses. In contrast to the industry trials, the two studies funded by the National Institute of Mental Health (NIMH), characterized by many methodological strengths, had lower placebo response rates (33%−35%) and meaningful between-group differences (25%) that support antidepressant efficacy. Thus, the inclusion of failed trials in meta-analyses is a mistake, as their methodological deficits and sheer number overwhelm data from the few methodologically rigorous NIMH-funded studies. In this context, the relative valuation of industry- and NIMH-funded studies of antidepressants for teen depression is important for a clear and compelling public health message.
In this review, I discuss the strengths and weaknesses of meta-analyses, describe the scientific and regulatory context for the large number of antidepressant trials in the late 1990s and early 2000s, and explain the distinction between a negative trial, where the treatment does not demonstrate efficacy, and a failed trial, where methodological problems are substantial and make it impossible to draw any conclusion about efficacy. Lastly, I discuss how the emphasis on the negative interpretation of the results of antidepressant efficacy studies for pediatric depression detracts from the positive evidence of antidepressant efficacy for pediatric obsessive-compulsive disorder and the pediatric anxiety disorders, which have a childhood onset and a higher combined prevalence than major depressive disorder in adolescence.
Meta-Analyses
Meta-analysis is a powerful tool in establishing the evidence base for the treatment of any condition. In order to capture as many high-quality studies as possible, meta-analyses rely largely on published clinical trials. Criteria are stipulated in advance to ensure that only high-quality studies are included (usually studies with large sample sizes and masked comparison treatments) to improve the chances that the results of the meta-analysis are meaningful. Because not all studies are published, specific analytic techniques are often used to account for unpublished studies. That said, it is important to note that meta-analyses can only be as informative as the trials included. Even when the studies are large and controlled, the strength of a meta-analysis’s conclusions can be limited by methodological deficits of the included trials.
Close examination of the child and adolescent depression literature reveals important methodological criteria that are not routinely used to exclude or limit the impact of studies in modern meta-analyses. For example, many meta-analyses focus on quality design criteria (i.e., large sample sizes and randomized placebo-controlled trials) but do not scrutinize trial implementation criteria. Implementation criteria include factors such as who the patients were, how they were recruited and maintained in the study, and how ill they were; how expert the investigators were in clinical trial methods and how thoroughly trained the investigators were in the disorder of interest; how many sites participated and whether those sites were academic or clinical sites or contract research firms; and how many subjects per site were enrolled. If implementation criteria are not scrutinized, the meta-analysis runs the risk of including well-designed but poorly implemented studies that do not offer meaningful information about a treatment’s efficacy.
Why We Needed Depression Clinical Trials in Children and Adolescents
It is important to understand the historical and regulatory context of antidepressant trials for depression in children and adolescents (for a review, see reference
4). Certain patient groups (children, women of childbearing age, and the elderly) were historically excluded from participation in clinical trials because of the perceived elevated risk of conducting research with pharmacological agents in these vulnerable populations. The premarketing studies of the newer antidepressants (mostly selective serotonin reuptake inhibitors [SSRIs]) focused largely on adults. After the newer antidepressants came to market in the late 1980s and early 1990s, children and adolescents increasingly received antidepressant treatment, as the newer agents had a substantially better side effect profile than did the existing tricyclic antidepressants. As antidepressant use in children and adolescents grew, concern shifted away from the risk of including children in research studies and instead focused on the lack of efficacy and safety data in children. Increasing use of psychotropics made the study of these medications in the pediatric population increasingly necessary and urgent.
The Food and Drug Administration Modernization Act (FDAMA)
The U.S. Food and Drug Administration (FDA) had a long-standing concern about the lack of efficacy and safety data on medications in the pediatric population, and it undertook a number of initiatives to motivate the pharmaceutical industry to study medications in children and adolescents (for a review, see reference
4). With the passage of the Food and Drug Administration Modernization Act (FDAMA) in 1997 (
5), a very effective strategy for motivating industry support for studies in children and adolescents was launched. The FDAMA mandated that industry conduct studies in children and adolescents. Among already marketed medications, those targeted for study were medications with indications in adults that could be used in children and adolescents. Importantly, the FDAMA
incentivized industry to conduct studies in children and adolescents by offering an additional 6 months of “pediatric exclusivity,” which is essentially an extension of existing patent life. While such an incentive may not appear to be a very big carrot, it was large enough. For example, Prozac (fluoxetine) was a $2.5-billion-a-year medication at the end of its patent life. FDAMA’s 6 additional months of exclusivity yielded approximately $500 million in additional profit (
6), in exchange for a relatively modest investment in clinical studies in children and adolescents.
The FDAMA included a number of other stipulations. The trials had to use the same indication as was used in adults, so antidepressants with a single indication (i.e., depression) had to study that indication in children and adolescents. That is one reason why prepubertal children were included in some industry-sponsored depression trials, even though the prevalence of major depression in prepubertal children is very low. Also, the study plan had to be approved by the FDA, so large-sample randomized placebo-controlled trials or pharmacokinetic or safety studies as done in adults were the methods of choice. Lastly, the trials had to be completed before the end of a medication’s patent life. With the entry of fluoxetine and other SSRIs into the market in the late 1980s and early 1990s, and with the FDAMA coming online in 1997 with a sunset date of January 2002, industry studies of the newer antidepressants had to be completed within a short time frame in order for the pharmaceutical companies to reap the financial benefits of the FDAMA.
With the pharmaceutical industry appropriately motivated, many pediatric studies of all medication classes were completed in just a few years. By September 2000, pharmaceutical companies had submitted more than 191 proposed study requests, with 58 studies completed—a sharp increase from the pre-FDAMA era (
7). The antidepressants in these studies included fluoxetine, sertraline, mirtazapine, paroxetine, citalopram, nefazodone, and extended-release venlafaxine. Fluvoxamine met the standard with a premarketing trial for childhood obsessive-compulsive disorder (OCD), but no depression trials, and there were no depression trials for bupropion.
By the time the Treatment for Adolescents With Depression Study, the definitive trial for teen depression, was published in 2004 (
8), 14 industry-sponsored trials of the newer antidepressants for pediatric depression had been completed.
Industry-Sponsored Depression Studies
At the time the industry-sponsored depression trials were being conducted under the FDAMA, there were precious few psychopharmacological experts in child and adolescent psychiatry, and even fewer with expertise in implementation of clinical trials in children and adolescents. Those who did have expertise were usually located in academic departments, overcommitted, and subject to notoriously slow academic institutional review boards. A major implementation challenge for industry under the FDAMA was to identify a large number of investigators who could rapidly get human-subject approval and recruit a large number of participants within a very tight time frame. Putting a study team together was achieved through personal contacts between industry representatives and potential investigators but also through broadcast faxes and e-mails to clinicians inquiring whether the recipient would be interested in being an investigator in a clinical trial. The group included academicians, clinicians in private practice, and clinicians whose practices were focused on conducting clinical trials for industry. Participating investigators were drawn not only from child and adolescent psychiatry but also from adult psychiatry as well as pediatric and adult primary care. Human-subject review occurred at academic but also independent institutional review boards. Trial start-up meetings were often a single-day event, sited at a hotel or resort, and offered a chance to meet interesting and energetic colleagues. Training would usually consist of a review of the protocol, methods to complete data and regulatory forms, and typically a video of an academic expert conducting the outcome assessment. The study investigators would often complete in parallel a rating of the participant on video. Little additional training was provided for those who had limited experience with the disorder under study or in clinical trial implementation, such as recruitment of appropriate subjects, retention of both responders and nonresponders, and adverse event reporting, such as distinguishing new-onset from pre-existing adverse events.
Once the study site was up and running, quality-monitoring checks largely focused on the timeliness and completeness of the regulatory and clinical data forms and did not necessarily focus on the appropriateness of the subjects recruited and the quality of the diagnostic assessment or outcome ratings. Investigators would receive some start-up funding and then were paid either per visit or activity or upon completion of a participant’s tasks or assessments. At the end of the acute, placebo-controlled phase, most trials included a longer-term open trial with active medication. The longer-term open trial would serve as a recruitment incentive, as participants who were randomized to placebo and did not respond would receive the active medication. In addition, participants who responded to active medication would be allowed to continue on the medication. The results of the industry-sponsored studies were relatively consistent, with response rates of active drug in the 55%–65% range and placebo response rates in the 50%–60% range. The positive outcomes demonstrated an approximate 8% difference between groups, translating into a number needed to treat of 12 or higher (
Table 1). Pediatric exclusivity was granted by the FDA for all medications listed in the table except selegiline and mirtazapine (
25).
NIMH-Funded Trials
At the time that industry was ramping up under the FDAMA, there was a parallel initiative at NIMH to develop the psychopharmacology and psychosocial evidence base for the major childhood psychiatric disorders. The Research Units of Pediatric Psychopharmacology (RUPP) initiative occurred during this time. The Multimodal Treatment of Attention Deficit Hyperactivity Disorder study (MTA) was in process. The Treatment of Adolescents With Depression Study (TADS), the Child/Adolescent Anxiety Multimodal Study (CAMS), the Treatment of Resistant Depression in Adolescents (TORDIA) study, the Treatment of Early-Age Mania (TEAM) study, the Treatment of Adolescent Suicide Attempters (TASA) study, the Pediatric OCD Treatment Study I and II (POTS), and the pharmacological studies of the RUPP Autism group were all addressing a huge psychotropic efficacy and safety gap in the literature. The Adolescent Depression Antidepressants and Psychotherapy Trial (ADAPT) (the definitive U.K. teen depression trial) was also from this era. These studies differed from the industry studies in a number of important ways. First and foremost, they were government-funded studies conducted by investigators with bona fide expertise both in the particular condition and in clinical trial implementation. In the United States, the studies underwent competitive peer review as well as NIMH Council review, and they often included active involvement of the NIMH Interventions Research Branch leaders as well as independent scientific review boards.
The goal of the studies was not merely to establish whether there was a signal of efficacy but also to establish a public health effect size and to inform practicing clinicians of best practices. These studies employed state-of-the-art research methods (i.e., comparative treatment trials with ecologically valid comparison and control groups) with an eye to reducing placebo response rates. Interventions were manualized, including the psychopharmacology study arms. Study clinicians had to review study materials and demonstrate mastery of the protocol and the intervention under study. Independent evaluators were commonly used, even when a study used a masked comparison of active drug and placebo. Data gathering included not only safety and efficacy measures but also measures that could be used in mediator and moderator analyses. Statistical analyses were state-of-the-art (e.g., imputation approaches to missing data rather than last observation carried forward). It was not uncommon during study implementation to hold weekly conference calls to, for example, review recruitment and participant “caseness” to ensure that the right participants were included in the trial. Groups of principal investigators, study coordinators, evaluators, and clinicians each held frequent conference calls to maintain fidelity and quality. The list of quality indicators could go on.
The studies resulted in numerous published papers, and ultimately the data went into the public domain. The placebo response was uniformly low enough in the key antidepressant studies (TADS, CAMS, RUPP Anxiety, RUPP Autism Repetitive Behaviors) to identify differences between active medication and placebo of 25% (TADS), 30% (CAMS), and 50% (RUPP Anxiety). In the RUPP Autism Repetitive Behaviors study, the placebo response rate was low, but so was the medication response rate, suggesting limited benefit for SSRIs for this treatment target in this population. The low placebo response rate in these studies is the key to their success. It allows these studies to inform us about effect sizes, number needed to treat, and, from an adverse event point of view, number need to harm.
Failed Versus Negative Treatment Trials
There are a number of reasons why clinical trials do not come out as expected. The intervention may not be effective or may be unsafe; there may be flaws in the fundamental nuts and bolts of the trial operations, such as misrandomization and mislabeling of compounds (e.g., drug as placebo and vice versa); there may be problems with the data collection/entry or the coding of the data analytic program. The concern with the industry-sponsored depression trials of the newer antidepressants is largely about implementation; the design and quality monitoring of data collection is unquestioned. The confluence of pressure to recruit a large number of participants in a tight time frame, large numbers of sites with small Ns per site, site investigators with unknown pediatric depression or clinical trial experience (it is unusual to see a listing of all the sites and site investigators in published pediatric depression trials), and the inclusion of prepubertal children may have resulted in the inclusion of participants with all forms of unhappiness. Including participants from the large number of children and adolescents with substantial unhappiness attributable to life circumstances (school, peer, family and community stressors) rather than major depression, financial incentives to retain participants in the trial and implicit pressures for participants to get better (i.e., observer bias, or enhanced expectancy effects) could account for the high rates of response to the supportive care in the placebo treatment arms. In the context of such methodological challenges, diagnostic heterogeneity, and time pressures, high placebo response rates and even low active medication response rates could be considered a reasonable outcome. Consistent with the premise of this review, the industry depression trials with high placebo response rates (i.e., 50%−60%) and small between-group differences (∼10%) are failed trials because of substantial methodological problems, rather than negative trials that failed to demonstrate efficacy. (See the text box “Extract From the FDA's Nefazodone Pediatric Exclusivity Supplement, 2002.”)
Why is the Placebo Response Rate so Critical in a Clinical Trial?
In a clinical trial, the placebo response rate provides important information about the disorder under study and the study’s design and implementation. If the condition is a severely impairing one worthy of pharmacological intervention, such as adolescent depression, one might expect a placebo response rate of maybe 20%. In that context, an active medication response rate of 25%, 45%, or 60% with numbers needed to treat (NNTs) of 20, 4, or 3, respectively, would tell us a good bit about the treatment. If the placebo response rate climbs to the 50%–60% range in a population with a severe condition, one must begin to question whether the enrolled population was the right population, whether the participants really had the condition under study, whether there was a fundamental problem in the assessment and treatment protocols, or whether there was a lack of skill in trial implementation. A consistently high placebo response rate in industry trials sharing similar methodologies is especially important when contrasted with NIMH-funded trials, using rigorous methods, in which the placebo response rate was low enough to determine the value of a medication intervention. The point might be a bit easier to understand if one were to think of another severely impairing condition, such as epilepsy. One would not be concerned to see a study of a new epilepsy treatment with a placebo response rate under 15% (
26). If the active medication had a response rate of 20%, 35%, or 50%, it would be informative, yielding NNTs 20, 5, and 3, respectively. If, however, the placebo response rate in a trial of a new epilepsy drug was 50%−60% with a drug response rate of 60%−70%, one might have serious questions about the patients enrolled, their actual condition, or the design or implementation of the study.
What we do Know About Antidepressants and Pediatric Depression
So where does this leave us? The few NIMH-funded studies implemented by experts using state-of-the-art methods paint a very different picture of antidepressant efficacy for depression. The TADS study, the largest and arguably the highest-quality acute-phase randomized placebo-controlled trial of an antidepressant for teen depression, had a comparatively low placebo response rate (35%), with a positive effect size and a number needed to treat of ∼4 (
8). Longer-term outcomes of the TADS study suggest that over a 6- to 9-month treatment period, 80% of participants experienced symptom improvement (
27). The TORDIA study demonstrated that 50%−60% of teens who had not responded to one antidepressant responded to a second antidepressant (
28). Longer-term outcomes from the TORDIA study showed that upwards of 60% of participants remitted (
29). The ADAPT trial recruited a sample of severely impaired teens, pretreated them with an acute psychotherapeutic intervention, and then randomly assigned the nonresponders to either fluoxetine alone or fluoxetine combined with cognitive-behavioral therapy. Slightly less than 50% of participants responded acutely, but more than 80% demonstrated response after longer-term intervention (
30). The TASA study, albeit an open trial, recruited the most severely affected teen cohort of all the NIMH-funded studies—patients with depression and a recent suicide attempt. Participants could choose from among three interventions—an antidepressant, cognitive-behavioral therapy, or their combination. The response rate at the end of the acute phase was over 70%, and the reattempt rate was less than that of community samples (
31). This is all very good news, and it stands in stark contrast to meta-analyses that are dominated by the large number of industry-sponsored trials.
With respect to the industry-sponsored trials, some studies show a signal of efficacy (e.g., a between-group difference of ∼10%), but many do not. It is important to consider whether those medications that were studied using only industry methods would have demonstrated efficacy under different implementation efforts. For example, in
Table 1, observe the difference in placebo response rate for the three fluoxetine trials (two of them NIMH funded [
8,
9] and one industry funded [
10]). The lack of precision in implementation of industry studies means that we have many agents available for use in the pediatric population, but without quality data to guide their use—the same position we were in before passage of the FDAMA. Also, prescribers who wish to utilize other antidepressant medications will have to prescribe them off-label even if those medications may be better suited to a particular patient, as they may have a shorter half-life, a better active metabolite profile, more predictable pharmacokinetics, fewer drug interactions, or a different side effect profile than the FDA-approved SSRIs (i.e., fluoxetine and escitalopram).
The Childhood Anxiety Disorders also Benefit from Antidepressants
While focusing meta-analyses on antidepressants for adolescent depression is reasonable, patients, prescribers, and the general public may misinterpret depression meta-analytic results to mean that antidepressants do not work for any pediatric condition, which would be a distortion of the evidence base. Obsessive-compulsive disorder and separation, generalized, and social anxiety disorders are all responsive to antidepressants, and the NIMH-funded studies focusing on these disorders have low placebo response rates and NNTs of 3–5. Importantly, the anxiety clinical trials often included large numbers of prepubertal children (i.e., 6–12 years old), so there are good data on the efficacy and safety of antidepressants in the most vulnerable of young patients. Sadly, even though the evidence base supports the efficacy of antidepressant medications for the anxiety disorders (
32), FDA labeling is lacking. A strong evidence base that is not reflected by FDA labeling forces prescribers who practice evidenced-based medicine to prescribe off-label.
Conclusions
In comparing the industry-sponsored studies and the NIMH-funded studies, five conclusions can be drawn that should bear on the conduct of future meta-analyses and the interpretation of existing ones:
1.
The NIMH-funded studies demonstrate good efficacy for antidepressant medications in pediatric depression and should be heavily weighted in any review of the literature.
2.
Studies that use the gold-standard design—large-sample randomized placebo-controlled trials—but have substantial implementation limitations should be considered failed, not negative studies, and thus not be included in the meta-analyses of efficacy. They may provide valuable information about safety and tolerability, but not about efficacy.
3.
Interpretations of existing meta-analyses of antidepressant efficacy that include failed trials should be considered highly suspect.
4.
Large depression studies with multiple sites and low Ns per site, as well as studies that include prepubertal children, are vulnerable to high placebo response rates and therefore should not be included in meta-analyses. Ironically, the inclusion of young children in industry-sponsored depression trials was associated with increased placebo response rates (
33), so good intentions to test a broad age range may have inadvertently muddied the waters regarding antidepressant efficacy.
5.
When the authors of a meta-analysis acknowledge that many of the included studies have methodological limitations (
1), there should be substantial reservation about making any interpretation about the treatment’s efficacy. This has been particularly problematic with respect to the newer antidepressants as commentators move to stronger conclusions than the meta-analytic data warrant—“Placebo is just as good as antidepressant medication,” or there is “no difference in efficacy between antidepressants and placebo,” or simply “antidepressants are not efficacious.” Such comments based on a large number of poor-quality studies do not do justice to the field or to the data that are available from high-quality NIMH studies.
We have come a long way in the treatment of pediatric depression and anxiety. The current generation of antidepressants are not perfect, but they offer clear advantages for depression and anxiety over tricyclic antidepressants and monoamine oxidase inhibitors, as well as over the chronic use of benzodiazepines for anxiety or anxiety mixed with depression. It is worth remembering that the modern era of psychopharmacology in children and adolescents began with trials of prochlorperazine and meprobamate (
34).
Let’s hope that drug development continues to improve the safety and efficacy of antidepressant medications. Let’s also hope that speed and efficiency do not dominate clinical trial methods for new psychotropic medications and inadvertently lead to failed studies, as likely occurred during the early years of the FDAMA. This is particularly important for industry, as bringing an agent to market is extremely expensive, and new and potentially useful medications that fail to demonstrate benefit because of trial implementation challenges may keep useful medications off the market.
Lastly, we need to acknowledge how the NIMH investment in high-quality clinical trials of psychopharmacology in childhood and adolescence actually paid off: the MTA (attention deficit hyperactivity disorder), TADS (teen depression), CAMS (childhood anxiety), TORDIA (resistant depression in teens), TASA (teens with depression and a suicidal event), RUPP Anxiety Group Studies, and RUPP Autism studies have essentially established the evidence base for the major psychiatric disorders affecting children and adolescents. These studies took a long time and cost a bit of money, but the knowledge they generated has transformed the treatment landscape for children and adolescents with the major psychiatric disorders.