Full access

Articles

Published Online: 1 January 2013

Cochrane Systematic Reviews for the Mental Health Field: Is the Gold Standard Tarnished?

Sharon Green-Hennessy, Ph.D.Authors Info & Affiliations

Publication: Psychiatric Services

Volume 64, Number 1

https://doi.org/10.1176/appi.ps.001682012

PDF/EPUB

Abstract

Objective

This study assessed the appropriateness of the Cochrane review system for providing guidance to the mental health field regarding evidence-based practice.

Methods

Excluding entries for substance misuse and dementia, all entries from the three mental health review groups in the 2005–March 2012 Cochrane database were included (N=552). Entries were coded for review group, type (protocol versus review), design (randomized controlled trials or not), datedness, percentage of studies excluded, eventual withdrawal, and nature of the review’s conclusions.

Results

A majority (N=322, 58%) of entries focused on psychotic, mood, and anxiety disorders. The average Cochrane review was last considered up to date in 2006. The reviews excluded over twice as many studies as they included, and inclusion of “gray literature” was infrequent. A total of 159 (44%) reviews reported that there was insufficient evidence to form a conclusion. A finding of insufficient evidence to form a conclusion was not related to the entry’s design but was related to the ratio of excluded to total studies and to the use of the Grading of Recommendations Assessment, Development, and Evaluation system of evaluating the quality of evidence.

Conclusions

The Cochrane Collaboration, an established review system respected globally for its methodological rigor, is poised to enlarge its role in guiding the evidence-based mental health field. However, addressing issues regarding datedness, diagnostic breadth, limited types of evidence, and amount of guidance provided in the conclusions would help maximize its contribution to the mental health field.

In the United States, training, credentialing, and funding increasingly are expected to be evidence based (1–6). Evidence-based practice entails “the integration of the best available research with clinical expertise in the context of patient characteristics, culture, and preferences” (7). The definition requires that there be an accurate understanding of the current research literature. Given the limitations of many frontline clinicians in the time and the expertise necessary to evaluate research (8), individuals looking to be guided by evidence have increasingly relied on systematic reviews to cull the best available evidence and distill it into useful practice recommendations (7,9,10).

In the past 15 years numerous professional and governmental bodies have undertaken systematic reviews to identify evidence-based treatments (11); however, these bodies have not always agreed on what merits that label (12–14). For example, among interventions for treating trauma in children and adolescents, eye movement desensitization and reprocessing was awarded the top rating (well supported) by the California Evidence-Based Clearinghouse for Child Welfare (15) but the lowest rating (no support) by the Hawaii Blue Menu of Evidence-Based Child and Adolescent Psychosocial Interventions (16). Such inconsistency can leave clinicians uncertain about which treatments to use and perhaps foster clinician skepticism about the merits of evidence-based systems.

The mental health treatment field might benefit from having a single, unimpeachably strong systematic review system. Of the review systems currently available, the Cochrane Collaboration, containing more than 5,000 systematic health care reviews (17), rises as a potential candidate. Cochrane reviews are well respected internationally for their transparent, standardized methodology, and they have been found to be more methodologically rigorous than other systematic reviews (18–21). However, notwithstanding the importance of methodological rigor, to be optimally useful a review system must also address the pressing clinical practice issues, be current, accurately reflect the treatment literature, and provide practice guidance (8,9,22–24). Hence, when evaluating Cochrane’s appropriateness as a mental health review system, it is equally important to assess suitability on these dimensions as well.

Cochrane reviews are initiated and conducted by volunteers who align themselves with one of 53 separate collaboration review groups (CRGs), each with its own scope and editors. Cochrane’s stated aim is to assist providers, consumers, policy makers, and others to make health care decisions (25,26); however, historically, author interest and the CRG’s agenda have determined which areas of review are prioritized (27), raising questions about whether the reviews address the pressing needs of practicing clinicians (18,24).

After registering a review title with a CRG, authors publish a peer-reviewed protocol. It is Cochrane’s policy to convert protocols into full systematic reviews within two years and to require authors to update reviews every two years or to include a commentary explaining why the review has not been updated (26). Research has questioned Cochrane’s ability to adhere to this time line (28,29). This finding is noteworthy because recent data suggest that psychopharmacology interventions need updating at a faster rate than drug interventions in other medical specialty areas (30).

The Cochrane handbook states that randomized controlled trials are the preferred means for addressing the effectiveness of health care interventions and cautions authors that nonrandomized studies cannot “give anything close to a definitive answer about the likely effects of an intervention” (26). Concurrently, Cochrane advocates searching the “gray literature” to minimize publication bias (26). Given the competing interests to be both inclusive and exclusive, it is unclear how well Cochrane reviews reflect the psychopathology intervention literature, a notable portion of which does not utilize randomized controlled designs (31,32).

Each Cochrane review culminates in a section called “Implication for Practice.” The Cochrane Collaboration has been criticized for a tendency to produce inconclusive reviews (33–35). Historically, the Cochrane handbook has stated that reviews should summarize the evidence to assist health care decision makers rather than make direct practice recommendations. More recently, however, the Cochrane Collaboration has begun implementation of the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system. To evaluate the quality of the evidence, GRADE integrates aspects of internal validity with factors related to applicability, such as burden or risk, cost, and values of the recipient, thereby influencing the strength of the recommendations (36,37).

Although Cochrane reviews are often recognized as the gold standard for general medical health care, their appropriateness for guiding evidence-based mental health care decisions merits examination. Cochrane’s coverage of current treatment literature and the ability of Cochrane reviews to provide guidance in areas needed by practicing clinicians are of particular interest.

Methods

The Cochrane Database of Systematic Reviews (2005–March 2012) includes all protocols and reviews that first appeared, were updated or amended, or were withdrawn between 2005 and March 2012. The sample of mental health reviews and protocols used for this study represents the database entries from the three CRGs that focus on mental health issues—depression, anxiety, and neurosis; schizophrenia; and developmental, psychosocial, and learning problems. Entries from CRGs that address dementia and substance use disorders were not included.

Each entry was coded for CRG group, type (protocol or review), design (exclusively randomized controlled trials or inclusion of nonrandomized studies), and topic—for example, mood or anxiety. Whether the authors used or planned to use the GRADE system was recorded. Datedness was calculated by using the year that the entry was last considered up to date according to the history section of each database entry. The withdrawal of a review or protocol, as well as the reason for the withdrawal, was also noted.

For reviews only, the number of studies included and excluded was derived from tables at the conclusion of each review. Included studies were also classified by whether any portion of the study had been published. Finally, by using both the Implications for Practice section in the review’s conclusion and the abstract’s conclusion, each review’s conclusions were categorized as primarily positive, primarily negative, mixed, or inconclusive. The conclusions of 36 (10%) reviews that were not withdrawn were evaluated by a secondary coder, and interrater reliability was high (κ=.88, p<.001).

Results

The three CRGs contributed 552 entries—188 (34%) from the depression, anxiety, and neurosis group; 205 (37%) from the schizophrenia CRG; and 159 (29%) from the developmental, psychosocial, and learning problems group. Nearly one-third (N=177, 32%) of entries were protocols. Each entry was coded as representing one of 23 topics. As shown in Table 1, 182 (33%) systematic reviews involved serious mental illness consisting primarily of psychotic disorders, such as schizophrenia. An additional 16 (3%) entries focused on the side effects of antipsychotic medication. Unipolar and bipolar mood disorder represented 97 (18%) entries, and 43 (8%) entries related to anxiety disorders. No other topic accounted for greater than 5% of the entries.

Table 1 Topic of 552 Cochrane mental health protocols and reviews, 2005–March 2012

Topic	N	%
Serious mental illness^a	182	33
Mood disorders	97	18
Anxiety disorders	43	8
General growth and development	27	5
Psychological adjustment to medical issues	18	3
Nonspecific emotional distress	17	3
Side effects of antipsychotic medication	16	3
Autism spectrum	16	3
Personality disorders	14	3
Externalizing disorders^b	14	3
Maltreatment or domestic violence	13	2
Sleep disorders	12	2
Attention-deficit hyperactivity disorder	12	2
Intellectual or developmental delay^c	11	2
Intellectual delay and psychiatric diagnosis (dual diagnosis)	11	2
Somatic disorder^d	10	2
Eating disorder	9	2
Comorbid diagnoses	8	1
Language or motor disorder	7	1
Sexual disorder	7	1
Impulse control disorder	4	1
Substance use disorder and psychiatric diagnosis (dual diagnosis)	2	—
Suicide or self-harm	2	—

Includes schizophrenia and other psychotic disorders

Includes conduct disorder; oppositional defiant disorder; and antisocial, delinquent, or violent behaviors in youth

Includes intellectual disability, learning disabilities, and significant developmental delay

Includes conversion disorder, body dysmorphic disorder, hypochondriasis, medically unexplained symptoms, and chronic fatigue syndrome

A total of 27 entries, 16 (60%) protocols and 11 (40%) reviews, representing 5% of the sample, were withdrawn from 2005 until March 2012. Withdrawn entries were more likely to be in the developmental, psychosocial, and learning problems group (N=13, 8%) and in the depression, anxiety, and neuroses group (N=11, 6%) than in the schizophrenia group (N=3, 2%) (χ²=9.24, df=2, p=.01, Φ=.129).

The process of converting mental health protocols into reviews occurred on average in the two-year time frame suggested by the Cochrane Collaboration (N=345; mean±SD=2.15±1.82 years). However, not all protocols followed this path, and a subset of protocols did not progress beyond the protocol stage. Specifically, 46 (26%) mental health protocols had not yet converted into reviews despite having been first published in 2007 or earlier.

Updating of completed reviews did not appear to follow the prescribed two-year timetable. After omitting the 11 reviews that were subsequently withdrawn, a total of 325 reviews listed the year that they were last considered up to date. [All 39 reviews that lacked such information had been published since 2009, so the date of publication was substituted for year last updated.] On average, reviews were last considered up to date in 2006 (2,006.42±3.02; range 1998–2012). Three-quarters (N=272, 75%) of the reviews were based on information no more recent than 2008, and approximately one of nine (N=41, 11%) was last considered up to date a decade or more ago. Datedness was the reason for withdrawal for 19 (70%) of the 27 withdrawn studies.

Excluding withdrawn studies, Cochrane mental health reviews included 12.11±17.75 studies (range 0–194 studies). However, Cochrane mental health reviews typically excluded considerably more studies (33.03±58.60) than they included, with 287 reviews (79%) restricted to studies of randomized clinical trials. Studies of which no part had been previously published were unlikely to appear in a Cochrane review, given the ratio of unpublished to included studies (.019±.056).

Excluding withdrawn entries, the most common outcome (N=159, 44%) of Cochrane mental health reviews was insufficient evidence to form a conclusion. An additional 63 (17%) reviews concluded there was sufficient evidence, but the evidence was mixed. Thirty-seven (10%) reviews had a predominantly negative conclusion, and 105 (29%) reviews had a predominantly positive conclusion. There was no difference between reviews that used only randomized controlled trials (N=123) and those that included nonrandomized studies (N=36) in their likelihood to be inconclusive (43% and 47%, respectively; χ²=.38, df=1, p=.54). In contrast, use of the GRADE system, which had been employed in 47 (13%) of all nonwithdrawn reviews, was associated with conclusions. Reviews that employed the GRADE system (N=14) were less likely to have an inconclusive outcome than those that did not use the GRADE system (N=145) (30% and 46%, respectively; χ²=4.24, df=1, p=.04, Φ=–.108).

Because the most common conclusion of a review was that there was insufficient evidence to form any determination on the merits of the intervention, a logistic regression was conducted to determine predictors of this outcome. Conclusions were collapsed into a dichotomous variable (insufficient evidence to form a conclusion versus all other outcomes) with the following variables entered as predictors: year review was last considered up to date, CRG group, design (exclusively randomized controlled trials versus inclusion of nonrandomized studies), use of the GRADE system, and ratio of studies excluded to total studies.

Use of the GRADE system and ratio of excluded to total studies predicted likelihood that the review would determine that there was too little evidence to form a conclusion (Table 2). The overall model was significant (χ²=54.71, df=6, p<.001), with approximately 19% (Nagelkerke R²=19.0) of the variance in conclusions accounted for by these two predictor variables.

Table 2 Predictors of Cochrane review’s having insufficient evidence to form a conclusion (N=358)^a

Predictor	B	SE b	Exp (B)^b	95% CI
GRADE^c	1.10*	.41	3.00	1.33–6.74
Ratio of excluded studies	3.48**	.58	32.56	10.45–101.52
Design	.02	.33	1.02	.53–1.95
Year up to date	.06	.04	1.06	.97–1.15
Developmental, psychosocial, and learning problems versus depression, anxiety, and neurosis collaboration review group (CRG)	.20	.34	1.22	.62–2.40
Schizophrenia versus depression, anxiety, and neurosis CRG	–.02	.30	.98	.55–1.77

Includes only reviews without missing values for any predictor variable

Exponentiation of the B coefficient or odds ratio

GRADE, Grading of Recommendations Assessment, Development, and Evaluation

*p<.01, **p<.001

Further examination found significant differences in the ratio of excluded studies to all studies between the reviews that found too little evidence to form a conclusion (.78±.23) and reviews that arrived at a conclusion, be it positive, negative, or mixed (.62±.23) (t=–6.89, df=356, p<.001, d=.78). Inconclusive reviews, on average, omitted 78% of all potential studies on the intervention.

Discussion

Although clinician adherence in daily practice to evidence-based guidelines remains limited (38,39), there is increasing focus on the use of evidence-based mental health treatments. Despite this emphasis within the field, there has been inconsistency in how that label is applied. The mental health field might well benefit from having a well-established systematic review system rather than competing, and at times inconsistent, efficacy assessments (12–14).

In a number of ways the Cochrane Collaboration appears well suited to be that system. It has a transparent, strong methodology that has been found to be more rigorous than the methodology of other review systems (18–21). It is a long-standing, well-established system that health care providers regard as highly credible (40). However, questions have arisen regarding its topic coverage, datedness, representativeness of the literature, and ability to provide clear guidance, and these questions merit evaluation.

In terms of clinical breadth, a majority (58%) of Cochrane’s mental health entries were concentrated in three areas (psychosis and mood and anxiety disorders). These mental health domains are known for their high disease burden, cost, and prevalence (41–44). Moreover, recently the Cochrane system appears to have focused on new diagnostic areas; for example, nine of the 14 entries for personality disorders are protocols that first appeared in 2011 or 2012. Nevertheless, review coverage outside these three major areas is limited at present. This finding is consistent with a recent report that 58% of all Cochrane systematic reviews of substance misuse focused solely on alcohol or opioids (45). In fact, recently the Cochrane Collaboration has identified the development of a transparent system for prioritizing topics that is responsive to user input as a strategic recommendation (MacLehose, et al., unpublished paper, 2012).

Datedness appears to be a greater area of concern. Although mental health protocols that progressed to reviews did so in a timely manner, 26% of protocols had not been converted to a review five or more years after publication. As a result, some review topics remain undeveloped because an author group has laid a claim to the topic in a prior protocol. Topics can become available again when entries are withdrawn, but withdrawal is a relatively rare occurrence with withdrawal rates differing among CRGs.

Although the Cochrane Collaboration’s policy states that reviews be updated every two years, the average Cochrane mental health review was last considered up to date in 2006. Although there is a lack of consensus regarding how often systematic reviews need to be updated (29,30), it is concerning that one in nine mental health reviews was last considered up to date a decade or more ago. The Cochrane Collaboration has acknowledged the need for a more flexible system and has begun to explore alternatives to a uniform two-year updating policy (MacLehose, et al., unpublished paper, 2012).

Another concern is the large portion of reviews (44%) that determined they had insufficient evidence to form any conclusion, even a mixed one. Contrary to the Cochrane Collaboration’s assertion that reviews that include nonrandomized studies are challenged to form definitive conclusions regarding efficacy, the design of the entry was not associated with the likelihood that a review would be inconclusive. The formation of a conclusion was predicted by the ratio of excluded to total studies and use of the GRADE system.

Cochrane reviews excluded for consideration noticeably more studies than they included. Despite a stated desire by the Cochrane Collaboration to minimize publication bias, relatively few reviews included studies with completely unpublished data. In fact, the more studies that the reviews excluded, the more likely the reviews were to determine there was insufficient evidence to form a conclusion regarding the efficacy of an intervention. Although inconclusiveness may accurately reflect the state of the literature, given that data suggest reviews that included more varied designs have arrived at similar conclusions (23,46), there is a call for mental health reviews to be based on a wider representation of the psychopathology intervention literature (34).

The other significant predictor of review conclusions was whether the GRADE system was used. Reviews that employed the GRADE system were less likely to determine that there was insufficient evidence to form a conclusion, perhaps reflecting the author’s process of having evaluated the strength of the evidence in an organized manner. As yet, though, only a small proportion of Cochrane reviews have used the GRADE system, with likelihood of use being CRG dependent.

In terms of limitations, this study examined only mental health entries from the three primary mental health CRGs and thus did not include all mental health reviews and protocols in the Cochrane library. Specifically, neither the dementia nor the substance use disorders CRGs were assessed; moreover, a number of mental health reviews are scattered among other CRGs, for example, reviews on postpartum depression in the pregnancy and childbirth CRG. Thus although the sample included all entries from the three mental health CRGs during this time frame, it did not include every mental health review or protocol in the Cochrane library.

In a related vein, the reviews’ diagnostic topic was assessed to measure whether reviews addressed clinical needs. However, other aspects besides diagnosis are important in determining if a review fits clinician’s needs. Specifically, assessing characteristics of the interventions reviewed (for example, pharmacological versus psychotherapeutic or prevention versus tertiary) and of the populations studied (for example, pediatric or geriatric) would be needed to understand if the reviews targeted the topics of importance to individuals seeking empirically based guidance.

Conclusions

Although the Cochrane system is recognized as a premiere review system for general medical health interventions, it is important to recognize the limitations of its use for the mental health profession. The system’s datedness, its limited coverage of some mental health domains, and the fact that reviews regularly are inconclusive and often do not reflect the full range of the psychopathology treatment literature are worth noting. The Cochrane Collaboration has already begun to discuss changes to address issues of topic prioritization and review updates (MacLehose, et al., unpublished paper, 2012). Continued exploration into the potential role of varied types of evidence, greater integration of the GRADE system into reviews, and consistency among CRGs in attending to protocol progress may assist in further addressing these issues.

Acknowledgments and disclosures

The author is indebted to Kevin Hennessy, Ph.D., for feedback and coding assistance.

The author reports no competing interests.

References

Cooper JL, Aratani Y: The status of states’ policies to support evidence-based practices in children’s mental health. Psychiatric Services 60:1672–1675, 2009

Format	RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks
Direct importt
Citation style
Style

Copy to clipboard
Tips for downloading citations

Abstract

Objective

Methods

Results

Conclusions

Methods

Results

Discussion

Conclusions

Acknowledgments and disclosures

References

Information

Published In

History

Authors

Details

Metrics

Citations

Export Citations

View options

PDF/EPUB

Login options

Purchase Options

Not a subscriber?

Figures

Other

Share

Share article link

Share