This guideline was developed using a process intended to meet standards of the Institute of Medicine (2011a) (now known as the National Academy of Medicine). The process is fully described in a document available on the APA Web site at: www.psychiatry.org/psychiatrists/practice/clinical-practice-guidelines/guideline-development-process.
Guideline Writing Group Composition
The GWG was initially composed of four psychiatrists with general research and clinical expertise (R. B., J. E., M. J. T., A. S.). This non-topic-specific group was intended to provide diverse and balanced views on the guideline topic to minimize potential bias. Three psychiatrists (E. A., A. G., V. F.), one psychologist (N. Z.), one adolescent pediatrician (N. G.), one internist (M. M.), and one dietitian (L. M.) were added to provide subject matter expertise in eating disorders. Two fellows (B. U., M. R.) were involved in the guideline development process, and an additional member (K. P.) provided input on quality measure considerations. The vice-chair of the GWG (L. J. F.) provided methodological expertise on such topics as appraising the strength of research evidence. The GWG was also diverse and balanced with respect to other characteristics, such as geographical location and demographic background. F.E.A.S.T. and Mental Health America reviewed the draft and provided perspective from patients, families, and other care partners.
Systematic Review Methodology
This guideline is based on a systematic search of available research evidence conducted by APA staff, extraction of detailed information on included studies by Dr. Evidence (Santa Monica, CA) using the DOC Data 2.0 software platform, and network meta-analyses conducted by Heno Analytics (Vancouver, BC, Canada). The systematic search of available research evidence used MEDLINE (PubMed), Cochrane Library, and PsycINFO databases, with specific search terms and limits as described in
Appendix B. Results covered the period from the start of each database to July 15, 2019, and were limited to English-language and human-only studies that were clinical trials, observational studies, systematic reviews, or meta-analyses. Case reports, comments, editorials, and letters were excluded. Updated searches were conducted using the same criteria for the period from January 1, 2019, to October 1, 2021, to ensure that more recent evidence was incorporated into the guideline. Four reviewers (L. J. F., S.-H. H., J. Y., and T. C.) screened the results of the initial search, with each abstract and title screened by two reviewers according to APA’s general screening criteria: RCT, systematic review or meta-analysis, or observational study with a sample of at least 50 individuals; human; study of the effects of a specific intervention or psychiatric disorder or symptoms. If discrepancies were noted among reviewers’ ratings, an additional opinion was given by a third individual and consensus was achieved among the reviewers. Abstracts identified using this approach were then reviewed by one individual (S.-H. H.), with verification by a second reviewer (L. J. F.) to determine whether they met eligibility criteria as defined by the PICOT elements (see
Appendix B). For the updated search, abstracts were screened in the same fashion by two reviewers (L. J. F. and S.-H. H.) with discrepancies resolved by discussion and consensus among the reviewers. If the publication characteristics were not clear from the initial title and abstract review, full text review occurred.
Studies were included if participants were
≥ 10 years of age and diagnosed with an eating disorder (AN, BN, BED, night eating syndrome, ARFID) with diagnosis as defined by DSM-III, DSM-III-R, DSM-IV, DSM-IV-TR, DSM-5 (Section II or Section III), or ICD-10, as applicable. Interventions of interest included psychotherapies, pharmacotherapies, and other interventions. For AN, approaches to refeeding and bone density preservation were also included. Comparator conditions included active interventions, placebo, treatment as usual, waiting list controls, or general psychiatric management. Multiple outcomes were included related to key eating disorder findings, functioning, quality of life, adverse effects, and study withdrawal rates, among others (see
Appendix B). Studies were excluded if the eating disorder of interest did not account for at least 75% of the total sample. Other exclusion criteria included small sample size (
N < 20 for randomized controlled trials [RCTs],
N < 50 for non-randomized clinical trials or observational studies), lack of a comparator group, or short treatment duration (< 10 days for refeeding studies or < 8 weeks for other studies). Citations to registry links, abstracts, and proceedings were not included unless also published in a peer-reviewed journal because they did not include sufficient information to evaluate the risk of bias of the study.
For each trial identified for inclusion from the search, detailed information was extracted by Dr. Evidence (Santa Monica, CA) using the DOC Data 2.0 software platform. Dr. Evidence processes included verifications and quality checks on data extraction. In addition to specific information about each reported outcome, extracted information included citation; study design; treatment arms (including doses, sample sizes); co-intervention, if applicable; trial duration and follow-up duration, if applicable; country; setting; funding source; sample characteristics (e.g., mean age, percent nonwhite, percent female, percent with co-occurring condition); and rates of attrition, among other data elements. Summary tables (see Appendices E and H) include specific details for each study identified for inclusion from the literature search. Factors relevant to risk of bias were also identified for each RCT that contributed to a guideline statement. Risk of bias was determined using the Cochrane Risk of Bias 2.0 tool (Sterne et al. 2019) by one reviewer (J. M.) and verified by an additional reviewer (S.-H. H. or L. J. F.). Risk of bias ratings are included in summary tables (see
Appendix E), with specific factors contributing to the risk of bias for each study shown in
Appendix F (McGuinness and Higgins 2020). Extracted data on outcomes were used in network meta-analyses (conducted by Heno Analytics; Vancouver, BC, Canada).
Available guidelines from other organizations were also reviewed (see
Appendix G) (American College of Obstetricians and Gynecologists 2018; Catalan Agency for Health Technology Assessment and Research 2009; Couturier et al. 2020; Danish Health Authority 2016a, 2016b; Haute Autorité de Santé 2010; Golden et al. 2015a; Hackert et al. 2020; Hay et al. 2014; Herpertz et al. 2020; Hilbert et al. 2017; Hornberger et al. 2021; Lock et al. 2015a; National Guideline Alliance (UK) 2020; Ozier et al. 2011; Resmark et al. 2019; Royal Colleges of Psychiatrists 2014).
Rating the Strength of Supporting Research Evidence
Strength of supporting research evidence describes the level of confidence that findings from scientific observation and testing of an effect of an intervention reflect the true effect. Confidence is enhanced by such factors as rigorous study design and minimal potential for study bias.
Ratings were determined, in accordance with the Agency for Healthcare Research and Quality (AHRQ)’s Methods Guide for Effectiveness and Comparative Effectiveness Reviews (Agency for Healthcare Research and Quality 2014), by the methodologist (L. J. F.) and reviewed by members of the SRG and GWG. Available clinical trials were assessed across four primary domains: risk of bias, consistency of findings across studies, directness of the effect on a specific health outcome, and precision of the estimate of effect.
The ratings are defined as follows:
▫
High (denoted by the letter A) = High confidence that the evidence reflects the true effect. Further research is very unlikely to change our confidence in the estimate of effect.
▫
Moderate (denoted by the letter B) = Moderate confidence that the evidence reflects the true effect. Further research may change our confidence in the estimate of effect and may change the estimate.
▫
Low (denoted by the letter C) = Low confidence that the evidence reflects the true effect. Further research is likely to change our confidence in the estimate of effect and is likely to change the estimate.
The AHRQ has an additional category of insufficient for evidence that is unavailable or does not permit estimation of an effect. The APA uses the low rating when evidence is insufficient because there is low confidence in the conclusion and further research, if conducted, would likely change the estimated effect or confidence in the estimated effect.
Rating the Strength of Guideline Statements
Each guideline statement is separately rated to indicate strength of recommendation and strength of supporting research evidence. Strength of recommendation describes the level of confidence that potential benefits of an intervention outweigh potential harms. This level of confidence is informed by available evidence, which includes evidence from clinical trials as well as expert opinion and patient values and preferences. As described in the section “Rating the Strength of Supporting Research Evidence,” this rating is a consensus judgment of the authors of the guideline and is endorsed by the APA Board of Trustees.
There are two possible ratings: recommendation or suggestion. A recommendation (denoted by the numeral 1 after the guideline statement) indicates confidence that the benefits of the intervention clearly outweigh harms. A suggestion (denoted by the numeral 2 after the guideline statement) indicates greater uncertainty. Although the benefits of the statement are still viewed as outweighing the harms, the balance of benefits and harms is more difficult to judge or either the benefits or the harms may be less clear. With a suggestion, patient values and preferences may be more variable, and this can influence the clinical decision that is ultimately made. These strengths of recommendation correspond to ratings of strong or weak (also termed conditional) as defined under the GRADE method for rating recommendations in clinical practice guidelines (described in publications such as Guyatt et al. 2008 and others available on the Web site of the GRADE Working Group at www.gradeworkinggroup.org).
When a negative statement is made, ratings of strength of recommendation should be understood as meaning the inverse of the above (e.g., recommendation indicates confidence that harms clearly outweigh benefits).
The GWG determined ratings of strength of recommendation by a modified Delphi method using blind iterative voting and discussion. In order for the GWG members to be able to ask for clarifications about the evidence, the wording of statements, or the process, the vice-chair of the GWG served as a resource and did not vote on statements. The chair and other formally appointed GWG members were eligible to vote.
In weighing potential benefits and harms, GWG members considered the strength of supporting research evidence, their own clinical experiences and opinions, and patient preferences. For recommendations, at least 13 out of 14 members must have voted to recommend the intervention or assessment after four rounds of voting, and at most 1 member was allowed to vote other than “recommend” the intervention or assessment. On the basis of the discussion among the GWG members, adjustments to the wording of recommendations could be made between the voting rounds. If this level of consensus was not achieved, the GWG could have agreed to make a suggestion rather than a recommendation. No suggestion or statement could have been made if 3 or more members voted “no statement.” Differences of opinion within the GWG about ratings of strength of recommendation, if any, are described in the subsection of
Appendix G, “Balancing of Potential Benefits and Harms in Rating the Strength of the Guideline Statements and Quality Measurement Considerations,” for each statement.
External Review
This guideline was made available for review from October 5, 2021, to November 12, 2021, by stakeholders, including the APA membership, scientific and clinical experts, allied organizations, and the public. In addition, a number of patient advocacy organizations were invited for input. 108 individuals and 26 organizations submitted comments on the guideline (see the section “Individuals and Organizations That Submitted Comments” for a list of those who wished to be acknowledged in the guideline). The Chair and Co-chair of the GWG reviewed and addressed all comments received; substantive issues were reviewed by the GWG.