Full access

Reviews and Overviews

Published Online: 1 August 2006

Assessing Decisional Capacity for Clinical Research or Treatment: A Review of Instruments

Laura B. Dunn, M.D., Milap A. Nowrangi, M.Be., Barton W. Palmer, Ph.D., Dilip V. Jeste, M.D., and Elyn R. Saks, J.D.Authors Info & Affiliations

Publication: American Journal of Psychiatry

Volume 163, Number 8

https://doi.org/10.1176/ajp.2006.163.8.1323

PDF/EPUB

Abstract

Objective: The need to evaluate decisional capacity among patients in treatment settings as well as subjects in clinical research settings has increasingly gained attention. Decisional capacity is generally conceptualized to include not only an understanding of disclosed information but also an appreciation of its significance, the ability to use the information in reasoning, and the ability to express a clear choice. The authors critically reviewed existing measures of decisional capacity for research and treatment. Method: Electronic medical and legal databases were searched for articles published from 1980 to 2004 describing structured assessments of adults’ capacity to consent to clinical treatment or research protocols. The authors identified 23 decisional capacity assessment instruments and evaluated each in terms of format, content, administration features, and psychometric properties. Results: Six instruments focused solely on understanding of disclosed information, and 11 tested for understanding, appreciation, reasoning, and expression of a choice. The instruments varied substantially in format, degree of standardization of disclosures, flexibility of item content, and scoring procedures. Reliability and validity also varied widely. All instruments have limitations, ranging from lack of supporting psychometric data to lack of generalizability across contexts. Conclusions: Of the instruments reviewed, the MacArthur Competence Assessment Tools for Clinical Research and for Treatment have the most empirical support, although other instruments may be equally or better suited to certain situations. Contextual factors are important but understudied. Capacity assessment tools should undergo further empirically based development and refinement as well as testing with a variety of populations.

Informed consent is a cornerstone of ethical clinical practice and clinical research (1) . Meaningful consent is possible only when the person giving it has the capacity to use disclosed information in deciding whether to accept a proposed treatment or consent to a research protocol. Legal and bioethics experts generally agree that decisional capacity includes at least four components (2, 3) : understanding information relevant to the decision; appreciating the information (applying the information to one’s own situation); using the information in reasoning; and expressing a consistent choice. These capacities may be reduced by cognitive impairment, certain psychiatric symptoms, and situational factors such as the complexity of the information disclosed and the manner of disclosure (4 – 6) . Thus, it is fundamental to the notion of capacity that different contexts may demand different kinds or levels of functional abilities (2, 7 –12) . A lower level of decisional capacity is required for low-risk than a higher-risk treatment or research protocol, although it has not been clearly established what levels are appropriate for what decisions.

In treatment settings, formal capacity assessment has traditionally come into consideration when a patient refuses a recommended treatment (3, 10, 11, 13), although it may increasingly arise in other circumstances as physicians perceive themselves to be vulnerable to lawsuits from dissatisfied patients. In the research context, the main focus has been on how to implement informed consent for studies that enroll subjects who are at risk of having impaired decisional capacity. Despite the attention given to the topic over many decades, no consensus has emerged on how informed consent should be managed; what we currently have, as Michels observed, is “a hodgepodge of practices” (14) . Some institutional review boards require documentation of study volunteers’ capacity to consent. For example, our institutional review board at the University of California, San Diego, requires explicit assessment of decisional capacity for all protocols that involve more than “minimal risk” in studies that sample from populations considered at risk for impaired decisional capacity (15) . Indeed, over a quarter century ago, the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research (better known as the Belmont Commission) made it clear that investigators are responsible for ascertaining that participants adequately understand information disclosed in the consent process (16) .

Over the past two decades, numerous tools have been developed to assess decisional capacity. Although some instruments have been more widely adopted than others, there is no gold standard (17 – 20) . In this article, we critically review existing instruments for assessing capacity, highlighting information about their content, administration, and psychometric properties as well as their strengths and limitations for use in particular contexts.

Method

We conducted searches on PubMed (MEDLINE), PsycINFO, ArticleFirst, LexisNexis, and Westlaw for English-language articles published from January 1980 through December 2004 describing or using structured instruments designed to assess adults’ capacity to consent to clinical treatment or research. We used search terms related to competency, decision-making capacity, consent, and assessment instruments. From the search results, we selected articles that described instruments in sufficient detail to permit evaluation of their content, administration, and psychometric properties. We excluded articles describing instruments that focus on the consent capacity of children or adolescents, advance directives, testamentary capacity, or capacity to consent to inpatient hospitalization, as consent in each of these contexts raises additional considerations beyond the scope of this review.

The search and selection process yielded 23 instruments. For each one, we abstracted information about domains assessed, administration characteristics (including question format and time required), and the nature of the standardization samples. Because determinations of decisional capacity should not be observer dependent, we examined interrater reliability. To the degree that a person’s decisional capacity is stable over brief spans of time, scores should be consistent over brief follow-up intervals, so we also looked at test-retest reliability.

For each instrument reviewed, we examined the instruments and searched the literature for evidence related to content validity and criterion validity (21) . Content validity, the degree to which the instrument’s content reflects the universe of content relevant to the constructs being measured, is usually determined on the basis of expert consensus. There is some controversy over what the appropriate content should be for decisional capacity assessment instruments and over the determinative weight the various components of decisional capacity should carry (10, 13, 22, 23) . Moreover, different legal jurisdictions use different standards, which further complicates decisions on what standards to include in an instrument (8, 11) . We examined whether each instrument’s constructs appeared to be consistent with widely accepted theory on competency and capacity (2, 8, 24) . There are divergent views on the four-component model, and we recognize that not everyone would agree with the validity judgments we made. This lack of consensus underscores the need for further development and refinement of instruments as well as clarification of the standards needed for different contexts and choices.

Criterion validity, the degree to which scores on a scale are associated with an accepted concurrent standard (concurrent validity) or with a future state or outcome (predictive validity), is most often assessed in terms of intercorrelations. Other useful values are the measure’s sensitivity—that is, its valid positive (in this case, impaired) rate—and specificity—its valid negative (unimpaired) rate. The accepted standard against which criterion validity is evaluated may be another established measure. In the absence of a gold standard for measuring decisional capacity, however, validation against multiple criteria is desirable (21) . The criteria could include judgments of experts on decisional capacity, although establishing criteria for “expert” status may be difficult. Limitations of this approach in the context of our study include the documented inconsistency of clinicians’ application of relevant legal standards and the frequent discordance of expert opinions on decisional capacity (25 – 28) .

Results

Of the 23 decisional capacity assessment instruments we identified, 10 focus on consent in clinical research ( Table 1 ) and 15 on consent in treatment ( Table 2 ); two of the instruments are used in both contexts (43, 47) .

Capacity to Consent to Clinical Research

Two of the 10 instruments that focus on capacity to consent to a clinical research protocol—the MacArthur Competence Assessment Tool for Clinical Research and the Informed Consent Survey—are supposed to measure all four capacity domains, although whether the latter instrument adequately assesses appreciation and reasoning is debatable. Measures of understanding are included in nine instruments (5, 7, 38, 40 – 45, 47), five of which assess only understanding (39, 40 –44) . The California Scale of Appreciation focuses solely on assessment of appreciation (although understanding is likely also tapped), and the Competency Assessment Interview focuses only on understanding and reasoning. The vignette method (47, 48) appears to cover understanding, reasoning, and choice, although appreciation may be tapped as well.

An important variation among the instruments is whether the disclosed information and query content are established by the instrument itself or must be tailored for the specific protocol. For instance, participants may receive standard disclosures and questions, and acceptable responses to the questions may be predetermined. The California Scale of Appreciation and the Competency Assessment Interview use hypothetical study protocols and standard questions (although the California scale could be tailored). Another approach is for the instrument to specify the text of the probes (e.g., “What is the purpose of this study?”) while allowing the disclosures and acceptable responses to be tailored; this approach is used in the Evaluation to Sign Consent, the Quality of Informed Consent questionnaire, the Deaconess Informed Consent Comprehension Test, the Informed Consent Survey, the MacArthur Competence Assessment Tool for Clinical Research, the Two-Part Consent Form, and the vignette method.

The instruments vary in the degree of skill and training required of interviewers for valid administration. The Quality of Informed Consent questionnaire and the Two-Part Consent Form are self-administered; a drawback of this format is that the process does not have a built-in opportunity to ask follow-up questions. The Evaluation to Sign Consent, the Brief Informed Consent Test, the Deaconess Informed Consent Comprehension Test, the Informed Consent Survey, and the vignette method use interviews, although they all appear to require minimal to moderate training of interviewers or scorers. Training is required for administering the MacArthur Competence Assessment Tool for Clinical Research—the only instrument for which a published manual provides scoring guidelines (7) —because the items must be scored during the interview so that appropriate follow-up questions can be asked or requests for clarification elicited. The California Scale of Appreciation and the Competency Assessment Interview also require moderate training. Most of the instruments take less than 10 minutes to administer, although the more comprehensive ones take longer.

Psychometricians generally suggest that instruments to be used for clinical decision making have reliability values of at least 0.80 (21) . By this standard, most of the instruments we examined had acceptable interrater reliability, although no interrater reliability information was provided for the Brief Informed Consent Test, the Evaluation to Sign Consent, the Informed Consent Survey, and the vignette method. Test-retest reliability has been reported for four of the scales (the MacArthur Competence Assessment Tool for Clinical Research, the Quality of Informed Consent questionnaire, the Two-Part Consent Form, and the California Scale of Appreciation) ranging from –0.15 to 0.77 (the one negative correlation was for a specific subscale of the MacArthur Competence Assessment Tool for Clinical Research in one specific application study of women with depression [35] ).

When item content varies with specific use, another potential source of variance may be introduced by the disclosures and acceptable responses that are specified for the different uses. The manual for the MacArthur Competence Assessment Tool for Clinical Research gives fairly detailed instructions on preparation of the items. However, no data are available on how consistent this and other modifiable instruments are, even with trained users—that is, on how “reliable” the item content preparation phase is, or the “inter-item-writer reliability.” In the absence of such data, it is not clear whether, or under what conditions, results from these instruments can be generalized across specific uses, even when referring to similar protocols. The reliability and validity data of one version may not generalize to other versions prepared by other users.

Information about the concurrent, criterion, or predictive validity has been published for the MacArthur Competence Assessment Tool for Clinical Research (31, 37), the Brief Informed Consent Test (38), the Evaluation to Sign Consent (37), the Deaconess Informed Consent Comprehension Test (42), the Two-Part Consent Form (43), and the vignette method as described by Schmand et al. (47) . The external criterion was generally capacity judgments made by physicians. However, interpreting lack of agreement between “expert” judgment and subjects’ performance on the instruments themselves is problematic. Convergence with opinions from other potential experts or stakeholders (e.g., patients, family members, and legal or regulatory authorities) was rare, although judgments of some nonphysician experts have been included in studies of the Quality of Informed Consent questionnaire (41) and the Two-Part Consent Form (43) . Several reports attempted to establish concurrent validity by showing the association with general functional or cognitive measures (38, 42, 47), but because decisional capacity is context- and decision-specific, such correlations are not fully germane. Finally, Cronbach’s alpha, a measure of internal consistency, was 0.69 (fair) for the Schmand et al. vignette method (47) and ranged from 0.83 to 0.88 for the California Scale of Appreciation (22) .

Capacity to Consent to Treatment

All 15 of the instruments that focus on capacity to consent to treatment ( Table 2 ) measure understanding, but only nine of them appear to assess all four capacity dimensions. Two of the remaining six instruments assess only understanding, two assess understanding and appreciation, and two assess understanding and reasoning.

Preset vignettes or content are used as stimuli in eight of the 15 instruments: the Assessment of Consent Capacity for Treatment, the Hopemont Capacity Assessment Interview, Fitten et al.’s direct assessment of decision-making capacity (60), the original MacArthur instruments (63), the Hopkins Competency Assessment Test, the Competency to Consent to Treatment Instrument, and the vignette methods (although Vellinga et al. [19] presented the actual treatment decision to a subset of patients). In contrast, the patient’s actual treatment decision is used in the MacArthur Competence Assessment Tool for Treatment, the Competency Interview Schedule, the Ontario Competency Questionnaire, the Aid to Capacity Evaluation, the Two-Part Consent Form, the Structured Interview for Competency and Incompetency Assessment Testing and Ranking Inventory, and the Competency Assessment Interview; and it can form the basis for the Vellinga et al. vignette method.

All 15 instruments employ structured or semistructured interviews, although the Two-Part Consent Form uses a self-administered questionnaire, which is followed by additional questions when the questionnaire is returned (43) . The degree of training needed to administer these instruments ranges from minimal, as in the Hopkins Competency Assessment Test, to more substantial, as in the Competency Interview Schedule, the Assessment of Consent Capacity for Treatment, the Ontario Competency Questionnaire, the Hopemont Capacity Assessment Interview, the Aid to Capacity Evaluation, Fitten et al.’s direct assessment of decision-making capacity (59, 60), the original MacArthur instruments (3, 63), the MacArthur Competence Assessment Tool for Treatment, the Structured Interview for Competency and Incompetency Assessment Testing and Ranking Inventory, the Two-Part Consent Form, the Competency to Consent to Treatment Instrument, the Competency Assessment Interview, and the two vignette methods. Detailed manuals to guide administration, scoring, and interpretation are available only for the Hopemont Capacity Assessment Interview, the original MacArthur Competence Study instruments (Understanding of Treatment Disclosures, Perception of Disorder, Thinking Rationally About Treatment), and the MacArthur Competence Assessment Tool for Treatment; a training video is also available for the latter. Administration time was not widely reported for these instruments, but it varies with the comprehensiveness of the evaluation.

Information on reliability was reported for 12 of the instruments. Adequate interrater reliability (≥0.80) has been reported for the Competency Interview Schedule (51), the Assessment of Consent Capacity for Treatment (52), the Aid to Capacity Evaluation (58), the Hopemont Capacity Assessment Interview (unpublished 1995 paper of N. Staats and B. Edelstein), the Understanding of Treatment Disclosures, Perception of Disorder, Thinking Rationally About Treatment scales (63), the MacArthur Competence Assessment Tool for Treatment (67, 69), the Hopkins Competency Assessment Test (73), the Structured Interview for Competency and Incompetency Assessment Testing and Ranking Inventory (78), the Competency to Consent to Treatment Instrument (28), and the Two-Part Consent Form (43) . Data on internal consistency have been reported for the Competency Interview Schedule (51), the Hopemont Capacity Assessment Interview (56), and the original MacArthur instruments (59, 63, 88) ; for the latter, internal consistency seemed to vary with the study population, with higher consistency reported for hospitalized psychiatric patients than for cardiac patients and healthy community samples. The authors of the Competency Interview Schedule (51) and Schmand et al. (47) used interitem correlations to evaluate internal consistency. Test-retest reliability has been reported for only four of the scales—the Competency Interview Schedule (51), the Hopemont Capacity Assessment Interview (unpublished 1995 paper of N. Staats and B. Edelstein), the original MacArthur instruments (63), and the Two-Part Consent Form (43) . For the seven instruments with variable item content, no data have been published on the reliability of item preparation or on associations between versions prepared by different users.

Data related to concurrent, criterion, or predictive validity have been published for all of the treatment-consent capacity instruments except the Competency Assessment Interview. In most cases, the external criterion was judgments of decisional capacity made by physicians. Data on the various instruments’ ability to discriminate between patients who were judged by experts as competent and those who were judged incompetent were reported for the Competency Interview Schedule (51, 89), the Aid to Capacity Evaluation (58), the Hopkins Competency Assessment Test (73, 76), Fitten et al.’s direct assessment of decision-making capacity (59), the MacArthur Competence Assessment Tool for Treatment (69, 70), the Structured Interview for Competency and Incompetency Assessment Testing and Ranking Inventory (78), and the two vignette methods (19, 47) . For Fitten et al.’s assessment instrument, the MacArthur Competence Assessment Tool for Treatment, and both vignette methods, performance on the instrument did not correspond to physicians’ judgments of older patients’ global competency; this lack of correspondence was interpreted as indicating that clinicians were relatively insensitive to the decisional impairment of these study subjects. Performance on cognitive tests was correlated with decisional capacity scores in some cases (47, 56, 64, 69, 76) but not in others (59, 60, 73, 74, 76) . Such findings are consistent with the notion that decisional capacity is a construct distinct from cognitive domains, although cognitive factors are important in the measured abilities. The degree of convergence between the scale’s results and opinions from family members was evaluated in Vellinga et al.’s vignette method (19) ; family members’ judgments of subjects’ competency did not correspond well to results on the instrument.

Discussion

We identified 23 instruments for assessing capacity to consent to research protocols (N=10) or to treatment (N=15) (two instruments were used in both contexts). With a few exceptions, the instruments focused mostly on the understanding component of decisional capacity. There is a clear need to refine measurement of the other capacity domains.

Selection of an instrument would depend on the context in which it is to be used. Common uses include making definitive capacity determinations, screening to identify individuals who need further evaluation or remediation (30, 40), and evaluating capacity as a dependent variable in decisional capacity research (5, 30 – 35, 37, 62, 63) .

A fundamental challenge in selecting an instrument is the lack of consistency across instruments in what is being measured, despite the use of similar labels for these constructs. Definitions of “reasoning” vary from the ability to provide “rational reasons” for one’s choice (28, 51) to making the “reasonable” choice in a given situation (28) to the underlying cognitive processes used in reaching a decision (e.g., “consequential” and “comparative” reasoning in the MacArthur Competence Assessment Tools for Clinical Research [7] and for Treatment [67] ). Definitions and measurements of appreciation are also variable, with the focus ranging from appreciation of the consequences of a choice (28) to acknowledgment of the presence of a disorder and its treatment potential (63) to the absence of “patently false beliefs” driving one’s appreciation (22) . Even the assessment of understanding varies from requiring simple repetition of the interviewer’s words or the wording on a consent form to more detailed evaluations of deeper comprehension; it also may incorporate the person’s ability to retain information. Given these variations, those who work in the field of capacity assessment need to develop consensus on the appropriate definitions and standards for measuring each domain. This will be a key task as the field moves forward, although we do not anticipate that it will be an easy one.

A primary concern in developing a capacity assessment measure, as Appelbaum and Grisso wrote with regard to the original MacArthur Treatment Competence Study, is that the functional abilities being assessed “should have close conceptual relationships with appropriate standards of competence” (62) . Thus, for instance, a test of general cognitive abilities, such as the Mini-Mental State Examination (90), would not be an appropriate instrument for gauging the more specific, context-dependent ability to understand disclosed material about a recommended treatment (12, 60, 69, 91) . Although we have pointed out where instruments were correlated with more general measures, we do not believe such correlations constitute a strong line of evidence in favor of validity. Rather, the accumulation of data supporting a variety of types of validity will be the best evidence of validity. Moreover, validity should be understood as something arrayed along a continuum rather than either present or absent. Given questions about whether any instrument can gauge decisional capacity adequately without considering contextual and individual factors, validity will likely remain an imperfect aspect of these assessment tools.

Another consideration is how the risks or risk-benefit ratio of a treatment or research protocol affect the defining of minimally acceptable levels of understanding, appreciation, and reasoning for consent. There appears to be general agreement that flexibility is needed in setting thresholds and that these thresholds should depend on the type of decision being made (7) . Thus, for example, it may be appropriate to require a greater capacity for consent for a high-risk protocol than for a low-risk one. It may also be appropriate to weight the various subdomains of capacity differently in different contexts; for example, full appreciation that one has a disorder that requires treatment may be more relevant when consent is sought for participation in a randomized, controlled trial than for a nonintervention or observational study. How to operationalize the thresholds and weighting of such subdomains in a given context is less clear, however, and has received little empirical attention.

It must be acknowledged that every instrument has limitations. As a general recommendation, however, the best choices for measuring capacity to consent to research and treatment, given their comprehensiveness and supporting psychometric data, will frequently be the MacArthur Competence Assessment Tools for Clinical Research and for Treatment, respectively. Of the instruments we examined that focus on research, the MacArthur Competency Assessment Tool for Clinical Research has been the most widely adopted, and, as a result, numerous lines of evidence supporting its reliability and construct validity have accumulated (30 – 33, 35, 36) . The MacArthur Competence Assessment Tool for Treatment has been validated with a variety of populations and is among the few instruments for which extensive training materials are available. Nevertheless, even the MacArthur instruments have substantial limitations, such as the lack of empirical documentation of the psychometric equivalence of tailored versions, the need for substantial training and reliability documentation (particularly when used for research purposes), and the probable need for further study and refinement of the subscales for appreciation and reasoning. The lack of a predetermined cutoff separating capacity and incapacity is less a limitation than an intended feature of the MacArthur instruments; they were designed not as stand-alone tools for capacity assessment but rather as aids to capacity assessment. In any case, factors unique to certain contexts or populations will make other instruments preferable in some situations. Table 3 provides recommendations about instruments for use in various situations or settings.

Conclusions

Interest in research on decisional capacity has grown in recent years, in part as a result of concerns about the adequacy of consent procedures in research in which subjects with an elevated risk of having impaired capacity were exposed to more than minimal risk (92) . With increasingly complex research protocols and increasingly sophisticated and sometimes risky treatment options, and with an aging population at risk of having cognitive impairment and therefore impaired decisional capacity (13), there is an undeniable need for reliable and valid capacity assessment methods.

Footnote

Received July 12, 2004; revision received February 9, 2005; accepted March 18, 2005. From the Department of Psychiatry, University of California at San Diego, La Jolla, Calif.; School of Medicine, Loma Linda University, Loma Linda, Calif.; Veterans Affairs San Diego Healthcare System, San Diego; Law and Psychology and Psychiatry and the Behavioral Sciences, at the University of Southern California Law School, Los Angeles. Send correspondence and reprint requests to Dr. Dunn, Department of Psychiatry, 0603-V, University of California, San Diego, La Jolla, CA 92093-0603; [email protected] (e-mail).Supported in part by NIMH grants MH-66062, MH-64722, MH-67902, MH-66248, MH-59101, and MH-19934; by the National Association for Research on Schizophrenia and Depression; by the Greenwall Foundation; and by the VA San Diego Healthcare System. The authors express their gratitude to Gauri V. Nayak, M.A., for her assistance with this project.

References

Faden RR, Beauchamp TL, King NMP: A History and Theory of Informed Consent. New York, Oxford University Press, 1986

Format	RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks
Direct importt
Citation style
Style

Copy to clipboard
Tips for downloading citations

Abstract

Method

Results

Capacity to Consent to Clinical Research

Capacity to Consent to Treatment

Discussion

Conclusions

Footnote

References

Information

Published In

History

Authors

Affiliations

Metrics

Citations

Export Citations

View options

PDF/ePub

Get Access

Login options

Purchase Options

Not a subscriber?

Figures

Other

Share

Share article link

Share