Full access

Brief Report

Published Online: October 2004

Actors’ Portrayals of Depression to Test Interrater Reliability in Clinical Trials

Jules Rosen, M.D., Benoit H. Mulsant, M.D., Martha L. Bruce, Ph.D., Vikas Mittal, Ph.D., and Debra Fox, M.A.Authors Info & Affiliations

Publication: American Journal of Psychiatry

Volume 161, Number 10

https://doi.org/10.1176/ajp.161.10.1909

PDF/EPUB

Abstract

OBJECTIVE: This study determined if actors could portray depressed patients to establish the interrater reliability of raters using the Hamilton Depression Rating Scale (HDRS). METHOD: Actors portrayed depressed patients using scripts derived from HDRS assessments obtained at three points during treatment. Four experienced raters blindly viewed videotapes of two patients and two actors. They guessed if each interviewee was a patient or an actor and rated the certainty of their guesses. For each interview, they also rated the realism of the portrayal and completed the HDRS. RESULTS: Experienced raters could not distinguish actors and patients better than chance and were equally certain of their right and wrong guesses. Actors and patients received high scores on the realism of their portrayals. The HDRS scores of the actor-patient pairs were correlated. CONCLUSIONS: Actors can effectively portray depressed patients. Future studies will determine if actors can accurately portray patients with anxiety and psychosis.

In any controlled clinical trial, the validity of the study conclusion is limited by the reliability of the outcomes measured (1). In studies that use more than one rater, it is critically important that all raters throughout the duration of the study similarly use all rating instruments in a reliable way. For continuous measures such as clinical rating scales, interrater reliability can be measured by calculating intraclass correlation coefficients (2). Despite the importance of statistically establishing rater reliability, a review of recent literature suggests that this issue is most often ignored (3).

Problems establishing a pool of reliable raters include the logistical obstacles involved in multicenter trials, which may have as many as 40–50 raters at multiple sites. The initial investigators’ meetings typically do not test reliability among the raters present. To establish reliability, videotapes of actual patients can be mailed to each rater, who would then return both the videotapes and the rating scores to a central coordinator. This is a cumbersome, labor-intensive, and often expensive process. Furthermore, when patients decide that they no longer wish to have their videotaped interviews used for training purposes, it might be difficult to be sure that all videotapes have been returned to ensure privacy.

To address this problem, we have developed a web-based system to train raters in the use of psychiatric scales and to test for interrater reliability within a defined group of raters. The advantage of this system over traditional videotapes is that the video images can be transmitted anywhere electronically without the unwieldy process of sending and receiving videotapes. Furthermore, through interactive technology, the rating scores can be saved online in a centralized database, and interrater reliability can be calculated in real time. This process also minimizes errors due to coding and entry of data from paper-and-pencil entries.

A potential disadvantage of this proposed system is the cost of developing interactive videotapes with actual patients who at any time could withdraw permission to use their interviews. Therefore, we conducted a study to assess the validity of using videotaped interviews of actors portraying depressed patients. Actors simulating patients have been accepted for a variety of training purposes in medical education (4–9). However, this practice has not been widely used in psychiatry. Therefore, the actors’ ability to convey both the verbal and subtle nonverbal cues of a person with a psychiatric illness in the course of an interview needs to be demonstrated. This initial study compares the Hamilton Depression Rating Scale (HDRS) scores of experienced raters blindly rating videotapes of actors and actual patients with various levels of depression.

Method

Two patients with major depression (one older man and one younger woman) had been assessed by using an unpublished semistructured interview for the HDRS. These assessments had been videotaped at the initiation of treatment, during treatment, and after successful completion of treatment. Both patients had provided informed consent for the use of their videotapes for research purposes. The six selected interviews illustrated HDRS scores below 10 (an absence of depression), scores of 11–20 (mild to moderate depression), and scores above 21 (severe depression) (Figure 1). Scripts were generated from these videotapes. In order to create realistic portrayals of different stages of depression, a male actor and a female actor were recruited. These two actors were mental health professionals and had worked for several years with depressed patients. They were trained by viewing the videotaped interviews and by using scripts derived from the interviews. The actors then portrayed the three different HDRS interviews from the same-gender patients. The actors’ portrayals of the interviews were videotaped in the same room and with the same camera as the actual patients’ interviews.

The videotapes of both actors and patients were sent to a collaborating research site at Cornell University to ensure that none of the raters had any prior knowledge of either the actors or the patients. Four experienced HDRS raters at Cornell University with previously established reliability for the HDRS participated in six sessions over 3 weeks. During each session, a rater assessed three videotaped interviews of a single subject (either an actor or a patient) shown without depression, with mild to moderate depression, and with severe depression. The rater was told that the person on the videotape was either a patient or an actor.

To determine to what extent the raters could distinguish the actors and the patients, they were asked to guess whether they had rated a patient or an actor portraying a patient. They scored their certainty of that guess on a 10-point scale. In addition, the raters were asked their opinion of the quality of the subjects’ (actors’ or patients’) portrayals of depressive psychopathology by answering the following question, “Was this subject presenting in a way that was consistent with a depressed person during a course of depression?” on a scale of 0 (not at all) to 10 (very much). Finally, the raters were asked to complete the HDRS for each interview. Correlations were calculated for the HDRS scores of each actor-patient pair, and intraclass correlation coefficients were calculated for the ratings of both the actors and the patients.

Results

Experienced raters correctly identified actors or patients seven (44%) of 16 times or less than what would be expected by chance (i.e., 50%). When raters guessed incorrectly, they were as certain of their guesses as when they guessed correctly. In terms of “presenting in a way that was consistent with…depression” on a 10-point scale, the mean scores were 7.1 (SD=2.4) for the actors and 6.5 (SD=2.4) for the patients.

The scores generated for the actors’ interviews were highly correlated with the scores generated for the patients’ interviews (r=0.99, p<0.001). Intraclass correlation coefficients calculated with the ratings of the actors and the patients were 0.99 for both groups.

Discussion

These results demonstrate the feasibility of using trained actors to portray depressive psychopathology to establish interrater reliability. Therefore, training and testing materials for raters participating in multicenter clinical trials could be developed by using trained actors without the risk of disseminating the clinical information of actual patients. In this study, the two actors had extensive experience in the field of mental health assessments. This experience probably contributed to a more realistic portrayal of depression. This highlights the importance of selecting and training actors appropriately to ensure an even quality in the portrayal of psychopathology, especially if this method becomes common for use in clinical trials.

Figure 1. Scatterplot of Hamilton Depression Rating Scale Scores of Actor-Patient Pairs^a
^aFour of the 24 pairs of scores are not visible because of overlapping data.

Footnote

Received Aug. 29, 2003; revision received Nov. 17, 2003; accepted March 9, 2004. From the Department of Psychiatry and the Katz Graduate School of Business, University of Pittsburgh, Pittsburgh; the Geriatric Research, Education, and Clinical Center, VA Pittsburgh Health Care System, Pittsburgh; Weill Medical College of Cornell University, White Plains, N.Y.; and Fox Learning Systems, Inc., Bridgeville, Pa. Address reprint requests to Dr. Mulsant, 3811 O’Hara St., Pittsburgh, PA 15213; [email protected] (e-mail). Supported in part by grants AG-19088, HS-11976, MH-01613, MH-01634, MH-52247, and MH-61639. Dr. Rosen and Ms. Fox have a commercial interest in Fox Learning Systems, Inc. Drs. Bruce and Mulsant have been consultants for Fox Learning Systems, Inc.

References

Kobak KA, Greist JJ, Jefferson JW, Katzelnick DJ: Computer-administered clinical rating scales: a review. Psychopharmacology (Berl) 1996; 127:291–301

Crossref

PubMed

Google Scholar

Bartko JJ, Carpenter WT: On the methods and theory of reliability. J Nerv Ment Dis 1976; 163:307–317

Crossref

PubMed

Google Scholar

Mulsant BH, Kastango KB, Rosen J, Stone RA, Mazumdar S, Pollock BG: Interrater reliability in clinical trials of depressive disorders. Am J Psychiatry 2002; 159:1598–1600

Crossref

PubMed

Google Scholar

Syder D: The use of simulated clients to develop the clinical skills of speech and language therapy students. Eur J Disord Commun 1996; 31:181–192

Crossref

PubMed

Google Scholar

Peabody JW, Luck J, Glassman P, Dresselhaus TR, Lee M: Comparison of vignettes, standardized patients, and chart abstraction: a prospective validation study of 3 methods for measuring quality. JAMA 2000; 283:1715–1722

Google Scholar

Hazelkorn HM, Robins LS: Actors play patients: using surrogate patients to look into private practice. Public Health Rep 1996; 111:129–132

Google Scholar

Nestel D, Muir E, Plant M, Kidd J, Thurlow S: Modelling the lay expert for first-year medical students: the actor-patient as teacher. Med Teach 2002; 24:562–564

Google Scholar

Silvestre AJ, Gehl MB, Encandela J, Schelzel G: A participant observation study using actors at 30 publicly funded HIV counseling and testing sites in Pennsylvania. Am J Public Health 2000; 90:1096–1099

Google Scholar

Loayssa JR, Garcia GM, Diez EJ: [Simulated consultation with actors for teaching clinical interviews.] Aten Primaria 1993; 11:320 (Spanish)

PubMed

Google Scholar

Information & Authors

Information

Published In

American Journal of Psychiatry

Volume 161 • Number 10 • October 2004

Pages: 1909 - 1911

PubMed: 15465990

History

Published in print: October 2004

Published online: 22 December 2014

Authors

Details

Jules Rosen, M.D.

View all articles by this author

Benoit H. Mulsant, M.D.

View all articles by this author

Martha L. Bruce, Ph.D.

View all articles by this author

Vikas Mittal, Ph.D.

View all articles by this author

Debra Fox, M.A.

View all articles by this author

Metrics & Citations

Metrics

Citations

Export Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Format	RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks
Direct importt
Citation style
Style

Copy to clipboard
Tips for downloading citations

View Options

View options

PDF/EPUB

View PDF/EPUB

Get Access

Login options

Already a subscriber? Access your subscription through your login credentials or your institution for full access to this article.

Personal login Institutional Login Open Athens login

Purchase Options

Purchase this article to access the full text.

PPV Articles - American Journal of Psychiatry

Not a subscriber?

Subscribe Now / Learn More

PsychiatryOnline subscription options offer access to the DSM-5-TR^® library, books, journals, CME, and patient resources. This all-in-one virtual library provides psychiatrists and mental health professionals with key resources for diagnosis, treatment, research, and professional development.

Need more help? PsychiatryOnline Customer Service may be reached by emailing [email protected] or by calling 800-368-5777 (in the U.S.) or 703-907-7322 (outside the U.S.).

Actors’ Portrayals of Depression to Test Interrater Reliability in Clinical Trials

Abstract

Method

Results

Discussion

Footnote

References