Full access

Article

Published Online: 1 July 2008

A Clinical Trial Comparing Interviewer and Computer-Assisted Assessment Among Clients With Severe Mental Illness

George Wolford, Ph.D., Stanley D. Rosenberg, Ph.D., Harriet J. Rosenberg, M.A., Marvin S. Swartz, M.D., Marian I. Butterfield, M.D., Jeffrey W. Swanson, Ph.D., and M. Kay Jankowski, Ph.D.Authors Info & Affiliations

Publication: Psychiatric Services

Volume 59, Number 7

https://doi.org/10.1176/ps.2008.59.7.769

PDF/EPUB

Self-report data are often essential in the diagnosis, treatment, and monitoring of clients with severe mental illness and in the detection of comorbid conditions that often complicate the course of treatment and recovery. Because it is frequently a challenge for providers to elicit and process such data in a reliable, valid, and cost- and labor-efficient way, most systems of care for people with severe mental illness neglect or underuse standard assessments. For example, two of the most common comorbid conditions in this population, substance use disorder and posttraumatic stress disorder (PTSD), are severely underdiagnosed in routine service settings, despite the fact that brief, reliable assessment tools are available ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 ). To evaluate a potential solution to this problem, we tested the feasibility, reliability, and validity of a Web-based computer-assisted interview (CAI) for assessing people with severe mental illness. This procedure has a number of potential advantages for clinical and research applications. It does not require trained personnel to administer or score assessments, it can employ uniform measures across sites and over time, it can readily be expanded or updated, and it can be interfaced directly with other databases, such as client clinical records.

Specific clinical information has traditionally been collected by clients filling out self-report measures or by trained personnel reading questions and recording answers in a face-to-face interview. Although the latter technique is frequently recommended for people with severe mental illness because of literacy and comprehension issues, it adds cost and complexity. More recently, CAI has been put forth as a more efficient alternative to interviewer-administered procedures. We conducted a randomized clinical trial to compare the computer format and the trained interviewers to assess demographic variables, psychiatric hospitalizations, substance abuse, trauma history, PTSD, and risk behaviors for blood-borne diseases. We hypothesized that CAI would be feasible with clients with severe mental illness, that CAI-collected data would be as reliable and valid as interviewer-collected data, that CAI would facilitate the disclosure of more sensitive information, that CAI would be lower in cost and more timely, and that CAI would be preferred by clients.

Most studies comparing CAI to other formats have found high participant acceptance, high correlations between formats ( 14, 15, 16, 17 ), and increased willingness to divulge sensitive information when using CAI ( 18, 19, 20, 21, 22, 23, 24, 25 ). Computerized speech presentation has also been found to improve client evaluation ( 26 ). It has also been suggested that using speech in assessment of psychiatric clients increases task focus and comprehension and is a way of overcoming potential literacy and concentration problems ( 27 ).

Only a few previous studies have assessed the use of CAI among people with severe mental illness. Clients rated CAI positively or even preferred computerized assessment to interviewers ( 16, 28, 29 ), and they provided valid information via the computer on an assessment of mental states ( 30 ). However, few, if any, of these studies with clients with severe mental illness directly compared the quality of data collected using the different formats. This randomized controlled trial adds to previous research on a number of dimensions: it utilizes a relatively large and diverse multisite sample in typical service settings, it compares CAI with face-to-face interviewing by using multiple self-report measures of important clinical variables, it employs external measures to assess the criterion validity of responses, and it includes measures of sensitive behaviors.

Methods

Following approval by the relevant institutional review boards at each site, we sampled clients with severe mental illness from North Carolina, Maryland, and New Hampshire between May 2000 and October 2001. The comparison of CAI to face-to-face interviews was the primary focus of this project. The inclusion criteria were ages 18–65 years; severe mental illness, including schizophrenia, schizoaffective disorder, bipolar disorder, or major depression; six months with a case manager; ability to provide informed consent; and confirmation that study participation was not clinically contraindicated. After complete description of the study to the participants, written informed consent was obtained. Each client was interviewed twice, with four to nine days separating the two interviews. At the first interview, blood samples were drawn to test for HIV, hepatitis B, and hepatitis C, and clients received pretest counseling for blood-borne disease as mandated by each state. We assigned approximately equal numbers of clients to the four possible pairings of the two formats: interviewer and interviewer, interviewer and computer, computer and interviewer, and computer and computer. Each site received computer-generated lists for assigning consecutive clients randomly to conditions. Clients received $35 after completing the study.

Participants

A total of 245 clients with severe mental illness agreed to participate (233 completed both surveys). They were drawn from both acute inpatient and outpatient publicly funded service settings at the three geographic sites. Inpatients were chosen sequentially from new admissions. Outpatients were chosen randomly from the patient lists at the outpatient centers. Diagnoses, determined primarily by Structured Clinical Interview for DSM-IV ( 31 ), were recorded from client charts.

Presentation formats

Computer presentation. A research assistant logged the participant onto the Web site containing the survey and was available to answer any questions about using the computer. Questions were presented to the clients both in large-font text and in speech. Every question had a "repeat" button, so that the client could hear the question again. For more complex questions, we included an "additional information" button.

Trained interviewers. Six experienced interviewers asked the same questions present on the computer format by reading from a printed booklet and recording the answers in the same booklet. They were trained to provide information identical to that available on the "additional information" buttons, if the client expressed comprehension difficulties.

Measures

Questions about demographic characteristics were followed by a series of self-report instruments validated for people with severe mental illness in previous studies. Demographic characteristics assessed were gender, age, and number of psychiatric hospitalizations during the past six months. Participants were given the Traumatic Life Events Questionnaire ( 32 ), a survey that measures trauma history. They were also given the PTSD Checklist (PCL) ( 33 ), a 17-item screen that assesses the presence of current posttraumatic stress disorder by DSM-IV criteria; the Dartmouth Assessment of Lifestyle Instrument (DALI) ( 7 ), an 18-item scale that has high classification accuracy for current substance use disorder of alcohol, cannabis, and cocaine among clients with severe mental illness; and the AIDS Risk Inventory (ARI) ( 34 ), a scale that assesses risk behaviors associated with acquiring and transmitting HIV and knowledge about AIDS.

Preference questionnaire. This questionnaire was administered as a face-to-face interview. There were two forms: one for people who experienced both formats (CAI or face-to-face interview) and one for those who had the same format throughout. The first form asked about format preference on a 3-point scale: preferred having an interviewer, preferred using the computer, no preference. The second version asked generic questions about whether the client liked the format and difficulty of the format.

Lab measures. We collected a blood sample from the clients to determine whether they were positive for HIV, hepatitis B, or hepatitis C—conditions that are elevated in this population. Samples were analyzed at a common laboratory using standardized procedures, as previously described in detail ( 35 ).

Criterion measure for PTSD. The Clinician-Administered PTSD Scale (CAPS) is widely regarded as the gold-standard measure for PTSD. Administered by an experienced clinician, it provides both a dichotomous (present or absent) rating, as well as a total severity score. To assess validity, we administered the CAPS to 58 of the 233 clients (25%), who were selected randomly ( 36 ).

Procedures

At each of the participating sites, clients with severe mental illness who met study criteria were randomly chosen from either acute inpatient admissions (N=124) or from clients enrolled in community support programs (N=121). Samples were stratified to approximate equal gender distribution. After giving informed consent, clients were asked to complete a brief (approximately 25 minute) assessment in one of the two formats (randomly assigned) and agreed to repeat the assessment approximately one week later. Clients were also asked to provide a blood sample to test for HIV, hepatitis B, and hepatitis C and were provided with pretest counseling. The informed consent included permission to obtain additional information from the client's case manager, primary clinician, or hospital records. After completion of the second assessment, a second research assistant, blind to clients' self-reports, obtained hospitalization and demographic information.

Results

The characteristics of the clients in each of the four treatment groups are shown in Table 1 . The four groups did not differ significantly on demographic or diagnostic variables. The degrees of freedom differ slightly from analysis to analysis because of different amounts of missing data for different variables. On average, fewer than five clients (less than 2%) were missing data per variable. Except for correlations assessing test-retest reliability or convergent validity, which require comparing time 1 results to time 2 results, the other analyses were carried out on the time 1 data, because these would most nearly reflect what would occur in a real clinical application of the formats.

Table 1 Demographic variables of persons with severe mental illness, by interview condition

Hypotheses

Feasibility. We assessed feasibility by comparing the CAI format to interviewer administration for rates of participation, completion, and missing data. Only 21 (8%) of the 266 potential participants who were approached declined to participate. The 21 persons who refused were given a checklist to detail their reasons: six were unwilling to use a computer, nine were unwilling to see an interviewer, and 12 were unwilling to give blood (multiple reasons were allowed). The high acceptance rate and the results of the checklist suggest that the computer was not a barrier to participation. Of the 245 clients who began the study, 12 (5%) dropped out between the first and second sessions. One mentioned dislike of the computer as the reason for dropping out, and two mentioned the interviewer as being the reason.

No difference was found between the formats in whether the interview was completed (100% completion in both groups). In order to examine missing data at the level of individual items, we focused on two of the measures, the DALI and the PCL (a total of 35 questions per client). Neither measure had branching options. In the first DALI interview, the 124 clients responding to trained interviewers answered an average of 17.8 of the 18 items, and the 109 clients responding to the computer answered an average of 17.1, a significant difference (t=4.67, df=231, p<.01). On the PCL, clients responding to trained interviewers answered an average of 16.9 of the 17 items, and clients responding to the computer answered an average of 16.7 items (t=2.26, df=231, p<.025).

Reliability and validity. We assessed each client twice in order to examine the test-retest reliability of each format and the amount of agreement between the two formats (convergent validity). We calculated these scores by extracting one individual question (number of psychiatric hospitalizations) and six composite scores: total number of reported traumas, summed PTSD score from the PCL, alcohol abuse and other drug abuse scores from the DALI, and the drug risk score and the sex risk score from the ARI. The test-retest data on those seven measures are presented in Table 2, and the convergent validities are in Table 3 .

Table 2 Reliabilities for selected variables between interviews among persons with severe mental illness

Table 3 Convergent validities for selected variables between interviews among persons with severe mental illness

Intraclass correlations were used in all cases. The average test-retest reliability was similar for the two formats. Reliability scores for the respondents were in the acceptable range, and the confidence limits for the two different formats overlapped for every measure. For the PCL, the test-retest reliabilities were comparable to those obtained from younger individuals without a history of severe mental illness ( 37 ). The reliabilities on individual measures exhibited moderate variability. The average intraclass correlation between the two formats was .76. We would expect the correlation between formats on average to be lower than the test-retest reliability scores within format.

Criterion validity. For PTSD, number of hospitalizations, and risk for blood-borne infections, we utilized criterion measures to assess the relationship of format on validity.

The CAPS, conducted by an expert clinician blind to client group assignment, served as the criterion measure for the PCL. We used the PCL from the first interview. We computed the correspondence between the PCL and the CAPS in two fashions: an intraclass correlation of total score derived from each instrument and a percentage agreement on final diagnosis. CAPS total score was significantly correlated with both the score on the computer-administered PCL (N=23, intraclass r=.878; F=15.38, df=22 and 22, p<.001) and the score on the interviewer-administered PCL (N=35, intraclass r=.654; F=4.78, df= 34 and 34, p<.001).

Forty-eight percent of the subgroup of clients (28 of 58 clients) were diagnosed as having PTSD on the CAPS. When the CAPS was used as the criterion, the computer-administered PCL had 92% sensitivity (11 of 12 clients) and 90% specificity (9 of 10 clients). The interviewer-administered PCL had 75% sensitivity (12 of 16 clients) and 79% specificity (15 of 19 clients).

Hospital records served as a criterion measure for number of hospitalizations. Client report using the computer correlated .68 with hospital records. Client report in the trained-interviewer format correlated .80. The differences between formats did not approach significance. Number of hospitalizations was graded on a 4-point scale. There was an exact match on 67% of the computer responses and 72% of the face-to-face responses.

The ARI assesses the risk of acquiring a blood-borne infection. As a criterion measure, we collected a blood sample from each client and had that blood analyzed for the presence of HIV, hepatitis B, and hepatitis C. Thirty-six clients had one or more infections. To see how well the ARI predicted infection by format, we first computed the overall risk score and subsequently carried out t tests between clients with and without an infection by using the risk score as a dependent measure. We carried out the test separately for each format. Using a computer, clients with an infection had higher risk scores than clients without (.20 versus .12), a significant difference (t=2.61, df=107, p=.01). Using trained interviewers, there was no significant difference in risk scores between the two client groups (.16 versus .15).

Disclosure of sensitive information. We hypothesized that clients would disclose more stigmatized behavior in the computer format. To test this prediction we constructed two disclosure indices chosen from the DALI and the ARI: one for drug items and one for sex items. Clients responding to an interviewer had higher scores on the drug index than clients responding to the computer (2.22 versus 1.58) (t=2.01, df=228, p=.046). Clients responding to an interviewer also had higher scores on the sex index than clients responding to the computer (2.21 versus 1.72) (t=2.49, df=229, p=.014). Contrary to our hypothesis, clients reported higher numbers of illegal and stigmatized events to trained interviewers than to the computers.

Cost and time. The estimated cost per interview using trained interviewers was approximately $90 per interview, which included interviewer salaries, data editing costs, and data entry costs. At the time of the study, CAI costs included approximately $12,500 for computer hardware, $1,000 for software, and $3,000 for programmer time, an average of about $47 per interview. There were personnel costs associated with the computer interviewing. A staff person had to be available to log the client in and answer any questions, but the cost per interview was hard to estimate, because these people were engaged in other tasks during the interview. That staff person would have to be sufficiently computer literate to deal with the occasional computer failure or network outage. The initial capital costs of computing are independent of the number interviewed. The cost per interview drops as the number interviewed increases. The computer interviews entailed some cost savings compared with trained interviewer administration, and those savings increase for larger projects.

We measured the time from when the client began the interview until he or she completed it. On the first interview, clients responding to a computer spent slightly more time than they did on the second interview, but the difference did not approach significance (22.31 minutes versus 21.53 minutes). There was no significant difference in the number or duration of breaks as a function of format. The CAI data were available more quickly for analysis. Face-to-face interviews often required days or even weeks between completion and delivery for data editing and entry. The computer interviews were available in the database upon completion of the interview.

Satisfaction and preference. Of the 53 clients who experienced only the computer, 51 (96%) responded that they "liked it." Of the 65 clients who experienced only the face-to-face interview, 61 of the 64 clients who responded (95%) liked it (no significant difference). We asked the 115 clients who had experienced both formats which format they would prefer in the future. Of those clients, 49 (43%) said they would prefer the computer, 43 (37%) said they would prefer a person, and 23 (20%) said they had no preference. Age was the only patient variable that was significantly related to preference. The clients who preferred the computer averaged 40.3 years of age and the clients who preferred a face-to-face interview averaged 48.2 years (t=3.33, df=84, p=.001). There may be variables other than the ones we collected, such as education level, that would predict satisfaction.

Discussion

Results of this study demonstrate the feasibility of implementing CAI for clients with severe mental illness. Participants reported liking the format as well as or better than live interviews and completed the CAI as rapidly. Although the two formats produced comparable data overall, CAI was less expensive and the results were available more rapidly.

The computer format produced slightly superior data in some instances, and trained interviewers produced slightly superior data in others. The interviewer format produced less missing data, but completion rates were high with either format. Both formats yielded high and nearly identical values for test-retest reliability averaged across measures. The average convergent validity between the two formats was .76, indicating that the clients were providing similar answers across the two formats. There was considerable variability in that convergent validity across measures. We could discern no particular pattern in that variability.

We examined criterion validity on three of our measures. The computer-collected data yielded significantly higher correspondence with the criterion on the risk for blood-borne diseases. The two formats did not differ significantly on the others. We had hoped to compare the criterion validity of the two formats in detecting substance abuse, but our criterion measure was applied differently at two of the sites. For clients for whom we did have a reliable criterion estimate, the computer yielded slightly better, but not significantly better, agreement.

We did not find greater disclosure with the computer than with the trained interviewers. In fact, the effect was significant in the other direction. Most prior research has reported greater disclosure using a computer format. One possibility is that our clients with severe mental illness were less concerned about the particular stigma or social desirability issues associated with these topics; another is that they had become used to sharing details of their behavior in the treatment setting.

It is also important to note that we used experienced, highly trained, and closely monitored interviewers. It seems likely that the computer would fare even better in routine clinic and hospital settings, where staff may not have training in administering particular assessments or where they conduct assessments intermittently as only one of many job tasks.

Our cost comparisons of formats are only approximate, but they do indicate that computer interviewing makes sense from an economic standpoint. Any such cost comparison is complex and depends on a host of variables. One additional cost advantage of the computer is that the expenses are mostly up front and are diminishing as commercial software for creating and modifying Web-based interviews becomes increasingly available.

Additional interviews cost very little on the margin once the system is established. With trained interviewers, the costs increase as the number of interviews increase.

There are several limitations to the conclusions we can draw from this research. Our computer survey required that the client be able to read English to some extent. We tried to reduce the demands on literacy by keeping the language as simple as possible and by having the questions read aloud along with the printed text. The survey was available only in English, but the questions and responses could be translated into other languages. We assessed only comorbid conditions. The computer is not equally suited for assessing all conditions, especially those involving clinician judgment. More research is needed to test computer assessment of other conditions, such as primary diagnosis.

Conclusions

As computer literacy increases, costs for computer hardware and software decline, and as the economics of employing highly trained interviewers become more problematic, clinicians working with persons with severe mental illness should consider a Web-based, computer-assisted approach to assessment.

Acknowledgments and disclosures

This research was supported in part by grant R01-MH-60073 from the National Institute of Mental Health. This research was also supported, in part, by the Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service, through Program 824 Funds and the Cooperative Studies Program (CSP 706D), and through a VA research career development award (RCD-0019-2). Dr. Butterfield died on June 26, 2006, after a courageous five-year battle with breast cancer. The authors acknowledge her work on the research described in this report and celebrate her many significant contributions to the field of psychiatry and to the care of our nation's veterans.

The authors report no competing interests.

Footnote

Dr. Wolford is affiliated with the Department of Psychological and Brain Sciences, Dartmouth College, 6207 Moore Hall, Hanover, NH 03755 (e-mail: [email protected]). Dr. Rosenberg, Ms. Rosenberg, and Dr. Jankowski are with the Department of Psychiatry, Dartmouth Medical School, Lebanon, New Hampshire. Dr. Swartz and Dr. Swanson are with the Department of Psychiatry and Behavioral Sciences, Duke University Medical Center, Durham, North Carolina. At the time this study was conducted, until her death in 2006, Dr. Butterfield was with the Department of Veterans Affairs Mid-Atlantic Region Mental Illness Research, Education, and Clinical Center, Durham, and with the Department of Psychiatry and Behavioral Science, Duke University Medical Center, Durham.

References

Ananth J, Vandewater S, Kamal M, et al: Missed diagnosis of substance abuse in psychiatric clients. Hospital and Community Psychiatry 40:297–299, 1989

Format	RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks
Direct importt
Citation style
Style

Copy to clipboard
Tips for downloading citations

Methods

Participants

Presentation formats

Measures

Procedures

Results

Hypotheses

Discussion

Conclusions

Acknowledgments and disclosures

Footnote

References

Information

Published In

History

Authors

Details

Metrics

Citations

Export Citations

View options

PDF/EPUB

Login options

Purchase Options

Not a subscriber?

Figures

Other

Share

Share article link

Share