Health care workplace violence is a significant public health problem, yet for many reasons, eradicating it remains elusive. According to a 2016 review in the
New England Journal of Medicine, “Healthcare workplace violence is an underreported, ubiquitous, and persistent problem that has been tolerated and largely ignored” (
1). The statistics are startling: 75% of the annual 24,000 workplace assaults in the United States between 2011 and 2013 were in health care settings (
2), and health care workers are 20% more likely than workers in any other industries to become victims of violence (
3). Among health care workers, the risk for being a target of violence is highest for inpatient mental health care workers, including nurses; 75% of these incidents were violent assaults by patients (
4,
5). These numbers do not capture the true incidence of violence, however, largely because of underreporting by as much as 70% (
1). Further, there are no national data on verbal aggression such as threats, verbal abuse, hostility, and harassment toward staff by patients (
1). Experienced by many health care workers on a daily basis, verbal aggression is the most common form of violence in health care (
5,
6), yet it is the least likely to be reported or addressed in the workplace because it is seen as “part of the job” (
7). Health care workplace violence has major consequences, because it leads to staff burnout, posttraumatic stress disorder, leaving the job, anxiety, and depression (
8–
10), even if the violence did not result in physical injury.
In an effort to address this serious public health issue, the U.S. Joint Commission required the use of standardized instruments to “measure outcomes for behavioral health organizations” as of January 2018 (
11). The commission recommended that health care organizations “clearly define workplace violence and put systems into place . . . that enable staff to report workplace violence instances” as well as to “capture, track and trend all reports of workplace violence—including verbal abuse and attempted assaults when no harm occurred” (
2). However, as Odes et al. (
12) pointed out in a recent systematic review of the frequency of violence in inpatient psychiatric settings in the United States, different measurement approaches can lead to markedly divergent estimates of violence prevalence. The authors also identified the lack of consensus on the best tracking instruments as a critical gap in the workplace violence field. Without accurate data collected via standardized and validated measures, it is difficult to assess the impact of implementation of violence prevention and risk assessment efforts (
12).
We undertook this review as part of a collaborative interdisciplinary effort at our acute psychiatric hospital to adhere to the Joint Commission’s requirements. Our goal for this work was to review recent literature to identify scoring instruments that document inpatient psychiatric violence and to assess the utility, reliability, and validity of the most commonly used measures. We describe preliminary use of the results obtained in this review in the Discussion section.
Methods
We comprehensively searched the literature in MEDLINE, the Cochrane Library, CINAHL, and PsycINFO databases with the assistance of experienced librarians and Covidence systematic review software (
13). Keywords for the searches included “inpatient psychiatric units,” “aggression,” “violence,” and “measurement.” (The full search details are included in an
online supplement to this article.) The search was limited to articles published in English between June 2008 and June 2018 to capture the most current use of measures. One of the authors (L.A.M.) screened for potential studies by inspecting the title and abstract of retrieved articles and then reviewed the full text of the studies identified in this initial screen. In addition to articles obtained with this search strategy, the references of included articles were searched by hand for additional relevant studies to be incorporated into the review.
Included studies were those that measured observable, quantifiable, and violent behavior episodes at acute psychiatric hospitals by using scales, instruments, or other tools. We did not limit studies on the basis of their definitions of violence or aggression because no universally accepted definition of mental health–related aggression exists (
14) and because the literature lacks consensus on defining the difference between aggression and violence (
15). We defined routinely collected data, specially designed forms, and occurrence reports as data sources used for clinical purposes and for quality improvement or research studies, without reference to specific instruments that have been identified in the literature. In addition to counting frequency of use of measures in articles addressing inpatient psychiatric unit violence, we also noted and reported the characteristics of the more common measures (type of questions, mode of administration, length, and reliability).
Because the focus of this review was on scoring instruments for measuring violent incidents during relatively short psychiatric hospitalizations of acutely ill adult patients, exclusion criteria included studies of forensic and nonacute inpatient psychiatric settings, which tend to have longer lengths of stay, and of geriatric and child or adolescent populations. Studies of lateral violence (i.e., staff to staff), staff-to-patient and visitor-to-patient violence, and intimate partner violence were also excluded.
Results
The literature review generated 294 independent studies; review of their abstracts for exclusion criteria eliminated 189 studies, and full-text reviews and searches of bibliographies helped identify 85 studies meeting the inclusion criteria (see
online supplement). We eliminated studies that used the same data for different analyses; details regarding removal of duplicates can be found in the
online supplement. One study, Carr et al. (
16), used three measures, and we counted each scoring instrument used in the study as a separate use of the measure: progress notes, Observed Aggression Scale (OAS) (
17), and the Violent Incident Form (
18). The final total was 74;
Table 1 summarizes the review results.
Most Commonly Used Measures
Of the 15 measures identified through our review, the most frequently used instruments to measure violence on inpatient acute psychiatric units during the past 10 years were variants of the OAS, including the Retrospective Overt Aggression Scale (ROAS, also known as the OAS-M) (
19), Aggression Observation Short Form (AOS) (
20), Modified Overt Aggression Scale (MOAS) (
21), and Staff Observation Aggression Scale (SOAS) and its revised version (SOAS-R) (
22,
23), together accounting for 55 (74%) of the 74 studies using instruments to measure violence. Routinely collected data, including unstructured clinical notes, researcher questionnaires, special forms, and occurrence forms, were used in 19 (26%) of the 74 studies. The rest of the measures, except for the Patient-Staff Conflict Checklist (PCC) (N=5, 7%), were cited once or twice: Social Dysfunction and Aggression Scale (
24), Violence Scale (
25), Violent Incident Form (
18), Report Form for Aggressive Behavior (
26), Aggression Scale (
27), Crisis Intervention Questionnaire (
28), Aggression Questionnaire (
29), and Positive and Negative Syndrome Scale–EC (
30). Because we were interested in measures that were most frequently used, we focused our subsequent analyses on routinely collected data, special forms, and occurrence reports and on OAS, ROAS, MOAS, AOS, SOAS/SOAS-R, and the PCC instruments.
Table 2 shows a comparison of the OAS variants and PCC in terms of domains measured, scoring, and use.
The OAS, ROAS, MOAS, AOS, SOAS/SOAS-R, and PCC instruments were created originally for use on inpatient psychiatric units (
17,
21–
23,
31–
33). The OAS, ROAS, and MOAS use Likert subscales to measure the severity of verbal aggression, object aggression, and aggression against self and against others. The SOAS and SOAS-R attempt to capture the frequency, nature, and severity of aggressive incidents. Data collected for SOAS/SOAS-R fall into five categories: what started or provoked the aggressive act, specific details about it, the aim or target of the act, the consequences for the target or victim, and measures that were used to stop the aggression (
34). Some studies using the SOAS/SOAS-R also used a visual analog scale, which was a subjective measure of level of violence severity as perceived by the person completing the instrument (
34). The PCC was developed initially for a comprehensive study of psychiatric inpatient violence in the United Kingdom, part of which examined types of conflict and consequent containment measures used by staff (
35). The PCC focuses more on staff-patient relationships than on collecting prevalence data. The PCC consists of 21 conflict behavior items (e.g., physical aggression or refusing medication) and eight containment measures (e.g., medication or restraints) (
36).
The major differences among the instruments are clarity of definitions and anchors, how frequently the measure is completed, the time frame covered by the measure, domains measured, and number of items to note and score. Each instrument provides at least a rudimentary definition of aggression and violence, with the MOAS, SOAS-R, and PCC having the most detailed anchors. Three measures, the OAS, SOAS, and SOAS-R, are meant to be completed for each incident, which in theory would lead to more accurate measures of violence prevalence. However, several studies mentioned that per-incident completion was burdensome for staff, particularly for the SOAS/SOAS-R, which includes >20 items, and that these measures often were not completed because of lack of time or inability to fit completion into the workflow (
34). Although the MOAS and ROAS take less of staff’s time because they require assessment only of the past week, they are subject to inaccuracy due to recall bias. The much simpler AOS, in which staff note whether three domains are either absent or present per patient at the end of each shift, was developed to address both of these issues (
20). It is possible that a paucity of studies that used the AOS was due to this instrument having been developed much later than the other instruments.
The difference among instruments regarding the domains they measure is also related to the number of items necessary to note and score. The OAS, ROAS, MOAS, and AOS each measure verbal aggression, the most common form of violence, whereas the SOAS/SOAS-R and PCC do not. The latter instruments document several potentially useful details about violent episodes; however, the length of time for filling out these forms likely precludes their consistent use on a busy inpatient psychiatric unit.
How Measures Were Used
Our results showed that use of routine notes and occurrence reports, OAS variants, and the PCC (N=61, 82%) clustered into the following categories: using the instrument to assess the effect of implementing violence risk assessment tools, to assess outcomes of an intervention meant to reduce violence, to measure the incidence or prevalence of violence, and to measure other causes of or clinical correlates of variables with violence. In most of these studies (N=38 of 61, 62%), the instruments were used to determine the causes of or clinical correlates of aggression, and in 46% (N=28 of 61) of the studies, instruments were used to measure aggression prevalence or incidence. Overall, OAS variants were used more frequently than unstructured notes and PCC to quantify prevalence or incidence (N=18 of 28, 64%).
Reliability and Validity
Although we found no reliability or validity scores for the unstructured tools, each of the four most commonly cited instruments had good interrater reliability; the OAS had an intraclass coefficient (ICC) between 0.70 and 0.87; the MOAS, a Pearson’s r between 0.85 and 0.94; the SOAS-R, an ICC of 0.96 and an interrater reliability between 0.61 and 0.74; and the PCC, a κ of 0.69 and a Cronbach’s α of 0.73. We found no description of validity for unstructured measures or the OAS; we noted multiple examples of discriminant validity in the literature for the MOAS and SOAS-R and a few for the PCC.
Discussion
Our main finding was that the most commonly used tools to measure acute violence on inpatient psychiatry units between 2008 and 2018 were structured scoring instruments, specifically variants of the OAS. The field has long called for adoption of a gold standard and commonly used instruments, and our results support a trend toward using a few structured, well-researched instruments. Gothelf et al. (
37) reported that 44.7% of empirical studies of aggression among psychiatric patients published in seven major American and European peer-reviewed journals between 1985 and 1994 used unstructured instruments such as nursing notes and specially designed questionnaires. This observation is unsurprising given that many of the structured instruments were developed in the late 1980s and early 1990s. Our findings are consistent with Iozzino et al.’s meta-analysis (
38) spanning studies published in 1995–2014, showing that of 35 studies of violence incidence, seven used unstructured instruments to collect violence data and that OAS variants were used in 20 of 22 studies that used structured instruments. Campbell et al. (
39) concluded that the most frequently cited and researched instrument of psychiatric inpatient violence was the SOAS. In a 2020 systematic review by Li et al. (
40) of prevalence of aggression among persons diagnosed as having schizophrenia, their inclusion criteria included only studies that used the MOAS.
However, our findings must be interpreted with caution; for instance, a 2021 review by Odes et al. (
12) of frequency of inpatient violence against health care workers included 14 studies, all but one of which (
41) used unstructured instruments or nursing notes. A comprehensive 2011 review by Bowers et al. (
35) found that of 122 identified studies measuring incidence of psychiatric patient violence in various settings, 55% were “retrospective analyses of official incident records and/or patient notes,” and 45% were based on data from “surveys, interviews and observation recording instruments that had been designed for each study.” Our study’s finding of a trend toward increased use of structured instruments may be due to the following. The Odes et al. (
12) review spanned a much longer time frame with more stringent search criteria, looking only at quantitative studies that reported baseline measures. Bowers et al.’s review (
35) collected studies conducted between 1960 and 2009, mostly before our selected time frame, and the search criteria of these authors were much broader.
Although the observed trend toward increased use of standardized, well-researched instruments for measuring violence against psychiatric staff is promising, we note that the different instruments used are not directly comparable, leading to the aforementioned inability to compare data among states or countries or within or among institutions to see which instrument is working well. One potential obstacle to adopting standard measures is that the instruments used to measure violence are cumbersome or time consuming for an already overly busy inpatient staff (
34,
42,
43). Nursing staff may find that adding one more rating scale would interfere with their usual workflow and that the effort for entering data may not be valuable (
18,
44–
47). Staff are willing to make temporary changes in their schedules for short periods to accommodate specific research projects, but for longer-term quality improvement purposes, workflow changes must be sustainable and not place too much additional burden on already overtaxed staff (
42,
48,
49). The burden of filling out forms is one of the hypothesized reasons for the significant underreporting of inpatient violence (
42,
50). It is therefore important to ensure that the staff filling out the instruments find the data they collect of value. For example, it would be more useful for staff to be able to track violence data for individual patients over time to assess effects of psychopharmacological or other therapeutic interventions or to highlight changes in behavior during team meetings. In addition, hospital administrations might use the data to redirect staffing to reduce the likelihood of assault and injury on units that have an increase in violence.
Our own experience highlights the importance of addressing these obstacles to wider implementation. After an initial scoping review of instruments for recording and scoring violence and aggression incidents, we piloted the SOAS-R instrument at our hospital because it was one of the most commonly used instruments, had a good evidence base in the literature, and collected more comprehensive data about each incident. After having received only two completed SOAS-R forms after 2 months, we concluded that the SOAS-R’s length and complexity, coupled with the high frequency of violent events on the units, made it too burdensome for already overtaxed nurses to complete. This observation was not surprising, given Gifford and Anderson’s (
51) findings regarding obstacles to and facilitators of nurses’ reasons for reporting or not reporting assault. Impediments to reporting included lack of time to complete the forms, lack of clarity of the reporting process itself, and forms that are not well designed or easily accessible.
We were surprised that the nurses at our hospital asked us to implement a simpler, more general instrument. This request prompted us to pilot an adaptation of the MOAS instrument. The MOAS is short, requires responses in four categories, and includes verbal aggression. Because of its brevity, it fits more easily into the staff workflow at the end of each shift or day. The MOAS also is relatively simple to incorporate into an electronic health record, which allows for graphical analysis to assess trends and to provide immediate feedback to teams working with patients. The data could be analyzed to assess the MOAS’s function regarding risk assessment, staffing patterns, and response to violence interventions such as deescalation, seclusion, restraint, and medications. Moreover, the MOAS has validity and reliability (
15,
21,
52). There is precedent for such use, because the MOAS has been used immediately after an incident rather than retroactively (
41,
53) and is currently being used as part of a strategy to reduce inpatient violence at Stony Brook University Hospital, an acute care inpatient psychiatric hospital (
54).
The MOAS is used as a retrospective instrument that asks staff members to recall episodes of violence from the past 7 days. This approach is subject to recall bias and leads to inaccurate measurements. Therefore, a major change we made to the MOAS was to use it for real-time scoring at each work shift or day, which also helped with the limitation that the MOAS does not tally the incident frequency. Although it would be ideal to use an instrument that also captures exact incident frequency, keeping track of the number of episodes of verbal violence, in particular, on acute inpatient units would put too much burden on staff. It may be that with the adoption of portable technology, recording of incident frequency could be incorporated into a more comprehensive measure that fits into the daily workflow of staff. In our hospital, the MOAS is now embedded in the nurses’ shift progress note in the electronic record. Because changing the MOAS from retrospective to real-time incident recording is likely to change its reliability and validity, we devised an easily accessible, brief online training module based on vignettes that helps standardize scoring, and we will study both its reliability and validity as we gather more data. We plan to report on the usefulness, reliability, and validity of this modified MOAS in a future article.
This review’s limitations included that the search strategy was probably not inclusive enough; we may have missed studies, including conference proceedings, abstracts, books, studies in languages other than English, and inquiries to specific experts. To minimize omission of studies, we conducted the review with the aid of university library services. The included articles were diverse in design and intent, with differences in method qualities and interventions and types of patients studied, and some of the instruments may be more useful in particular types of studies. For example, we included research studies that used measures collecting more details about specific episodes for a short period of time to answer specific research questions along with longer-term studies that measured episodes in general to capture overall prevalence. We may also have inadvertently misclassified some studies as being independent when they were duplicates. We did not contact authors to verify our assumptions. We did not calculate bias in study inclusion or methodological quality, because it was not considered as important in a study that investigates frequency of use of an instrument as it would be in a study that, for example, examines the effectiveness of an intervention. Finally, we note potential subjectivity bias because only one author reviewed the database citation results, selected articles for review, and reviewed the articles.