TO THE EDITOR: Simon et al.’s report (
1) concerning the ability of item 9 of the nine-item Patient Health Questionnaire (PHQ-9) to predict suicidal behavior of patients with psychotic disorders asserts that “response to item 9 of the PHQ-9 was a strong predictor of suicide attempt” and “endorsement of frequent thoughts of death or self-harm [i.e., on item 9] certainly indicates an increased risk of self-harm.” Data presented in the report provide grounds for disputing such conclusions. Given that the authors stated that their findings “should have practical utility” and “have implications for clinicians,” I comment from the standpoint of a practitioner.
If I make a prediction about a patient’s short-term risk for suicidal behavior, I want my prediction at the very least to be more likely correct than incorrect. But on the basis of Simon et al.’s results, were I to predict that a patient who endorsed thoughts of death or self-harm on item 9 would make a suicide attempt in the 90 days following that endorsement, I would have to expect my prediction to be incorrect most of the time.
Item 9 asks, “Over the past 2 weeks, how often have you been bothered by thoughts that you would be better off dead or of hurting yourself in some way?” Responses include “not at all,” “several days,” “more than half the days,” and “nearly every day.”
Simon et al.’s data show that 98% of the time a patient who endorsed any thoughts of self-harm on item 9 would be free of suicide attempts for the subsequent 90 days. Similarly, 96.5% of the time a patient would be free of suicide attempts for the next 90 days even after responding “Nearly every day” to item 9. Hence, to maximize the chance that I would be correct, I would want to predict that a patient would be safe from suicide attempts whenever a patient endorsed thoughts of self-harm on item 9.
In addition, I note that in Simon et al.’s sample more than half of the 90-day periods found to include a suicide attempt were associated with “not at all” responses to item 9.
What makes item 9 frankly worthless for predicting suicidal behavior is the low prevalence for suicide attempts and deaths by suicide (in Simon et al.’s study, .024 and .002, respectively). When events we wish to predict occur so rarely, it is impossible to construct a test that will provide a useful level of positive predictive power. Although the authors state that larger samples are needed to address the relationship between item 9 response and suicide mortality, with similar prevalence the same poor predictive power would be obtained regardless of how large a sample was studied.
Researchers are in a position to query their data after the statuses of their observations have been established. Practitioners are in a very different position and need a way to confidently judge a patient’s risk of suicide attempt. The results found by Simon et al. and by other researchers (
2,
3) clearly demonstrate that item 9 of the PHQ-9 is in itself useless with respect to satisfying that need.