To the Editor: We read with interest the article by Karen Dineen Wagner, M.D., Ph.D., et al.
(1) in the June issue. In their study comparing citalopram to placebo, we were surprised to find the authors reporting an overall effect size of 2.9. With the commonly cited criteria set forth by Cohen, effect sizes can be considered trivial (<0.2), small (0.2 to <0.5), moderate (0.5 to 0.8), or large (>0.80).
By these metrics, the reported effect size can be characterized as gargantuan, big bang-worthy. The value does not appear to be a benign typographical error for “0.29,” given that “2.9” appears twice. An accurate effect size cannot be manually calculated with the information provided in the article. However, in order to arrive at the effect size of 2.9, it can be estimated that a pooled standard deviation of the change score of 2.1 would have been required. Such a narrow standard deviation of the change score seems improbable (a manual calculation with the Ns and standard deviations in the article yields a value of 15.6, for an effect size of 0.4). Moreover, such a low standard deviation of the change score would suggest uniformity in response that is far from consistent with comparable studies.
We surmise one of two possibilities. The first is that a simple arithmetic mistake occurred and was not picked up, despite otherwise meticulous attention to detail. A trickster decimal point may be to blame, and a demoted effect size of 0.29 may gain in honesty what it loses in the sex appeal of an inflated 2.9 status. A smaller effect size seems more plausible, and not only because a meta-analysis of 33 trials of selective serotonin reuptake inhibitors (SSRIs) for the treatment of adult depression
(2) arrived at a pooled effect size of 0.4 but because the current study, although statistically significant, was not
that clinically impressive. Only 36% of the patients treated with citalopram responded, compared to 24% of those with placebo (for a lukewarm number needed to treat of 8). These results, while modest, are respectable in their own right and nothing to sneeze at in a clinical area that has been short on proven therapeutic options. But a majestic sequoia of 2.9 they are not.
Alternatively, the authors may have used a different definition or formula to calculate the effect size. This would be unfortunate because the basic job description of an effect size is to facilitate communication among investigators and across measures. The gargantuan 2.9 becomes an unfortunate jarring screech of nails against the chalkboard: it robs from the melody of welcome that this timely contribution otherwise merits.