Patient Accountability
In the first part of the trial, 169 patients were randomly assigned to double-blind treatment (WS 5570: N=84; placebo: N=85), and 206 patients were randomly assigned in the second part after the interim analysis (WS 5570: N=102; placebo: N=104). Therefore, totals of 186 and 189 patients were randomly assigned to treatment with WS 5570 and placebo, respectively, and were included in the intention-to-treat analysis. After random assignment, 18 patients in the WS 5570 group (9.7%) and 25 in the placebo group (13.2%) terminated treatment prematurely. The primary reasons for early withdrawal were lack of efficacy (WS 5570: N=10, 5.4%; placebo: N=14, 7.4%), revocation of informed consent (WS 5570: N=4, 2.2%; placebo: N=7, 3.7%), and adverse events (WS 5570: N=2, 1.1%; placebo: N=2, 1.1%). The per-protocol analysis of all patients without major protocol violations included 164 patients in the WS 5570 group and 157 patients in the placebo group. The decisions with respect to the relevance of the protocol deviations were made before the code was broken.
Efficacy
Between the start of randomized treatment and day 42, both groups’ average total scores on the Hamilton depression scale decreased monotonically (
Figure 1). Starting from baseline mean values of 21.9 points (SD=1.7) in both treatment groups, the Hamilton depression total score decreased during the treatment phase by a mean of 9.9 points (SD=6.8) in the WS 5570 group and by 8.1 points (SD=7.1) in the placebo group (pooled data from both study stages; last observation carried forward).
In the confirmatory hypothesis testing for the primary outcome measure for the first study stage, i.e., the interim analysis for the intention-to-treat data set, the null hypothesis relating to the difference between treatment groups in the decrease in total score on the Hamilton depression scale between baseline and day 42 was associated with a one-sided p value of p1=0.037 (t=1.80, df=167). Since this p value lies between the boundaries for early rejection and acceptance, the trial was continued with a second stage. The required number of subjects was reestimated on the basis of the results of the interim analysis. The group in the second stage showed a one-sided p value of p2=0.038 (t=1.78, df=204). Therefore, the product of p values for the final combination test fell below the critical limit (0.037·0.038=0.0014<0.0038), and so the null hypothesis was rejected, and the superiority of extract WS 5570 over placebo was demonstrated for a treatment duration of 6 weeks. For the pooled data from both study stages the one-sided p value for the change in Hamilton depression score between baseline and day 42 was p=0.02 (t=2.50, df=373). For the comparisons of the two treatment groups in terms of change from baseline in Hamilton scale total score at days 28, 14, and 7, the t test results were nonsignificant. A repeated measures ANOVA with independent variables of treatment and time and an interaction term was used to compare the postbaseline Hamilton depression scores of the two treatment groups and demonstrated a significant time-by-treatment interaction (F=3.41, df=4, 1492; Greenhouse-Geisser epsilon=0.58, two-sided Greenhouse-Geisser-corrected p=0.03).
These results were confirmed in the per-protocol analysis, in which both treatment groups had the same mean decreases in Hamilton depression scale total score between baseline and day 42 as in the intention-to-treat analysis (t=2.31, df=319, p=0.02, two-sided t test).
In the intention-to-treat study group, the percentage of responders (those with at least 50% decreases in Hamilton score between baseline and treatment end) was significantly higher for WS 5570 (52.7%, 98 of 186) than for placebo (42.3%, 80 of 189) (χ2=4.04, df=1, p<0.05, two-sided). Furthermore, the percentage of patients with remission (score of 6 or less on Hamilton scale at treatment end) was significantly higher for the active treatment group (24.7%, 46 of 186) than for placebo (15.9%, 30 of 189) (χ2=4.55, df=1, p=0.03, two-sided).
A secondary outcome measure was the change in total score on the Montgomery-Åsberg Depression Rating Scale between baseline and treatment end. The mean decrease was 11.7 points (SD=9.0) for the WS 5570 group and 9.9 points (SD=9.2) for the placebo group (intention-to-treat analysis: t=1.90, df=373, p=0.06, two-sided t test). The depression subscore of the SCL-58 (11 items) showed a mean reduction of 7.9 points (SD=8.7) for the WS 5570 group and 6.5 points (SD=8.4) for the placebo group (intention-to-treat analysis: t=1.57, df=366, p=0.12, two-sided t test).
Table 2 indicates the relationship between the initial severity of depression and the magnitude of the treatment effect. Among the patients receiving WS 5570, the difference in the decrease in Hamilton depression scale total score between baseline and the final visit was larger in the subgroup of patients with initial scores equal to or above the median value of 22 points. Their decrease was significantly greater than the decrease for the patients receiving placebo (t=2.08, df=225, p=0.04, two-sided t test), but the decrease for the patients with initial Hamilton depression scores between 18 and 21 did not differ significantly from that for the placebo group (t=1.50, df=146, p=0.14, two-sided t test).
The score on the Bech melancholia subscale decreased by a mean of 5.5 points (SD=4.2) in the WS 5570 group and by 4.4 points (SD=4.1) in the placebo group (intention-to-treat analysis: t=2.60, df=373, p=0.001, two-sided t test).