Researchers have suggested that psychotic depression is a distinct subtype of depression that is semi-independent of severity (
1–
3). Recognizing this, DSM-5 separated psychosis and severity in the diagnostic coding. Psychotic depression is defined by the presence of delusions or hallucinations in an episode of major depression and is associated with various distinctive features. Psychotic depression is less responsive to tricyclic antidepressants alone (
4,
5) and has a long-term mortality rate twice that of nonpsychotic major depression, even among hospitalized patients (
6). Although most studies have not found greater rates of a family history of major depression in patients with psychotic major depression, two studies in which family members were interviewed found greater rates of psychotic major depression among the relatives (
7,
8). The frequency of hypercortisolemia, defined by a positive dexamethasone suppression test, has been found to be much higher in inpatients with psychotic depression than in those with nonpsychotic major depression (64% compared with 41%) (
9). In fact, Schatzberg et al. (
10) have suggested that elevated cortisol levels in psychotic major depression play a role in producing psychosis.
We performed a systematic review of the literature examining the course of illness in psychotic depression, performed a meta-analytic summary of the frequency of psychosis in prior or subsequent episodes in study subjects with psychotic or nonpsychotic major depression, and determined the frequency of psychosis across all episodes of depression in the two subgroups. The hypothesis was that the prevalence of psychosis in prior or subsequent episodes and the overall frequency of psychosis among all episodes would be higher in patients with psychotic depression than in those with nonpsychotic depression.
Method
Searches of PubMed, Embase, and PsycINFO were conducted from database inception to June 11, 2015. Two searches were performed; in the first, the terms “psychotic or delusional depression” were combined with “course of illness,” and in the second, with “clinical characteristics.” Although the term “clinical characteristics” is broad, this was necessary because “course of illness” may have been one of several characteristics examined and not necessarily been highlighted. Citations were merged and duplicates removed. Abstracts were reviewed to exclude unrelated articles. Potentially relevant articles were reviewed in full. The bibliographies of relevant articles were reviewed for other potential studies. The search was repeated on May 10, 2017, before we submitted this article for publication, and no relevant new studies were identified.
Studies were selected if they 1) identified patients with unipolar major depression using either DSM-III, DSM-III-R, or DSM-IV criteria, Research Diagnostic Criteria (RDC) (
14), or the Washington University criteria (Feighner criteria) (
15); 2) identified psychotic and nonpsychotic patients based on the presence of delusions or hallucinations; 3) examined past or subsequent episodes and characterized them as psychotic or nonpsychotic; and 4) were published in English. Studies that included and did not separate the findings for bipolar depression or schizoaffective disorder were excluded. For studies that reported on the course of illness in psychotic and nonpsychotic depressed patients but did not include the necessary data, we attempted to obtain data from the authors.
Episodes were judged to be psychotic or nonpsychotic based on the presence of delusions or hallucinations. Studies prior to DSM-III-R that classified cases as psychotic based on the presence of stupor were not excluded, but cases with stupor were documented as such. Studies could include patients with only mood-congruent symptoms or both mood-congruent and mood-incongruent features. Studies used either RDC or DSM criteria to exclude patients with schizoaffective disorder. Studies could be either retrospective or prospective. The episode at the time of study entry was considered the index episode and was used to define the psychotic and nonpsychotic groups. For the analysis of prior and subsequent episodes, the studies had to determine that episodes were distinct (as opposed to a continuation of a prior episode).
Statistical Analysis
Two meta-analyses were performed. In the first, the risk of any prior or subsequent psychotic major depressive episode was compared in patients whose index episode was psychotic or nonpsychotic. The number at risk was limited to those with recurrent episodes, with the exception of one study in which that number was not available, so the total number of patients was applied for that study. In the second analysis, the risk of psychosis in all episodes (prior or subsequent) was compared in the patients whose index depressive episodes were psychotic and those whose index episodes were nonpsychotic. The meta-analyses were performed using a random-effects model, and results were expressed as risk ratios with their 95% confidence intervals, a test of significance (Wald z), the number of contrasts (N), and p values. Chi-square tests and the I2 statistic derived from the chi-square values were used to test heterogeneity among the contrasts. An alpha error p<0.20 and an I2 of at least 50% were taken as indicators of heterogeneity of outcomes.
Planned secondary analyses included a comparison of retrospective and prospective studies. Correlations of risk ratios with mean age of the sample and year of publication, weighted for sample size, were computed. Exploratory analyses were undertaken to determine whether inclusion of patients with mood-incongruent psychotic symptoms, studies with older patients, or early studies versus later studies influenced risk ratios or absolute risk of psychosis. A funnel plot was examined, and an Egger test was performed (
16).
Results
The search (charted in
Figure 1) identified 4,064 nonduplicate articles. Of these, 3,920 were excluded because they did not include patients with psychotic depression or did not examine course of illness. The remaining 144 articles were reviewed in full. (Reasons for exclusion of nine studies that examined course of illness [
17–
25] are detailed in Table S1 in the
online supplement.) The data from two Coryell et al. studies (
13,
26) were included after patients with bipolar disorder were excluded.
Twelve studies met the selection criteria and were included in our analyses (
1,
13,
26–
35) (
Table 1). The studies, which were conducted over 35 years (1981–2015), included 546 patients with unipolar psychotic depression and 1,583 patients with nonpsychotic depression. Approximately two-thirds of the patients were female, and the mean age of the pooled samples was 49 years. Two studies were limited to older patients. Six studies limited patient selection to patients with mood-congruent psychotic symptoms, and six studies included patients with both mood-congruent and mood-incongruent features. Half the studies were retrospective and half were prospective.
Seven studies assessed the risk of a prior or subsequent psychotic episode in patients with psychotic and nonpsychotic index depressive episodes (
1,
26,
27,
29,
32,
34,
35) (
Figure 2). The studies included 273 patients with psychotic index episodes and 1,020 with nonpsychotic index episodes. The pooled risk ratio was 9.98 (95% CI=4.75, 20.94; z=6.08, p<0.001). There was significant heterogeneity (I
2=80%, χ
2=30.32, p<0.001). The simple pooled risk of a patient having at least one prior or subsequent psychotic episode was 65.3% for the patients with index psychotic episodes and 5.1% for those with nonpsychotic index episodes. When the single study with incomplete data for recurrence was excluded (
35), the simple pooled risks of prior and subsequent psychotic episodes were 75.3% and 7.1% for patients with psychotic and nonpsychotic index depressive episodes, respectively.
Eight studies assessed the risk of psychosis among all episodes of depression in the patients with psychotic and nonpsychotic index episodes (
1,
13,
27–
31,
33) (
Figure 3). The patients with psychotic index episodes had a total of 500 depressive episodes, and the patients with nonpsychotic index episodes had 1,172 depressive episodes. The pooled risk ratio was 7.24 (95% CI=5.03, 10.43; z=10.63, p<0.001). Heterogeneity was moderate (I
2=54%) and significant (χ
2=15.32, p=0.03). The pooled percentage of episodes that were psychotic was 64% in the patients with psychotic index depressive episodes and 6.1% in those with nonpsychotic index episodes.
Secondary analyses were performed for studies that examined the frequency of psychosis in all episodes in the patients with psychotic and nonpsychotic index depressive episodes. These analyses revealed that the distinction between retrospective and prospective study designs did not significantly affect the risk ratio and had only a modest effect on the percentage of episodes that were psychotic (see Figure S1 in the online supplement). Neither mean age nor year of publication was significantly correlated with the risk ratio (r=0.43, p=0.34 and r=0.47, p=0.24, respectively).
Exploratory analyses revealed that the risk ratio for psychosis in all episodes did not differ significantly between the early studies and the later studies; however, the absolute risk of psychosis in the psychotic patients was higher in the early compared with the later studies (83.2% and 52.1%, respectively) (see Figure S2 in the online supplement). Heterogeneity in the early studies was very low. In the comparison of the two studies of older patients compared with six mixed-age studies, the older samples had an apparent higher risk ratio (13.51 compared with 6.56), but the difference was not significant (χ2=2.90, df=1, p=0.09) (see Figure S3 in the online supplement). The actual risk of psychosis among all episodes in patients with a psychotic index episode was lower in the studies of older patients than in the mixed-age studies (44.4% and 71.8%, respectively), and rates of psychosis in the patients with nonpsychotic index episodes were low (3.0% and 7.0%, respectively). Among studies examining risk of psychosis in all episodes, five limited subject selection to psychotic patients with mood-congruent features and three included patients with both mood-congruent and mood-incongruent features. The risk ratio in the mood-congruent subgroup was lower than that in the three studies that included patients with both mood-congruent and mood-incongruent features (5.62 compared with 13.87; χ2=10.46, df=1, p=0.001) (see Figure S4 in the online supplement), but the pooled percentage of all episodes that were psychotic was higher in the studies of patients with only mood-congruent features than in those of patients with both mood-congruent and mood-incongruent features (79.6% compared with 43.3%). The relative difference in rates of psychosis in the nonpsychotic patients was even greater (13.6% compared with 2.9%) in the studies of patients with mood-congruent features compared with the studies of patients with mood-congruent and mood-incongruent features. The higher risk ratio did not indicate a higher rate of psychosis in the patients with index mood-congruent psychotic episodes, but rather a much lower rate of psychosis in the nonpsychotic patients in the mood-congruent–mood-incongruent subgroup. In this analysis, heterogeneity was low in both subgroups. Finally, in all these exploratory analyses, the difference in risk of psychosis among all episodes in the subgroup with psychotic index episodes compared with the subgroup with nonpsychotic index episodes was always robust and statistically significant regardless of which factors defined the subgroups.
A funnel plot (
Figure 4) did not appear asymmetric on inspection, and the Egger analysis revealed that the intercept did not deviate significantly from 0.
Discussion
Both meta-analyses were consistent with the hypothesis that psychotic depression runs true to form. The risk ratios of 7 and 10 are large. With the exception of one small, underpowered study (N=24), each individual study found a significantly higher risk of psychosis among patients whose index depressive episode was psychotic compared with those whose index episode was nonpsychotic. The funnel plot and the Egger analysis did not suggest reporting bias.
These findings suggest stability of the diagnosis of psychotic major depressive disorder. Coryell et al. (
13) reported greater stability of the psychotic subtype of major depression than other suggested subtypes—endogenous depression or agitated/retarded depression. Stability of the symptoms has been considered central to defining a valid disorder. The Washington University group cited five features that were characteristic of a disorder (
15). The findings of the present study bear on two of those features—clinical description and clinical course. The clinical features of delusions and hallucinations distinguish the subtype, and the stability of the presentation over time suggests consistency in the presentation of the disorder.
Heterogeneity was found in both meta-analyses. This means that other factors not accounted for may contribute to the variability among studies. Use of retrospective versus prospective methods did not affect the risk ratio or explain heterogeneity. The date of publication was not associated with the risk ratio, but risk of psychosis was lower in later studies. Risk ratios were higher in studies with older patient populations and in studies that included patients with both mood-congruent and mood-incongruent symptoms; however, the higher risk ratios appeared to be the result of very low rates of psychosis in the nonpsychotic index patients rather than an elevated risk of psychosis in the psychotic index patients. There was a suggestion of an interaction among these factors; studies with later publication dates were more likely to be prospective and were more likely to include both patients with mood-congruent symptoms and patients with mood-incongruent symptoms. Unfortunately, the number of studies examining risk of psychosis in all episodes (N=8) or risk of any psychosis in prior or subsequent episodes (N=7) was too small for multivariate analysis to be performed. Nevertheless, these exploratory analyses indicated that risk of psychosis in patients whose index depressive episode was psychotic remained high regardless of the factors examined.
Another potential source of heterogeneity is depression severity. The question is whether psychotic depression is merely a more severe form of depression. A full discussion of this issue extends well beyond the scope of this report. A more pertinent question is whether illness severity could explain the findings of this meta-analysis. Nine of the 12 studies compared psychotic and nonpsychotic inpatients, which should reduce differences in illness severity (see
Table 1). Five of the 12 studies assessed severity, and all found that the psychotic subgroup had more severe illness (
26,
29,
30,
33,
34). Two studies that used the Hamilton Depression Rating Scale (HAM-D) (
36) reported mean scores of 29.9 compared with 25.7 and of 30.5 compared with 24.2, respectively, for the psychotic and nonpsychotic patients (
29,
33). Alternatively, while Maj et al. (
34) found that illness severity in psychotic depressed patients was more likely to be rated as severe, it was assessed as mild or moderate in 23% of the psychotic subgroup. The issue is further complicated by the direct effect of delusions on the rating of severity on three HAM-D items—guilt, hypochondriasis, and insight. A study-level meta-analysis such as this cannot disentangle the severity-psychosis interaction. However, the small to medium difference in severity seems unlikely to be sufficient to explain the 10-fold difference in the frequency of psychosis between patients with psychotic and nonpsychotic index depressive episodes. Finally, if psychotic symptoms were limited to severe episodes, this would have the effect of reducing the likelihood of recurrence of psychosis rather than the outcome reported.
It might be questioned why some episodes in patients with psychotic depression are not psychotic. In the present data, 64% of all episodes in patients with psychotic index depressive episodes were psychotic and 36% were not. There are various possibilities. Treatment was naturalistic, and psychotic depressed patients were more likely to receive antipsychotic treatment that may have reduced psychotic symptoms. Maj et al. (
34) also noted the difficulty in confirming the presence of delusions in some patients with sustained preoccupations. Nevertheless, the finding of an elevated risk of psychosis in future depressive episodes should alert clinicians to the need to look carefully for psychosis in patients with this past history. The importance of this clinical point is underscored by a study finding that 27% of patients who met DSM-IV criteria for psychotic depression on structured interview did not receive a clinical diagnosis of psychotic depression (
37).
A related question is that of what the characteristics of nonpsychotic episodes are in patients whose primary diagnosis appears to be psychotic depression. Few studies have examined this question. Maj et al. (
34) noted that some nonpsychotic patients who developed delusions during a follow-up period had sustained preoccupations at baseline. In our 1981 study (
1), four patients with prior psychotic episodes were not psychotic during the index admission but were agitated and ruminative. Ruminative thinking or sustained preoccupations may be subsyndromal symptoms of delusional depression. We previously found that the frequency of ruminative thinking was high in delusional patients (87%), but it can occur in patients without delusions (
1,
38). Some ruminative patients show high levels of conviction and evidence of the impact of their beliefs in the absence of clear implausibility of beliefs that would define delusional thinking. These two factors—conviction and impact—have been discussed by Kendler et al. (
39) and Meyers et al. (
40) as dimensions of delusional thinking that may reflect severity. These dimensions may help to define subsyndromal delusional depression.
Given the relative stability of this disorder, one might wonder whether these subsyndromal ruminative episodes in patients with a history of psychotic depression might best be treated as psychotic depression. Two studies (
41,
42) found that “near delusional” depressed patients, as defined by persistent ruminations, were less likely to respond to antidepressant monotherapy and might benefit from antipsychotic treatment. To our knowledge, the treatment for this patient population has not been systematically studied.
Stability of diagnosis has also been studied in first-episode psychosis patients. In one such study, the number of patients with unipolar psychotic depression was small (
43). The Ruggero et al. 10-year follow-up study (
44) is most pertinent. In that study, of 628 patients presenting with first-episode psychosis, 80 were diagnosed as having psychotic major depression at baseline. At 10 years, 36 patients (45%) retained the diagnosis, 11 had switched to bipolar disorder, and a larger percentage of patients had switched to a schizophrenia spectrum diagnosis. The switch to bipolar disorder is not unexpected. In the Maj et al. 10-year follow-up study (
34), 10% of psychotic depressed patients had a subsequent manic or hypomanic episode. There may be several reasons for the differences in the findings of the Ruggero et al. study (
44) and this meta-analysis. First, initial psychotic episodes may be less well differentiated, especially in young patients, and difficult to diagnose. Second, in the Ruggero et al. study, the mood-congruent/mood-incongruent distinction does not appear to have been made, whereas in the data we analyzed, six of the studies limited the sample to patients with mood-congruent symptoms. Coryell et al. (
45) found that patients diagnosed with DSM-III mood-incongruent delusions were often diagnosed as having schizoaffective disorder on the basis of Research Diagnostic Criteria. Two other studies reported that psychotic depressed patients with mood-incongruent symptoms were more likely than patients with mood-congruent symptoms to receive a final diagnosis of schizophrenia after a follow-up period (
46,
47). Third, the mean age in the Ruggero study was 31 years, whereas the mean pooled age in the studies we reviewed was 49 years. At this older age, the diagnosis may have become more stable. Consistent with that, the prospective follow-up studies in this review did not report significant conversion to schizoaffective disorder.
The finding that patients with psychotic major depression are likely to suffer subsequent psychotic depression raises the question of what the appropriate treatment is, especially for prevention of future psychotic depressive episodes. Yet, few studies have examined this question (
48–
50), and they have been limited to samples of fewer than 35 subjects. The large Study of the Pharmacotherapy of Psychotic Depression II plans to compare the combination of sertraline and olanzapine or sertraline and placebo over a 36-week period and hopefully will address these questions (
51).
This meta-analysis has both strengths and limitations. All of the studies used a similar definition of delusions. Only two studies included patients with stupor in the psychotic group, but in one study (
26) only a single patient had stupor, and in the other (
27) the five patients with stupor also had delusions. All studies excluded patients with schizoaffective disorder, although definitions of schizoaffective disorder have changed during the period the studies were conducted. Although the index episode was diagnosed as mood congruent or incongruent, in none of the retrospective studies were prior psychotic episodes diagnosed as mood congruent or incongruent. Furthermore, only one of the 12 studies examined the consistency of the congruence of delusions between episodes; yet, in that 7-year prospective study, the type of delusion was similar in 29 (88%) of the 33 new psychotic depressive episodes that occurred (
30). This finding appears similar to observations in three of the studies that the type of delusion tended to be similar from episode to episode (
1,
27,
28). Although the number of studies in the meta-analysis was not large, the findings were consistent. While the retrospective studies might be expected to be less rigorous than the prospective studies, the risk ratios did not differ significantly or substantively in these two types of studies. The use of hospital records to document presence of psychotic features in some retrospective studies was more rigorous than relying on patient memory, and one study demonstrated a high level of agreement for duplicate ratings of the psychotic distinction using records (
1), but reliance on hospital admissions may lead to the underreporting of milder depressive episodes.
Finally, another limitation of the study is that the literature search may have failed to uncover other relevant articles because description of episodes in psychotic depression may have been one of several features examined in a study, but the findings pertaining to this review may not have been highlighted in the title, abstract, or keywords, and thus the study was not identified by our search. This might be especially true for negative findings. Reassuringly, the funnel plot and the Egger analysis do not suggest reporting bias.