Although bipolar disorder has traditionally been thought to have a lifetime prevalence of about 1% of the population
(1,
2), a substantial upward revision of this estimate is occurring based on mounting evidence for a broad bipolar spectrum that includes people with a history of hypomania, subthreshold manic symptoms, and medication-induced manic symptoms
(3 –
5) . The bipolar spectrum is now thought to characterize as much as 5% of the general population
(6) . People with bipolar spectrum disorder spend a considerably higher proportion of time with depressive than manic symptoms
(7,
8), resulting in frequent confusion between major depressive episodes due to major depressive disorder or to bipolar disorder
(9) . Failure to make this distinction can have dire clinical implications.
The same distinction between major depressive disorder and bipolar disorder would be useful to make in depression cost-of-illness studies. However, with rare exceptions, these studies failed to distinguish between major depressive episodes associated with major depressive disorder and those associated with bipolar disorder
(10 –
13) . Furthermore, although several recent cost-of-illness studies
(14 –
16) and reviews
(17,
18) have focused on the costs of bipolar disorder, none has presented comparative information on the workplace costs of major depressive disorder and bipolar disorder. The current report does this using data from the recently completed National Comorbidity Survey Replication (NCS-R)
(19), a nationally representative survey of the prevalence and costs of mental disorders in the U.S. household population.
Method
Sample
The NCS-R is a nationally representative survey of mental disorders among English-speaking household residents ages 18 and older in the continental United States. Interviews were carried out with 9,282 respondents between February 2001 and April 2003. Verbal informed consent was obtained before data collection. Consent was verbal rather than written to maintain consistency with the baseline NCS. The rate of response was 70.9%. Respondents were given a $50 incentive for participation. In addition, a probability subsample of hard-to-recruit predesignated respondents was selected for a brief telephone nonrespondent survey, the results of which were used to weight the main sample for nonresponse bias. Nonrespondent survey participants were given a $100 incentive. The Human Subjects Committees of Harvard Medical School and the University of Michigan both approved these recruitment and consent procedures.
The NCS-R interview was administered in two parts. Part 1 included a core diagnostic assessment of all 9,282 respondents. Part 2 included questions about correlates and additional disorders administered to all part 1 respondents who met lifetime criteria for any core disorder plus a roughly 1-in-3 probability subsample of 5,692 other respondents. The Health and Work Performance Questionnaire assessment of work performance was included in part 2. A subsample of 3,378 part 2 respondents was either employed or self-employed 20 hours or more per week in the month before the interview and had valid data on all measures used in the following analyses. This is the sample used here. The records for these respondents were weighted to adjust for differential probability of selection into part 2 of the interview and for differential nonresponse. A more detailed discussion of NCS-R sampling and weighting is presented elsewhere
(20) .
Mood Disorders
NCS-R diagnoses were based on version 3.0 of the World Health Organization’s Composite International Diagnostic Interview (CIDI)
(21), a fully structured lay-administered measure. DSM-IV criteria were used to define major depressive episodes, dysthymic disorder, bipolar I disorder, and bipolar II disorder. Because of the small sample size, bipolar I disorder and bipolar II disorder were combined into a single category of bipolar disorder for the current analysis. All diagnoses excluded patients with plausible organic causes for their illness. Blind clinical reappraisal interviews with the lifetime nonpatient version of the Structured Clinical Interview for DSM-IV (SCID)
(22) were administered to a probability subsample of 325 NCS-R respondents to assess concordance with CIDI hierarchy-free diagnoses. CIDI-SCID concordance was excellent for bipolar disorder, with an area under the receiver-operator-characteristic curve of 0.93, an odds ratio of 582.6, and a nonsignificant McNemar test (χ
2 =0.6, df=1, p=0.45). The McNemar test evaluated whether the CIDI prevalence estimate differed significantly from the SCID prevalence estimate. Concordance between the CIDI and the SCID was also good for major depression, with an area under the curve of 0.75, an odds ratio of 18.4, and a McNemar test (χ
2 =7.2, df=1, p=0.006). The McNemar test was significant because the CIDI prevalence estimate was conservative relative to the SCID estimate. Concordance between the CIDI and the SCID was not assessed for dysthymia because the number of respondents with dysthymia in the clinical reappraisal sample was too small for reliable analysis.
Once the mood disorders were operationalized, the respondents who were classified as having lifetime bipolar disorder were defined as 12-month patients with 12 months of illness if they experienced a major depressive, a manic, or a hypomanic episode at any time in the 12 months before the interview. The respondents classified as having lifetime major depressive disorder were defined as having 12-month cases if they had experienced a major depressive episode at any time in the 12 months before the interview. The vast majority of respondents with a hierarchy-free diagnosis of 12-month dysthymia also met criteria for 12-month major depressive disorder. These “double depressives”
(23) were subsequently compared with other patients with major depressive disorder in the ability to predict their work performance. The handful of respondents with 12-month dysthymia who failed to meet criteria for major depressive disorder was excluded from the analysis because of the group’s low statistical power.
The persistence and severity of 12-month major depressive episodes were compared for respondents with bipolar disorder who had 12-month major depressive episodes and for respondents with 12-month major depressive disorder to determine whether more severe or persistent depression could account for observed differences in work performance between the two subsamples. Persistence was assessed by asking respondents with a 12-month major depressive episode to estimate how many days out of 365 in the past year they had experienced a depressive episode. Severity was assessed with the self-report version of the Quick Inventory of Depressive Symptomatology
(24), referring to the 1 month in the past year when the respondents reported their depression as most severe.
Work Performance
Work performance was assessed with the WHO Health and Work Performance Questionnaire
(25,
26) . This measure uses self-reports about absenteeism (missed days of work) and “presenteeism” (low performance while at work transformed to lost workday equivalents) to generate a summary measure of overall lost workdays in the month before the interview. Absenteeism was defined on a 0–100 scale for the percentage of work days the respondent missed in the past 30 days, while presenteeism was defined on a separate 0–100 scale in which 0 meant doing no work at all on days spent at work and 100 meant performing at the level of a top worker. Absenteeism and presenteeism were combined into a measure of total lost work performance by adding absenteeism to the value ([100–absenteeism] × [100–presenteeism]). Information about salary was used to transform the measures of lost work performance from a time metric to a salary metric for the purposes of estimating human capital loss associated with mood disorders. Salary was incremented by 25% to estimate fringe benefits.
Control Variables
All analyses included control for sex, age (18–29, 30–44, 45–59, and 60 and over), race/ethnicity (non-Hispanic white, non-Hispanic black, Hispanic, other), education (less than high school, completed high school, some college, completed college), and occupation (professional, technical, service-clerical, laborer), as well as for average expected hours of work per week (20–34, 35–44, 45 or more).
Analysis
Subgroup comparison of prevalence estimates was used to study the sociodemographic correlates of mood disorders, while linear regression analysis was used to estimate associations of mood disorders with work performance. Mood disorders were coded as “yes/no” dummy predictor variables in linear regression equations that included sociodemographic variables (age, sex, race/ethnicity, education, occupation) as controls. The dependent variables in these equations were measures of lost work performance in the metrics of day equivalents and salary equivalents that distinguished absenteeism and presenteeism as well as combined absenteeism and presenteeism into a summary measure of overall lost work performance.
These basic equations were elaborated in three ways. The first distinguished among patients with bipolar disorder who reported 12-month episodes of major depressive episodes only, mania or hypomania only, or both. The second controlled for the severity duration of major depressive episodes. The third evaluated interactions between mood disorders and sociodemographic variables.
The key predictors in the regression equations were measures of the prevalence of 12-month disorders, whereas the outcomes were measures of 1-month (not 12-month) decrements in work performance. The coefficients were multiplied by 12 to estimate decrements in work performance over the past 12 months because of 12-month mood disorders. These individual-level estimates were then projected to the total U.S. civilian labor force by adjusting for 12-month disorder prevalence and for the fact that the seasonally adjusted number of workers in the U.S. civilian labor force ages 18 and over at the time of the NCS-R was 130 million.
A question might be raised as to why the time frame of the measures was not made consistent by using either 12-month decrements in work performance as outcomes or 1-month prevalence of mood disorders as predictors. The former was not possible because methodological research has shown that retrospective self-reports about health-related decrements in work performance are inaccurate beyond a 1-month recall period
(27) . The latter (i.e., using measures of 1-month mood disorders as predictors) would have been possible but would have left unresolved the possibility that remitted mood disorders continue to have residual adverse effects on work performance after episode resolution. The use of 12-month disorders to predict 1-month work performance resolves this problem by generating an averaged estimate of the effects on 1-month work performance of both active episodes and remitted episodes that were active in the past 12 months. The multiplication of this estimate by 12 then produces an unbiased estimate of the effect of mood disorders active in the past 12 months on decrements in work performance in the same time period.
Because the NCS-R data are weighted and clustered, the Taylor series linearization method
(28) implemented in the SUDAAN software system
(29) was used to obtain design-based estimates of statistical significance. Significance tests of sets of coefficients in the logistic regression equations were made using Wald χ
2 tests based on design-corrected coefficient variance-covariance matrices. Statistical significance was consistently evaluated as p=0.05, two-tailed.
Discussion
Two potential limitations of this study are the possible existence of inaccuracy in the key measures and the possible existence of unmeasured common causes of the disorders and outcomes. With regard to the first of these two, the accuracy of diagnostic assessment was documented in the SCID reappraisal interviews mentioned in the section on measures. However, fully structured instruments, such as the CIDI, are less able to distinguish mixed episodes than are semistructured clinical interviews, leading to the imposition of a more rigid distinction between major depressive episodes and manic/hypomanic episodes in individuals with major depressive disorder than would have been ideal
(30) . The accuracy of the Health and Work Performance Questionnaire work performance assessment was evaluated in a series of workplace validity studies
(25,
26) that documented strong relationships of questionnaire measures with independent payroll records and supervisor evaluations of job performance.
The possibility of unmeasured common causes is much more difficult to evaluate. To the extent that common causes exist, the estimated effects of bipolar disorder and major depressive disorder on lost work performance will be biased. No definitive way exists to evaluate this possibility other than by experimentally changing the prevalence of these disorders, presumably in a treatment effectiveness trial, and evaluating the effects on work performance. The results of such experiments in representative workplace samples have not been reported either for bipolar disorder or major depressive disorder, although such an experiment is currently underway to evaluate the workplace effects of treating major depressive disorder
(31) . Despite the absence of experimental evidence, simulations of likely effects have been carried out using parameter estimates gleaned from clinical trials
(32,
33) . The estimated decrements in work performance associated with major depressive disorder in these simulations are broadly consistent with the NCS-R estimates. In addition, the results of a recently reported experimental effectiveness trial aimed at increasing work performance by improving the quality of major depressive disorder treatment yielded estimates of effects on work performance broadly consistent with the NCS-R estimates
(34) .
Within the context of these limitations, the results reported here show that bipolar disorder and major depressive disorder are both common disorders in the U.S. civilian labor force associated with substantial lost work performance. Our prevalence estimates of bipolar disorder and major depressive disorder are consistent with those in other national surveys
(35,
36) . As noted in the introduction though, bipolar spectrum disorders could be defined more broadly than in the current report
(3 –
5) . The same is true for subthreshold depression
(37,
38) . Future research should investigate the effects of these subthreshold disorders on work performance
(5,
39) .
Our finding that both bipolar disorder and major depressive disorder are associated with substantial losses in work performance is consistent with other estimates of workplace costs
(12,
17,
18,
40,
41) . The estimated annual population-level workplace cost of major depressive disorder, $36.6 billion, is similar in magnitude to the $31.0 billion estimate reported in another recent study
(31) . In addition, the workplace cost of major depressive disorder plus bipolar disorder, $50.7 billion, is very similar to the $51.5 billion estimate reported elsewhere
(12), although the distribution of workplace cost components is quite different across studies. Whereas presenteeism is estimated here to account for about two-thirds of the total workplace costs of illness, the earlier findings were skewed in the opposite direction, with more than two-thirds of workplace costs estimated to arise from absenteeism
(12) . The current results are likely to be more accurate than the earlier ones because the Health and Work Performance Questionnaire produces a better measure of the on-the-job component of work performance than the measure used in the previous study. Finally, the only previous estimate of the population-level workplace cost of bipolar disorder, $2.3 billion in 1990 dollars or in the range of $3 billion to $4 billion today
(11), is much lower than our $14.1 billion estimate, presumably reflecting the fact that this earlier report, which was based on synthetic estimation rather than primary data collection, assumed a much lower prevalence than we found to be the case in our nationally representative survey.
By considering bipolar disorder and major depressive disorder simultaneously, we documented that bipolar disorder is associated with substantially more lost work performance than major depressive disorder at the individual level, although aggregate impairment is greater for major depressive disorder than for bipolar disorder because of the higher prevalence of the former than the latter disorder. Decomposition showed that the higher individual-level impairment of bipolar disorder than major depressive disorder was due largely to major depressive episodes being more impairing in the context of bipolar disorder than in major depressive disorder rather than to mania/hypomania being more impairing than major depressive episodes. The finding that mania/hypomania in the absence of major depressive episodes is associated with significantly less work impairment than bipolar disorder with major depressive episodes is consistent with the observation in a prospective patient study that functional impairment was associated with variation in depressive symptoms but not manic symptoms
(42) . More detailed analysis of the NCS-R data showed that the higher individual-level work impairment of major depressive episodes in bipolar disorder than in major depressive disorder is due partly to the greater persistence and severity of major depressive episodes in bipolar disorder than in major depressive disorder. However, the persistence/severity of major depressive episodes explained only part of the association between bipolar disorder and work impairment. The remaining part of this association could be due to either imprecision in our measures or the effects of unmeasured correlates of bipolar disorder and work impairment.
An important practical problem related to the finding that most workers with bipolar disorder had major depressive episodes is that major depressive episodes due to bipolar disorder are sometimes incorrectly treated as if they were due to major depressive disorder
(43,
44) . This problem is exacerbated by people with bipolar disorder reporting more distress because of their depressive than their manic symptoms
(40) . Because antidepressant medications can trigger the onset of mania, it is important to screen for a history of bipolar disorder at the initiation of depression treatment. A short and valid screen for manic/hypomanic symptoms has recently been developed that could be used for this purpose
(45) . It is important for the same reason to include a screen for bipolar disorder in workplace depression screening programs. The prevalence and impairments of subthreshold cases should also be examined. Effectiveness trials are needed to calculate the return on investment from the employer’s perspective of coordinated workplace bipolar disorder-major depressive disorder screening and treatment
(34,
41) .
Acknowledgments
Collaborating NCS-R investigators include Ronald C. Kessler (principal investigator, Harvard Medical School), Kathleen Merikangas (co-principal investigator, NIMH), James Anthony (Michigan State University), William Eaton (Johns Hopkins University), Meyer Glantz (NIDA), Doreen Koretz (Harvard University), Jane McLeod (Indiana University), Mark Olfson (Columbia University College of Physicians and Surgeons), Harold Pincus (University of Pittsburgh), Greg Simon (Group Health Cooperative), T. Bedirhan Ustun (WHO), Michael Von Korff (Group Health Cooperative), Philip Wang (Harvard Medical School), Kenneth Wells (UCLA), Elaine Wethington (Cornell University), and Hans-Ulrich Wittchen (Max Planck Institute of Psychiatry). A complete list of NCS publications and the full text of all NCS-R instruments can be found at http://www.hcp.med.harvard.edu/ncs. Send correspondence to
[email protected]. The NCS-R is carried out in conjunction with the WHO World Mental Health (WMH) Survey Initiative. A complete list of WMH publications and instruments can be found at http://www.hcp.med.harvard.edu/wmhcidi.