Mental and substance use disorders (henceforth “mental disorders”) are the largest contributor to global burden of disease (
1), yet most people living with mental disorders in low- and middle-income countries (LMICs) do not have access to care because of scarce funding and human resources (
2). A stepped-care approach, in which nonspecialists manage detection of mental disorders and provide treatment or referral to specialists, is an efficient method to close the treatment gap in LMICs (
3–
5).
In stepped care, only severe mental disorders require consultation with a mental health specialist, whereas nonsevere mental disorders can be managed by primary care providers and lay workers (
6). Thus key to implementation and sustainability of comprehensive stepped-care management of mental disorders is a rapid, reliable method for minimally trained providers to identify presence and type of mental disorder. Many screening tools for mental disorder detection have been validated in high-income countries and LMICs (
7). However, these tools, which have five to more than 20 items and sometimes have a cost per use, are designed to detect one disorder at a time (e.g., depression [
8]) or symptoms common to some disorders (e.g., psychological distress [
9]). Use of a combination of these screens for all mental disorders is unfeasible in low-resource health systems.
We aimed to develop and validate a brief questionnaire, the Mental Wellness Tool (mwTool), to screen for mental disorders and classify individuals into disorder categories that facilitate comprehensive stepped-care management of mental disorders. Through novel application of a variable selection technique (least absolute shrinkage and selection operator [LASSO]) (
10), we sought to identify a small set of items, selected from widely validated screening measures for individual mental disorders, with high sensitivity for identification of any mental disorder and high specificity for classification of severe mental disorders, common mental disorders, substance use disorders, and suicide risk. In a separate validation sample, we assessed performance of the mwTool in identifying and classifying mental disorders. Specifically, we examined performance by using index responses as well as proxy responses, which facilitate epidemiological research and community-based care wherein interviewing all index cases is not feasible or possible.
Methods
Participants provided written informed consent as approved by the New York State Psychiatric Institute Institutional Review Board (no. 7479) and the Eduardo Mondlane University Institutional Health Bioethics Council (CIBS FM and HCM/54/2017). Study analyses and reporting follow the Standards for Reporting of Diagnostic Accuracy Studies (
11).
Study Setting
Development data were collected at two primary care clinics and one hospital in Maputo City, Mozambique, from May 16 to June 8, 2018. These facilities provide primary care, emergency, and outpatient mental health services. The hospital also provides services for victims of interpersonal violence and inpatient general medical and psychiatric services. Validation data were collected from December 5 to 12, 2018, at three primary care clinics in Nampula, Mozambique. These facilities provide primary care and emergency services. People with mental disorders are referred to Nampula’s provincial psychiatric hospital.
Study Population
Adults (patients and accompaniers) in health facility waiting rooms were invited to participate. All volunteers were taken to a private area for eligibility assessments and informed consent. Potential participants were excluded if they were under age 18 or unable to sufficiently communicate in Portuguese, which was determined by asking potential participants to repeat study objectives in their own words. For the development sample, we planned to enroll ≥400 people with at least one psychiatric diagnosis and ≥400 without any psychiatric diagnosis to ensure ±5% margins of error for sensitivity and specificity estimates. For diversity of psychiatric diagnoses, we aimed to obtain ≥40 participants with balanced representation of males and females with each specific diagnosis (detailed below). For the validation sample, we aimed to obtain ≥40 gender-balanced index participants (those providing responses about their own mental health) with each specific diagnosis (described below), of whom ≥200 would provide proxy responses (regarding the mental health of another index participant with whom they were attending the health facility), allowing enough precision for ±7% margins of error for sensitivity and specificity.
Measures
For all instruments except the Psychosis Screening Questionnaire (PSQ) and Primary Care PTSD Screen (PC-PTSD), we used existing Brazilian or Portuguese translations, and local research team members made minor adjustments for the Mozambican context (e.g., local terms for substances). The PSQ and PC-PTSD were translated from English to Portuguese by the local research team, back-translated by a native English speaker fluent in Portuguese, and reviewed for translation accuracy by a measurement specialist at Columbia University (unassociated with this study). All measures were pretested while training interviewers (Mozambican mental health specialists) and underwent final review by means of cognitive interviews with 10 Mozambican adults attending primary care.
Mental disorder diagnosis and classification.
Mental disorder diagnoses were made with the Brazilian version of the Mini International Neuropsychiatric Interview (MINI) Plus (
12,
13), a structured diagnostic interview that has been widely used as a reference standard across many contexts (
7). Based on MINI diagnoses, we classified participants into the following four categories corresponding to different stepped-care pathways: severe mental disorder, for diagnoses of mania, psychosis, or the presence of psychotic symptoms associated with another disorder (e.g., depression); common mental disorder, for major depressive episode, panic disorder, posttraumatic stress disorder [PTSD], anxiety, or somatization; substance use disorder, for alcohol abuse or dependence or substance abuse or dependence; suicide risk if the respondent was scored as having moderate or high suicide risk (i.e., 6 points or higher, indicative of past-month active ideation, planning, and/or attempt). All diagnoses were for current disorders, except for psychosis, for which even lifetime diagnoses were considered, because patients with a history of psychosis require referral to specialists.
Mental health screening battery.
We administered nine structured instruments commonly used to screen for specific mental disorders and to assess functioning (
Table 1) (
8,
14–
21). (Details about the screening battery are provided in an
online supplement to this article.)
Demographic and general health measures.
We collected self-reported sociodemographic information (age, gender, marital status, living situation, education, religion, monthly household income, occupation, and race-ethnicity) and health history (chronic diseases, pregnancy, and parity). For the validation sample, we recorded the relationship between index and proxy respondents.
Responses to all measures were recorded via tablet with the REDCap platform (
22).
Procedures
In the development sample, research assistants administered the sociodemographic questionnaire and then the MINI and mental health screening battery in a randomized order. In the validation phase, a research assistant first administered the sociodemographic questionnaire. Then, for participants who were alone or with someone who was not eligible to participate in the study, the research assistant administered the mwTool followed by the MINI to assess the participant’s mental health (index response). For participants attending the facility with a companion who was also a participant, research assistants privately administered the items from the mwTool for identification of any mental disorder in regard to the companion’s mental health (proxy response) and then administered the complete mwTool and MINI to assess the participant’s own mental health (index response).
Statistical Analysis
We excluded from analysis participants with incomplete responses to the MINI, screening battery, or mwTool. Analyses were performed with R, version 3.6.1 (
23); the glmnet package fit LASSO models (
24).
mwTool development.
First, we sought to identify three to five items from the screening battery with high sensitivity for the presence of any mental disorder. Second, we sought to identify an additional six to 12 items that provided high specificity for classification into the four disorder categories, in order to minimize false positives for potentially stigmatizing disorders and undue burden on low-resource systems while maintaining adequate sensitivity. (All items considered for the mwTool are identified in the online supplement.) Ordinal responses were dichotomized for analyses; responses indicating moderate-to-high symptom strength and frequency were considered positive.
A series of LASSO logistic regressions was used to determine the best subset of battery items for presence of any mental disorder and subsequently for each disorder category. LASSO regression incorporates a penalty term based on the sum of the absolute values of all model coefficients. The effect of this term is that, at high levels of the shrinkage parameter (λ), coefficient estimates for less important variables shrink to zero and are removed from the model. This allows for variable selection at high levels of λ, which we varied to select the best three to 10 items for predicting each outcome. We confirmed that LASSO results included only items with positive coefficients (item presence associated with higher disorder risk). Area under the receiver operating characteristic curve described the accuracy of different best-item subsets. We then selected mwTool items, balancing statistical validity (i.e., empirically best combination of items based on LASSO), feasibility (i.e., fewer total items), and face validity (i.e., items reflecting diagnostic criteria for disorder categories).
Sensitivity and specificity of the mwTool for any mental disorder and for the four disorder categories were assessed in the development sample. Because treatment of severe mental disorders takes priority in stepped care when a person has both severe and common mental disorders, participants positive for both by the mwTool were classified as having a severe mental disorder and not a common mental disorder.
mwTool validation.
We calculated sensitivity and specificity of the mwTool for any mental disorder and for the four disorder categories by using index case responses. We also assessed sensitivity and specificity of mwTool questions for any mental disorder by using proxy responses. We excluded proxy responses when proxy and index respondents provided discordant information about their relationship.
Results
mwTool Development
Across the three Maputo sites, 1,033 people were screened for eligibility; seven (1%) were under age 18, and eight (1%) were not fluent in Portuguese. Twenty-nine (3%) of the 1,018 eligible persons did not provide informed consent. We excluded from analysis 78 (8%) participants who did not complete the MINI or screening battery (see
online supplement). Among the 911 included participants, 570 (63%) were female, and the mean±SD age of the sample was 32.0±11.3 years. As shown in
Table 2, over half (52%) had one or more disorders on the basis of MINI diagnoses: severe mental disorder, 29%; common mental disorder, 36%; substance use disorder, 14%; and suicide risk, 9%.
In total, 13 screening battery items were selected for inclusion in the mwTool (see diagram in
online supplement). (The three, five, eight, and 10 screening items that best classified any mental disorder and each of the disorder categories are detailed in the
online supplement.) Twelve items were chosen from LASSO results. In consultation with expert clinicians and clinical researchers, one additional suicide item was added to capture individuals with recent attempts, a high-risk group that may lack ideation and thus would not be detected by the LASSO models’ best items (
25).
A positive response to any of the first three mwTool items signals the interviewer to ask the next 10 items; a negative response to all three indicates absence of any mental disorder and signals screening completion. When the 10 additional items are asked, a positive response to any item associated with a disorder category indicates presence of that disorder category. Negative responses to all additional 10 items indicate that the person should be classified either as having a common mental disorder, if the person gave a positive response to item 2 of the patient Health Questionnaire or item 5 of the General Anxiety Disorder [GAD] scale), or as having no disorder, if the person gave a positive response only to item 1 of the GAD.
The first three mwTool items identified any mental disorder with 0.83 sensitivity and 0.49 specificity (
Table 2), and performed similarly by gender, age, and HIV status (see
online supplement). The 10 additional mwTool items classified severe mental disorder with 0.72 specificity, common mental disorder with 0.79 specificity, substance use disorder with 0.82 specificity, and suicide risk with 0.90 specificity (
Table 2), with little variation across subpopulations (see
online supplement). Sensitivity for the specific disorder categories was highest for common mental disorder (0.83) and lowest for severe mental disorder (0.62).
mwTool Validation and Final Item Selection
At Nampula sites, 482 people were screened for eligibility; two (<1%) were not fluent in Portuguese. Of the 480 participants, 243 (51%) provided proxy responses to the three initial mwTool items regarding another participant with whom they were attending the health facility. We excluded from analysis 27 (6%) participants who did not complete the MINI or all mwTool items (see
online supplement). Among the 453 included participants, 296 (65%) were female, and the mean age of the sample was 31.1±10.7 years. MINI diagnoses indicated the presence of one or more disorders for 39% of participants (
Table 3): severe mental disorder, 18%; common mental disorder, 30%; substance use disorder, 6%; and suicide risk, 8%.
Based on index responses, the first three items of the mwTool had 0.94 sensitivity for identification of any mental disorder (
Table 3). The 10 classification items had a specificity of 0.47 for severe mental disorder, 0.83 for common mental disorder, 0.82 for substance use disorder, and 0.93 for suicide risk. Sensitivity for disorder categories ranged from 0.77 (suicide risk) to 0.96 (common mental disorder). Family proxy responses had higher sensitivity than did nonfamily responses (0.73 versus 0.62), although the difference was not significant (
Table 3). For both family and nonfamily proxy responses that led to a positive screen on the mwTool, 83% of corresponding index responses also led to a positive screen.
Although the 13-item mwTool generally showed similar or higher sensitivity and specificity in the validation sample compared with the development sample, the specificity for severe mental disorder was considerably lower in the validation sample. Many participants whose screen was false positive for a severe mental disorder (N=58 of 195) had endorsed only the first of the additional classification items (item 7 of the GAD). Because this item assesses a symptom of anxiety, a common mental disorder, we next evaluated performance of the mwTool when this item was excluded. When this 12-item mwTool (see diagram in
online supplement) was used, specificity of index responses for classification of severe mental disorder increased to an acceptable level (0.63), and specificity for common mental disorder (which the excluded item also was used to classify in the 13-item mwTool) was reduced but remained good (0.72) (
Table 4). The 12-item mwTool performed similarly in subpopulations of the validation sample (see
online supplement).
Discussion
We employed the novel application of LASSO regression modeling, along with expert consultation, to select items from mental health screens that can identify and classify mental disorders. We designed a two-step instrument, the mwTool, in which the first three items are asked of all respondents, and only those who are identified by these items as positive for any mental disorder are asked the additional items for classification into disorder categories.
Brevity of screening instruments reduces provider burden and, in turn, promotes adoption in primary care and community settings (
26). Additionally, previous research has shown that screens with fewer items are as accurate as those with more items for detection of individual disorders in both high-income countries and LMICs (
27–
29). Other analytical techniques, such as item response theory, have been successfully used to shorten screens for common mental and substance use disorders in LMICs (
30–
32), although no brief instrument exists that provides transdiagnostic mental health assessment. With the LASSO variable selection technique, we reduced 99 items from nine instruments for eight different mental disorders to 12 items—or just 12% of the combined screens’ length—that had acceptable-to-excellent sensitivity and specificity for all disorder categories.
Proxy respondents are common in clinical care and epidemiological research when the index individual is unable to self-report, either because the person is unavailable, incapable of providing responses, or underage. However, evaluation of proxy responses on other assessment tools has shown them to be less reliable for questions about subjective experiences, such as emotions and psychological distress, than for objective experiences (
33). Our results showed that the first three items of the mwTool had good sensitivity for identification of any mental disorder when proxy responses from family members were used. Proxy responses from nonfamily did not perform as well as family proxy responses, in line with previous research showing that friends and health care proxy respondents have lower agreement and reliability, compared with family proxies (
33). Future studies are needed to determine in more detail what specific characteristics of family members—such as cohabitation or relation to the index individual—promote reliability of proxy responses on the mwTool.
Throughout mwTool development and validation, severe mental disorder was the lowest-performing disorder category. In tool development, five more questions had to be included for adequate classification of severe mental disorder, and sensitivity of the mwTool for severe mental disorder was lower than for any other disorder category. This is unsurprising, because measures for severe mental disorders have routinely been found to have lower performance than for other mental disorders (
34). However, in the validation sample, by removing one question we were able to increase specificity to an acceptable level. Therefore, we recommend use of the 12-item mwTool in future assessments.
This study had several limitations. Participants in both the development and the validation samples were recruited from health clinics, and our findings may not be generalizable to other settings. Additionally, because no other published data exist on the prevalence of mental disorders in Mozambican health care settings and we cannot be certain that our data are representative, we did not calculate the positive predictive value or negative predictive value of the mwTool. However, the mwTool did not perform differently by age, gender, and HIV status subgroups. Because we assessed the mwTool in one low-income country and one language, its validity in other settings should be assessed. Moreover, owing to post hoc elimination of GAD item 7 from the 13-item mwTool assessed in the validation sample, the 12-item mwTool requires further validation in an independent sample. Finally, the population in our setting had low rates of substance use, and most substance users also used alcohol; therefore, the mwTool includes questions related only to alcohol use. In other settings, it may be necessary to add items for substance use and calibrate dichotomization of the measure according to contextual substance use patterns.
Conclusions
To our knowledge, the mwTool is the first brief screen for nonspecialist assessment of common mental disorders, severe mental disorders, substance use disorders, and suicide risk. The mwTool performed well for identification of any mental disorder when index and proxy responses to three items were used and for classification of positives into treatment categories when index responses to an additional nine items were used. Although developed in LMIC primary care, the mwTool may have applicability in multiple settings, such as community-based care, emergency situations, and population-based assessments, but further research is required to assess its performance in these settings.
Acknowledgments
The authors thank the Mozambican Ministry of Health, National Directorate of Public Health, Maputo City Health Department, Nampula Province Health Department, and research assistants for facilitating data collection and all study participants for their contribution to this work.