The incidence of opioid use disorder (OUD) has risen steeply in recent decades, with devastating consequences for patients, families, and communities. In 2020, an estimated 3 million Americans had OUD (
1), an increase of about 50% in 10 years (
2). U.S. emergency department visits related to the nonmedical use of opioids reached 285,000 in 2020 (
3), and 68,630 deaths resulted from opioid overdose (
4). The Centers for Disease Control and Prevention estimates that in 2021 the number of opioid-related deaths surpassed 80,000, an increase of 17% in a single year (
5).
For individuals with OUD, access to treatment is a challenge, and only about 10% of patients who need treatment receive it (
6). Worse, treatment often fails. Relapse rates following detoxification from opioid dependence are high (
7), and even after inpatient treatment, a majority of patients relapse within a year (
8). In the United States, three medications for opioid use disorder (MOUD) are approved for treatment: methadone, buprenorphine, and naltrexone (
9). Along with other supportive services, such as peer support, MOUD has been shown to increase rates of recovery from OUD (
9). However, most patients who receive MOUD treatment do not achieve long-term, stable abstinence (
10). While reductions in use and mortality risk are desirable real-world patient outcomes, abstinence is a clinical outcome recommended by the U.S. Food and Drug Administration (FDA) to evaluate treatments for substance use disorders, including OUD (
11,
12). It is a stable indicator of longer-term outcomes (
13), it can be biologically confirmed via urine drug screening, and it facilitates the use of intent-to-treat analyses that include all participants randomized to treatment.
The randomized clinical trial described here assessed the extent to which MOUD effectiveness might be improved by A-CHESS, the addiction treatment version of the Comprehensive Health Enhancement Support System. A-CHESS is an evidence-based smartphone intervention designed to assist recovery from substance use disorders with a suite of motivational, social support, and coping tools. A large randomized controlled trial (N=349) previously found that use of A-CHESS was associated with a decrease in risky drinking days and enhanced long-term abstinence among people with alcohol use disorder leaving residential treatment, one-third of whom reported illicit opioid use (
14). Related field tests in drug courts (
15), in Federally Qualified Health Centers (
16), and among women in Appalachia (
17) also showed positive outcomes for alcohol and opioid use.
In the present trial, we assessed the potential of A-CHESS to improve long-term outcomes of MOUD among participants with OUD. The primary hypothesis was that participants receiving MOUD plus A-CHESS would have a higher probability of abstinence from illicit opioid use (i.e., no days of illicit use) than participants receiving MOUD alone. Our secondary hypotheses were that participants assigned to MOUD+A-CHESS would show less use of other illicit substances, higher quality of life, greater retention in opioid treatment, and lower health services use compared with those assigned to MOUD alone. We tested MOUD type, gender, withdrawal symptom severity, pain severity, and loneliness as moderators of the impact of MOUD+A-CHESS compared with MOUD alone. The study variables were prespecified in the protocol (
18). (Although the protocol described the primary outcome as days of illicit opioid use, this outcome was changed to
any days of illicit use—i.e., abstinence vs. nonabstinence—prior to data collection, given the aforementioned FDA recommendations on evaluating treatments for substance use disorders.)
Additional secondary hypotheses specified in the protocol regarding HIV and hepatitis C virus are addressed elsewhere (
19). We also tested A-CHESS use and communication style patterns as predictors of outcomes, and will report these findings separately.
Methods
Trial Design
In this nonblinded parallel-group randomized controlled trial, 414 participants with OUD were assigned in a 1:1 ratio to receive either MOUD+A-CHESS or MOUD alone for 16 months and were followed for an additional 8 months after the intervention. The study was approved by the University of Wisconsin–Madison Health Sciences Institutional Review Board and the Western Institutional Review Board, and is registered at ClinicalTrials.gov (NCT02712034).
Participants
Participants were eligible if they were currently on MOUD; were age 18 or older; met DSM-5 criteria for OUD of at least moderate severity (four or more symptoms) in the past 12 months; had no acute medical problems requiring immediate inpatient treatment; had no history of psychotic disorders; were willing to participate in a randomized trial; could provide two verified contacts as locators, if necessary; could read and write in English; agreed to share health-related data with primary care clinicians; and had been abstinent from illicit opioids at study intake for at least 1 week and no longer than 4 months.
Patients were recruited from outpatient detoxification and treatment programs at two sites in Massachusetts and one in Wisconsin. Potential participants were identified by a site staff person and asked if they were interested in learning about the study. If yes, the University of Wisconsin study coordinator or site coordinator provided a detailed overview, including participant responsibilities and confidentiality protections. Interested participants then gave written consent and completed a baseline survey. Information on race/ethnicity was collected via self-identification. Participants also self-identified gender as male or female or could decline to respond; we did not ask for participants’ biological sex.
Interventions
MOUD alone.
Participants in the control arm received methadone, buprenorphine, or injectable naltrexone and treatment as usual at each site. This could include a recovery plan, behavioral interventions such as group counseling, and sessions with a substance use counselor. Sequence and duration of medication and behavioral interventions varied by patient.
MOUD+A-CHESS.
Participants in the experimental arm received A-CHESS for 16 months along with their MOUD. As described previously (
18), A-CHESS services are based on self-determination theory constructs of intrinsic motivation, social support, and coping competence (
20) to address numerous determinants and antecedents of relapse. For a complete description of app features, see the
online supplement.
Participants assigned to the MOUD+A-CHESS arm who did not have an Android smartphone were given one loaded with the app, along with a data plan for the 16-month intervention period. Participants who already had a compatible Android smartphone had A-CHESS installed. We provided up to one replacement phone, if needed. If participants lost a second phone, we offered to load A-CHESS onto an appropriate smartphone they obtained. Data plans were terminated after 16 months, but participants could continue to access A-CHESS via other connectivity.
The University of Wisconsin or site coordinator trained participants to use A-CHESS and customize it with, for example, sources of support, high-risk locations to avoid, and recovery motivations. App content was refreshed monthly with healthy activities, local AA or NA meetings, and clinic schedules for group sessions. Participants demonstrated that they could use A-CHESS before leaving training.
Study Variables and Measures
For assessing outcomes and other variables, participants were asked to complete telephone surveys (∼30 minutes) with the University of Wisconsin study coordinator at baseline, at months 4, 8, 12, and 16, and after the intervention at months 20 and 24. All months in the study are 30-day periods. Participants in the MOUD-alone group received $30 per completed survey, participants in the MOUD+A-CHESS group who used their own phone received $15 per survey, and participants in the MOUD+A-CHESS group who were given a study phone received $10 per survey. Survey measures used for quantitative data collection are described below. In addition, results from urine drug screens were recorded at baseline and all subsequent surveys, if possible. An overview of the analytic strategy is presented in
Figure 1.
Primary outcome.
Self-reported abstinence from illicit opioid use was documented for the 30-day period preceding each survey with a variant of the timeline followback (
21), with questions on illicit opioid use separated from those on other substances. Urine drug screens were used only if administered between 30 days before and 1 day after the survey to align with a survey question about past-30-day illicit opioid use. Results from urine screens were used to validate self-reported information. If participants reported abstinence (i.e., 0 days of use) while a screen was positive, their status was changed to nonabstinent. Inconsistency between screen results and self-reports did not affect participants’ ability to continue in the study.
Secondary outcomes.
For retention in treatment, participants reported whether they were staying on MOUD at each time point over the 24 months. In addition, they reported engagement with other forms of treatment outside their clinic facility: meeting attendance (e.g., NA, AA, 12 step, Smart Recovery), outpatient treatment, residential treatment, and therapy/counseling. Each of these variables was analyzed separately. Patients completed a 30-day timeline followback at each survey to document other nonprescribed drug and alcohol use as well as health services use during the past 4 months (overnight hospitalizations, emergency department and urgent care visits, visits with any other providers; all variables analyzed separately). The Satisfaction With Life Scale (
22) was used to assess quality of life.
Moderation.
Analyses examined whether effects differed by MOUD type, gender, withdrawal symptom severity, pain severity, and loneliness (Brief UCLA Loneliness Scale [
23]). Severity variables were self-reported on a scale of 1 (“not at all severe”) to 10 (“very severe”). Substance use disorder severity, a planned moderator, was dropped from the model because DSM-5 values were not consistently documented during the clinic intake process.
Mediation.
Self-determination theory constructs were assessed as follows: for motivation, the Treatment Self-Regulation Questionnaire (
24); for coping competence, the revised Drug-Taking Confidence Questionnaire (
25); and for relatedness, the McTavish Bonding Scale (
26). The Positive and Negative Affect Scale (
27,
28) was used to measure negative affect, and the self-devaluation subscale of the Substance Abuse Self-Stigma Scale (
29) was used to measure self-stigma.
Covariates.
Potential covariates tested included sociodemographic variables (gender, age, race/ethnicity, education, housing status, employment status, and marital status), historical factors (age at start of regular use of opioids, past OUD treatment, and psychiatric diagnoses), and pain severity (
30).
Sample Size Determination and Power
We proposed recruiting 440 patients, anticipating 35% attrition over time, to produce a final N of 286. The final N was calculated to provide approximately 82% power to detect a standardized mean difference of 0.35 between study arms (a small to medium-sized effect) in a linear model with up to six covariates, using an alpha of 0.05. Power was calculated using formulas from Cohen (
31) that are implemented in the
pwr package in R (
32). Assumed attrition rates were calculated from past data from our recruitment sites.
Randomization
The project director used a computer-generated allocation sequence to randomize participants to MOUD+A-CHESS or MOUD alone in a 1:1 ratio using a block design stratified by gender, site, and MOUD type. Block size was 16. The project director informed the site coordinator of group assignment by e-mail, identifying participants by identification number only. The site coordinator enrolled participants into their study arm and provided training in use of the app to participants who were assigned to MOUD+A-CHESS. Staff were blinded at baseline, before randomization, but as is generally the case with trials of mHealth for substance use disorders (
33), blinding was not possible once participants received or did not receive the A-CHESS app.
Statistical Analysis
Outcomes were analyzed with mixed-effects models, using the
glmmTMB() function from the
glmmTMB package implemented in R. These models account for correlated measurements within participants, use all available data (allowing for intent-to-treat rather than only complete-case analysis), and provide unbiased estimates when data are missing at random (
34). Each model included a random effect for participant and study time point, as well as fixed effects for time point, arm, and arm-by-time point interaction. Time point was treated as a continuous variable. Models predicting illicit opioid abstinence used a binary distribution with a logit link. We also included covariates that were marginally related (p<0.2) to illicit opioid use. For our primary outcome, effects are described as significant if p<0.05. Secondary analyses (e.g., tests of moderation, alternative outcomes) should be considered exploratory, with their unadjusted p values interpreted in that context.
Each moderator was examined in separate models. Models assessing moderation by type of MOUD allowed type to vary across time based on participant self-report. Moderation was tested in two ways: methadone versus buprenorphine (participants receiving naltrexone or no MOUD were set to “missing” in this model) and methadone versus all other MOUD types (including no MOUD). This was done because of the small sample sizes of participants on MOUDs other than methadone and buprenorphine (see Table S1 in the online supplement). In models with withdrawal symptom severity as a continuous moderator, withdrawal and illicit opioid use were assessed concurrently, but a causal relationship could not be established because withdrawal questions referred to symptoms experienced over the preceding 4 months while questions about illicit opioid use referred to the past 30 days only. All follow-up analyses were drawn from the fitted model. A region-of-significance analysis was performed by adjusting the centering of the variables to find the values of withdrawal for which a significant interaction between arm and time point was observed. Both withdrawal severity and MOUD type were time-varying moderators. Simple slopes analyses were conducted by applying the emtrends() function from the emmeans package in R to the fitted model.
Results
Participants
A total of 414 participants received one of the two interventions and were included in the analyses (see Figure S2 in the online supplement for the CONSORT flow diagram). Recruitment began in April 2016 and ended in May 2018; the 16-month intervention period ended in September 2019, and data collection continued through May 2020.
Table 1 presents participant characteristics at baseline. Most participants identified as white (94.0%) and male (54.8%) and began regular use of illicit opioids at age 24. Employment status (yes/no), treatment history (number of times in treatment to stop using opioids), and baseline pain severity rating were associated with illicit opioid use and were included as covariates in adjusted models.
Time-stamped A-CHESS usage data (e.g., services selected, pages viewed, message text) were captured in our database. Participants used A-CHESS an average of 32.3% of days during the first 12 months and 18.3% of days during the second 12 months (see Table S2 in the online supplement). Of the original 208 MOUD+A-CHESS participants, 191 (91.8%) were using the app after the first month (30 days), 153 (73.6%) after 6 months (182 days), and 123 (59.1%) after 1 year (360 days).
Across all participants (N=414), 64.5% completed the 24-month survey. Rates of missed surveys differed statistically between arms at the 4-, 20-, and 24-month surveys. At 4 months, the missing survey rate for the MOUD-alone group was 7.4% higher than for the MOUD+A-CHESS group (χ2=4.00, p=0.045). At 20 and 24 months, the missing survey rate for the MOUD+A-CHESS group was 12.2% and 10.8% higher than for the MOUD-alone group (χ2 values >5.24, p values <0.022; see Table S3 in the online supplement). The difference may be attributed in part to the fact that phone service was no longer provided to MOUD+A-CHESS participants after 16 months, affecting our ability to track and communicate with participants and possibly reducing their motivation to complete surveys.
Because we used linear mixed models, which can handle missing data, all 414 participants (206 in the MOUD-alone arm and 208 in the MOUD+A-CHESS arm) who completed baseline surveys were included in the final analyses. A total of 267 participants (144 in the MOUD-alone arm and 123 in the MOUD+A-CHESS arm) completed the 24-month survey, which was 19 fewer than expected after attrition. All participants were analyzed according to original study arm assignment.
Outcomes and Estimation
Primary outcome.
There was no difference in illicit opioid abstinence between participants in the MOUD+A-CHESS and MOUD-alone arms across time (i.e., arm by time point; odds ratio=1.10, 95% CI=0.90–1.33, p=0.35; see Table S4 and Figure S3 in the online supplement for estimates by arm across time). An intent-to-treat analysis in which all missing outcomes were recoded as using illicit opioids also did not yield a significant difference between study arms in abstinence from illicit opioids across time (odds ratio=0.89, 95% CI=0.74–1.07, p=0.22). We did not test mediators because the primary outcome was not significant.
Type of MOUD (methadone vs. buprenorphine, all other MOUD options set to “missing”) moderated the effect of arm across time for abstinence from illicit opioids (i.e., MOUD type by arm by time point, odds ratio=0.57, 95% CI=0.34–0.97, p=0.039) (
Figure 2). Simple slopes analysis for the time-point effect showed that for participants on methadone, the probability of abstinence significantly increased over time for those in the MOUD+A-CHESS arm (b=0.28, SE=0.09, p=0.003); the probability increased but not significantly for participants in the MOUD-alone arm (b=0.06, SE=0.08, p=0.48). These time-point slopes were not significantly different from each other (∆b=0.22, SE=0.11, p=0.053). For participants on buprenorphine, those in the MOUD-alone arm showed a significant increase over time in the probability of abstinence (b=0.68, SE=0.19, p<0.001); the probability also increased for those in the MOUD+A-CHESS arm, but not significantly (b=0.34, SE=0.17, p=0.053). These time-point slopes, too, were not significantly different from each other (∆b=−0.34, SE=0.25, p=0.166).
We also observed this moderation effect for abstinence from illicit opioids when comparing methadone with all other MOUD types (buprenorphine, naltrexone, and no MOUD) in a more conservative model including the full sample (odds ratio=0.65, 95% CI=0.43–0.99, p=0.044).
Withdrawal symptom severity also moderated the effect of arm across time for abstinence from illicit opioids (i.e., withdrawal by arm by time point; odds ratio=0.95, 95% CI=0.91–1.00, p=0.047). A region-of-significance analysis showed that the moderation effect of arm by time point for abstinence emerged only for participants who reported no withdrawal symptoms (scored zero) (odds ratio=1.30, 95% CI=1.01–1.67, p=0.039); this effect was nonsignificant for any rating of withdrawal severity (range, 1–10) when symptoms were present (
Table 2). However, it should be noted that the majority of participants (more than 60%) reported withdrawal scores of zero, which is where we observed the significant arm-by-time point interaction.
Figure 3 displays the moderation effect of withdrawal symptom severity. Simple slopes analyses are also provided in the figure for the effect of arm across time for the 25th, 50th, and 75th percentiles of severity. As
Table 2 and Figure S4 in the
online supplement show, withdrawal symptoms were positively skewed (skew=0.96); thus, both the 25th and 50th percentiles were equal to the score of zero.
Gender, pain severity, and loneliness did not moderate the difference between participants in the MOUD+A-CHESS and MOUD-alone arms across time for the primary outcome (see Table S5 in the online supplement for inferential statistics).
Secondary outcomes.
There was no difference in use of illicit marijuana, sedatives, stimulants, or alcohol between participants in the MOUD+A-CHESS and MOUD-alone arms across time (see Table S6 in the online supplement for inferential statistics), nor did we find significant differences in quality of life between the MOUD+A-CHESS and MOUD-alone arms across time.
However, as shown in
Figure 4, we found significant arm-by-time point effects for meeting attendance, one of our measures of retention in treatment (odds ratio=1.25, 95% CI=1.05–1.49, p=0.014), and for emergency department or urgent care visits, a measure of health services use (odds ratio=0.88, 95% CI=0.78–0.99, p=0.034). Simple slopes analysis for the time-point effect showed that participants in the MOUD+A-CHESS arm had slower declines in meeting attendance (b=−0.21, SE=0.07, p=0.001) than those in the MOUD-alone group (b=−0.44, SE=0.07, p<0.001). Those in the MOUD+A-CHESS arm had fewer emergency department or urgent care visits across time (b=−0.20, SE=0.05, p<0.001) compared with those in the MOUD-alone arm (b=−0.07, SE=0.04, p=0.136).
With regard to our other retention-in-treatment variables, we did not find significant differences between the MOUD+A-CHESS and MOUD-alone arms across time for staying on MOUD (odds ratio=0.90, 95% CI=0.75–1.07, p=0.22). We also did not find differences in outpatient visits or therapy/counseling, and we were unable to test residential treatment center attendance because of a lack of variability in the data. Among our other measures of health services use (hospitalizations, other provider visits), we also found no significant differences. Each variable was analyzed separately.
Discussion
In our primary analysis including all participants, we did not find that A-CHESS increased abstinence for those who used illicit opioids or other substances. This null finding indicates that A-CHESS did not benefit the average study participant with regard to our primary outcome. However, other planned analyses suggested possible differences between subsets of participants in the MOUD+A-CHESS arm and between arms on certain secondary outcomes: Among participants not experiencing withdrawal symptoms, those in the MOUD+A-CHESS arm were more likely than those in the MOUD-alone arm to remain abstinent from illicit opioids; MOUD type moderated A-CHESS effects such that MOUD+A-CHESS appeared to be more effective for participants on methadone compared with those on buprenorphine; and relative to participants in the MOUD-alone arm, those in the MOUD+A-CHESS arm had greater meeting attendance and fewer emergency department or urgent care visits. Although these tests of moderation and effects on secondary outcomes were specified prior to data analysis, they should be interpreted cautiously and replicated in future studies, given that we conducted a large number of tests.
While there have been promising pilot studies in the past decade (
35,
36), to our knowledge this is the first large, long-term (24 months) randomized clinical trial to test effects of a smartphone intervention in combination with MOUD. mHealth in general is a rapidly expanding field, with benefits of accessibility, cost, versatility, and fidelity and with potential to augment treatment and extend the reach of evidence-based interventions (
37). For substance use disorders in particular, mHealth may reduce stigma as well as provide “just-in-time” intervention because of the portability of smartphones. There are, in fact, countless apps claiming to facilitate recovery available for download—but almost none are regulated or proven (
37,
38). In a recent evaluation of 904 free or low-cost apps, only seven offered evidence-based content (
38). Hence, mHealth for illicit substance use is in a “formative stage” (
39), with substantially more clinical research and dissemination effort needed to realize its potential (
37–
39). The present study did not find between-group differences for our primary outcome, but it suggests questions to pursue regarding the potential contribution of mHealth for the average patient receiving MOUD.
Relative to mobile apps and eHealth in general (
40,
41), A-CHESS usage data indicated high use of the app (91.8% of participants at 1 month, 73.6% at 6 months, 59.1% at 12 months), and yet MOUD+A-CHESS did not increase abstinence relative to MOUD alone. Sustaining engagement is a good start, but research is needed to understand what specific content, services, or design variables are effective in reducing substance use or sustaining abstinence (
39). As described in the
online supplement, A-CHESS offers features intended to reduce and distract from cravings, provide peer support, remind patients of reasons to abstain, connect them with clinic support, alert them to real-time risks, provide relevant health news and information, locate support meetings, and more. Future studies should focus on the effectiveness of individual features and on identifying, developing, and testing features and aspects most likely to assist and sustain recovery. It is possible that a future, optimized version of A-CHESS or similar mHealth tools could produce benefits on average not seen in the present trial.
The study had several limitations worth noting. Participants in the MOUD+A-CHESS arm were provided with smartphones and Internet service; hence, there were incentives to join and continue participation that may limit the degree to which the results generalize to real-world implementation involving individuals’ own cellphones and data plans. Further affecting generalizability is that participants were drawn from treatment centers in areas with little racial or ethnic diversity.
The study also had limitations with regard to examining the moderating effects of MOUD type. We did not have equivalent numbers of participants for each MOUD. At baseline, 300 participants were receiving methadone, 90 were receiving buprenorphine, and 44 were receiving injectable naltrexone. Moreover, a patient’s treatment medication could vary during the study. In addition, few participants receiving buprenorphine had used opioids in the past 30 days at baseline and, by chance, the majority of them were assigned to the MOUD-alone arm (14/44, compared with 8/46 in the MOUD+A-CHESS arm). Without a larger and more balanced sample, we cannot determine whether A-CHESS would have had similar effects across all MOUD types.
Finally, participants’ MOUD dosage information was not available. Region-of-significance analyses conducted to clarify the moderating effect of withdrawal symptom severity suggested that the benefits of A-CHESS were limited to periods when participants reported no withdrawal symptoms. Withdrawal scores were highly positively skewed such that participants spent much of their 24 months in the study free from withdrawal symptoms. A-CHESS appeared to be beneficial during those withdrawal-free periods in which medication dosing was adequate to relieve symptoms and/or acute withdrawal symptoms had subsided. However, detailed data on medication dosing and use of a thorough withdrawal symptom assessment tool (e.g., that used by Amass et al. [
42]) could help clarify how and when A-CHESS might maximally benefit patients. Future app development and research could involve testing new content within A-CHESS focused on coping with withdrawal symptoms.
Mobile health systems have the potential to be as present in patients’ lives as the symptoms of addiction, offering the promise of help anytime and anywhere. This study aimed to understand whether bundling MOUD with a mobile relapse-prevention system could improve long-term recovery from opioid use disorder. Our results indicate that, on average, adding A-CHESS does not improve abstinence from illicit opioid use. However, the app may help certain patients under certain conditions. In particular, patients appeared to be more likely to benefit during periods when they were not experiencing withdrawal symptoms, and patients receiving methadone with A-CHESS appeared to benefit more than those receiving other types of MOUD with A-CHESS. Finally, the app appeared to have a positive impact on the use of certain health services. More research to identify effective adjuncts to support patients using any MOUD is needed.
Acknowledgments
The authors thank Laura Reuter, at SSTAR, for her extraordinary efforts with recruitment, data collection, and participant support, and Lori L. Gustafson, Ph.D., for her invaluable review of the analysis and manuscript.