The past decade has seen the development of promising treatments for the care of individuals experiencing the onset of schizophrenia. International research as well as the National Institute of Mental Health (NIMH)–sponsored Recovery After an Initial Schizophrenia Episode project have provided the foundational evidence for the effectiveness of coordinated specialty care (CSC). CSC includes evidence-based psychopharmacological management and general health care, cognitive and behaviorally oriented individual or group psychotherapy, family support and education, supported education and employment, case management, and, more recently, peer support. The promise of CSC for promoting the recovery of individuals developing schizophrenia-related disorders has led to the expansion of federal funding for CSC programs through the federal community mental health block grant program (
1).
NIMH leadership recognized the research opportunities afforded by this expansion in CSC programs, but it also understood that the number of new programs with connections to academic centers was limited. Novel strategies were needed to increase the number of CSC sites that are capable of science-to-service or service-to-science translation. Developing a common framework for data collection and reporting was seen as a way to strengthen connections within and between scientific and clinical service agencies.
Faced with similar challenges in genomics research, the National Human Genome Research Institute established the PhenX (consensus measures for phenotypes and exposures) toolkit to provide standard measures for use in genomewide association studies. PhenX measures are selected by working groups of domain experts by using an established consensus process (
2). Working groups select potential measures for the PhenX toolkit on the basis of criteria established by the PhenX steering committee. The criteria stipulate that measures identified be valid, reliable, and have demonstrated utility; impose a low burden to participants and investigators; and be broadly applicable and generally acceptable. Outreach to the scientific community provides feedback to inform final deliberations and selection of measures and protocols. Measures selected by working groups are freely available to the biomedical research and clinical community through the PhenX toolkit (
https://www.phenxtoolkit.org).
Driven by awareness of a similar need, NIMH funded a supplement to the PhenX toolkit for use in early psychosis. PhenX measures for early psychosis included clinical service delivery and translational research panels. The clinical panel was charged with identifying measures with utility in routine clinical settings. It included a diverse mix of members, including individuals with extensive community-based implementation experience as well as personal and/or family experience of psychosis. Recommendations from both panels were sent to the scientific and early psychosis clinical communities for review; stakeholder comments informed the final selection of measures.
The PhenX early psychosis clinical services panel identified 19 measures for inclusion in the toolkit (see online supplement) (
3). These covered a range of domains, including premorbid functioning, self-reported and rater-assessed symptoms, social and occupational functioning, well-being, personal recovery, medication adherence and side effects, physical activity, and experience of treatment with respect to shared decision making and recovery orientation. Family measures assessed family history of mental illness, family functioning, family burden, and expressed emotion. The selected measures complement measures already in the PhenX toolkit—such as use of alcohol, tobacco, and other substances; exposure to trauma and adversity; crime and delinquency; and anthropometrics.
The clinical services and translational research panels also collectively recommended six supplemental measures that did not reach the threshold of adequate psychometric properties or ease of use required of measures included in the collection (
3; unpublished data, Ongur D, Carter CS, Gur RE, et al., 2019).
Challenges in Use of PhenX Measures for Early Psychosis
Several issues arose in the development of the clinical services collection that pose challenges in facilitating the service-to-science bridge sought by NIMH. Unless resolved, these issues will impede the creation of a learning health care system that allows data- and outcome-based quality improvement.
Availability and Selection of Measures
In the past, multiple PhenX work groups have selected common data elements for psychiatric disorders, but the early psychosis clinical services panel was the first with the explicit charge of identifying measures with utility in routine clinical care. The group’s initial work schematic identified a series of potential domains for exploration, including both individual-level and program-level constructs of interest. Unsurprisingly, the availability of measures meeting criteria established by the PhenX steering committee varied substantially across domains, with far more measures available in more classical clinical domains (for instance, psychopathology) than in broader psychosocial domains, such as vocational achievement and social functioning. Listed below are key measurement concepts that are not included in the toolkit.
Duration of untreated psychosis (DUP).
DUP has emerged as a critical and potentially tractable influence on both short- and long-term functioning (
4). Objective, reliable measurement of DUP poses multiple challenges, most stemming from the inherent difficulties involved in anchoring and dating initial psychosis and a lack of consensus concerning definitions of entry into treatment (
5). Debates with respect to the latter include whether initiation of antipsychotic pharmacotherapy is sufficient. Best practices developed to date include administration of lengthy semistructured questionnaires with collateral validation, typically supervised by experienced senior clinicians or researchers.
Program-level measures.
Although the clinical services panel selected one service satisfaction measure and one measure of the perceived recovery orientation of the treating program, the limitations of these measures (and others reviewed) include lack of specificity for the range of components in best practice of CSC (
6), for team-based care models, and for services designed to meet the sometimes unique needs of transition-age youths and emerging adults. The work group was also unable to identify a sufficiently validated and cross-cutting fidelity metric or set of objective programmatic or structural quality indicators. Key domains not adequately covered by the two measures selected include availability and quality of trauma-focused services; cultural competency; and treatment for substance use disorders, suicidality, and other health conditions.
Vocational outcomes.
Helping clients return to full lives in the community is one of the express goals of CSC, elevating the importance of metrics focused on functioning at school or work. In spite of this goal, the clinical services panel was unable to locate a psychometrically validated measure of the quality of involvement with school or work or any related measures—for example, client satisfaction with work/school involvement, underemployment, or fit between current involvement and future vocational goals.
Clinician Report Versus Client or Family Self-Report
Research has consistently found meaningful differences between client and clinician ratings or reports for the same constructs (
7–
10), suggesting that one or both forms of measurement are invalid or less reliable. In fact, client and clinician ratings often capture different latent constructs, reflecting the perspective of the individual completing them and correlating with different variables of interest (
11,
12). Some decisions as to which perspective to include hinged solely on availability (e.g., global clinician-rated role functioning versus client perception), whereas in other cases the panel included both (symptoms) or opted for one or the other based on panel consensus.
Measures Across the Age Range
Many of the dimensions of importance in early psychosis care may have differential validity across the age range of individuals receiving services (
13). Some of the selected measures have different versions for adults and children. For example, the Personal Well-Being Scale has different versions for children/adolescents and adults, as does the incarceration measure. In contrast, the Family Functioning Scale was validated for use with individuals ages 12 and older. In addition, specific domains that have been developed in the context of adult services, including standard operationalizations of recovery, may not neatly translate to early intervention service settings, Finally, multiple toolkit measures have not specifically been tested with adolescents.
Challenges Related to Implementation in Nonacademic Clinics
Clinical services panel and community feedback underscored the extent to which integrating even modest standard measurement practices would be advantageous not only for research but also for local quality improvement activities. These advantages are related to time, training, and the administrative structure needed to track and monitor data input and analyses. Clinicians and frontline staff in the United States, for example, often receive only minimal training in research-oriented clinical measurement (
14). Without appropriately trained staff to ensure the integrity of client-reported measures, according to panel members, program clinicians could introduce potential bias, for example, by directly administering (and collecting) client self-report for potentially sensitive constructs, such as medication adherence or shared decision making. Notably, the ability of treating clinicians to administer the measures without intensive training was an important selection criterion. Additional concerns were centered in data quality assurance, including the potential for a high volume of missing data, a common challenge in community-based performance monitoring. The time and infrastructure necessary to complete assessments is often unavailable.
Even when the infrastructure is available, existing data collection/performance monitoring requirements also vary across states. Some programs are mandated to utilize particular metrics or data collection systems—dictated by the state’s central health authority, county or city governments, or other funding entities. An individual CSC program may have little control over these mandates. In such contexts, the addition of PhenX measures might be perceived as adding little value or as overburdening programs.
The clinical utility of the measures hinges on timely reporting about individual patients or clinical programs to the clinical providers, patients, and patients’ family members. Most clinical programs lack the infrastructure to develop reporting methods. A critical area for future research involves development of standard reporting and data collection procedures that will increase the impact of data in clinical decision-making.
Conclusions
Despite the challenges, the early psychosis clinical services panel identified standard measures suitable for academic centers and routine clinical care. This panel sets an example for future science-to-service and service-to-science initiatives as well as quality improvement efforts. The participation of end users and individuals affected by illness likely influenced the inclusion of measures that are more person and clinician centered. Further, by highlighting significant psychometric gaps, the effort suggests priorities for future measure development efforts. The process has also connected with the ongoing imperative to embrace measurement-based care as a foundation for achieving the triple aim of better health, better health care, and improved efficiency.