We make many decisions every day, based on factors such as our goals at the moment and the value we perceive in certain actions or choices (i.e., “good for me” or “bad for me”). With many of these decisions, we have the opportunity to experience their outcome. At times, that outcome is expected, while at others it can be surprising. By comparing what we had expected would happen to what really did, we learn how to make better choices next time. This process is so prevalent, and so often automatic, that many individuals may not realize how frequently it happens. Yet, a great many neural resources are dedicated to continually making predictions, experiencing the consequences of choices, and adjusting behavior to optimize outcomes. Our brains have been described as Bayesian machines that optimize behavior based on predictions and outcomes (
1). It is then not surprising that this adaptive process is perturbed in various ways across psychopathology.
Patients with generalized anxiety disorder, the focus of a study by White et al. (
2) reported in this issue of the
Journal, seem at least phenomenologically to be impaired in this process of prediction, feedback, and learning (
3). The degree to which these patients may seem maladaptively “stuck in their ways” can frustrate even the most experienced clinician. Despite what might appear as persistent negative feedback for this style of decision making, often little seems to change for these patients. How do we go about understanding the processes that have gone awry in their brains? And what does investigating biological perturbations in decision making teach us about the pathophysiology and treatment of generalized anxiety disorder? Such answers may be found by applying computational models to an experimental probe of decision making: operant reinforcement learning.
A method for understanding a process like learning and decision making can be considered computational when it utilizes complex mathematics to model mechanistic components describing how specific mental functions interact to drive behavior. Such models can be very powerful in that they can predict future behavior and explain which components of information processing may be altered under certain conditions or in certain populations. The study of operant learning via reinforcing decisions through rewards or punishments has developed iterations of these models over the past decade and found that they do a fairly accurate job of tracking and predicting choice behavior (
4). Central to these models is calculation of the expected value of a stimulus or action, encoding of the reward or punishment that was received after a choice was made, and the calculation of the discrepancy between what was received and what was expected (called a prediction error). A positive prediction error indicates a better outcome than expected, while a negative prediction error indicates a worse outcome. The signaling of a prediction error, which a large body of work in experimental animals and humans has attributed to dopaminergic neurons, results in updating of the expected value signal (moderated by a learning rate, which dictates the degree to which information from the previous trial is carried into the next one). If this process works well, the individual quickly minimizes prediction errors by adjusting the expected value of a choice, and hence its selection, to its most optimal level.
Brain imaging studies have picked up on this modeling work over the past decade (
4) and found that trial-to-trial variations in signals such as value and prediction error can be observed in trial-to-trial variations in brain activity. Often value signaling has been linked to activity in the ventromedial prefrontal cortex and ventral striatum (
5,
6), while prediction error signaling has been linked to activity in the ventral tegmental area, the striatum, the dorsal cingulate and medial prefrontal cortices, the anterior insula, and other locations (
7). While the terminology associated with computational neuroscience (or in this context often called “computational psychiatry”) can seem foreign to the general clinical reader, it is meant to describe a learning process that is inherently intuitive to all.
In their experiment, White et al. gave patients with generalized anxiety disorder and healthy comparison participants a decision-making task in which responses were associated with different probabilities of reward and punishment, thus presenting an opportunity to learn from both reward and punishment. Patients failed to learn from both rewards and punishments as well as healthy individuals did, as reflected in a persistently high error rate among patients, while healthy subjects progressively decreased their errors. Turning to the brain, the authors’ primary finding was a profound and widespread reduction in prediction error signaling in patients. That is, activity in regions like the cingulate and medial prefrontal cortex, the striatum, and the insula were all blunted in their normal signaling that a prediction error had occurred, both to the experience of reward or punishment. By contrast, the neural representation of expected value was somewhat diminished but failed to reach the authors’ statistical threshold for significance. Thus, the authors suggest that the impairments in learning were due to inadequate experience of prediction errors and thus inefficient updating of value. As a consequence, decision making regarding which stimulus to select in order to receive reward or avoid punishment becomes less accurate and thus suboptimal. Taken by itself, this work provides critical new insight into generalized anxiety disorder. It provides a computational vantage point on patients who are rarely studied computationally (unlike patients with depression, for example). It also narrows down the range of potential brain dysfunctions that can account for impaired reinforcement learning, which has been taken to represent a canonical context for understanding decision making.
This work also provides a more mechanistic framework for asking clinically relevant questions. For example, what is the cause of impaired prediction error signaling? It has been argued that one function of chronic worry is to control the intensity of emotional experience, such that a chronic, mild state of negative emotion is persistently experienced in order to attenuate the magnitude of an affective shift that could possibly occur from unexpected outcomes (
8). Perhaps one consequence of this process is that the emotional salience associated with prediction errors during learning is blunted. Unfortunately, and extrapolating from the results of this study, one consequence of persistently blunted prediction errors is exactly the “stuck in their ways” phenomenology described clinically. That is, patients are seemingly unable to learn from their experience and adapt their thinking and behavior so that it optimizes outcomes for them.
What other processes might be affected by blunted prediction error signaling? We have argued that emotion regulation can fundamentally be understood as a value-based decision-making process akin to reinforcement learning (
9). Blunted prediction error signaling, especially in the dorsal anterior cingulate, may account for impairments in emotion regulation in these patients.
Finally, how can this information be used therapeutically? Since prediction error signaling has been tied to the actions of dopamine, medications that increase phasic dopamine release or action specifically may restore abnormalities in learning and decision making. This could even be accomplished through neurofeedback (
10). Alternatively, therapeutic procedures that improve the sensitivity to and monitoring of emotional salience (which is a core component of psychotherapies for this condition [
8]), may in turn improve prediction error signaling. While answers to these important questions are not yet known, it is clear that use of a computational neuropsychiatry approach in this study and others like it advances our mechanistic understanding of psychopathology and our sophistication in explaining it.