Individuals with autism spectrum disorders display a unique, pervasive pattern of learning abilities and disabilities. Their learning of simple associations, facts, details, and habits is intact (
1) as demonstrated by their responsiveness to operant conditioning-based interventions, ability to memorize extensive information about special interests, and insistence on sameness (
2). However, they are impaired in goal-directed learning (
3), rapid processing of ambiguous stimulus feedback contingencies (
4,
5), and generalization of learning to new contexts (
6). This impairment greatly interferes with everyday social, language, and classroom learning, which typically is goal-oriented and involves quick, subtle, and inconsistent stimulus/reward sequences (
7–
9). A mechanistic explanation for this pattern has remained elusive.
Animal, computational modeling, and human neuroimaging studies suggest that learning simple associations and facts typically recruits the hippocampus, whereas the processing of the quick, inconsistent stimuli encountered in everyday life recruits the striatum (
10). The striatum receives projections from brainstem nuclei of the dopamine system and generates prediction error signals (dopamine bursting and dips) that train the organism about the reward value of different actions (
11). Over time, this process leads to the development of reward-based working memory mediated by the orbitofrontal cortex (
12).
As learning becomes more goal-directed and driven by unpredictable feedback, it requires increasing recruitment of cognitive control-related brain regions, including lateral (for information maintenance and manipulation) and anterior (for higher-order rule representation) regions of the prefrontal cortex. This prefrontal involvement permits the online maintenance of the task-related working memories (
13) that make possible more complex forms of goal and unpredictable feedback-driven learning. In this more complex learning, prefrontal dysfunction results in a paradoxical pattern of performance. It is relatively harder to learn more predictable pairings of information where using reward-based working memory would be helpful and relatively easier to learn less predictably reinforced pairings of information where the use of reward-based working memory is actually misleading (
5).
During learning, organisms must predict future rewards, evaluate errors in those predictions, respond to volatility in signals about the probability with which rewards are received, and recruit cognitive systems when additional effort is required. These processes engage the anterior cingulate cortex, which then signals the orbitofrontal cortex and other prefrontal regions to increase available working memory-related resources (
14). To the extent the prefrontal cortex cannot be brought online “proactively” to sustain task- or reward-based working memories, the learner may engage in a less efficient, slower strategy whereby rules and task memories are retrieved from the hippocampus and, to the extent possible, the prefrontal cortex “reactively” on a trial-by-trial basis (
15).
In the present study, we investigated the neural mechanisms of learning in young adults with an autism spectrum disorder and with typical development using a probabilistic reinforcement task with both a stimulus and a feedback epoch. Given their impairments in prefrontal functioning (
16,
17), we predicted that the autism spectrum disorders group would be less able to accurately encode task- and reward-based working memories to guide responding as demonstrated by greater error rates on high-probability AB trials early in learning, reduced error rates on low-probability EF trials, and reduced sustained recruitment of cognitive control-related regions of the prefrontal cortex during all trials. Second, as found previously (
5), we hypothesized that the autism spectrum disorders group would show attenuated reward anticipation during stimulus presentation as reflected in poorer win-stay performance (the selection of the previously rewarded stimulus on the next trial of that type). Finally, given their impairments in reward-based working memory and dysregulated positive feedback processing, we predicted that the autism spectrum disorders group, compared with the typically developing group, would exhibit a more reactive pattern of task performance involving less proactive cognitive control and neural recruitment during the stimulus epoch of the task and greater reactive recruitment of neural circuits of the anterior cingulate cortex during feedback processing, a pattern demonstrated recently in adolescents with autism spectrum disorders (
18).
Discussion
We examined performance of young adults with autism spectrum disorders on a task simulating real-life feedback processing. Individuals with autism spectrum disorders showed learning deficits related to impairment in the ability to build a reward-based working memory representation. They exhibited poorer performance on the simplest AB pair. During the stimulus epoch, they showed reduced medial prefrontal activation associated with the probability of having learned the AB pair and less superior and anterior prefrontal cortex recruitment early and late in the task for all trial types. Additionally, they exhibited dysregulated feedback processing and made fewer previously rewarded choices. This is likely the cause of their poor performance on the AB pairs, which involved receiving the most positive feedback. In contrast to the typically developing group who developed a reward-based working memory, during the feedback phase of the task, the autism group persisted in recruiting brain regions associated with feedback-based trial-by-trial learning.
To our knowledge, this study is the first fMRI investigation of reinforcement learning with probabilistic feedback in young adults with autism spectrum disorders. Most other studies of learning and reward have examined aspects of the social motivation theory (
28), which proposes that individuals with autism spectrum disorders exhibit early impairments in learning from social rewards provided by caregivers, leading to a cascade of atypical cognitive, social, and language development. However, the social motivation theory does not 1) provide a model for the full range of autism-related learning and reward-processing impairments, 2) address when problems in the learning process occur, or 3) explain learning deficits in nonsocial contexts.
Two perspectives from the literature may help interpret our findings. First, we examined the results in light of the widely held neurocognitive view that autism spectrum disorders involve intact simple versus complex information processing (
1). Indeed, an fMRI study applying this view to learning suggested that the autism group exhibited reductions in the recruitment of prefrontal brain regions and reduced prefrontal functional connectivity compared with individuals with typical development (
3). However, given that in the present study participants with autism spectrum disorders performed significantly worse on the simplest stimulus pair and equivalently on more difficult ones, a deficit in simple versus complex information processing does not provide a satisfactory explanation of our results.
Second, studies of social motivation theory demonstrate that persons with autism anticipate and seek social (
29,
30), and even nonsocial (
5,
31,
32), rewards less than persons with typical development and show decreased recruitment of striatal brain regions. The first study of reward processing in adults with autism spectrum disorders provides further support for the idea of dysregulated feedback processing. In that study, the autism group exhibited increased ventromedial prefrontal and anterior cingulate recruitment (
33) in response to feedback compared with typically developing adults, suggesting that they may have experienced greater arousal, conflict, and signaling for the increased allocation of attention when evaluated (
34). Results of our present study are consistent with both these points of view.
Findings provide several insights that may inform intervention. First, psychosocial and neural retraining, as well as psychopharmacological interventions that enhance dopaminergic or noradrenergic function in the prefrontal cortex and improve the ability to maintain information online, may secondarily improve learning. Second, the ability to experience positive feedback and/or rewards and to orient behavior toward them may be compromised in individuals with autism spectrum disorders (
35). Consequently, it may be especially important to make positive feedback explicit and salient in academic and intervention settings. Indeed, this strategy already is used in autism treatments for young children, including in early intensive-behavioral intervention, in which praise is delivered using exaggerated positive affect (e.g., reference
36), in social skills groups for high-functioning school-age and adolescent children, which use positive behavioral supports in an enjoyable setting (
37), and in coaching models that teach parents to deliver feedback clearly and positively (
38). Future studies can help refine our understanding of the mechanisms by which positive feedback is effective.
It remains unclear whether the elevated cingulate activation constitutes an intervention target. In the absence of a fully functional prefrontal cortex, enhanced cingulate recruitment may be a marker of a successful compensatory strategy. However, sustained anterior cingulate activation has been associated with anxiety disorders (
39), poorer emotion regulation (
40), and developmental immaturity (
41–
43). Future investigations of the effectiveness of the anterior cingulate in error monitoring and in signaling the need for increased working memory-related resources in individuals with autism spectrum disorders are needed.
There are several limitations of the present study. First, while the sample size was close to 60 participants, 20% of the recruited autism spectrum disorders group could not perform the task. This limits the generalizability of our findings. Future studies should include simpler tasks. Second, sample size calculations were based on power analysis techniques available at the inception of the study. More sophisticated methods, including region-of-interest-based analyses (
44), and correction for multiple comparisons (
45) are currently available. This may lead to larger, more replicable studies (
20) and to sufficiently large samples to permit subgroup analyses of the heterogeneous autism spectrum disorders phenotype (
46). Third, the 18- to 40-year-old period is long. However, it is relevant to point out that we examined age effects in the imaging analyses and found none.
In summary, we provide evidence that learning impairments in young adults with autism spectrum disorders reflect a weakness in the ability to integrate positive reward-related information into working memory and a tendency to rely more heavily on trial-by-trial feedback-based learning. Future studies are needed that use more standardized paradigms and a wider range of both common and idiosyncratic rewards, that examine other forms of learning, and that investigate the effect of learning impairments on daily functioning and mental health.