Full access

Regular Article

Published Online: 1 December 1997

Synaptic Elimination, Neurodevelopment, and the Mechanism of Hallucinated “Voices” in Schizophrenia

Ralph E. Hoffman, M.D., and Thomas H. McGlashan, M.D.Authors Info & Affiliations

Publication: American Journal of Psychiatry

Volume 154, Number 12

https://doi.org/10.1176/ajp.154.12.1683

PDF/EPUB

Abstract

OBJECTIVE: After peaking during childhood, synaptic density in the human frontal cortex declines by 30%–40% during adolescence because of progressive elimination of synaptic connections. The characteristic age at onset of schizophrenia—late adolescence and early adulthood—suggests that the disorder could arise from irregularities involving this neurodevelopmental process. METHOD: A computer simulation of a speech perception neural network was developed. Connections within the working memory component of the network were eliminated on the basis of a “Darwinian rule” in order to model loss of synapses. As a comparison, neuronal cell death, also postulated as being linked to both neurodevelopment and schizophrenia, was simulated. The authors determined whether these alterations at low levels could enhance perceptual capacity and at high levels produce spontaneous speech percepts that simulate hallucinated speech or “voices.” RESULTS: Eliminating up to 65% of working memory connections improved perceptual ability; beyond that point, network performance declined and speech hallucinations emerged. Simulating excitotoxic neuronal loss at low levels also improved network performance, but in excess it did not produce hallucinations. CONCLUSIONS: The model demonstrates perceptual advantages of selective synaptic elimination as well as selective neuronal loss, suggesting a functional explanation for these aspects of neurodevelopment. The model predicts that psychosis arises from a pathological extension of one of these neurodevelopmental trends, namely, synaptic elimination. (Am J Psychiatry 1997; 154:1683–1689)

Computer models are central to scientific disciplines ranging from meteorology to physical chemistry. Their usefulness lies in simulating complex, interactive systems. A good model does not recreate “reality” in its entirety—if that were the case, the best model would be the real-life system itself. Instead, model construction proceeds by incorporating certain properties of the system in a much simplified form which, when simulated by computer, exhibits characteristic properties or behaviors that have been previously unexplained. For instance, a computer simulation of Jupiter's atmosphere produced a stable “storm” resembling the planet's Red Spot after its rapid spin rate and liquid composition had been accounted for (1). The simulation was useful insofar as essential physical dynamics of this previously unexplained phenomenon were captured.

Brain systems composed of large numbers of interactive neural elements have been usefully explored by means of computer modeling (2, 3). In this spirit, we describe a neural network computer simulation of narrative speech perception. Although this simulation represents a vast simplification of actual cortical networks, it produced behaviors that may illuminate both normal brain development and induction of psychosis.

OBSERVATIONS USED IN CONSTRUCTING THE MODEL

Key observations used in constructing the simulation of a speech perception neural network were as follows.

1. Cortical development during adolescence is characterized by substantial reductions of synapses. Studying normal postmortem tissue obtained from the middle frontal cortex, Huttenlocher (4) found that synaptic density peaked during childhood, with a subsequent decline of 30%–40% during adolescence to reach adult levels, which remained relatively stable. Since large-scale neuronal loss does not occur during this developmental period, synaptic elimination must reflect curtailed connectivity between neurons. Less than 1% of afferents to any cortical area derive from the thalamus, the primary source of noncortical input (5). Large-scale reductions of cortical synapses must therefore reflect reductions in corticocortical connections rather than thalamocortical afferents. Investigating consequences of reduced connectivity within neural systems was therefore a primary goal of our model.

2. Recent studies suggest that excessive synaptic pruning is associated with schizophrenia. Many workers have hypothesized that schizophrenia is a neurodevelopmental disorder (6–11). The characteristic age at onset of this disorder—late adolescence and young adulthood—and the prominence of synaptic pruning during normal adolescence suggest that schizophrenia could arise from a pathological extension of this “late” developmental process (6, 8). This hypothesis has been supported by phosphorus-31 magnetic resonance spectroscopy studies of neural membrane phospholipid turnover (12–14) and postmortem studies of neuropil volume (15) and dendritic spine numbers (16, 17) that compared the frontal cortex in schizophrenic brains and normal control brains. In addition, reductions in synapse-associated phosphoproteins (synapsin and synaptophysin) in the medial temporal cortex of schizophrenic patients have been reported (18, 19).

3. Hallucinated speech or “voices” commonly occur in schizophrenia. Our strategy was not to simulate the entire syndrome of schizophrenia but to explore a single characteristic symptom, auditory hallucinations, which are reported to occur in approximately 50%–80% of patients (20). One clue to their origin is that these hallucinations most commonly consist of spoken speech or “voices” (21), a phenomenological feature suggesting that hallucinated speech involves neural systems dedicated to auditory speech perception. This view is reinforced by positron emission tomography and functional magnetic resonance imaging evidence of activation of the auditory/linguistic association cortex when voices occur (22, 23).

Certain aspects of the speech perception system were therefore simulated to determine whether pruning “corticocortical connections” could simulate voices. Our criterion for identifying this “symptom” was production of “percepts” by the speech perception network in the absence of any phonetic input, thereby simulating hallucination.

4. Working memory underlies normal speech perception. The neural network simulation was guided by the observation that ordinary speech, when produced at normal rates, has substantial acoustic ambiguity because of blurring of phonetic information and background sounds (24–27). Consequently, perception of a word embedded in narrative speech depends not only on acoustic input corresponding to the word itself but also on previously perceived words and intrinsic knowledge of how words are sequenced into larger message units (28, 29). The utilization of linguistic expectations used to “disambiguate” ongoing speech inputs reflects a specialized working memory capacity that was incorporated into the neural network.

Many studies have demonstrated working memory impairments in schizophrenia (30–32). Weinberger et al. (30) have implicated pathology involving interactions between frontal and medial temporal areas that are known to underlie human working memory (33). We therefore targeted the working memory component of our neural network to explore effects of reduced corticocortical connectivity.

A COMPARISON NEUROANATOMIC “PATHOLOGY”: SELECTIVE CELL DEATH

As a comparison “pathology,” consequences of neuronal cell loss were also studied. Animal studies indicate that cell loss accompanies normal neurodevelopment (7). Along these lines, Huttenlocher (4) described evidence of frontal neuron cell loss in humans during childhood that preceded, at least in part, large-scale synaptic pruning. Postmortem studies have also reported neuronal loss in cortical and medial temporal areas in the brains of schizophrenic patients (9). Mechanisms of neuronal elimination in normal development and schizophrenia are unknown but may reflect overactivation (“excitotoxicity”) or underutilization (9, 10). Our neural network model therefore simulated effects of working memory neuron elimination according to different selection mechanisms.

METHOD

Our simulation of sequential word perception was based on models developed by Elman (29, 34). We have described a preliminary study that used this model to explore hallucinated speech (35). Compared to our first study, the simulation we report here made use of a more complex learning paradigm and a smaller input layer designed to force the network to rely further on working memory.

Network Architecture and Language Training

The network, which consisted of 148 “neuronal elements” divided into a four-layered system (figure 1), was designed to translate “phonetic” inputs into words. Actual acoustic data were not used. Instead, our simplifying assumption was that the phonetic representation of each word corresponded to a unique pattern of activation where roughly 25% of the neurons in the initial or input layer were “turned on.” The vocabulary of the network consisted of 30 words, including 14 nouns (woman, Jane, boy, girl, Bill, man, cop, Sam, omen, warning, story, dog, God, ball), 11 verbs (chase, kiss, love, fear, tell, run, kick, give, frightens, think, miss), four adjectives (young, old, large, small), and one other word (won't). A large array of grammatical structures was permitted by this word set.

Each of 40 hidden-layer neurons received a weighted sum of inputs from each of the 25 input neurons and 40 temporary storage neurons:

where input (x) is the summed input received by neuron x in the hidden layer, I(y) is the activation of neuron y in the input layer, S(y) is the activation of neuron y in the temporary storage layer, and w_yx are corresponding projection weights (which can be positive or negative). The activation of each hidden-layer neuron was then computed by using a sigmoidal function ranging from 0 to 1, which acted on the summed input. The output layer consisted of 43 neurons. Output-layer neurons received inputs exclusively from the hidden layer (figure 1) and had the same sigmoidal activation function as hidden-layer neurons. Besides being assigned a phonetic code, each word was also assigned a pattern within the output layer, where between three and six of these neurons were turned on. These neurons coded for semantic and syntactic features. For instance, the word “cop” was represented by activation of output neurons that individually coded for NOUN, ANIMATE, and HUMAN, as well as a particular neuron that referred to “cop” itself. Examples of output codes for individual words are provided in figure 2.

When the network produced an output layer activation pattern, an algorithm decided which word was the best fit for that particular pattern; the best fit became the “detected word.” When the output activation pattern demonstrated no clear-cut best fit, the network was assessed as not perceiving any word.

Network training utilized 60 repetitions of a set of 256 different grammatical sentences. Differences in weights between different neuronal layers were adjusted by using an “on-line” variant of back-propagation learning (36), which progressively minimized the error of activation patterns produced by the output layer in response to inputs whose phonetic information was partially degraded. During the course of training, the network acquired the ability to use linguistic expectations—stored as activation patterns resonating between the hidden and temporary storage layers—to guide detection of words.

Assessment of Network Performance

After the network was trained, it was retested with a set of 23 sentences not used in training but incorporating the same vocabulary. During testing, each test sentence was separated from the next by a pause consisting of five null inputs (all input neurons set to 0). The percentage of words successfully detected by the network was counted, as well as the total number of misidentifications (when the network confused one word for another). Hallucinations were scored when output-layer activation patterns yielded word percepts during pauses when phonetic inputs were absent. Assessment of network performance was undertaken with full phonetic information for each word and then repeated with degraded phonetic information. The latter condition was created by randomly selecting two input neurons ordinarily turned on for each word and resetting them to 0. This manipulation forced the network to rely more on working memory and linguistic expectations based on previous inputs to “fill in the blanks” and produce the correct word percept.

As an example of network performance, suppose that the input consisted of “phonemes” presented in a sequence corresponding to the following words: cop-chase-old-man-#-#-#-#-#-Jane-kiss-girl, where # denotes null inputs corresponding to pauses. Assume that the output of the network was cop-chase-█-dog-█-█-█-fear-█-Jane-kiss-girl, where █ denotes the absence of any output produced by the network. The number of words correctly identified would be five of seven; the word “man” would be scored as misidentified, and “fear” would be scored as a hallucination.

Neuroanatomic Manipulations

Two neuroanatomic manipulations of working memory networks were simulated. The pruning procedure was guided by the concept of neurodevelopmental “Darwinism,” where neurons compete with each other for anatomic access to other neurons, with elimination of less robust interneuron connections (37–39). In mathematical terms, if the absolute value of a connection weight linking the temporary and hidden layers was below a certain threshold, it was “clamped” at 0. Excitotoxic cell death was simulated by presenting the network with the standard set of test sentences. Hidden-layer neurons were ranked according to the summed activation that they received. Neurons with the highest “rank” were then “eliminated” by clamping their activation levels at 0. Two other simulations of cell loss were also explored: 1) random elimination of neurons and 2) elimination of neurons that were the least activated during testing. Each of these three methods of cell elimination was applied separately to the hidden and temporary storage layers.

RESULTS

Performance of the Standard Network

Figure 3 shows the performance of the standard network before neuroanatomic manipulations. When phonetic information was reduced, fewer words were detected and more word misidentifications were produced. No “hallucinations” were elicited. To demonstrate the specific contribution of verbal working memory and grammatical word order, the network was retested with the same set of words but with word sequencing randomly selected (e.g., man-old-cop-chase instead of cop-chase-old-man). Dramatic reductions in word detection rate and increased word misidentifications demonstrated dependence of the network on word order in decoding input information.

Neuroanatomic Alterations

The effects of pruning working memory connections on network performance are illustrated in figure 4, part A. The word detection rate increased from 67% to 91% when up to 64% of connections were eliminated. This manipulation also eliminated misidentifications (not shown). Additional pruning then reduced perceptual abilities while producing simulated hallucinations, which were noted at pruning levels ranging from 80% to 95%. Hallucinations consisted of the single word “won't.”

At low levels, excitotoxic elimination of temporary storage-layer neurons also increased word detection rates (figure 4, part B) and decreased misidentification rates (not shown). Beyond an optimum level (with 5%–7% of cells eliminated), network performance gradually and irregularly declined. No hallucinations were noted following these alterations. Excitotoxic cell death of hidden-layer neurons impaired perceptual function even at low levels, and again no hallucinations were produced by this manipulation. Eliminating neurons by random selection or by disabling the neurons that were minimally activated neither enhanced network performance nor produced hallucinations. These observations held true regardless of whether such lesions were imposed on the hidden layer or on the temporary storage layer.

Hallucinations at Predictable Times

To investigate further the mechanism of hallucinations, the pattern of their occurrence was examined. The hallucinated word “won't” immediately followed input sentences ending with specific nouns. At lower pruning levels, these nouns were limited to “woman,” “man,” and “God.” At higher pruning levels, “won't” appeared as a hallucination after a broader range of nouns, but never after a word that it did not follow during training (e.g., run-won't; warning-won't). Thus, hallucinations were not random events but emerged from specific linguistic expectations, which themselves were “normal” (i.e., “won't” only follows certain nouns) but were exaggerated to such a degree that output percepts emerged in the absence of phonetic input.

DISCUSSION

Random selection of the word order of input sentences dramatically disrupted performance of the standard network, demonstrating clearly the network's reliance on linguistic expectations generated by a specialized working memory. This property of the network provided the basis for investigating the impact of neuroanatomic alterations of working memory on narrative speech perception. These efforts yielded two results.

First, eliminating working memory connections within a certain range improved the network's ability to perform the perceptual task. Functional advantages could also be obtained from excitotoxic elimination of temporary storage neurons. The model thus provides a functional accounting for these two neurodevelopmental trends. Although their relationship is poorly understood, a clue is provided by the Huttenlocher study (4), which suggests that cell death occurs somewhat before synaptic elimination.

Our study examined the relation between one aspect of language capacity and cortical connectivity. There are no neurobiological studies to date that provide a direct comparison. However, a study of songbird acquisition demonstrated an associated reduction of synapses in brain areas responsible for this communication function (40). Birdsong is not language, but it is a highly structured system involving sound sequences. Perhaps a parallel developmental process occurs in humans, where cortical pruning of synapses results in enhanced efficiency in processing sequential linguistic behavior. In bird studies, loss of synapses was accompanied by inability to learn new birdsong sequences. Reduced aptitude for learning a second language that is associated with the end of childhood may also be due to a cortical pruning process (41).

Second, synaptic pruning, when applied excessively, simulated hallucinated speech. This concept is of interest given the high prevalence of this symptom in schizophrenia, the characteristic age at onset of this disorder, the dramatic loss of frontal synapses occurring in adolescence normally, and empirical evidence suggesting further reductions in cortical synapses in schizophrenia. The characteristic age at onset of psychosis combined with our observation that synaptic elimination rather than cell elimination underlies this phenomenon fit well with the view that the former occurs developmentally later than the latter (4). The fact that hallucinations were not generated by cell death models suggests that a relatively full array of neurons is necessary in order to produce hallucinogenic percepts.

Our results estimated reductions in connectivity associated with normal neurodevelopment and induction of psychosis. The model can also be used to estimate corresponding reductions in synapses. These estimates can be generated if one assumes that the number of synapses mediating a projection from one neuron to another is linearly correlated with the strength or weight of the projection. In other words, higher numbers of synapses mediating a projection would increase the functional weight of the connection. Our model also assumes that pruning is “Darwinian” (where weaker interneuron connections are preferentially eliminated). Consequently, connections mediated by a smaller number of synapses would tend to be pruned away. A 64% reduction in connections optimized the perceptual functioning of the model and thus estimates stable, normal adult levels of connectivity. On the basis of the distribution of connection weights of the unpruned network, a reduction in connectivity of this magnitude would correspond to a synaptic reduction of 29%. This figure approximates the 30%–40% reduction of synapses from peak childhood levels to adult levels indicated by postmortem study of frontal areas of normal brains (4). “Hallucinosis” was observed for connectivity reductions greater than 77%. The model predicts that connectivity reductions of this magnitude correspond to an additional 20% loss of synapses compared to optimized adult levels of (reduced) connectivity. This estimate approximates the findings of Selemon et al. (15), who found that the mean neuropil volume of the frontal cortex of schizophrenic brains is reduced by 17% relative to that of normal adult brains (16). Neuropil consists of the dense intertwining of axons and dendritic arbors surrounding neuronal cell bodies and is likely to correlate with overall synaptic density. Thus, the model provides reasonable estimates of synaptic elimination, which, in a preliminary fashion, converge with the findings of these postmortem studies of normal brain development and schizophrenia (4, 15).

Another study of pruning in artificial networks indicates that this alteration can promote better generalization (42). In our simulation, pruning at low levels helped the network to fill in the gaps during perceptual processing. It is therefore not surprising that additional pruning could push the network to produce spontaneous hallucinations. The nonrandom pattern of occurrence of hallucinated percepts (i.e., “won't” following sequentially appropriate nouns) in the model indicates that this “pathology” arose from misapplied sequential expectations derived from working memory. This finding is of interest given the large number of studies indicating working memory impairments in schizophrenia (30–32). Reduced working memory capacity may derail thought processes, thereby suggesting a mechanism of thought disorder (43). Our model suggests that this functional system can also produce spurious outputs productive of other positive symptoms such as hallucinations.

Physiological studies indicate that synaptic elimination reduces local metabolic requirements (44, 45). Along these lines, imaging studies of human brain development have shown downward shifts in local cerebral metabolic rates that parallel developmental shifts in synaptic density (46). Our synaptic elimination model therefore provides an accounting for the many studies demonstrating reduced cerebral metabolism in schizophrenic patients (47).

The model may address an intriguing question raised by Crow (48): why has the genetic predisposition to schizophrenia remained robust in diverse human populations in spite of obvious fertility disadvantages? Our results suggest that genes that lead to postnatal reductions in corticocortical connectivity might be advantageous cognitively up to a certain point (and hence selected for) but in certain combinations could produce too much pruning—with psychotic symptoms resulting.

The model also predicts that when phonetic clarity is curtailed, the narrative speech perception abilities of schizophrenic patients reporting voices are reduced compared to those of nonhallucinating schizophrenic patients. These differences were demonstrated in a recent study of schizophrenic patients that used a speech tracking task (35).

One limitation of the model is that the simulated hallucinations consisted of a single word following “external speech.” The content of hallucinated speech has been shown to be highly constrained (49). However, actual hallucinations in most cases consist of whole phrases or sentences and are often not in response to external speech. With much more complex vocabularies and linguistic knowledge, simulated speech perception networks could, at least in theory, produce whole phrases or sentences as hallucinations. Moreover, thoughts or even affects could trigger spurious working memory outputs that are experienced as hallucinations. Therefore, we propose that the simulation does provide a useful, albeit simplified, model of this phenomenon.

Many issues are not addressed by our simulations. For instance, the model does not provide an explanation for the mechanism of action of neuroleptic drugs. We have simulated a hypodopaminergic state and demonstrated that hallucinations can be eliminated; these findings will be reported separately. Second, our simulation addresses only a single psychotic phenomenon, namely, hallucinations. Other psychotic symptoms may have different mechanisms. In addition, the model is limited in terms of its fidelity in simulating human neurobiology, including the simplicity of neuronal types and architecture and the learning paradigm used. However, as we stated earlier, a model should not be judged on the basis of its complexity but on its ability to extrapolate from current observations and data to account for unexplained phenomena. Many facets of schizophrenia remain mysterious and even paradoxical. We predict that neural models will be needed to assemble a comprehensive picture of this disorder.

FIGURE 1. Simulation of a Speech Perception Neural Network^a
^aThe hidden layer receives “phonetic” information, while the output layer codes for “words” in the system's vocabulary through activation of neurons coding for specific syntactic and semantic information. Projections are unidirectional and flow from the input layer to the hidden layer and from the hidden layer to the output layer. The temporary storage layer retains a copy of the hidden layer from the prior information-processing step.

FIGURE 2. Examples of the Output Neural Codes for Particular Words Belonging to the Network's Vocabulary

FIGURE 3. Word Detection and Misidentification Rates for the Standard Network Before Synaptic or Neuronal Cell Elimination^a
^aEffects of reducing phonetic information and randomization of input word sequences are represented.
FIGURE 4. . Effects of Pruning Working Memory Connections (A) and Excitotoxic Elimination of Temporary Storage Neurons (B) on Perceptual Abilities in Response to Degraded Inputs and Production of Spontaneous, “Hallucinated” Percepts

Footnote

Received March 27, 1997; revision received Aug. 5, 1997; accepted Aug. 11, 1997. From the Department of Psychiatry, Yale University School of Medicine. Address reprint requests to Dr. Hoffman, Yale Psychiatric Institute, P.O. Box 208038, New Haven, CT 06520-8038. Supported in part by NIMH grant MH-50557.

References

Gleick J: Chaos. New York, Viking, 1987

Format	RIS (ProCite, Reference Manager) EndNote BibTex Medlars RefWorks
Direct importt
Citation style
Style

Copy to clipboard
Tips for downloading citations

Abstract

OBSERVATIONS USED IN CONSTRUCTING THE MODEL

A COMPARISON NEUROANATOMIC “PATHOLOGY”: SELECTIVE CELL DEATH

METHOD

Network Architecture and Language Training

Assessment of Network Performance

Neuroanatomic Manipulations

RESULTS

Performance of the Standard Network

Neuroanatomic Alterations

Hallucinations at Predictable Times

DISCUSSION

Footnote

References

Information

Published In

History

Authors

Details

Metrics

Citations

Export Citations

View options

PDF/EPUB

Login options

Purchase Options

Not a subscriber?

Figures

Other

Share

Share article link

Share