Semantic and Procedural Memory
Goals
- To discuss semantic memory.
- To discuss the Complementary Learning Systems theory
- To discuss neural networks
- To discuss non-declarative memory
- To discuss neurogenesis and memory
Topic Slide

Jay McClelland (b. 1948) is a Professor of Psychology at Stanford University who is a principal architect of the Complementary Learning Theory model for human memory. He is a pioneer in the application of neural networks to human cognition.
May-Britt Moser (b. 1963) won the Nobel Prize in 2014 for her work on discovering grid cells in the entorhinal cortex.
Reading
Semantic memory
Semantic memory is the second major component of Declarative Memory which includes memories for facts, but doesn’t include episodic components (like where you learned the fact). Semantic memory is severely disrupted in fronto-temporal lobar degeneration (FTLD). Semantic dementia (SD) is a component of FTLD, and appears to depend particularly on the anterior temporal pole.
- The SD patient loses information about what words mean and what for what objects are used; i.e., the patient loses semantic memory.
- In early stages, the demented individual may start using category labels instead of specific category members – e.g., ‘bird’ instead of ‘robin’.
- Cuing does NOT help. This is not simply word loss, like anomia, it is a loss of the concept.
I discussed two case studies – the second of which was impaired with words, pictures of objects, and the sounds that familiar objects make.
Here is a partially paraphrased description of an individual with SD from the 2007 paper by Patterson and colleagues. This group emphasizes the role of the Anterior Temporal Lobe (ATL) in semantic dementia.
Mr M, a patient with semantic dementia — a neurodegenerative disease that is characterized by the gradual deterioration of semantic memory — was being driven through the countryside to visit a friend and was able to remind his wife where to turn along the not-recently-travelled route. Then, pointing at the sheep in the field, he asked her “What are those things?” Prior to the onset of symptoms in his late 40s, this man had normal semantic memory. What has gone wrong in his brain to produce this dramatic and selective erosion of conceptual knowledge?
The sheep are a puzzle to Mr M: not only does he not know what to call them, he no longer knows what they are. He wears a wool jacket when it’s cold and eats roast lamb for Sunday lunch, but would not be able to say that “those things” out there are the source of these products. He would succeed in matching a photograph of a sheep taken from the side to one taken from the front, because this task — which people can perform on meaningless objects that they have never seen before — relies on visual perceptual abilities rather than semantic ones. If asked whether the photograph of a sheep is an animal, he would probably say yes, but if asked what other animal is similar to it, he would look blank.
This striking combination of preserved and disrupted cognition has now been documented in hundreds of patients with SD in dozens of neurology clinics in many different languages and countries. not every such patient has had detailed structural and/or functional brain imaging but, for those who have, the resulting neuroanatomical profile is as consistent as the cognitive profile: the selective but generalized semantic degradation that occurs in SD goes hand-in-hand with focal degeneration of the bilateral ATL. Specific features of conceptual knowledge are almost certainly represented elsewhere and in a widely distributed network; but people’s ability to receive information in one modality and express it in another, to generalize across conceptually similar entities that differ in almost every specific modality, and to differentiate between entities that resemble each other in many modalities — all quintessentially semantic abilities — seem to depend on the ATL.
Patterson and colleagues argue for a model of semantic memory in which memories are distributed throughout the brain, but the anterior temporal lobe acts as a multimodal, conceptual hub. The 'hub model' of semantic memory includes a special role for the anterior temporal lobe (ATL) for storing conceptual and task-based information about related items.
I also showed that SD patients make generic drawings of animals when drawing a sample after a 10-sec delay. The sample animal's distinguishing features are not included in the drawing, and more generic features (e.g., four legs) are included.
I showed examples from Hannah Damasio’s work where she characterized a large sample lesion study and showed category specific semantic memory deficits along the anterior-posterior axis of the temporal lobe. Memory for persons was more anterior than memory for animals, and tools.
Memory recall activates the same areas as initial sensory input
I showed that the same sensory regions activated during initial presentation of items – pictures and sounds – were activated when those items were retrieved. This supports our earlier discussion of consolidation and cortical representations. That is, when a cue is presented, the memory reactivates the same cortical regions as were activate during initial encoding. This process is dependent upon the hippocampus during the period of consolidation, but eventually does not require the hippocampus to sustain the memory.
Semantic priming
I introduced the idea of connected concepts in a semantic network. Stimulating one node leads to stimulation of related nodes (spreading activation). I illustrated this with three examples drawn from physiological data:
- Work by Nobre on N400 semantic priming effects using scalp-recorded ERPs. Primed words had a smaller N400 than unprimed words.
- Work by Nobre using intracranial recording directly from the brain showed semantic priming effects in anterior temporal lobe. Primed words had a much smaller ERP than unprimed words. This suggests that 'less neural work' was required to access words that were already partially activated through priming.
- fMRI study using a semantic priming approach shows anterior temporal lobe differences related to priming. This occurred in the same region as Nobre showed her ERP effects. This area is also implicated in semantic memory by the studies I reviewed in last lecture by Hannah Damasio.
We will return to the concept of priming later in the lecture when we contrast semantic priming with perceptual priming
Complementary Learning Systems
Our discuss of declarative memory has raised several questions:
- What is the special role of the hippocampus with regard to memory?
- Why is there a consolidation period when the hippocampus is necessary to recall a memory?
- Why are some remote memories beyond the consolidation period accessible without the need for the hippocampus?
- Why do individuals without a functioning hippocampal system not learn new semantic information?
Complementary Learning Systems
These questions have been addressed by the 1995 theory advanced by McClelland and colleagues called Complementary Learning Systems. In a recent 2016 review article that updated the CLS theory in light of new findings, Kumaran, McClelland and their colleagues summed up the need for the CLS as follows:
“Effective learning requires two complementary systems: one, located in the neocortex, serves as the basis for the gradual acquisition of structured knowledge about the environment, while the other, centered on the hippocampus, allows rapid learning of the specifics of individual items and experiences.” (Kumaran et al. 2016).
While this quote nicely specifies the two components of the CLS theory – a rapid hippocampal learning system and a neocortical system for gradual structured knowledge learning, it does not state why that is necessary. We will consider the need for these two complementary systems below.
Role of the hippocampus in CLS
The hippocampus is proposed as a system for rapid learning of specific information. Three processes are specified for the hippocampus:
- Pattern separation – many objects look similar and have similar functions (for example, my phone and your phone), and many experiences are similar (eating in different restaurants with the same friends). However, despite the commonalities, it is often necessary to remember the specifics. You don't, for example, want to take my phone and leave your phone in its place. Pattern separation is an process by which very similar events are given unique codes that minimize the overlap and similarities.
- Pattern completion – an event has many different sensory/motor/emotional/motivational components that are bound together into an episodic memory. Pattern completion refers to the ability of a fragment of the original experience (a cue) to evoke the entire memory.
- Replay – the hippocampus is proposed to train neural networks in neocortex through a process called replay. I will unpack this further below, but the basic idea is that memory traces are replayed from the hippocampus to neocortex during sleep and during non-active periods, and that these traces 'train' the neural networks in neocortex so that they can extract structured knowledge.
Role of neocortex in CLS
As we discussed, cognitive theorists have argued that semantic memories are organized in a semantic network of the type I described above when discussing semantic priming. McClelland and many others have argued that artificial 'neural' networks (I put the quotes around the neural in neural networks because they were developed by Artificial Intelligence researchers and were proposed as models for the way the brain works, not the other way around) are a good model for how neocortex learns and extracts regularities from its sensory input.
To understand the CLS theory and its need for two memory systems, we need to have a basic familiarity with neural networks.
Neural network basics
Perceptron
In the 1950s and early 1960s, Frank Rosenblatt proposed the concept of the perceptron, a simple but powerful idea that incorporates many properties of a neuron.
A perceptron accepts inputs that vary in their strengths (or, weights). The perceptron sums the strengths of its inputs, and either provides an output (output equal to '1) or does not (output equal to '0'). The perceptron thus discriminates among patterns of inputs, and makes a binary (yes/no) decision about whether or not the pattern was present. In statistical terms, we could call this a linear classifier.
I pointed out the rather obvious similarities of the the perceptron to the neuron, and recalled how synaptic strength for individual synapses can be altered, and how that might change the likelihood that a neuron would provide an output. In some sense, then, we can conceptualize the neuron as a biological classifier, whose action potential output occurs when a pattern of input is recognized.
By aggregating many perceptrons into a network, many different patterns could be identified. In 1960, Rosenblatt created a specialized computer for the Navy called the Mark I Perceptron (now in the Smithsonian Museum) to recognize and classify input patterns.
For a number of reasons beyond the scope of this course (mostly concerned with difficulties in non-linear operations), the perceptron was limited in its applicability. However, networks of perceptrons that included 'hidden layers' between the input and output 'layers' were not so encumbered. This is the basis for neural networks as we know them now.
In lecture, I presented an example of a neural network for digit identification. I presented a snippet of a very neat 3-D animation of a neural network as it processes digits. The snippet shown in class illustrated a multi-layer neural network. The full video also shows other types of networks, and can be found here.
I also introduced the concept of supervised learning whereby, during numerous exposures to the input patterns, the output of the neural network is compared to ground truth, and an error signal is back-propagated through the network layers. The back-propagation signal changes the weights of the connections to minimize the error between the output and ground truth (for you mavens of neural networks, this usually involves a minimizing procedure called gradient descent). Over many instances of training, the output becomes more accurate. Most interesting, the neural network can correctly recognize input patterns that it has never seen before – that is, it generalizes. This is an important feature of neural networks that make them interesting to cognitive scientists.
Neural networks (also known as deep learning networks) are becoming very prevalent in modern life. Neural networks developed at Target examined purchase histories to predict (quite accurately) whether or not the consumer was pregnant, so that the consumer could received targeted ads. Neural networks in our iPhones are used to automatically recognize faces and objects in our photo-streams. In any situation where a pattern of sensory information can be classified and identified, neural and deep learning networks have been used.
Catastrophic Forgetting
Many theorists believe that the brain learns like a multilayer neural network. However, when trained sequentially on new information, neural networks can abruptly forget previously learned information. This problem of catastrophic forgetting was first pointed out by McCloskey and Cohen in 1989. Humans do show interference in sequential learning tasks (called proactive interference), but not catastrophic forgetting.
Catastrophic forgetting can be avoided by interleaving the training of new information with existing information. This is the presumed role of the hippocampus. Over time, the hippocampus slowly 'trains' the neocortical neural network, and thus avoids catastrophic forgetting. Thus, the hippocampal and neocortical learning systems are complementary.
Hippocampal replay
Is there any evidence for hippocampal replay? The answer is 'yes', and it comes from a very interesting direction. In two previous lectures, I have briefly discussed [place cells] in the rodent hippocampus. You can learn more about such cells here and here. Place cells fire when the rodent is in a particular location (the 'place field' of the 'place cell', roughly analogous to the receptive field of a neuron). When a rodent is learning a maze, one can record from a number of place cells and see them fire in sequence as the rat traverses the maze.
One amazing finding is that the hippocampus replayed this sequence of place cell firing when the rats were asleep or resting. The sequences were often replayed at 20 times speed, and occasionally were replayed backwards. These replay events were accompanied by changes in firing in neocortex. When the replay events were detected and interrupted experimentally, the animal did not learn the maze. Thus, consolidation of the maze memory depended upon the hippocampus instructing, or supervising, learning in neocortex. This was support for the CLS theory.
Non-declarative (procedural or implicit) memory
Returning to our memory outline, I discussed different aspects of non-declarative memory.
Dissociation of procedural and episodic memory
I first reviewed the dissociation of procedural and episodic memory in H.M. using the mirror writing example of intact procedural memory despite severe episodic memory deficits. H.M. got better on the task, but never remembered doing the task.
I then provided evidence of a double dissociation of Skill learning and Declarative Memory.
- Mirror reading and verbal recognition:
- Korsakoff patients (mammillary body damage) can learn mirror reading, but not remember the words.
- Huntington’s disease patients (basal ganglia dysfunction) are not good learning mirror reading, but do remember the words.
- Weather prediction task and episodic memory dissociation
- The Weather Prediction task is an example of implicit learning. One ‘gets’ it over time, but can’t typically say how. It is sometimes referred to as ‘statistical learning’.
- Parkinson’s patients (basal ganglia dysfunction) can remember episodically, but cannot learn the probabilistic weather prediction task
- Amnesic’s (presumably with MTL damage) show the reverse pattern.
Priming
There are different form of priming:
- Perceptual
- Conceptual
- Semantic
I presented two patient examples that dissociate different aspects of priming and episodic memory:
- Patient M.S. who has poor perceptual priming but good explicit memory
- I decided not to show these data, but let's stipulate that there are patients who have occipital lobe lesions and poor perceptual priming, who nevertheless have good episodic memory.
- Patient K.C. shows dissociation of conceptual priming and episodic memory.
- K.C. shows word priming one-year after exposure, with no episodic memory.
Repetition suppression
I presented the general idea behind repetition suppression – first presentation excites neurons – (reflected in fMRI activity in voxels). Some neurons are refractory (can’t be fully activated) upon second presentation – i.e., response is suppressed. This doesn’t occur for novel items. So, it is an implicit test of recognition.
- I provided an example of repetition suppression used to study perceptual priming.
- Perceptual priming and conceptual priming are anatomically dissociated using masked words in same (perceptual) or different (conceptual) fonts.
Reconsolidation
Fear conditioning
Many memory theorists believe that memories become labile and susceptible to modification whenever they are recalled. That is, when we reactivate a memory, it can be reconsolidated with new information. Thus, memories are constructive, and subject to manipulation.
Reconsolidation is often discussed in the context of fear conditioning, where an aversive stimulus (e.g., an electric shock, or an air puff to the eye) is preceded by an innocuous stimulus (e.g., a tone, or a colored light).
Fear conditioning is just a variant of classical conditioning. In the classical conditioning field, the shock would be called the unconditioned stimulus (UCS) and the innocuous stimulus would be called the conditioned stimulus (CS). Normally, an animal would respond to the UCS with an unconditioned response (UCR), such as jumping in pain. After many pairings of the CS-UCS, the CS evokes a conditioned response (CR) prior to the appearance of the UCS. This CR might be a freezing response. The appearance of the CR in response to the CS is evidence of learning, i.e., the animal predicts the appearance of the UCS based upon the appearance of the CS.
Extinction is a type of new learning, whereby the CS appears without the UCS. Over several CS appearances without the UCS, the CR diminishes. The diminishment of the CR is evidence that the animal has now learned that the CS does not predict the UCS.
I introduced the concept of extinction learning. Extinction is a form of learning, not ‘forgetting’. In extinction, one learns that a cue previously associated with a (usually aversive) outcome, no longer predicts the bad outcome.
After CS-UCS training, the presentation of a solitary CS (without UCS) is a cue for the CS-UCS memory. The memory is thought to be labile, and susceptible to reconsolidation. In experimental animals, giving a drug that inhibits protein synthesis (anisomycin) after a solitary reminder CS interferes with the reconsolidation of the old memory. Thus, the animals forget that the CS predicted a shock. When tested later with a CS, they show diminished CRs, indicative of poor memory for the CS-UCS pairing during the initial learning. Studies in animals show that the temporal window for reconsolidation is brief. If the solitary CS is given to reactivate an old memory, and the protein inhibitor is given 24 hours later, it has no effect on reconsolidation.
If in humans, you recall a memory with a solitary CS, and then immediately begin extinction training (while the memory is labile for reconsolidation), the human quickly learns the new information and extinguishes well. If you recall a memory with a solitary CS, and then wait until the temporal window for reconsolidation to be over before commencing an extinction trial, extinction is less effective.
Reconsolidation and pathological memories
We saw earlier in the semester that an NMDA antagonist can inhibit learning and memory in rats swimming in a Morris water maze. Can an NMDA agonist improve learning/memory? Can this help clinically in patients with traumatic memories?
Reconsolidation may provide a way. D-cycloserine is an antibiotic and NMDA agonist. It is being tested with virtual reality exposure therapy. If you recall a fearful memory (by exposing patient to a cue) in a safe environment (a form of extinction learning), perhaps providing a NMDA agonist will speed extinction learning.
This is a relatively new area of research, and the findings thus far must be viewed as preliminary, but also provocative.
- Acrophobia – helps patients unlearn fear of heights
- SCR measures and fear ratings
- PTSD and exposure therapy
- Although studies just underway, there may be support for the idea that VR exposure therapy plus NMDA agonists can alter traumatic memories.
Do you think this approach of manipulating memories has negative ethical implications?
Neurogenesis and memory
I mentioned that in this course, I will attempt to refute some common myths about brain function. One is that we never create new brain cells. There is now very convincing evidence that we create new neurons (even as adults) in our hippocampus (and olfactory bulbs). Much of this work has been done in rodents using simple conditioning memory tasks.
Conditioning
Conditioning is a form of non-declarative or implicit memory. I briefly explained the basics of classical conditioning:
- UCS – unconditioned stimulus – e.g., ‘shock’
- CS – conditioned stimulus – e.g., ‘tone’ that precedes shock.
- CR – conditioned response – e.g., what you do in response to the UCS and then (after conditioning) to the CS – e.g., eye blink
The difference between delay and trace conditioning.
- Delay – the UCS comes on at the end of the CS
- Delay conditioning depends upon the cerebellum and is not influenced by hippocampal lesions.
- Trace – there is a short stimulus-free delay between CS and UCS
- Trace conditioning depends upon the hippocampus.
- Trace conditioning depends upon neurogenesis in the hippocampus – inhibiting neurogenesis inhibits trace conditioning
- Trace conditioning increases survival of new neurons in the hippocampus