Differential diagnoses involving "Classical" vs. "Operant" class-conditioned learning


Classical conditioning


This field was founded by Ivan Pavlov in 1897. Here is how I usually retell Pavlov's story:


Pavlov was the first scientist to directly study the behavioural and learning psychology of dogs. He had a whole kennel of dogs that he experimented on. Pavlov wanted to try to understand the effect of the timing of stimuli on how the dogs learned. Every day at noon, they always got lunch - and that also means they were conditioned to salivate every day at lunch. So, for the first week of the experiment, Pavlov rang a bell every day at 10am to see how the dogs reacted to a novel stimulus. For the first day or two, they would react to the bell, even startle a little, maybe walk around the kennel for a minute or two and that was it. After a few days, they hardly seemed to notice the bell at all, and after a week they hardly reacted to the bell at all. They had become "desensitized" to the bell. The second week of the experiment, instead of ringing the bell at 10am, they rang it at 11:59:57 - just three seconds before lunch arrived. The first day or two, the dogs didn't seem to notice the bell much, but after a few days, they seemed to find the bell quite exciting. After hearing it, they began to move around and go towards where the lunch came, salivating like crazy. The last week of the experiment, they moved the bell back to 10am. At first, the dogs are excited, going to the lunch area and salivating - the bell had apparently taken on the same value as lunch simply by virtue of when the dogs were exposed to it; but after a few days of having the bell at 10am, and lunch at noon, the frequency the reactions to the bell began to drop steadily to zero. All these patterns hold for all classically-conditioned phenomena.


If we can take dogs as an example, food could be considered an "unconditioned stimulus" as far as our story is concerned. That means, that anytime a dog is presented with a stimulus of food, the dog salivates regardless of whether or not it has ever experienced being presented with food in the past -- or, we could say a dog's response to food is innate and independent of experience.



The following graphic highlights the change in stimulus response rate throughout the extinction process. On the vertical axis is the frequency of the stimulus and response. On the horizontal axis is time in minutes. Once the stimulus is removed from the reinforcement schedule, the dog quickly stops responding to the conditioned stimulus. The conditioned response should not re-occur unless a new association between the neutral stimulus and the unconditioned stimulus is made.



Comparing the concepts of a classically- vs. operationally- conditioned phenomenon depends on an understanding of how conditioning works differently in each case, especially with respect to how classical- and operant- conditioned responses differ in how they are extinguished, or how they undergo the process of "extinction". In our article we are conditioning using a bell and a button but Wikipedia uses a metronome. According to Wikipedia, Extinction is:

Extinction definition from Wikipedia (retreived Jan 6 2021):


Extinction is a behavioral phenomenon observed in both operant conditioned and classically conditioned behavior, which manifests itself by fading of non-reinforced conditioned response over time. In classical conditioning, when a conditioned stimulus is presented alone, so that it no longer predicts the arrival of the unconditioned stimulus, conditioned responding gradually stops. For example, after Pavlov's dog was conditioned to salivate at the sound of a metronome, it eventually stopped salivating to the metronome after the metronome had been sounded repeatedly but no food came. Many anxiety disorders such as post traumatic stress disorder are believed to reflect, at least in part, a failure to extinguish conditioned fear.



Operant conditioning


This tends to work differently than classical conditioning. In the classical conditioning explained above, a neutral stimulus becomes conditioned and then extinguished by changing its association with unconditioned stimuli through repeated exposures. Operant conditioning is always about how a dog interacts with its environment (including human and canine individuals within it). In operant conditioning the dog learns the reinforcing and punishing consequences its actions have on/in the environment and adjusts its behaviour as a result. Actions that are primarily punished tend to go away, while those that are primarily reinforced become more common over time. As experience accumulates, the dog learns what to expect and how to act. The brain's predictions are tested against the environment from day to day. The degree to which the brain's predictions differ from the environment is known as the "prediction error". Prediction error can be relatively reinforcing or relatively punishing , but dogs perceive and remember not only the mean reinforcement or punishment, but the standard deviation about the mean, and the direction of change over time of either of these, among many other aspects of the dog's narrative of experiences, depending on the complexity of the environment, context, or framework within which the dog is "operating".


From the point of view of a dog operating within a particular environment context or framework, consider how the behaviour of other dogs or people could reinforce or punish your dog's behaviour. This is the point of view I try to narrate when I am explaining to clients how undesired behaviours become inadvertently reinforced over repeated exposures to the same stimulus.


The scenario most people associate with operant conditioning is one where a mouse or a bird in a cage learns to push a certain button at a particular time to receive a food reward. At first, the experimenter, must manipulate the environment using classical conditioning to create associations necessary for operant conditioning. For example, an experimenter might trigger a food-treat reinforcement whenever the animal approaches the area of the cage closest to the button. This would trigger a perception of "prediction error" whenever the animal moves into the button area. This prediction error would prompt the animal to explore the area adjacent to the button. While the animal is there, the experimenter might trigger reinforcement whenever the animal touches the button with any part of its body, further increasing interest in the button, and then repeat actions like these until the animal is rapidly pushing the button whenever reinforcement is available.


In many classic experiments, an additional light is lit inside the cage to signal to the animal which parts of the day pushing the button results in a food treat and which parts of the day pushing the button results in no reaction whatsoever. In order to teach the animal what the light "means", the experimenter must adjust the environment to trigger the appropriate consequences when the light is on (food appears) and when the light is off (no reaction). Let's assume the animal has been conditioned to push the button to get food before the additional "food availability indicator light" has been conditioned. This means that at the beginning of the experiment animal pushes the button to get food at any time. On day one of the experiment at 8am the lights come on in the cages for the first time and the animals wake up and there is a new light shining above the button. The animal is waking up hungry so it goes over and pushes the key and gets a food reward, then pushes the key and gets another food reward, and it continues doing so until either the animal is satiated and stops triggering food because it is no longer hungry, or, the light goes off and food stops being delivered before the animal is satiated. This is the case most of interest to us in terms of understanding the main differences between classical and operant conditioning, because of what happens next -- after the animal has experienced the novel light go out for the first time.


I have prepared graphics similar to those I used for classical conditioning. This one shows how the frequency of button-pushing changes after the light goes out for the first time, or "reinforcement for the operation is removed". On the vertical axis is the frequency of the operation, in this case it is pushing the button in instances per minute. On the horizontal axis is time in minutes:



The first new concept we have here in operant conditioning versus classical conditioning is an immediate and steady increase in the frequency of button (or lever) pushes after reinforcement is removed. This is known as an "extinction burst". The animal is pushing the button and expecting food, but no food is coming out. This increase represents "perseverance" in pushing that button despite relevant feedback to the contrary.


Please note that this extinction burst applies not only to a button and food, but to any other "actions", "reinforcements", or "punishments" that are operationally conditioned. Particular contexts of note include:

  • barking while confined or at a window

  • cruising for attention

  • wanting to stay outside or in a strange place like in the car (I'm thinking of you Heather! ;) )

  • getting reinforced or punished in a context where ignoring is better

  • any fear- or anxiety-related behaviour (these tend to occur alongside possession, destruction, play, and who leads)

  • any inadvertent reinforcement of undesirable behaviour;

  • if any of these might even slightly apply to you, you have to watch that you take care of this confounding variable in your training -- because an extinction burst looks like severe anxiety and it usually occurs when your resolve to ignore is at its lowest

That's right. You heard me. ANY operant-conditioned behaviour is subject to possible spontaneous recovery at ANY time in the future regardless how long it's been since the last instance of the behaviour. AND ANY reinforcement of an operant-conditioned behaviour that occurs during the extinction burst or during ANY possible future spontaneous recovery might result in an IMMEDIATE resumption of the behaviour at it's ORIGINAL FREQUENCY. So if you are looking to extinguish an operant-conditioned behaviour -- or any behaviour that has ever been inadvertently reinforced, please do your best to NEVER reinforce that behaviour again.


Obviously you can't reliably do that and that's why I've asked you to do your best. We always try for 100%, but we almost never get it, and that's usually ok. But, this does represent a categorical difference between how classic- and operant-conditioned behaviour always extinguishes itself according to these patterns and so you must be prepared for what to expect because whatever operant behaviour you are targeting for extinction MUST go through an extinction burst that is NOT reinforced first, before extinction can occur and a classic-conditioned behaviour does not. This pattern always holds and the behaviour does not lie. The only time you should doubt these patterns is when there is conditioning on conditioning (and again Heather I'm thinking of you here). "Conditioning on conditioning" doesn't ususally happen unless you have an older dog or a dog with more serious behaviour problems.


I hope that helps! Please let me know if this article is helpful or if it raises questions. Please contact us!







Featured Posts
Recent Posts
Archive
Search By Tags
No tags yet.
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square

Serving Hamilton, Ancaster, Burlington, Milton, Oakville and Grimsby

Call

647-530-3738

Follow us !

  • Facebook

2013-2020 by

Pooch Perfect.

Pooch Perfect is owned and operated by ihackdogtraining inc.