top of page

The Rate of Reinforcement - The simplest and most applicable training tool you've never heard of

  • Writer: Mark McCormack
    Mark McCormack
  • Jun 11, 2021
  • 9 min read

Updated: Oct 1

The 1973 oil crisis, brought on by the Arab-Israeli war earlier that year, led at least indirectly, to the development of a series of important concepts within Behavioural Ecology, which are directly applicable to several aspects of dog training and behaviour modification -- that most trainers don’t know. These are the subject of this blog article.


At first, this whole thing may seem a bit weird and far-fetched, but I have seen several independent behaviour profs and whatnot making the same connections as I will in this article. So these ideas aren’t mine, but their direct application to dog training and behaviour modification is something I consider originally my work, including 3-5 new treat-handling, -placement, and -delivery techniques that I think training geeks like me would find useful and/or interesting for bolstering the degrees of freedom available within existing techniques as well as possibly offering novel approaches useful during assessment, conditioning, and proofing.


The 1973 oil crisis began in October when OPEC stopped at least 75% of crude oil shipments to nations who supported Israel during or after the Yom Kippur War, earlier that year. This led directly to a sudden and severely dwindling supply of oil, when compared to a “business-as-usual” supply. Many gas stations were without gas for weeks at a time, and many governments were forced into petroleum rationing “for essential services/uses only” to avoid price-gouging, huge lineups, rushes, hoarding, and escalating frustration and violence among those waiting for a fill-up. These situations may have led the academics of the time to a predisposition towards thinking and theorizing about best distribution of limited resources and the implications for the behaviour of animals exposed to similar kinds of stresses -- which just so happen to occur regularly in the daily challenges most wildlife face.


Optimal foraging theory (OFT) is a field within Behavioural Ecology which talks about how natural selection is likely to select, over time via mutation and “survival of the fittest”, animals who waste as little energy as possible, or who make the smartest decisions about when and how much time spent hiding from predators, searching for food, eating, storing food for later, socializing, or defending their stored food. OFT is both a theoretical and experimental field, with a rich research canon. Both Eric Charnov’s Marginal Value Theorem and John Krebs’ Foraging Theory, and Behavioural Ecology books are worth looking up. These plus Central Place Foraging are all OFTs, which assume that evolution would select for behaviour that would tend to maximize the rate of prey capture over generations.


The theory goes, that animals who are able to extract the most survival and reproduction value from each hour of each day while trying to minimize predation-risk will be the ones most likely to survive and reproduce in the long-run. Naturally, there is also a good deal of variability in individual behavioural and cognitive biases in most animal populations, but in general, when we go out to measure animal behaviour in the wild, we should expect to see most animals in the wild doing a “pretty good job” at balancing risks vs. rewards by taking enough chances to make sure they eat sufficiently but not over-feeding to the point that over-exposes them to predation risk, or reduces their ability to escape if a predator appears.


My masters research was on Eastern chipmunks, Tamias striatus. My research advisor, Dr. Giraldeau at UQAM, found good evidence that chipmunks base many of their foraging decisions based on the rate of prey capture unually favouring itineraries that serve to maximize that rate. My research showed that the presence of other chipmunks foraging nearby prompts most chipmunks to sacrifice their rate of prey capture in order to directly antagonize or compete with those chipmunks -- meaning not only that they each know when and where one another is finding food, but interacting with and antagonizing one another in these situations is quite important to this species.


All of these concepts are based on a rather surprising assumption that food resources are not distributed uniformly in the environment, but are present in patches of high resource density. This is in fact, what I saw during my chipmunk behaviour research. The Canadian forest where I did that research has many tree species, perhaps 25 species of trees, three of which (maple, oak, and beech) are considered "masting tree species". That means that they produce seeds pretty much every year, but every few years, each tree has a "mast" year where it produces several times the normal number of seeds. Seeds are produced in summer and fall in the autumn, as the chipmunks are trying to cache seeds in their burrow/larder to survive on during the winter. It is to each chipmunk's advantage to try to collect seeds under a tree that has masted in that particular year, until the number of seeds under that tree, i.e.: in this tree's patch, declines to the number of seeds the average tree produces in a "non-mast" year. At that time it is to that chipmunk's advantage to find another tree that has masted this year and collect seeds there -- all in order to try to maximize their rate of prey capture. This attracts chipmunks in neighbouring burrow/larders to forage together under the same trees, encouraging them to socialize together -- usually in an "agonistic" manner (which basically means they wrestle). It turns out that chipmunks continue to capture seeds from a patch for at least two or three minutes beyond the time when the average rate of prey capture has fallen to or below the average of a tree that has not masted before moving on. In most cases the chupmunks are accurate to within about 10%.


In fact, these types of behaviours are what many researchers have found in a wide variety of species; many different types of animals including fish, octopus, cuttlefish, eels, arthropods, reptiles, birds, and mammals can all detect about a 10% change in the “rate of prey capture” within a few minutes, wherever and whenever it occurs. It appears that all of these diverse animal groups have developed neural nets capable of making the integral calculations required to determine the marginal rate of prey capture given a perception of time between prey captures to plot a logarithmic depletion curve. This means dogs can detect a 10% change in “the rate of reinforcement” between two different vocal cues, hand signals, dog handlers, or training situations. I don’t have a reference ready to support this broad claim because I have not had recent access to the scientific literature, but it’s something I believe has been firmly established within the behaviour research cannon.


Before we talk about the particulars of what those “rates of reinforcement” are, and are good for, I want to clarify what I mean by “a 10% change in rate of reinforcement”. There are all kinds of ideas about what reinforcement is and how often you should do it, so I want to clarify some of the things I mean, and some of the things I DO NOT mean by this usage of “reinforcement”.


I’ve heard Cesar Millan say something like, “[Reinforcement is not a thing for ‘the alpha dog’. You will never see ‘the alpha dog’ go up to another dog and say, ‘wow, thank you for walking ten miles’]”. And I’ve heard “The Wolf Man” say he reinforces his dog every day at supper time, and if the dog ‘has been bad’ he doesn’t eat. Neither of these opinions make much sense to me. This isn’t what reinforcement looks like in my book.


I also see a lot of folks who work on getting sits and downs and other tricks twice-a-minute for twenty minutes, then make and serve tea for 15mins, and then take a biscuit to the dog and tell him he was a good boy when he was doing his sits and downs half-an-hour ago. This isn’t reinforcement either - or rather, the biscuit most likely reinforces being on the bed rather than the earlier training. I’m not talking about this either.


I’m talking about an operant-conditioning scenario where the dog is familiar with the operating framework - for example, a dog that knows that if it comes by you and makes a nice sit, you are much more likely to respond with attention than you would, if had the dog not made the sit, and only approached standing. The dog has to believe it has the power to "make" you reinforce it, as long as it behaves according to certain expectations. One of the best ways about this is to demonstrate to the dog that it can trigger you to reinforce it whenever it wants -- at least temporarily. I use the principle of interruptability, as I described in my blog article about the Premack Principle to communicate my interest in reinforcing any particular behaviour.

The way I demonstrate to a dog, that it can cue me to reinforce it is simple:

I make a criterion in my head, an idea of what I’m looking for. Let’s say I want to emphasize a sit… I would make sitting the criterion. If the dog sits, I will reinforce the dog for doing so. Further, I will interrupt anything else I may be doing. If I can detect that the dog sat, I will interrupt whatever else I am doing in that moment to tell the dog they’ve done well, and promptly deliver them a bit of kibble. You really do have to make an effort to keep a keen eye to make sure you reinforce the sit reliably enough to make a lasting impression. This is what I mean by interruptability. There’s one trick in particular I love to use in connection with this that makes it even more effective as a conditioning tool, but you’ll have to contact me to learn about it.


To be clear, I am using food, voice, touch, and eye-contact as the primary means of reinforcement in this article.


At the beginning of the conditioning your goal rate of reinforcement for the sit in this case is 100% -- you want to try to “capture” as many sits with a reinforcement as you can. A finished sit is usually reinforced with food about 5% of the time, but with voice, touch and eye-contact about 90% of the time. This is one of the things I mean by rate of reinforcement -- a measurement of how many times you reinforce the behaviour out of the total number of times the dog actually does the behaviour (measured as a percentage). I will refer to this kind of “rate of reinforcement” as the, “operant rate of reinforcement”. In this case, you try to make it clear to the dog that (s)he is operating you.


There’s something else I mean by “rate of reinforcement”... Imagine there are two trainers standing in the park, 25m apart. The trainer in the East can give a dog a treat two times each minute, or every thirty seconds. The trainer in the West can give a dog a treat twenty times each minute, or every three seconds. In this case the trainer in the East is giving reinforcements at one tenth the rate of the trainer in the West. If there are twenty-two dogs in the park, two should be in the East, and twenty should be in the West (in Behavioural Ecology, this is also known as the “Ideal free distribution”. This is the other thing I mean by rate of reinforcement -- a measurement of how many reinforcements are provided every second or every minute (measured in treats per second or treats per minute). I will refer to this kind of “rate of reinforcement” as the, “immediate rate of reinforcement”.


I believe dogs are capable of detecting a 10% difference in both the operant rate of reinforcement, and in the immediate rate of reinforcement -- and I am working on an experiment to provide evidence for this claim.


I am currently able to reinforce a dog as fast as 75 kibbles per minute, or 1.15 kibbles per second. This is effectively the maximum rate of reinforcement available during a typical training session with me. I usually don’t reinforce at this rate for longer than 10 seconds at-a-time. The minimum, or least rate of reinforcement I would normally offer during a typical training session is one kibble per twenty minutes (usually offered during advanced confinement or "stay" training). That represents a fifteen-hundred-fold increase in the immediate rate of reinforcement between the minimum and maximum rate I offer during a typical session (I define “a fold” in this case as a 100% increase). I feel this is far beyond what a typical trainer might offer. If I’m correct and dogs are capable of detecting a 10% change in the immediate rate of reinforcement, that means I have 15,000 different rates to offer any particular dog. If I pull out, say twenty, different rates in a one-hour session, I’m able to communicate a lot more information to the dog about its performance on each individual repetition of the exercise, when compared to a session where I only offer two different rates of reinforcement. When many different immediate rates of reinforcement are available, it becomes easier for the dog to understand not only when they have done well, but exactly how well they have done.


Reinforcement Type


Offering different types of reinforcement simultaneously, in different numbers, in different situations, is another critical component of teaching lessons that last. In this case I’m referring not only to reinforcement by voice, eye contact, and touch. Each dog has different preferences among the times, places, activities, objects, people and styles of reinforcement it prefers. Knowing yours intimately only adds to the different rates and types of reinforcement you can use simultaneously during any given session to communicate with your dog(s).

The more different types of reinforcement you use at the same time, the more reinforcing the event will be. The more closely the simultaneous reinforcements match the preferences of the dog, the more reinforcing the event will be.


Using generalization and proofing techniques in conjunction with the rate of reinforcement and reinforcement type is one of the most effective ways of teaching a dog a framework of routines and expectations in a way that helps them understand their roles and responsibilities more and more over time.


See my blog article on the Premack Principle, particularly the “Techniques” section, as many of them are also associated with the rate of reinforcement and conditioning strategies associated with it and Contact Us with any questions or comments.



Sources:




 
 
 

Comments


Featured Posts
Recent Posts
Archive
Search By Tags
Follow Us
  • Facebook Basic Square
  • Twitter Basic Square
  • Google+ Basic Square

Serving Hamilton, Ancaster, Burlington, Stoney Creek, Grimsby, and area

Call

647-530-3738

Follow us !

  • Facebook

2013-2022 by

Pooch Perfect.

Pooch Perfect is owned and operated by ihackdogtraining inc.

bottom of page