The Premack Principle and the spectrum of reinforcement by preference
The Premack Principle is one of the keystones of my perspectives on dog training, one of the deepest and most interesting concepts, and the one I'm most afraid of not doing justice to in this blog. I have been working on this post in the back of my mind since early 2017 (it is currently March 2021). So please bear with me and certainly send questions, criticisms and comments my way. Please be aware that I am writing this blog from the perspective of “the average dog”. If your dog has apparent behaviour issues, some parts of the article might not apply to your dog in the same way as written here because your dog’s experiences change which behaviours are expressed and in which ways.
The Premack Principle is to dog training as the "theory of relativity" is to physics. The Principle is exceptional in that it is equally applicable to training, management, and assessment, as well as speaking indirectly to cognitive biases inherent in canine behaviour. According to Premack, each dog will have its own unique preferences among all activities, environments, individuals, and objects, that could, in theory, be arranged into a unique listing for each dog in order of preference of each activity, environment, individual, and object. Premack also asserts:
that any relatively more-preferred activity, environment, individual, or object can be used as a reinforcer for a relatively less-preferred activity, environment, individual, or object;
that any relatively less-preferred activity, environment, individual, or object can be used as a punisher for a more-preferred activity, environment, individual, or object.
It also follows that a trainer or owner can change a dog’s relative preference rankings by using proofing and generalization techniques. For details on how to do this, please contact us -- as this can be a little complicated…
The concept of interruptability
Preferred activities, environments, individuals, or objects tend to be able to interrupt less-preferred activities. Interruptions tend to be more frequent, of greater duration, and are more profound in their displacement of the less-preferred elements, the greater the difference in preference between the two elements.
The fastest and easiest to way to assess the relative preference of two elements is to set up experiments to measure the interruptability of one element by the other.
Limits to the applicability of the Premack Principle
A young puppy coming into the world cannot be considered “a blank slate” for our purposes. There are certain “things” (activities, environments, individuals, or objects) which all dogs are biased to perceive, act on, and remember. In this case, I'm referring to the "genetic inheritance" dogs have from the African Wild dog, Lycaon pictus, which recent genetic evidence has shown is the likely equal progenitor of both Grey wolves (Canis lupus), and Domestic dogs (Canis familiaris).
Some common examples include:
- Alarm barking, "mailman syndrome", and barrier frustration
- Tug play, possession, wrestling, chase, fetch and other initiative games
- Patrolling, sniffing and scent marking
- Confusing correlation with causation.
Beyond these common "ancestral" canine behaviours and preferences (from the African Wild dog), there is a great deal of variation in these and other aspects of a dog's personality, namely which environments, activities, individuals, and objects an individual dog prefers. A proportion of these preferences, tendencies and biases are due to genetic and epigenetic variations between individual dogs, but the most significant source of variability in behaviour is due to variability in experience. If you aren’t training core skills, routines and expectations deliberately, then it’s tough to guess exactly what your dog is learning until “it’s too late” and you realize your mistake due to the advent of undesirable behaviour, anxiety, insecurity, or defensiveness. Please do your best to train core skills deliberately, and if you’re not sure how to do this, please contact us.
While the Premack Principle is independent of Thorndike’s Principles of Learning, these two concepts complement each other among a trainer or owner’s perspectives on their dog(s). Further, the entirety of the development of canine behaviour during the first few years of life is based upon the ecological requirements of domestic dogs (Canis familiaris), and their immediate genetic ancestors, African Wild dogs (Lycaon pictus). Many of the adaptations necessary for the survival and reproduction of African Wild dogs are apparent in the behaviour of domestic dogs and I would like to illustrate this further below.
Experiences and learning that occur earlier in a dog’s life (under and about a year old) tend to have a greater long-term impact on temperament, personality, and behaviour than those that come later (2-3 or more years old, depending on the breed and size at maturity). The behavioural ecology of the African Wild dog presumes intensive parental care, nurturing, protection and supervision. Therefore, in order to take the best advantage of the wealth of knowledge available from parents and siblings from previous years, the hormonal environment in younger dogs encourages more rapid and lasting formation of redundant neural connections in the brain than the hormonal environment in older dogs. This is one of the main reasons it's so important to train your dog as a puppy -- particularly on the key lessons; to listen to the humans, and behave according to house rules. When there is a great number of redundant neural connections, such as there are in growing dogs, learning occurs faster, easier, and sticks around longer -- and that’s the kind of thing we want to be taking advantage of as conscientious dog owners and trainers -- because a dog with a great number of neural connections for listening, delaying gratification, and staying calm is one that will lead to an easier and happier life in the human world.
Primary, Secondary, and Tertiary reinforcers:
Primary reinforcers are phenomena that are routinely satisfying or pleasurable to at least a large majority of dogs (about 66% of the total population). Primary punishers are phenomena that are routinely dissatisfying and/or unpleasant.
Secondary reinforcers are phenomena that reliably predict the arrival of a primary reinforcer. Secondary punishers are phenomena that reliably predict the arrival of a primary punisher.
Tertiary reinforcers are phenomena that reliably predict the opportunity to receive primary reinforcement. Tertiary punishers are phenomena that reliably predict the opportunity to receive primary punishment.
The “normal” set of primary reinforcers is usually determined by the species of the animal in question -- each species valuing the various activities, environments, individuals and objects differently, but secondary and tertiary reinforcers are most definitely learned during the “sensitive period” where biology determines a rapid learning pace, best suited for long-lasting or highly-generalizable conditioning. See a detailed description of these biases earlier in this document.
One of the most interesting insights the Premack Principle highlights is the fact that in any mammal brain, there is only one physiological chemical signal associated with all instances of reinforcement, and that is dopamine. Dopamine is what all mammal brains crave and its signal is essential for everything we crave and work for. In a certain sense, as far as the brain is concerned, all activities in life are just different pathways to different amounts of dopamine. What determines our preferences is our varying experiences as we work towards different kinds of plans to different kinds of dopamine rush. Narcotic addiction, and addictive behaviour itself is dopamine-seeking at its “most dysfunctional”, but any repeatable behaviour can become an addiction, depending on the unique experiences of a particular individual.
This is why it really matters how you feed, how you play, how you walk, how you exercise, how you relax and how you groom/manage -- and why its possible to be “too permissive”, or “too strict” regardless of what “training method” you prefer. See our blog articles on “What are reinforcement and punishment”, and “Training versus management” for further information.
Some folks call the Premack Principle, “the broccoli principle” because it talks about relative preferences, but I meet a lot of growing dogs who have in contrast, been trained on “the dessert principle” instead, which states, “dessert, whatever type, however much, and whenever the dog requests it”. The only problem with that is that many dogs quickly generalize this “dessert principle” and become fussy brats simply because they have never been asked to practice gratification delay or distress tolerance -- while the human world requires that we all exercise a degree of patience, discretion, and “biddability” whether we are human or not -- and dogs frequently face destruction orders and rehomings if they violate these unspoken human rules.
Skills are also important to learn young. Sit, down, stay, come, and confinement training are all usually considered foundational, but I would also add; the dog's name, a command for eye contact, learning to respect the leash, and staying calm while confined (i.e.: no unnecessary barking) are just as critical as teaching gratification delay and “broccoli (behaviour) is the best way to dessert (behaviour)”. Dogs that don’t get these types of training before they are fully adult will have a harder time retaining and applying those skills appropriately over their lifetime as compared to dogs who learn early and learn well. Part of the reasoning behind emphasizing these skills is that these skills aren’t necessarily skills all dogs inherit as part of their genetic legacy from the African Wild dog. If we wish for these skills to be readily available “on demand” throughout our dog’s life, it’s wise to train intentionally, while the dog is young, to ensure effective and long-lasting conditioning.
One of the most important concepts to understand when confronting the Premack Principle, is the concept of (degrees of) freedom. If, for example, we watch a dog’s behaviour for a period of time, we can make a rough relative measurement of the dog’s “preferences” for its various activities, environments, individuals, and/or objects, if we start with the assumption that this dog “is free” to engage in any behaviours it wishes during that period. If we make that assumption, then all we have to do is record the number of instances and the amount of time the dog remains engaged in its various daily activities, environments, individuals, and/or objects, and place them in a ranked list from lowest to highest. The only problem is, that most dogs “are not free” to choose among all possible behaviours during an average day, due to the conditioning their prior experiences have made on their “non-blank slate” inherited from the African Wild dog.
If, for example, a dog has been freaking out on every walk, every day of its life, to a certain extent, that dog is conditioned to freak out on every walk every day simply by virtue of repetitive conditioning -- and as long as this dog’s days continue to resemble one another to a significant extent, there is no reason for this conditioned behaviour to change. It is critical to recognize and accept the details of your dog’s personality/conditioning/shortcomings and start by managing the dog you have, as you train to increase your dog’s skills so your dog requires less behavioural management over time, and can gradually take over more responsibility for managing their own behaviour. In this case, we would consider the dog to “prefer” this type of behaviour, even though the behaviour is likely to represent a significant degree of emotional distress -- it’s “prefer” in a statistical sense -- and that is a key distinction to make when communicating about a dog’s behaviour using the Premack Principle. In most cases, responsible trainers should try to develop “positive outlets” for dogs who have “problematic conditioning” around basic functions like the daily walk.
Many dogs with anxiety issues need a good deal of arm-twisting before they break down their barriers and try relaxation for the first time, and unfortunately, the greater the proportion of a dog’s life that is spent in “defensiveness” or “anxiety” or “insecurity”, the greater the chances they will not be able to leave these feelings behind in the future - particularly during the first years of its life.
The one simple fact: that African Wild dogs must routinely survey and mark vast territories up to 1000sqkm in order to find enough game, may be sufficient to suggest the origin of the canine obsession with walks and sniffs. Or that African Wild dog adults leave their pups in a burrow while they hunt until they are full. Then they return to the burrow. The dog that sees the adults first is most likely to be the first to be fed by the regurgitations of an adult. Adult regurgitations are usually much too large for pups to swallow. There is therefore a tug play where siblings will approach the first pup just as it approached the adult. And these might be the reasons dogs tend to bark and tug.
While it is important to review the biases (“personalities”) common among dogs, there are very few other limitations to the management, training and assessment applications of the Premack Principle, and the individual variations among unique sets of biases is mind-boggling. Every dog truly is different, so it’s important that we have a way to determine what kinds of behaviour are functional and what kinds of behaviour are dysfunctional.
Earlier, there was an example of a dog that freaked out every time it went for a walk. This is a classic example of where you can sometimes easily tell if a dog is listening or if its behaviour becomes dysfunctional. For example, if either the dog starts pulling equally hard on the leash no matter which direction you’re going or for however long, or when the tension the dog pulls with increases steadily from one walk to the next, these are pretty clear signs something is going wrong. Let’s make one more example with a dog that likes to bark and react at other dogs during its walk. If the amount the dog reacts to other dogs decreased steadily during the walk, I would be less worried than if the dog either held the same frequency and intensity of reactions to dogs throughout, or one whose frequency and intensity of reactions to dogs increased throughout.
A walk is important for physical and mental stimulation. The idea being is that every dog has a certain amount of “jellybeans” every day, that are wise to get walked out. A dog that’s done this is supposed to earn a benefit from taking that walk, and if walked sufficiently (i.e.: all the jellybeans came out), your average dog will happily nap away most of the rest of the day. On the other hand, if your dog comes home from the walk with more excitement than it started with, this is not usually a good sign. A walk is supposed to calm a dog’s anxieties, not provoke them. When a walk (or any other activity) no longer serves its intended function, it becomes dysfunctional by definition. Another good definition for dysfunction is the same as the definition of insanity; “trying the same thing over and over again, desperately hoping for a different result”.
“The three strands”
Dr. Ed Bailey, Professor Emeritus of animal and canine behaviour at the University of Guelph, has a regular column in “Gun Dog” magazine. Dr. Bailey is a very experienced and influential trainer and behaviourist in our area. One of the “Gun Dog” articles talks about “the three strands” inherent in the training and conditioning of all dogs. These three strands are wound together and depend on each other to strengthen “the behaviour and training rope” as a whole.
The first strand is the intended outcome of skills and behaviours which are inherent in any responsible training program, but this strand emphasizes the responsibilities owners and trainers must have to practice repetitions and insist on higher standards for emphasized skills to a greater degree than less-emphasized skills, to consistently reinforce desired behaviours and consistently withhold reinforcement for undesired behaviours.
The second strand is fallback and escalation. What key lessons does the owner/trainer fall back on if/when things go wrong? How does the owner/trainer follow through to ensure obedience occurs appropriately? Is there an “escalation of seriousness” cue -- and what might it look like?
The third strand is what parts of the desired training and behaviour outcomes does this particular dog like the best, and how can we make “work” as attractive a proposition as we can for this particular dog? What types of activity, reinforcement and cooperation does this dog enjoy the most?
Using the “rate of reinforcement”:
Many trainers are quite conservative when it comes to adjusting the rate of reinforcement, especially when it comes to the maximum rate of reinforcement available at any moment. Sometimes bringing a dog’s attention to new behaviour possibilities requires a VERY high rate of reinforcement. Dogs are capable of detecting very slight changes (10% or less) in the rate of reinforcement. Make sure a rapidly-changing reinforcement rate occurs whenever it is needed.
Starting and ending a training session with a pattern of behaviours of a known value that signal the use of specific skill-sets - for example, a professional working dog wearing a special harness or other equipment whenever they are working (some dogs also have a special bumper) -- donning and doffing the harness conditions the dog to prepare for work.
Eye contact, or “response to name” is an excellent behaviour to use as a bookend for a new dog you are working with.
Consistently and intentionally starting and ending a training session with a haphazardly or eclectically chosen activity -- to emphasize that no particular action is expected in this situation.
Walking into an environment and sensitizing the dog to the places and activities you wish to emphasize. One typical puppy training recommendation is that whenever you and your dog walk into the family room immediately guide your dog towards their bed, ask them to lay on it, and then reinforce your dog before you do anything else. Repetitive conditioning like this would make laying on the bed a much higher priority for the dog than compared to doing nothing. The key here is to guide the dog along the steps you want the dog to eventually take over. This practice intentionally "interferes" with a dog’s normal preference-assignment process.
If you want a dog to be relaxed, sometimes the easiest and best thing to do is to over-emphasize a “down” for a few minutes -- see Premack technique “emotional loading”.
Deliberately ignoring the dog in an environment, during an activity, or with an object, in order to intentionally allow the dog to form its own preferences associated with the environment, activity or object.
Deliberately using signs, signals and learned associations with known emotional values in association with an environment, activity or object which we wish to take on the known emotional values.
Deliberately using incidental occurrences, cues, and routine procedures of known emotional value in association with environments, activities and objects of you wish to take on the known emotional values. This is particularly useful when teaching complex skills like loose-leash walking or Service skills, as well as in preventative safety training.
Artificially introducing interruptions of a less-preferred activity/environment/individual/object into a bout of a more-preferred activity/environment/individual/object, repeated over time, should serve to reinforce the less-preferred activity/environment/individual/object, for example.
Cooperation and “cause and effect”:
Oftentimes, the only thing you have to do here is allow your dog to cue you, or make yourself hyper-sensitive to your dog’s cues in certain situations in order to ensure the dog knows you are paying attention. I cannot overemphasize the importance of making sure your dog knows you are cooperating with them, that’s why whenever they behave voluntarily, you need to be sure your dogs are properly reinforced. Ensure key skills remain emphasized by disciplining yourself to proactively condition voluntary desirable behaviour as frequently as possible.
Degrees of freedom:
You are responsible for ensuring you get proof that your dog can and will behave desirably in any given situation. You are also responsible for making sure your dog does not “abuse its freedom” by behaving undesirably. You must ensure that your dog does not have more freedom than it can responsibly handle, under its excitement threshold, and still respond to requests from you. Cooperation requires that the dog be able to cue you for some things.
Safety supervision and Behaviour management:
You are responsible for your dog’s safety, and the safety of your dog based on the conditioning you’ve given them (or perhaps neglected). You must see proof that your dog can behave in a safe and responsible manner without your help before you can “trust” your dog will behave well when unsupervised. Do not consistently leave a dog with unearned freedom or the least you will experience is bad conditioning.
Intentionally adding “unneccesary” repetitions of training exercises, or training to a standard greater than the minimum required in order to contribute to one of the “three strands”, and overcome inherited and learned behavioural biases.
These are just a few of the examples of how I use the Premack Principle on a daily basis. Being aware of the spectrum of reinforcement available to you at all times is critical for people hoping to teach trust, security and patience. I hope you find this article useful, please don’t hesitate to ask any questions, or offer critiques and suggestions.