The Wheel of Operant Conditioning

I have to start this post, and should probably start every post by saying, that Linda Kaim told me this before I heard it from anyone else. But like many things I learned during my time as her apprentice, I didn’t get it for a long time. We ran into each other for the first time in a long time at a West-Gibbons method seminar last weekend at the York Pointer Setter Club near my house. We had great weather for the weekend and during a conversation evaluating the work we were seeing done with a young dog, Linda reminded me again of something she told me a long time ago: “It’s not quadrants. It’s a wheel.”

This idea was also presented at my NePoPo® New Silver School by Bart and Michael and reiterated at the New Gold school I attended. I found this part of the school frustrating because they way the terms “positive” and “negative” are used were different. But, to be fair, they are used in the more commonly accepted and understood manner.

The difficulty that people have understanding operant conditioning seems to me to be two things:

  1. They try to use it in a declarative fashion, “I’m going to positively reinforce this behavior by giving the dog a food reward after he does it,” instead of as a way to describe the observed effect a consequence had on a target behavior, “I gave the dog a food reward after he did the behavior, and I observed an increase in frequency or response rate of the behavior afterward.” To be clear, the latter is correct. Whether or not some consequence will have a reinforcing or punishing effect on a behavior depending on a variety of things such as the dog’s inherent desire to acquire or avoid the thing, and the thing’s relative value compared to whatever else is going on in the current situation (distractions, competing motivators, saturation point for a reinforcer, etc.)

  2. They think of reps of a behavior as distinct events that can only result in one outcome.

    1. I tell my dog to sit, they sit, I give them a treat. (positive reinforcement)

    2. I pull up on my dog’s collar, they sit, I stop pulling on their collar (negative reinforcement).

 

Representation as quadrants

In the classic presentation of operant conditioning, we have quadrants: positive, and negative (additive and subtractive); and reinforcement (strengthens behavior), and punishment (weakens behavior); are presented in a matrix.

 

Representation in NePoPo®

A more useful way to think about operant conditioning is to think of two training systems: one where the dog is seeking comfort and another where the dog is seeking reward. In the NePoPo® system we are taught that there are two systems of training. One where the dog gains or loses reward as a consequence of their behavior, and another where they encounter discomfort or restore comfort as a consequence of their behavior. Presenting these as quadrants does not allow for the same understanding because it couples the consequences by effect on response rate instead of by experience of the dog. NePoPo® frames operant conditioning from the dog’s point of view instead of the observer/trainer.

If we split the four quadrants up based on the type of consequences the dog experiences, it looks like this:

 

Representation as a wheel/circle/cycle

But if we take it a step further and consider the behavior of the dog can fluctuate over a period of time, we can understand that:

  • We push the dog into the behavior with some form of pressure, making them uncomfortable and they begin seeking a solution to restore comfort.

  • When the dog enters the behavior, we remove pressure, restoring comfort.

  • While the dog is still engaging in the behavior, they receive a reward, which adds to the positive experience of the dog.

  • Something else motivates them into non-compliance so they stop doing the behavior we want and lose access to the reward.

  • We reapply pressure which stops the non-compliant behavior as it pushes the dog back into compliance with the behavior we want.

  • This cycle can continue until some terminal signal is provided that lets the dog know this rep/event is over.

This is the most complete and accurate representation of operant conditioning because it relies on external factors to maintain motivation for behavior and accounts for the possibility that the dog may experiencing a cycle of consequences that occur in a specific sequence and all exert different influences to keep the dog in the target behavior and away from non-target behavior.

 

If you liked this post…

If you liked this post, consider clicking that little button with the coffee mug on it that says “Support Us.” You can choose to give us a one time tip, or become a member of our Ko-Fi page where we share early and exclusive content. When we build our membership a bit more we’ll start doing giveaways! We’re currently releasing the monthly webinars that we do for our versatile hunting dog organizations to that platform.

Previous
Previous

Building Ignition

Next
Next

Wesensfestigkeit