Skinner’s Operant Conditioning

Operant Conditioning.

The operant conditioning is the procedure by which a change in the consequences of a response will affect the rate at which the response occurs. The operant-conditioning process, the progress of a rat in Skinner’s operant-conditioning apparatus, also known as the Skinner box.

When a food-deprived rat kept in the box, its behavior at first is spontaneous and random. The rat is active, sniffing, poking, and exploring its environment. These behaviors emitted, not elicited; in other words, the rat is not responding to any specific stimulus in its environment.

At some time during this activity, the rat will depress a lever or bar located on one wall of the Skinner box, causing a food pellet to drop into a trough. The rat’s behavior (pressing the lever) has operated on the environment and, as a result, has changed it.

The food is a reinforcer for the behavior of depressing the bar. The rat begins to press the bar more often. What happens? It receives more food—more reinforcement—and so presses the bar even more frequently. For example, the rat’s behavior is now under the control of the reinforcers. Its actions in the box are less random and spontaneous because it is spending most of its time pressing the bar, and eating.

If we put the rat back in the box the next day, we can predict its behavior and we can control its bar-pressing actions by presenting or withholding the reinforcers or by presenting them at a different rate. Withholding the food extinguishes operant behavior in the same way that it extinguishes respondent behavior. If the unreinforced behavior no longer works, in that it no longer brings a reward, after a while it will stop. Thus, the person who controls the reinforcers controls the subjects’ behavior.

Operant conditioning and skinner box

Skinner believed that we can learn most human and animal behavior through operant conditioning. Consider how babies learn. An infant initially displays random, spontaneous behaviors, only some of which reinforced (rewarded with food or hugs or toys, for example) by parents, siblings, or caregivers.  As the infant grows, the positively reinforced behaviors, those of which the parents approve, will persist, whereas extinguish or discontinue those of which the parents disapprove.

The concept is the same with the rat in the Skinner box. Behaviors that work (pressing the bar to obtain food) are displayed frequently, and behaviors that do not work are not repeated. Thus, the organism’s behavior operates on the environment. And in turn, the environment, in the form of reinforcement, operates on the organism’s behavior.

Reinforcement can be powerful in determining and controlling behavior. Operant conditioning shapes behavior as a sculptor shapes a lump of clay”. If that lump of clay, that organism, needs the reinforcer badly enough, there is virtually no limit to how its behavior can be shaped—by an experimenter with a food pellet, a puppy owner with a dog biscuit, a mother with a smile, a boss with a pat on the back, or a government with a promise.

From infancy on, we display many behaviors, and those that are reinforced will strengthen and form patterns. This is how Skinner conceived of personality, as a pattern or collection of operant behaviors.

By demonstrating how behavior could be modified by continuous reinforcement. That is, by presenting a reinforcer after every response. Later, Skinner decided to consider how behavior would change if he varied the rate at which the behavior reinforced.

Schedules of Reinforcement

Skinner pointed out that in everyday life outside the psychology laboratory, our behavior rarely reinforce every time it occurs. We do not pick up and cuddle a baby every time he or she cries. Baseball superstars do not hit a home run every time at bat. The bagger in the supermarket does not receive a tip for each bag packed. And your favorite singing group doesn’t win a Grammy for every new song.

Skinner investigated different reinforcement schedules (Patterns or rates of providing or withholding reinforcers.).

Among the rates of reinforcement he tested are the following.

  1. Fixed interval 
  2. Fixed ratio 
  3. Variable interval
  4. Variable ratio

Fixed interval

A fixed-interval schedule of reinforcement means that the reinforcer is presented following the first response that occurs after a fixed time interval has elapsed. That interval might be 1 minute, 3 minutes, or any other fixed period of time. Skinner’s research showed that the shorter the interval between presentations of the reinforcer, the greater the frequency of response. However, the response rate declined as the interval between reinforcements lengthened.

Fixed ratio

In the fixed-ratio schedule of reinforcement, reinforcers are given only after the organism has made a specified number of responses. For example, the experimenter could reinforce after every 10th or 20th response. In this schedule, unlike the fixed-interval schedule, the presentation of reinforcers depends on how often the subject responds. For example, in a job in which your pay is determined on a piece-rate basis, how much you earn depends on how much you produce. The more items you produce, the higher your pay. Your reward is based directly on your response rate. This reinforcement schedule brings about a faster rate of responding than does the fixed-interval schedule.

Variable interval

In the variable-interval schedule of reinforcement, the reinforcer might appear after 2 hours in the first instance, after 1 hour 30 minutes the next time, and after 2 hours and 15 minutes the third time. For example, a person who spends the day fishing might be rewarded, if at all, on a variable-interval basis. The reinforcement schedule is determined by the random appearance of fish nibbling at the bait.

Variable ratio

A variable-ratio schedule of reinforcement is based on an average number of responses between reinforcers, but there is great variability around that average. However, Skinner found that the variable-ratio schedule is effective in bringing about high and stable response rates, as the people who operate gambling casinos can happily attest. Slot machines, roulette wheels, horse races, and the state lottery games pay on a variable-ratio reinforcement schedule, an extremely effective means of controlling behavior.

Variable reinforcement schedules result in enduring response behaviors that tend to resist extinction. Most everyday learning occurs as a result of variable-interval or variable-ratio reinforcement schedules.

Click here, to read about classical conditioning.


Leave a Reply

Your email address will not be published. Required fields are marked *