Reinforcement: Positive, negative, social, and punishment


Historical background

Classical conditioning, Pavlov, claimed people learned when an event (stimulus) evoked a response. He went on to show in his famous experiment with dog food, a bell and a slobbery dog how an event (dog food stimulus) can become associated with a different stimulus (bell ringing) which did not originally cause a response (slobbering dog) could be paired to create a new association in the subject's (dog) brain. Thereby causing the new stimulus (bell ringing) to cause the same response (dog slobbering) as the original stimulus (dog food).

John B. Watson's (1920s) demonstrated how a similar procedure could cause learning in people. His famous experiment took a 11-month-old boy, Little Albert and paired the sight of a rat with a loud startling noise to teach Albert to fear rats.

L. Thorndike (1913) described the law of effect, which claims that behavior is influenced by the effects which follow it.

Skinner (1940s) later refined these ideas and described learning as operant conditioning. A person operates in the environment with a certain behavior, which results with reinforcement that will either increase or decrease punishment, which will increase or decrease behavior.

Premack Principle (1959) Using a preferred activity as a reward for a less preferred activity (Grandma’s rule) Eat your beans before you can have dessert.

Current reinforcement related ideas

We think of events (antecedents) and their related behaviors as reinforcers and results that follow these behaviors as consequences (ABC). Consequences can be thought of as reinforcers, which maintain or increase behavior, or punishment, which decrease or eliminate behavior. They occur after a behavior and can either add something new to the environment (party, recess, free time, ... ) or remove something present from the environment (alarm squeal, person's presence, swarming bees, ...).

A positive reinforcer is usually perceived as something pleasant (attention, privileges, honors, social approval, free-time, freedom, grades, praise, tokens, trophy, food, candy, sticker, smile, star, getting to be with friends, being in class, socializing, learning, understanding ... ).

A negative reinforcer is usually perceived as something unpleasant or something desired to be removed or taken away (pain from a thorn, seat belt alarm, fire alarm, failing, fear, unhappy, being under pressure, stress, nagging parent or teacher, attention, teasing, bullied, ... ).

However, by definition, for something to be a reinforcer it needs to increase the frequency of a behavior. A positive reinforcer increases behavior when it is present. Likewise, by definition something is a negative reinforcer only if it increases the frequency of a behavior when it is removed.

A child may misbehave to get increased attention - positive reinforcement, or may misbehave to get his parents to stop arguing - negative reinforcement.

A reward is a gift, recognition, or something given for service, effort, or achievement. Rewards are often confused with positive reinforcers, which must increase behavior to actually be one. A reward causes students’ to participate or effort to achieve something, but does not require that any behavior be reinforced positively. In fact many rewards actually decrease the likelihood of a behavior being repeated. For example, the offer of a reward can increase effort to achieve the reward, but after the reward is attained, their is less desire to repeat the behavior that won the award. Also it is possible that a person may already be reinforced by their personal pursuit of a goal for which a reward is being offered and compete to win the award. However, it can't be know it it was the medal or a personal pursuit of achievement that was the reinforcer?

Punishment is a consequence of a behavior that weakens or decreases behavior when present (actual or imagined). Often retribution for a behavior.

Because reinforcement always increases behavior, negative reinforcement is not the same as punishment. For example, a parent who spanks a child to make him stop misbehaving and actually decreases the child's misbehavior, is using punishment, while a parent who takes away a child's privileges to make him study harder, and the child actually studies harder, is using negative reinforcement.

Shaping is the gradual application of operant conditioning. For example, an infant who learns that smiling elicits positive parental attention will smile at its parents more. Babies generally respond well to operant conditioning.

Behavioral systems are physiological processes in the brain that trigger behavior. Release of dopamine, adrenaline, ...will increase activity in the Behavioral activation system (BAS) and the Behavioral inhibition system (BIS). Dopamine causes feelings of joy, hope, interest, and urges optimism to approach an object or event for personal gain.

Verbal prompts with reinforcement: When reinforcing an event be careful giving verbal or nonverbal prompts. The prompt should be removed as soon as possible so the person will associate something else with the behavior rather than the prompt. i.e. A child enters a room and does not hang up his or her coat. If you prompt her, then the prompt may be associated with hanging up the coat rather than with coming into the room or taking the coat off. Use a firm and direct prompt and do NOT add OK because that implies a choice when none is intended. If there is resistance you might repeat the directive once and if compliance is not achieved, then apply a brief time out and return to the practice.

Tangible Rewards - Recess, money, coupons to exchange for gifts, free time, video, stickers, trophy, certificates, medals, token economy...

The first public school in New York City used a token economy of coupons for toys. They abandoned it because the school’s trustees felt it fostered a mercenary spirit.

Alfie Kohn (1993) in his book Punished by Rewards cites thirty years of research showing the more people are rewarded for completing a task, the more they lose interest in that task in the long run.

Passage in Stargirl where one character asks the other why one would do a random act of kindness without ever being acknowledged.



Behavior increases - usually positive

Behavior decreases - usually negative

Add Positive reinforcer
(+ reinforcer + behavior)
Positive punisher
(+ punishment - behavior)
Remove Negative reinforcer
teacher stares till work
(- reinforcer + behavior)
Negative punisher
(- punishment - behavior)

When there are opposing behaviors, then when one changes in one direction the other will change in the opposite direction. Distracting behaviors decrease or stop as on task behaviors increase.

Reinforcement schedules

Continuous - all responses deemed satisfactory are reinforced.

Intermittent - some responses deemed satisfactory are reinforced.

Social reinforcement

  1. Ignore inappropriate behavior. (time out or negative reinforcement)
  2. Reinforcement should immediately follow the appropriate behavior. (reward)
  3. Reinforcement must be contingent on the specific behavior. (reward)
  4. Reinforcement should be individual. (A reinforcer for one may not be for another)
  5. Reinforce continually at first. (to associate behavior with consequences)
  6. Reinforce approximations of behavior. (shaping)
  7. Reinforce intermittently after behavior is established. (move from extrinsic to intrinsic)


When people use the phrase:
I’ve told you a hundred times.
They need to realize - it is not the child who is dense.

Misbehavior is generally discouraged with punishment:

The behaviors in the chart are punishable misbehaviors identified in United States schools by Hyman. (Hyman pps. 13-14)

Excessive talking in the classroom, hallway, lunchroom... Indecent language or gestures
Insolence toward school staff Stealing
Smoking Drug use
Fighting or attacking school personnel Defacing and vandalizing school property
Gambling Throwing objects in class or around school grounds
Loitering in unauthorized places Dishonesty
Petting Tardiness
Rudeness Not bringing required instructional materials to class
Absenteeism from class or school Leaving class or school without permission
Disobeying requests of school staff Not completing assignments
Inattention to classroom activities Possession of weapons
Habitually breaking the dress code Body odor
Cheating Extortion of other students
Organized protests


Type 1 punishment is application of an aversive event after a behavior.

Type 2 punishment is removal of a positive event after a behavior.

Technically punishment is a decrease in the rate of a behavior.

For example:

In the classroom if a child completes an assignment and the teacher says very good and the frequency of completion decreases, because of the teacher’s praise, then the student has been punished.

Again, Punishment is technically defined by its effect on behavior.

Punishment can include sounds, smells, tastes, visual images, or physical sensation.

Research supports both types of punishment as both working and not working.

Research also supports that punishment decreased misbehavior of people not being punished, but observed or heard about the punishment of others (Foxx, 1982; Axelrod, 1983; Van Houten, 1983). By definition it is punishment, since it reduces the future probability of behavior.

Baer (1971) argues that punishment is legitimate, commendable, and justifiable when it relieves persons of the even greater punishments that could result.

Ethical Considerations

  1. Identify the rationale for the treatment.
  2. Identify techniques to use.
  3. Use the doctrine of the least restrictive alternative. This means that other less intrusive procedures must be considered and/or tried before punishment is presented. This is based on the premise that the individual has the right of basic human freedoms. The intervention should not cause pain, tissue damage, humiliation, discomfort, and stigma as expected side effects accompanying the behavior change. Carr and Lovaas (1983) state the use of punishment by contingent presentation of a stimulus should not be the method of first choice, even when trying to reduce self injurious behavior. Should try 1) DRO, 2) DRO with extinction, 3) time out from positive reinforcement, so all environmental reinforcement is reduced, and 4) DRO combined with positive practice overcorrection, the intent is to have the individual practice appropriate, alternative responses. There may be times when none of these are appropriate, but you should have considered them and why they are not appropriate.
  4. Know if the issue is related to cruel and unusual punishment, and cruel and unusual punishment according to Longo (1981) serves no more effective purpose than a lesser punishment; and is inflicted arbitrarily.
    • The 18th amendment provides protection from this and the 14th protects individuals from harm.
    • This protection has been upheld by the courts in several cases Wheeler vs. Glass, 1973; NY Association for Retarded Children v. Carey, 1975.
    • Also in Ingraham v. Wright (1977) they upheld the notion that paddling as swatting of a student on the buttocks in the presence of witnesses, does not violate constitutional protection against cruel and unusual punishment. To lessen the risk three controls should be set-up 1) a review mechanism should be followed before, during, and after the punishment is administered; 2) staff should be properly trained and supervised; and 3) informed consent should be obtained from parents or legal guardians. Informed consent should include review of materials to deliver the stimulus, should discuss the nature and side effects of the program, all should experience the aversive stimulus themselves, public should be made aware of the proposed treatment. The person asked for consent should be able and capable to understand the program (language, mentally competent, no jargon).
  5. Have consent prior to implementation of a punishment procedure.

Decision Model before Initiation of Punishment

If decide punishment is the right procedure, then decide what procedure will be most effective.

Examine previous intervention records for other types and success of interventions and medical records.

If a least restrictive model is used, then the following should be considered in the following order: response cost, time out from positive reinforcement, and overcorrection.

The final decision should be based on:

  1. the individual characteristics of the child and the behavior problem,
  2. the likelihood of the program being implemented and carried out in a consistent manner,
  3. the probability of successfully eliminating the behavior, and
  4. the ethical and legal legitimacy of using the procedure. Then implement and evaluate.
Other Considerations

Negative Consequences from Punishment

Source: Cooper, J. O., Heron, T. E., & W. L. Heward. (1987) Applied behavior analysis. Columbus: OH. Merrill Publishing Co.


Dr. Robert Sweetland's notes
[Home: & ]