Explain the difference between reinforcement and punishment
Distinguish between reinforcement schedules
The previous section of this chapter focused on the type of associative
learning known as classical conditioning. Remember that in classical
conditioning, something in the environment triggers a reflex
automatically, and researchers train the organism to react to a
different stimulus. Now we turn to the second type of associative
learning, operant conditioning. In operant
conditioning, organisms learn to associate a behaviour and its
consequence ([link]). A pleasant consequence makes
that behaviour more likely to be repeated in the future. For example,
Spirit, a dolphin at the National Aquarium in Baltimore, does a flip in
the air when her trainer blows a whistle. The consequence is that she
gets a fish.
The target behaviour is followed by reinforcement or punishment to either
strengthen or weaken it, so that the learner is more likely to exhibit
the desired behaviour in the future. The behaviour is operant in this case, because
it operates on the environment.
Clinically-Application
Before planning behavioural modification, it is important to delineate whether the
problem behaviour is respondent or operant . This can be done with the ABC approach
in which a diary of the antecedent-behaviour-consequence is kept. Respondent
behaviour is treated by elimination or replacement of the antecedent stimulus, while
operant is treated by adressing the consequences.
Table 2 Classical and Operant Conditioning Compared
Classical
Conditioning
Operant Conditioning
Conditioning
approach
An unconditioned
stimulus (such as
food) is paired with
a neutral stimulus
(such as a bell).
The neutral stimulus
eventually becomes
the conditioned
stimulus, which
brings about the
conditioned response
(salivation).
The target behaviour
is followed by
reinforcement or
punishment to either
strengthen or weaken
it, so that the
learner is more
likely to exhibit
the desired behaviour
in the future.
Stimulus timing
The stimulus occurs
immediately before
the response.
The stimulus (either
reinforcement or
punishment) occurs
soon after the
response.
Psychologist BF Skinner saw
that classical conditioning is limited to existing behaviours that are
reflexively elicited, and it doesn’t account for new behaviours such as
riding a bike. He proposed a theory about how such behaviours come about.
Skinner believed that behaviour is motivated by the consequences we
receive for the behaviour: the reinforcements and punishments. His idea
that learning is the result of consequences is based on the law of
effect , which was first proposed by psychologist Edward
Thorndike.
According to the law of effect, behaviours that are followed by
consequences that are satisfying to the person are more likely to be
repeated, and behaviours that are followed by unpleasant consequences are
less likely to be repeated (Thorndike, 1911). Essentially, if an
individual does something that brings about a desired result, the person
is more likely to do it again. If someone does something that does
not bring about a desired result, the person is less likely to do it
again. An example of the law of effect is in employment. One of the
reasons (and often the main reason) we show up for work is because we
get paid to do so. If we stop getting paid, we will likely stop showing
up—even if we love our job.
Law of Effect
Any response followed by a satisfying situation (drive reduction?)
Is likely to be repeated.
Behaviours resulting in an annoying situation are less likely to occur.
Actions that subsequently lead to a “satisfying state of affairs”
are more likely to be repeated.
Working with Thorndike’s law of effect as his foundation, Skinner began
conducting scientific experiments on animals (mainly rats and pigeons)
to determine how organisms learn through operant conditioning (Skinner,
1938). He placed these animals inside an operant conditioning chamber,
which has come to be known as a Skinner box
([Skinner-Box]). A Skinner box contains
a lever
(for rats) or disk (for pigeons) that the animal can press or peck for a
food reward via the dispenser. Speakers and lights can be associated
with certain behaviours. A recorder counts the number of responses made
by the animal.
Fig. 17 An illustration shows a rat in a Skinner box:
a chamber with a speaker, lights, a lever, and a food dispenser.
See also
Watch this brief video clip to
learn more about operant conditioning: Skinner is interviewed, and
operant conditioning of pigeons is demonstrated.
In discussing operant conditioning, we use several everyday
words—positive, negative, reinforcement, and punishment—in a specialized
manner. In operant conditioning, positive and negative do not mean good
and bad. Instead, positive means adding something, and
negative means taking something away. Reinforcement means
increasing a behaviour, and punishment means decreasing
a behaviour. Reinforcement can be positive or negative, and punishment
can also be positive or negative. All reinforcers (positive or negative)
increase the likelihood of a behavioural response. All punishers
(positive or negative) decrease the likelihood of a behavioural
response. Now let us combine these four terms: positive reinforcement,
negative reinforcement, positive punishment, and negative punishment
([link]).
Table 3 Positive and Negative Reinforcement and Punishment
Reinforcement
Punishment
Positive
Something is added to
increase the likelihood
of a behaviour.
Something is added to
decrease the likelihood
of a behaviour.
Negative
Something is removed to
increase the likelihood
of a behaviour.
Something is removed to
decrease the likelihood
of a behaviour.
An event that follows a response and increases the strength of the response and/or the likelihood that it will be repeated is known as a reinforcer.
A reinforcer always increases the probability or intensity of a response occurring.
Reinforcement is the process by which consequences—a stimulus or an event follows a behaviour—lead to an increase in the likelihood
that the response will occur again. Reinforcement may be positive or negative depending on whether the outcome of a behaviour a true reward—a positive stimulus
or reinforcer eg praise—OR the removal of an aversive stimulus.
Positive reinforcement is the most effective way to teach a person or animal a new behaviour.
In positive reinforcement, a desirable stimulus is added to increase a behaviour.
For example, a parent tells his five-year-old son, Saad, that if he cleans
his room, he will get a toy. Saad quickly cleans his room because he
wants a new art set. Let us pause for a moment. Some people might say,
“Why should I reward my child for doing what is expected?” But in fact,
we are constantly and consistently rewarded in our lives. Our paychecks
are rewards, as are high grades and acceptance into our preferred
school. Being praised for doing a good job and passing a driver’s
test is also a reward. Positive reinforcement as a learning tool is
highly effective. It has been found that one of the most effective
ways to increase achievement in school districts with below-average
reading scores was to pay the children to read. Specifically,
second-grade students in Dallas were paid $2 each time they read a book
and passed a short quiz about it. The result was a significantly
increased reading comprehension (Fryer, 2010). What do you think about
this program? If Skinner were alive today, he would probably think this
was a great idea. He was a strong proponent of using operant
conditioning principles to influence students’ behaviour at school. In
fact, in addition to the Skinner box, he also invented what he called
the teaching machine that was designed to reward small steps in learning
(Skinner, 1961)—an early forerunner of computer-assisted learning. His
teaching machine tested students’ knowledge as they worked through
various school subjects. If students answered questions correctly, they
received immediate positive reinforcement and could continue; if they
answered incorrectly, they did not receive any reinforcement. The idea
was that students would spend additional time studying the material to
increase their chance of being reinforced the next time (Skinner, 1961).
In negative reinforcement, an undesirable
stimulus is removed to increase a behaviour. For example, car
manufacturers use the principles of negative reinforcement in their
seatbelt systems, which go “beep, beep, beep” until you fasten your
seatbelt. The annoying sound stops when you exhibit the desired
behaviour, increasing the likelihood that you will buckle up in the
future. Negative reinforcement is also used frequently in horse
training. Riders apply pressure—by pulling the reins or squeezing their
legs—and then remove the pressure when the horse performs the desired
behaviour, such as turning or speeding up. The pressure is the negative
stimulus that the horse wants to remove.
In substances users, using a drug to relieve withdrawal symptoms is also an example
of negative reinforcement.
When two different behaviours are reinforced and then the reinforcement of
one behaviour is withdrawn to extinguish it, the other behaviour is likely to increase.
Differential reinforcement is defined as reinforcing a specific type behaviour
while withholding reinforcement for another behaviour. It may be useful for a child
who exhibits an unwanted behaviour; attention to the unwanted behaviour is reduced
while the child may be reinforced for an alternative, though not
incompatible behaviour.
Clinical Correlate
Amina, for instance, rips hair out of her head while finishing her personal tasks.
Her therapist chooses to reinforce the lack of hair pulling by using differential reinforcement.
The therapist follows these steps and places a three-minute timer on Amina’s desk.
Amina is reinforced if she refrains from pulling her hair for the full three minutes.
If Amina does pull her hair, the countdown is restarted and she is not reinforced.
A high-probability behaviour can be used to reinforce a low-probability behaviour.
For example, listening to online classes over favourite devices,
or listening to music while doing heavy workouts.
Many people confuse negative reinforcement with punishment in operant
conditioning, but they are two very different mechanisms. Remember that
reinforcement, even when it is negative, always increases a behaviour. In
contrast, punishment always decreases a
behaviour. In positive punishment, you add an
undesirable stimulus to decrease a behaviour.
An example of positive
punishment is scolding a student to get the student to stop texting in
class. In this case, a stimulus (the reprimand) is added in order to
decrease the behaviour (texting in class).
In negative punishment, you remove a pleasant stimulus to decrease behaviour.
For example, when a child misbehaves, a parent can take away a favourite
toy. In this case, a stimulus (the toy) is removed in order to decrease
the behaviour.
Attention
Negative reinforcement and punishment are not the same. See text for more details.
Punishment, especially when it is immediate, is one way to decrease
undesirable behaviour. For example, imagine your four-year-old son,
Brandon, hit his younger brother. You have Brandon write 100 times “I
will not hit my brother” (positive punishment). Chances are he won’t
repeat this behaviour.
While strategies like this are common today, in
the past, children were often subject to physical punishment, such as
spanking. It’s important to be aware of some of the drawbacks in using
physical punishment on children.
First, punishment may teach fear.
Brandon may become fearful of the street, but he also may become fearful
of the person who delivered the punishment—you, his parent. Similarly,
children whom teachers punish may come to fear the teacher and
try to avoid school (Gershoff et al., 2010). Consequently, most schools
in the United States have banned corporal punishment. Second, punishment
may cause children to become more aggressive and prone to antisocial
behaviour and delinquency (Gershoff, 2002). They see their parents resort
to spanking when they become angry and frustrated, so, in turn, they may
act out this same behaviour when they become angry and frustrated. For
example, because you spank Brenda when you are angry with her for her
misbehaviour, she might start hitting her friends when they won’t share
their toys.
While positive punishment can be effective in some cases, Skinner
suggested that the use of punishment should be weighed against the
possible negative effects. Today’s psychologists and parenting experts
favour reinforcement over punishment—they recommend that you catch your
child doing something good and reward her for it.
Typically, studies show that children who receive corporal punishment have higher levels of aggression,
delinquency, and behavioural issues (Gershoff, 2002).
Doing so is also linked over time to a variety of mental health disorders,
increased criminal behaviour, and slower cognitive development.
Some have argued that the evidence connecting spanking to harmful outcomes is correlational,
and correlation does not imply causality.
It really is possible that spanking makes kids more aggressive, but it’s also conceivable that violent kids make their parents employ physical punishment more frequently.
However, the American Psychological Association strictly advises against the use of corporal punishments in children.
Warning
Corporal punishment must not be avoided in children. Children exposed to corporal punishments
are likely to develop aggressive and violent traits as adults,
and are predisposed to a variety of psychiatric disorders.
Delinquency and slower cognitive development are also associated with corporal punishments.
In his operant conditioning experiments, Skinner often used an approach
called shaping. Instead of rewarding only the target behaviour, in
shaping, we reward successive approximations
of a target behaviour.
Remember that the organism must first display the behaviour for reinforcement to work.
Shaping is needed because it is improbable that an organism will
spontaneously display anything but the simplest of behaviours. In
shaping, behaviours are broken down into many small, achievable steps.
Steps of Shaping
In shaping, the desired behaviour is achieved in several
small achievable steps—much like in systematic desensitization.
This this step is key to the success of behaviour modification with shaping.
The specific steps used in the process are the following:
Reinforce any response that resembles the desired behaviour.
Then reinforce the response that more closely resembles the desired behaviour. You will no longer reinforce the previously reinforced
response.
Next, begin to reinforce the response that even more closely resembles the desired behaviour.
Successive approximations: Continue to reinforce closer and closer approximations of the desired
behaviour.
Finally, only reinforce the desired behaviour.
A series of gradual steps, each of which is more like the final desired response.
It involves rewarding behaviours that approximate the target behaviour
and so behaviours come closer and closer to the target behaviour.
There is a reinforcement of each of these simple steps of behaviour
that lead to a desired, more complex behaviour.
Shaping is often used in teaching a complex behaviour or chain of
behaviours. Skinner used shaping to teach pigeons not only such
relatively simple behaviours as pecking a disk in a Skinner box, but also
many unusual and entertaining behaviours, such as turning in circles,
walking in figure eights, and even playing ping pong; the technique is
commonly used by animal trainers today. An essential part of shaping is
stimulus discrimination. Recall Pavlov’s dogs—he trained them to respond
to the tone of a bell, and not to similar tones or sounds. This
discrimination is also important in operant conditioning and in shaping
behaviour.
See also
Here is a brief video of
Skinner’s pigeons playing ping pong.
It’s easy to see how shaping is effective in teaching behaviours to
animals, but how does shaping work with humans? Let’s consider parents
whose goal is to have their child learn to clean his room. They use
shaping to help him master steps toward the goal. Instead of performing
the entire task, they set up these steps and reinforce each step. First,
he cleans up one toy. Second, he cleans up five toys. Third, he chooses
whether to pick up ten toys or put his books and clothes away. Fourth,
he cleans up everything except two toys. Finally, he cleans his entire
room.
Rewards such as stickers, praise, money, toys, and more can be used to
reinforce learning. Let’s go back to Skinner’s rats again. How did the
rats learn to press the lever in the Skinner box? They were rewarded
with food each time they pressed the lever. For animals, food would be
an obvious reinforcer.
What would be a good reinforcer for humans? For your daughter Sydney, it
was the promise of a toy if she cleaned her room. How about Joaquin, the
soccer player? If you gave Joaquin a piece of candy every time he made a
goal, you would be using a primary reinforcer.
Primary reinforcers are reinforcers that have innate reinforcing
qualities. These kinds of reinforcers are not learned. Water, food,
sleep, shelter, sex, and touch, among others, are primary reinforcers.
Pleasure is also a primary reinforcer. Organisms do not lose their drive
for these things. For most people, jumping in a cool lake on a very hot
day would be reinforcing and the cool lake would be innately
reinforcing—the water would cool the person off (a physical need), as
well as provide pleasure.
A secondary reinforcer has no inherent value
and only has reinforcing qualities when linked with a primary
reinforcer. Praise, linked to affection, is one example of a secondary
reinforcer, as when you called out “Great shot!” every time Joaquin made
a goal. Another example, money, is only worth something when you can use
it to buy other things—either things that satisfy basic needs (food,
water, shelter—all primary reinforcers) or other secondary reinforcers.
If you were on a remote island in the middle of the Pacific Ocean and
you had stacks of money, the money would not be useful if you could not
spend it. What about the stickers on the behaviour chart? They also are
secondary reinforcers.
Important
Star charts use secondary reinforcers. Tokens used in token economies (see below)
are also secondary reinforcers.
Sometimes, instead of stickers on a sticker chart, a token is used.
Tokens, which are also secondary reinforcers, can then be traded in for
rewards and prizes. Entire behaviour management systems, known as token
economies, are built around the use of these kinds of token reinforcers.
Token economies have been found to be very effective at modifying
behaviour in a variety of settings such as schools, prisons, and mental
hospitals.
For example, a study by Cangi and Daly (2013) found that use
of a token economy increased appropriate social behaviours and reduced
inappropriate behaviours in a group of autistic school children. Autistic
children tend to exhibit disruptive behaviours such as pinching and
hitting. When the children in the study exhibited appropriate behaviour
(not hitting or pinching), they received a “quiet hands” token. When
they hit or pinched, they lost a token. The children could then exchange
specified amounts of tokens for minutes of playtime.
Clinical Correlate
Behaviour Modification in Children
Parents and teachers often use behaviour modification to change a
child’s behaviour. Behaviour modification uses the principles of
operant conditioning to accomplish behaviour change so that
undesirable behaviours are switched for more socially acceptable ones.
Fig. 18 A photograph shows a child placing stickers on a chart hanging on the wall.
Some teachers and parents create a sticker chart, in which several
behaviours are listed ([Stickers]). Sticker
charts are a form of token economies, as described in the text. Each
time children perform the behaviour, they get a sticker, and after a
certain number of stickers, they get a prize, or reinforcer. The goal
is to increase acceptable behaviours and decrease misbehaviour.
Remember, it is best to reinforce desired behaviours, rather than to
use punishment.
In the classroom, the teacher can reinforce a wide range of behaviours, from students raising their hands, to walking quietly in the hall, to turning in their homework.
At home, parents might create a behaviour chart that rewards children for things such as putting away toys, brushing their teeth, and helping with dinner.
In order for behaviour modification to be effective, the reinforcement
needs to be connected with the behaviour; the reinforcement must
matter to the child and be done consistently.
Time-out is another popular technique used in behaviour modification
with children, especially those with intellectual disability.
It works according to the principle of negative punishment.
When a child demonstrates an undesirable behaviour, she is removed
from the desirable activity at hand.
([link]). For example, say that Sadia and
her brother Salman are playing with building blocks. Sadia throws
some blocks at her brother, so her parent give her a warning that she will
go to time-out if she does it again. A few minutes later, she throws
more blocks at Salman. Caregiver removes Sadia from the room for a few
minutes. When she comes back, she doesn’t throw blocks.
This technique is especially useful for managing aggressive behaviour.
There are several important points that a parent should know before
implementing time-out as a behaviour modification technique.
First, make sure the child is being removed from a desirable activity and placed in a less desirable location. If the activity is something undesirable for the child, this technique will backfire because it is more enjoyable for the child to be removed from the activity.
Second, the length of the time-out is important. The general rule of thumb is one minute for each year of the child’s age. Sadia is five; therefore, she sits in a time-out for five minutes. Setting a timer helps children know how long they have to sit in time-out.
Finally, as a caregiver, keep several guidelines in mind over the course of a time-out: remain calm when directing your child to time-out; ignore your child during time-out (because caregiver attention may reinforce misbehaviour); and give the child a hug or a kind word when time-out is over.
Fig. 19 Photograph A shows several children climbing on playground
equipment. Photograph B shows a child sitting alone at a table
looking at the playground.
Remember, the best way to teach a person or animal a behaviour is to use
positive reinforcement. For example, Skinner used positive reinforcement
to teach rats to press a lever in a Skinner box. At first, the rat might
randomly hit the lever while exploring the box, and out would come a
pellet of food. After eating the pellet, what do you think the hungry
rat did next? It hit the lever again, and received another pellet of
food. Each time the rat hit the lever, a pellet of food came out. When
an organism receives a reinforcer each time it displays a behaviour, it
is called continuous reinforcement. This
reinforcement schedule is the quickest way to teach someone a behaviour,
and it is especially effective in training a new behaviour. Let’s look
back at the dog that was learning to sit earlier in the chapter. Now,
each time he sits, you give him a treat. Timing is important here: you
will be most successful if you present the reinforcer immediately after
he sits, so that he can make an association between the target behaviour
(sitting) and the consequence (getting a treat).
See also
Watch this video
clip
where veterinarian Dr. Sofia Yin shapes a dog’s behaviour using the
steps outlined above.
Once a behaviour is trained, researchers and trainers often turn to
another type of reinforcement schedule—partial reinforcement. In
partial reinforcement, also referred to as
intermittent reinforcement, the person or animal does not get reinforced
every time they perform the desired behaviour. There are several
different types of partial reinforcement schedules
([link]). These schedules are described as either
fixed or variable, and as either interval or ratio. Fixed refers to
the number of responses between reinforcements, or the amount of time
between reinforcements, which is set and unchanging. Variable refers
to the number of responses or amount of time between reinforcements,
which varies or changes. Interval means the schedule is based on the
time between reinforcements, and ratio means the schedule is based on
the number of responses between reinforcements.
Now let’s combine these four terms. A fixed interval reinforcement schedule
is when behaviour is rewarded after a
set amount of time. For example, June undergoes major surgery in a
hospital. During recovery, she is expected to experience pain and will
require prescription medications for pain relief. June is given an IV
drip with a patient-controlled painkiller. Her doctor sets a limit: one
dose per hour. June pushes a button when pain becomes difficult, and she
receives a dose of medication. Since the reward (pain relief) only
occurs on a fixed interval, there is no point in exhibiting the behaviour
when it will not be rewarded.
With a variable interval reinforcement schedule, the person or animal gets the reinforcement based on
varying amounts of time, which are unpredictable. Say that Manuel is the
manager at a fast-food restaurant. Every once in a while someone from
the quality control division comes to Manuel’s restaurant. If the
restaurant is clean and the service is fast, everyone on that shift
earns a $20 bonus. Manuel never knows when the quality control person
will show up, so he always tries to keep the restaurant clean and
ensures that his employees provide prompt and courteous service. His
productivity regarding prompt service and keeping a clean restaurant are
steady because he wants his crew to earn the bonus.
With a fixed ratio reinforcement schedule,
there are a set number of responses that must occur before the behaviour
is rewarded. Carla sells glasses at an eyeglass store, and she earns a
commission every time she sells a pair of glasses. She always tries to
sell people more pairs of glasses, including prescription sunglasses or
a backup pair, so she can increase her commission. She does not care if
the person really needs the prescription sunglasses, Carla just wants
her bonus. The quality of what Carla sells does not matter because her
commission is not based on quality; it’s only based on the number of
pairs sold. This distinction in the quality of performance can help
determine which reinforcement method is most appropriate for a
particular situation. Fixed ratios are better suited to optimize the
quantity of output, whereas a fixed interval, in which the reward is not
quantity based, can lead to a higher quality of output.
In a variable ratio reinforcement schedule,
the number of responses needed for a reward varies. This is the most
powerful partial reinforcement schedule. An example of the variable
ratio reinforcement schedule is gambling.
Imagine that Sarah—generally an intelligent, thrifty woman—visits Las Vegas for the first time. She is not a
gambler, but out of curiosity, she puts a quarter into the slot machine,
and then another, and another. Nothing happens. Two dollars in quarters
later, her curiosity is fading, and she is just about to quit. But then,
the machine lights up, bells go off, and Sarah gets 50 quarters back.
That is more like it! Sarah gets back to inserting quarters with renewed
interest, and a few minutes later, she has used up all her gains and is
$10 in the hole. Now might be a reasonable time to quit. Nevertheless, she
keeps putting money into the slot machine because she never knows when
the next reinforcement is coming. She keeps thinking that she could win
$50, or $100, or even more in the next quarter. Because the
reinforcement schedule in most types of gambling has a variable ratio
schedule, people keep trying and hoping that the next time they will win
big. This is one of the reasons that gambling is excessively addictive—and so
resistant to extinction.
Clinical Correlate
Gambling employs the variable-ratio schedule.
Behaviours reinforced through variable ratio schedule are most resistant to extinction.
In operant conditioning, the extinction of a reinforced behaviour occurs at
some point after reinforcement stops, and the speed at which this
happens depends on the reinforcement schedule. In a variable ratio
schedule, the point of extinction comes very slowly, as described above.
But in the other reinforcement schedules, extinction may come quickly.
For example, if June presses the button for the pain relief medication
before the allotted time her doctor has approved, no medication is
administered. She is on a fixed interval reinforcement schedule (dosed
hourly), so extinction occurs quickly when reinforcement doesn’t come at
the expected time. Among the reinforcement schedules, variable ratio is
the most productive and the most resistant to extinction. Fixed interval
is the least productive and the easiest to extinguish
([link]).
Fig. 20 A graph has an x-axis labeled “Time” and a y-axis labeled “Cumulative
number of responses.”
Two lines labeled “Variable Ratio” and “Fixed
Ratio” have similar, steep slopes. The variable ratio line remains
straight and is marked in random points where reinforcement occurs. The
fixed ratio line has consistently spaced marks indicating where
reinforcement has occurred, but after each reinforcement, there is a
small drop in the line before it resumes its overall slope. Two lines
labeled “Variable Interval” and “Fixed Interval” have similar slopes at
roughly a 45-degree angle. The variable interval line remains straight
and is marked in random points where reinforcement occurs. The fixed
interval line has consistently spaced marks indicating where
reinforcement has occurred, but after each reinforcement, there is a
drop in the line.
“If the gambling establishment cannot persuade
a patron to turn over money with no return, it may achieve the same
effect by returning part of the patron’s money on a variable-ratio
schedule” (p. 397).
Skinner uses gambling as an example of the power and effectiveness of
conditioning behaviour based on a variable ratio reinforcement
schedule. Skinner was so confident in his knowledge of
gambling addiction that he even claimed he could turn a pigeon into a
pathological gambler (“Skinner’s Utopia,” 1971). Beyond the power of
variable ratio reinforcement, gambling seems to work on the brain in
the same way as some substances of abuse. The Illinois Institute for
Addiction Recovery (n.d.) reports evidence suggesting that
pathological gambling is an addiction similar to a chemical addiction
([link]). Specifically, gambling may
activate the brain’s reward centers, like other substances of abuse.
Research has shown that some pathological gamblers have lower levels
of norepinephrine than do normal gamblers (Roy, et al., 1988).
According to a study conducted by Alec Roy and colleagues,
norepinephrine is secreted during
stress, arousal, or thrill; pathological gamblers use
gambling to increase their levels of this neurotransmitter.
Another researcher, neuroscientist Hans Breiter, has done extensive research
on gambling and its effects on the brain. Breiter (as cited in
Franzen, 2001) reports that “Monetary reward in a gambling-like
experiment produces brain activation very similar to that observed in
a cocaine addict receiving an infusion of cocaine” (para. 1).
Deficiencies in serotonin might also
contribute to compulsive behaviour, including a gambling addiction.
It may be that pathological gamblers’ brains are different than those
of other people, and perhaps this difference may somehow have led to
their gambling addiction, as these studies seem to suggest. However,
it is very difficult to ascertain the cause because it is impossible
to conduct a true experiment (it would be unethical to try to turn
randomly assigned participants into problem gamblers). Therefore, it
may be that causation actually moves in the opposite
direction—perhaps the act of gambling somehow changes
neurotransmitter levels in some gamblers’ brains. It also is possible
that some overlooked factor, or confounding variable, played a role
in both the gambling addiction and the differences in brain
chemistry.
Fig. 21 A photograph shows four digital gambling machines.
Gaming disorder is a new diagnostic category in the ICD-11. Game studios intentionally
employ reinforcements in different forms. Enough evidence has now emerged to convince the
World Health Organization to officially recognize gaming addiction as health condition.
It is however, like gambling disorder, classified under
Impulse Control Disorders
Mental processes such as thinking, knowing, problem-solving, and remembering
According to cognitive theorists, these processes are critically important in a more complete, more comprehensive view of learning
Proposed by Wolfgang Köhler. Insight is the sudden realization of the
relationship between elements in a problem situation, which makes the solution apparent.
A method of learning where a problem is solved by using reason, particularly to draw conclusions,
inferences, or judgments. In contrast to trial-and-error learning, insight learning involves addressing
problems based on conceptual experiments rather than actual experience (unlike trial-and-error stages).
When someone has been stuck in a difficulty for a while and all of a sudden realises how to fix it,
this happens frequently. This was seen in Wolfgang Kohler’s chimpanzee experimentation in the 1900s.
Kohler discovered that chimpanzees could answer issues without error by using insight learning.
In one instance, a banana was positioned such that chimpanzees couldn’t reach it, but they managed to do it.
To get there, they stacked boxes on top of one another, then they used sticks to knock the banana over.
Although strict behaviourists such as Skinner and Watson refused to
believe that cognition (such as thoughts and expectations) plays a role
in learning, another behaviourist, Edward C. Tolman, had a different opinion.
Tolman’s experiments with rats demonstrated that organisms can learn even if they
do not receive immediate
reinforcement. [2][3]
This finding was in conflict with the
prevailing idea at the time that reinforcement must be immediate in
order for learning to occur, thus suggesting a cognitive aspect to
learning.
In the experiments, Tolman placed hungry rats in a maze with no reward
for finding their way through it. He also studied a comparison group
that was rewarded with food at the end of the maze. As the unreinforced
rats explored the maze, they developed a cognitive map a mental picture of the layout of the maze
([link]). After 10 sessions in the maze
without reinforcement, food was placed in a goal box at the end of the
maze. As soon as the rats became aware of the food, they were able to
find their way through the maze quickly, just as quickly as the
comparison group, which had been rewarded with food all along. This is
known as latent learning: learning that occurs
but is not observable in behaviour until there is a reason to demonstrate
it.
Fig. 22 An illustration shows three rats in a maze, with a starting point and food at the end.
Latent learning also occurs in humans. Children may learn by watching
the actions of their parents but only demonstrate it at a later date,
when the learned material is needed. For example, suppose that Ravi’s
dad drives him to school every day. In this way, Ravi learns the route
from his house to his school, but he’s never driven there himself, so he
has not had a chance to demonstrate that he’s learned the way. One
morning Ravi’s dad has to leave early for a meeting, so he can’t drive
Ravi to school. Instead, Ravi follows the same route on his bike that
his dad would have taken in the car. This demonstrates latent learning.
Ravi had learned the route to school, but had no need to demonstrate
this knowledge earlier.
Tip
This Place Is Like a Maze
Have you ever gotten lost in a building and couldn’t find your way
back out? While that can be frustrating, you’re not alone. At one
time or another we’ve all gotten lost in places like a museum,
hospital, or university library. Whenever we go someplace new, we
build a mental representation—or cognitive map—of the location, as
Tolman’s rats built a cognitive map of their maze. However, some
buildings are confusing because they include many areas that look
alike or have short lines of sight. Because of this, it’s often
difficult to predict what’s around a corner or decide whether to turn
left or right to get out of a building. Psychologist Laura Carlson
(2010) suggests that
what we place in our cognitive map can impact
our success in navigating through the environment. She suggests that
paying attention to specific features upon entering a building, such
as a picture on the wall, a fountain, a statue, or an escalator, adds
information to our cognitive map that can be used later to help find
our way out of the building.
See also
Watch this to learn more about Carlson’s studies on cognitive maps and navigation in
buildings.
Behaviours that are maintained by negative reinforcement (escape Behaviour because the organism’s performance allows the organism to escape an undesirable stimulus).
Escape behaviour is a two-factor form of learning (the organism learns to identify a stimulus that signals the initiation of an aversive stimulus).
If the organism performs the target behaviour in the presence of a cue, the organism can escape the negative reinforcer.
Two factors
Discrimination learning (cue) and
Avoidance or escape learning. l
Clinical Correlate
Substance users begin to experience withdrawal symptoms in the course of development of drug dependence.
At some point, the user learns that the use of the substance can help reduce—escape—this unpleasant state.
This is an example of escape learning.
Gradually, substance users learn to prevent withdrawal from happening by using the substance at the earliest cues of withdrawal.
This would be avoidance learning.
Operant conditioning is based on the work of B. F. Skinner. Operant
conditioning is a form of learning in which the motivation for a
behaviour happens after the behaviour is demonstrated. An animal or a
human receives a consequence after performing a specific behaviour. The
consequence is either a reinforcer or a punisher. All reinforcement
(positive or negative) increases the likelihood of a behavioural
response. All punishment (positive or negative) decreases the
likelihood of a behavioural response. Several types of reinforcement
schedules are used to reward behaviour depending on either a set or
variable period of time.
Question
________ is when you take away a pleasant stimulus to stop a
behaviour.
positive reinforcement
negative reinforcement
positive punishment
negative punishment
Check Answer
D
Question
Which of the following is not an example of a primary
reinforcer?
food
money
water
sex
Check Answer
B
Question
Rewarding successive approximations toward a target behaviour is
shaping
extinction
positive reinforcement
negative reinforcement
Check Answer
A
Question
Slot machines reward gamblers with money according to which
reinforcement schedule?
A Skinner box is an operant conditioning chamber used to train
animals such as rats and pigeons to perform certain behaviours,
like pressing a lever. When the animals perform the desired
behaviour, they receive a reward: food or water.
What is the difference between negative reinforcement and punishment?
In negative reinforcement you are taking away an undesirable
stimulus in order to increase the frequency of a certain behaviour
(e.g., buckling your seat belt stops the annoying beeping sound in
your car and increases the likelihood that you will wear your
seatbelt). Punishment is designed to reduce a behaviour (e.g., you
scold your child for running into the street in order to decrease
the unsafe behaviour.)
What is shaping and how would you use shaping to teach a dog to
roll over?
Shaping is an operant conditioning method in which you reward
closer and closer approximations of the desired behaviour. If you
want to teach your dog to roll over, you might reward him first
when he sits, then when he lies down, and then when he lies down
and rolls onto his back. Finally, you would reward him only when
he completes the entire sequence: lying down, rolling onto his
back, and then continuing to roll over to his other side.
What type of reinforcement schedule is involved in the development of
superstitious behaviour?
When a reinforcer or punishment is accidentally delivered quickly
after an independent behaviour
(temporal contiguity), superstitious behaviour results.
As a result, the conduct is unintentionally promoted or punished,
which raises the possibility that it will happen again.
For instance, let’s say you step under a ladder, misstep, and fall a moment later.
It is simple to blame your mishap on “poor luck” and the unrelated ladder. The fact that your fall occurred shortly after walking beneath the ladder positively reinforces your cultural idea that doing so will bring bad luck makes it simple for associations to establish.
Explain the difference between negative reinforcement and punishment, and provide several examples of each based on your own experiences.
Think of a behaviour that you have that you would like to change. How could you use behaviour modification, specifically positive reinforcement, to change your behaviour? What is your positive reinforcer?
A young house officer usually attends educational seminars only if there is a post-seminar lunch, or if he knows that there will be a photo session with the chief guest, otherwise, he either gets himself posted at ER on that day or reports sick. Explain the behavior of the house officer according to B.F Skinner’s theory.
What are the schedules of reinforcement?
What reinforcers can you use in clinical settings?
behaviour that is followed by consequences satisfying to the
organism will be repeated and behaviours that are followed by
unpleasant consequences will be discouraged ^
This work is (being) adapted from on OpenStax Psychology 2e which is licensed under creative commons attribution 4.0 license. We license our work under a similar license.
If you copy, adapt, remix or build up on work, you must give appropriate credit, provide a link to the license, and indicate if changes were made.
You may do so in any reasonable manner,
but not in any way that suggests the licensor endorses you or your use.