PSYU2236 Week 3 Notes: The Use Of Rewards To Shape Behaviour

Summary

Difficulty: ★★☆☆☆

Covers: Classical vs Operant conditioning, types of reinforcers, reinforcement schedules, premack’s principle, chaining, differential reinforcement techniques

Quizlet flashcards:

https://quizlet.com/au/1115146035/the-use-of-rewards-to-shape-behaviour-flash-cards/?i=6xlcf8&x=1jqt

What is Operant Conditioning?

Operant conditioning examines how voluntary behaviour is shaped by consequences. This week focuses on reinforcement, schedules of reinforcement, shaping, and differential reinforcement techniques.

Classical vs Operant Conditioning
FeatureClassical ConditioningOperant Conditioning
Type of behaviourInvoluntary, reflexiveVoluntary, intentional
Learns what?Association between two stimuliAssociation between behaviour and consequence
Response typeAutomatic (UCR/CR)Goal-directed behaviour
ExampleDog salivates to bellRat presses lever for food
The Skinner Box

Used to measure operant conditioning in controlled environments.

ComponentDescription
ResponseDefined behaviour (lever press, peck) required to obtain reward
ReinforcerFood or liquid released as reward
CuesSignals indicating whether reinforcement is available
MeasurementFrequency and timing of responses

Links closely to Thorndike’s Law of Effect: behaviours followed by satisfying outcomes are strengthened.

Shaping

Shaping = reinforcing successive approximations of the target behaviour.

Process:

  1. Reinforce any behaviour close to the desired behaviour.
  2. Gradually require behaviours closer to the target.
  3. Withhold reinforcement for non-target actions.

Used to teach new or complex behaviours.

Types of Reinforcers

Definition Table

CategoryDefinitionEffect on Behaviour
Positive reinforcementAdding a desirable stimulusIncreases behaviour
Negative reinforcementRemoving an aversive stimulusIncreases behaviour
Positive punishmentAdding an aversive stimulusDecreases behaviour
Negative punishmentRemoving a desirable stimulusDecreases behaviour

Examples

TypeExample
Positive reinforcementGiving stickers for good behaviour
Negative reinforcementRemoving a painful rock from shoe → encourages wearing shoes
Positive punishmentScolding or smacking for misbehaviour
Negative punishmentTaking away screen time after failing an exam

How to Identify Them (Two-Step Rule)

  1. Is behaviour increasing or decreasing?
    • Increasing → reinforcement
    • Decreasing → punishment
  2. Is something added or removed?
    • Added → positive
    • Removed → negative
Emotional Effects of Each Consequence Type
TypeEmotional ImpactNotes
Positive reinforcementSatisfaction, motivationMost effective for long-term change
Negative reinforcementReliefCan maintain avoidance behaviours
Positive punishmentFear, anxietyWeak for long-term change; ethical concerns
Negative punishmentDisappointmentEffective for reducing specific behaviours
Schedules of Reinforcement

Continuous Reinforcement

  • Behaviour reinforced every time.
  • Useful for establishing new behaviour.
  • Behaviour extinguishes quickly once reinforcement stops.

Partial (Intermittent) Reinforcement

ScheduleDefinitionResponse PatternResistance to ExtinctionExamples
Fixed Ratio (FR)Reinforcement after fixed number of responsesHigh rate, pauses after reward (post-reinforcement pause)Low to moderatePiecework pay
Variable Ratio (VR)Reinforcement after unpredictable number of responsesVery high rate, steadyVery highGambling, slot machines
Fixed Interval (FI)Reinforcement available after fixed timeScalloped pattern; increased responding near time limitLowStudying before exams
Variable Interval (VI)Reinforcement after unpredictable intervalsSteady, moderate rateHighFishing

Key terms:

  • Post-reinforcement pause (PRP): seen in fixed ratio schedules.
  • Ratio strain: break in responding when ratio requirements become too high.
  • Ratio run: rapid responding leading up to reinforcement.
Differential Reinforcement Techniques

Used in applied behaviour analysis to increase desired behaviours and reduce unwanted ones.

TypeDefinitionExample
DRO (Other Behaviour)Reinforce any behaviour except the problem behaviourReward students for staying silent rather than calling out
DRL (Low Rates)Reinforce behaviour only when it occurs at low ratesReinforce washing hands once instead of repeatedly
DRI (Incompatible Behaviour)Reinforce behaviour incompatible with the undesired behaviourReward sitting in seat to prevent running
DRA (Alternative Behaviour)Reinforce a more appropriate alternative behaviourReward saying “please” instead of whining
What Makes Reinforcement Effective?
FactorDescription
Reinforcer magnitudeLarger rewards produce faster learning
Contrast effectsChanging reward value in one context affects behaviour in another
Delay of reinforcementLonger delays weaken learning; may accidentally reinforce other behaviours (superstitious behaviour)
Speed of rewardFaster reinforcement increases dopamine release
Primary reinforcersReinforcers with intrinsic biological value (food, water, warmth)
Secondary reinforcersReinforcers that gain value through association with primary reinforcers (money, tokens, praise)

Strength of Secondary Reinforcers Depends On:

  • Size/magnitude of primary reinforcer
  • Number of pairings
  • Timing between secondary and primary reinforcer
Premack Principle

A high-probability behaviour can reinforce a low-probability behaviour.
Example: “You can play outside after you finish homework.”

Chaining

Teaching complex behaviours by breaking them into small steps and reinforcing each link.

TypeDescription
Forward chainingTeach steps in order from first to last
Backward chainingTeach final step first; work backwards

Leave a comment