{"id":786,"date":"2018-08-20T10:52:22","date_gmt":"2018-08-20T14:52:22","guid":{"rendered":"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/?post_type=chapter&#038;p=786"},"modified":"2020-08-17T14:10:59","modified_gmt":"2020-08-17T18:10:59","slug":"chapter-10-useful-things-to-know-about-instrumental-conditioning","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/chapter\/chapter-10-useful-things-to-know-about-instrumental-conditioning\/","title":{"raw":"Useful Things to Know about Instrumental Conditioning","rendered":"Useful Things to Know about Instrumental Conditioning"},"content":{"raw":"<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 6pt;margin-right: 5.85pt\">Most of the things that affect the strength of classical conditioning also affect the strength of instrumental learning\u2014whereby we learn to associate our actions with their outcomes. As noted earlier, the \u201cbigger\u201d the reinforcer (or punisher), the stronger the learning. And, if an instrumental behavior is no longer reinforced, it will also be extinguished. Most of the rules of associative learning that apply to classical conditioning also apply to instrumental learning, but other facts about instrumental learning are also worth knowing.<\/p>\r\n\r\n<h2>Instrumental Responses Come Under Stimulus Control<\/h2>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">As you know, the classic operant response in the laboratory is lever-pressing in rats, reinforced by food. However, things can be arranged so that lever-pressing only produces pellets when a particular stimulus is present. For example, lever-pressing can be reinforced only when a light in the Skinner box is turned on; when the light is off, no food is released from lever- pressing. The rat soon learns to discriminate between the light-on and light-off conditions, and presses the lever only in the presence of the light (responses in light-off are extinguished). In everyday life, think about waiting in the turn lane at a traffic light. Although you know that green means go, only when you have the green <em>arro<\/em><em>w<\/em> do you turn. In this regard, the operant behavior is now said to be under <a href=\"#_bookmark93\"><strong>stimulus<\/strong> <strong>control<\/strong><\/a>. And, as is the case with the traffic light, in the real world, stimulus control is probably the rule.<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.1pt\">The stimulus controlling the operant response is called a <a href=\"#_bookmark91\"><strong>discriminative<\/strong> <strong>stimulus<\/strong><\/a>. It can be associated directly with the response, or the reinforcer (see below). However, it usually does not elicit the response the way a classical CS does. Instead, it is said to \u201cset the occasion for\u201d the operant response. For example, a canvas put in front of an artist does not elicit painting behavior or compel her to paint. It allows, or sets the occasion for, painting to occur.<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">Stimulus-control techniques are widely used in the laboratory to study perception and other psychological processes in animals. For example, the rat would not be able to respond appropriately\u00a0to light-on and light-off conditions if it could not see the light. Following this logic, experiments using stimulus-control methods have tested how well animals see colors, hear ultrasounds, and detect magnetic fields. That is, researchers pair these discriminative stimuli with those they know the animals already understand (such as pressing the lever). In this way, the researchers can test if the animals can learn to press the lever only when an ultrasound is played, for example.<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.1pt\">These methods can also be used to study \u201chigher\u201d cognitive processes. For example, pigeons can learn to peck at different buttons in a Skinner box when pictures of flowers, cars, chairs, or people are shown on a miniature TV screen (see <a href=\"#_bookmark6\">Wasserman, 1995<\/a>). Pecking button 1 (and no other) is reinforced in the presence of a flower image, button 2 in the presence of a chair image, and so on. Pigeons can learn the discrimination readily, and, under the right conditions, will even peck the correct buttons associated with pictures of <em>ne<\/em><em>w<\/em> flowers, cars, chairs, and people they have never seen before. The birds have learned to <a href=\"#_bookmark90\"><strong>categorize<\/strong> <\/a>the sets of stimuli. Stimulus-control methods can be used to study how such categorization is learned.<\/p>\r\n\r\n<h2>Operant Conditioning Involves Choice<\/h2>\r\n<img src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image5-5.jpeg\" width=\"303.866666666667px\" height=\"321.533333333333px\" alt=\"image\" class=\"aligncenter\" \/>\r\n\r\nAnother thing to know about operant conditioning is that the response always requires choosing one behavior over others. The student who goes to the bar on Thursday night chooses to drink instead of staying at home and studying. The rat chooses to press the lever instead of sleeping or scratching its ear in the back of the box. The alternative behaviors are each associated with their own reinforcers. And the tendency to perform a particular action depends on both the reinforcers earned for it and the reinforcers earned for its alternatives.\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt\">To investigate this idea, choice has been\u00a0studied in the Skinner box by making two levers available for the rat (or two buttons available for the pigeon), each of which has its own reinforcement or payoff rate. A thorough study of choice in situations like this has led to a rule called the <a href=\"#_bookmark92\"><strong>quantitative<\/strong> <strong>law<\/strong> <strong>of<\/strong> <strong>effect<\/strong> <\/a>(see <a href=\"#_bookmark47\">Herrnstein, 1970<\/a>), which can be understood without going into quantitative detail: The law acknowledges\u00a0the fact that the effects of reinforcing one behavior depend crucially on how much reinforcement is earned for the behavior\u2019s alternatives. For example, if a pigeon learns that pecking one light will reward two food pellets, whereas the other light only rewards one, the pigeon will only peck the first light. However, what happens if the first light is more strenuous to reach than the second one? Will the cost of energy outweigh the bonus of food? Or will the extra food be worth the work? In general, a given reinforcer will be less reinforcing if there are many alternative reinforcers in the environment. For this reason, alcohol, sex, or drugs may be less powerful reinforcers if the person\u2019s environment is full of other sources of reinforcement, such as achievement at work or love from family members.<\/p>\r\n\r\n<h2>Cognition in Instrumental Learning<\/h2>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 6pt;margin-right: 5.85pt\">Modern research also indicates that reinforcers do more than merely strengthen or \u201cstamp in\u201d the behaviors they are a consequence of, as was Thorndike\u2019s original view. Instead, animals learn about the specific consequences of each behavior, and will perform a behavior depending on how much they currently want\u2014or \u201cvalue\u201d\u2014its consequence.<\/p>\r\n<p class=\"import-Normal\" style=\"margin-left: 6pt\"><img src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image6-6.jpeg\" width=\"624px\" height=\"223.859947506562px\" alt=\"image\" \/><\/p>\r\n<p class=\"import-Normal\" style=\"margin-left: 6pt\">[Image courtesy of Bernard W. Balleine]<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">This idea is best illustrated by a phenomenon called the <a href=\"#_bookmark93\"><strong>reinforcer<\/strong> <strong>devaluation<\/strong> <strong>effect<\/strong> <\/a>(see <a href=\"#_bookmark15\">Colwill &amp; Rescorla, 1986<\/a>). A rat is first trained to perform two instrumental actions (e.g., pressing a lever on the left, and on the right), each paired with a different reinforcer (e.g., a sweet sucrose solution, and a food pellet). At the end of this training, the rat tends to press both levers, alternating between the sucrose solution and the food pellet. In a second phase, one of the reinforcers (e.g., the sucrose) is then separately paired with illness. This conditions a taste aversion to the sucrose. In a final test, the rat is returned to the Skinner box and allowed to press either lever freely. No reinforcers are presented during this test (i.e., no sucrose or food\u00a0comes from pressing the levers), so behavior during testing can only result from the rat\u2019s memory of what it has learned earlier. Importantly here, the rat chooses <em>not<\/em> to perform the response that once produced the reinforcer that it now has an aversion to (e.g., it won\u2019t press the sucrose lever). This means that the rat has learned and remembered the reinforcer associated with each response, and can combine that knowledge with the knowledge that the reinforcer is now \u201cbad.\u201d Reinforcers do not merely stamp in responses; the animal learns much more than that. The behavior is said to be \u201c<strong>goal-directed<\/strong>\u201d (see <a href=\"#_bookmark15\">Dickinson &amp; Balleine,<\/a> <a href=\"#_bookmark15\">1994<\/a>), because it is influenced by the current value of its associated goal (i.e., how much the rat wants\/doesn\u2019t want the reinforcer).<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.1pt\">Things can get more complicated, however, if the rat performs the instrumental actions frequently and repeatedly. That is, if the rat has spent many months learning the value of pressing each of the levers, the act of pressing them becomes automatic and routine. And here, this once goal-directed action (i.e., the rat pressing the lever for the goal of getting sucrose\/food) can become a <a href=\"#_bookmark91\"><strong>habit<\/strong><\/a>. Thus, if a rat spends many months performing the lever- pressing behavior (turning such behavior into a habit), even when sucrose is again paired with illness, the rat will continue to press that lever (see <a href=\"#_bookmark47\">Holland, 2004<\/a>). After all the practice, the instrumental response (pressing the lever) is no longer sensitive to reinforcer devaluation. The rat continues to respond automatically, regardless of the fact that the sucrose from this lever makes it sick.<\/p>\r\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.85pt\">Habits are very common in human experience, and can be useful. You do not need to relearn each day how to make your coffee in the morning or how to brush your teeth. Instrumental behaviors can eventually become habitual, letting us get the job done while being free to think about other things.<\/p>","rendered":"<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 6pt;margin-right: 5.85pt\">Most of the things that affect the strength of classical conditioning also affect the strength of instrumental learning\u2014whereby we learn to associate our actions with their outcomes. As noted earlier, the \u201cbigger\u201d the reinforcer (or punisher), the stronger the learning. And, if an instrumental behavior is no longer reinforced, it will also be extinguished. Most of the rules of associative learning that apply to classical conditioning also apply to instrumental learning, but other facts about instrumental learning are also worth knowing.<\/p>\n<h2>Instrumental Responses Come Under Stimulus Control<\/h2>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">As you know, the classic operant response in the laboratory is lever-pressing in rats, reinforced by food. However, things can be arranged so that lever-pressing only produces pellets when a particular stimulus is present. For example, lever-pressing can be reinforced only when a light in the Skinner box is turned on; when the light is off, no food is released from lever- pressing. The rat soon learns to discriminate between the light-on and light-off conditions, and presses the lever only in the presence of the light (responses in light-off are extinguished). In everyday life, think about waiting in the turn lane at a traffic light. Although you know that green means go, only when you have the green <em>arro<\/em><em>w<\/em> do you turn. In this regard, the operant behavior is now said to be under <a href=\"#_bookmark93\"><strong>stimulus<\/strong> <strong>control<\/strong><\/a>. And, as is the case with the traffic light, in the real world, stimulus control is probably the rule.<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.1pt\">The stimulus controlling the operant response is called a <a href=\"#_bookmark91\"><strong>discriminative<\/strong> <strong>stimulus<\/strong><\/a>. It can be associated directly with the response, or the reinforcer (see below). However, it usually does not elicit the response the way a classical CS does. Instead, it is said to \u201cset the occasion for\u201d the operant response. For example, a canvas put in front of an artist does not elicit painting behavior or compel her to paint. It allows, or sets the occasion for, painting to occur.<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">Stimulus-control techniques are widely used in the laboratory to study perception and other psychological processes in animals. For example, the rat would not be able to respond appropriately\u00a0to light-on and light-off conditions if it could not see the light. Following this logic, experiments using stimulus-control methods have tested how well animals see colors, hear ultrasounds, and detect magnetic fields. That is, researchers pair these discriminative stimuli with those they know the animals already understand (such as pressing the lever). In this way, the researchers can test if the animals can learn to press the lever only when an ultrasound is played, for example.<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.1pt\">These methods can also be used to study \u201chigher\u201d cognitive processes. For example, pigeons can learn to peck at different buttons in a Skinner box when pictures of flowers, cars, chairs, or people are shown on a miniature TV screen (see <a href=\"#_bookmark6\">Wasserman, 1995<\/a>). Pecking button 1 (and no other) is reinforced in the presence of a flower image, button 2 in the presence of a chair image, and so on. Pigeons can learn the discrimination readily, and, under the right conditions, will even peck the correct buttons associated with pictures of <em>ne<\/em><em>w<\/em> flowers, cars, chairs, and people they have never seen before. The birds have learned to <a href=\"#_bookmark90\"><strong>categorize<\/strong> <\/a>the sets of stimuli. Stimulus-control methods can be used to study how such categorization is learned.<\/p>\n<h2>Operant Conditioning Involves Choice<\/h2>\n<p><img decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image5-5.jpeg\" width=\"303.866666666667px\" height=\"321.533333333333px\" alt=\"image\" class=\"aligncenter\" \/><\/p>\n<p>Another thing to know about operant conditioning is that the response always requires choosing one behavior over others. The student who goes to the bar on Thursday night chooses to drink instead of staying at home and studying. The rat chooses to press the lever instead of sleeping or scratching its ear in the back of the box. The alternative behaviors are each associated with their own reinforcers. And the tendency to perform a particular action depends on both the reinforcers earned for it and the reinforcers earned for its alternatives.<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt\">To investigate this idea, choice has been\u00a0studied in the Skinner box by making two levers available for the rat (or two buttons available for the pigeon), each of which has its own reinforcement or payoff rate. A thorough study of choice in situations like this has led to a rule called the <a href=\"#_bookmark92\"><strong>quantitative<\/strong> <strong>law<\/strong> <strong>of<\/strong> <strong>effect<\/strong> <\/a>(see <a href=\"#_bookmark47\">Herrnstein, 1970<\/a>), which can be understood without going into quantitative detail: The law acknowledges\u00a0the fact that the effects of reinforcing one behavior depend crucially on how much reinforcement is earned for the behavior\u2019s alternatives. For example, if a pigeon learns that pecking one light will reward two food pellets, whereas the other light only rewards one, the pigeon will only peck the first light. However, what happens if the first light is more strenuous to reach than the second one? Will the cost of energy outweigh the bonus of food? Or will the extra food be worth the work? In general, a given reinforcer will be less reinforcing if there are many alternative reinforcers in the environment. For this reason, alcohol, sex, or drugs may be less powerful reinforcers if the person\u2019s environment is full of other sources of reinforcement, such as achievement at work or love from family members.<\/p>\n<h2>Cognition in Instrumental Learning<\/h2>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 6pt;margin-right: 5.85pt\">Modern research also indicates that reinforcers do more than merely strengthen or \u201cstamp in\u201d the behaviors they are a consequence of, as was Thorndike\u2019s original view. Instead, animals learn about the specific consequences of each behavior, and will perform a behavior depending on how much they currently want\u2014or \u201cvalue\u201d\u2014its consequence.<\/p>\n<p class=\"import-Normal\" style=\"margin-left: 6pt\"><img decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image6-6.jpeg\" width=\"624px\" height=\"223.859947506562px\" alt=\"image\" \/><\/p>\n<p class=\"import-Normal\" style=\"margin-left: 6pt\">[Image courtesy of Bernard W. Balleine]<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5.95pt;margin-right: 5.85pt\">This idea is best illustrated by a phenomenon called the <a href=\"#_bookmark93\"><strong>reinforcer<\/strong> <strong>devaluation<\/strong> <strong>effect<\/strong> <\/a>(see <a href=\"#_bookmark15\">Colwill &amp; Rescorla, 1986<\/a>). A rat is first trained to perform two instrumental actions (e.g., pressing a lever on the left, and on the right), each paired with a different reinforcer (e.g., a sweet sucrose solution, and a food pellet). At the end of this training, the rat tends to press both levers, alternating between the sucrose solution and the food pellet. In a second phase, one of the reinforcers (e.g., the sucrose) is then separately paired with illness. This conditions a taste aversion to the sucrose. In a final test, the rat is returned to the Skinner box and allowed to press either lever freely. No reinforcers are presented during this test (i.e., no sucrose or food\u00a0comes from pressing the levers), so behavior during testing can only result from the rat\u2019s memory of what it has learned earlier. Importantly here, the rat chooses <em>not<\/em> to perform the response that once produced the reinforcer that it now has an aversion to (e.g., it won\u2019t press the sucrose lever). This means that the rat has learned and remembered the reinforcer associated with each response, and can combine that knowledge with the knowledge that the reinforcer is now \u201cbad.\u201d Reinforcers do not merely stamp in responses; the animal learns much more than that. The behavior is said to be \u201c<strong>goal-directed<\/strong>\u201d (see <a href=\"#_bookmark15\">Dickinson &amp; Balleine,<\/a> <a href=\"#_bookmark15\">1994<\/a>), because it is influenced by the current value of its associated goal (i.e., how much the rat wants\/doesn\u2019t want the reinforcer).<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.1pt\">Things can get more complicated, however, if the rat performs the instrumental actions frequently and repeatedly. That is, if the rat has spent many months learning the value of pressing each of the levers, the act of pressing them becomes automatic and routine. And here, this once goal-directed action (i.e., the rat pressing the lever for the goal of getting sucrose\/food) can become a <a href=\"#_bookmark91\"><strong>habit<\/strong><\/a>. Thus, if a rat spends many months performing the lever- pressing behavior (turning such behavior into a habit), even when sucrose is again paired with illness, the rat will continue to press that lever (see <a href=\"#_bookmark47\">Holland, 2004<\/a>). After all the practice, the instrumental response (pressing the lever) is no longer sensitive to reinforcer devaluation. The rat continues to respond automatically, regardless of the fact that the sucrose from this lever makes it sick.<\/p>\n<p class=\"import-BodyText\" style=\"text-align: justify;margin-left: 5pt;margin-right: 5.85pt\">Habits are very common in human experience, and can be useful. You do not need to relearn each day how to make your coffee in the morning or how to brush your teeth. Instrumental behaviors can eventually become habitual, letting us get the job done while being free to think about other things.<\/p>\n","protected":false},"author":23,"menu_order":9,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":"cc-by-nc-sa"},"chapter-type":[48],"contributor":[],"license":[54],"class_list":["post-786","chapter","type-chapter","status-publish","hentry","chapter-type-numberless","license-cc-by-nc-sa"],"part":233,"_links":{"self":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/786","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/users\/23"}],"version-history":[{"count":3,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/786\/revisions"}],"predecessor-version":[{"id":1612,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/786\/revisions\/1612"}],"part":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/parts\/233"}],"metadata":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/786\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/media?parent=786"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapter-type?post=786"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/contributor?post=786"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/license?post=786"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}