{"id":479,"date":"2018-08-14T18:12:26","date_gmt":"2018-08-14T22:12:26","guid":{"rendered":"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/?post_type=chapter&#038;p=479"},"modified":"2020-08-26T08:02:57","modified_gmt":"2020-08-26T12:02:57","slug":"why-should-i-trust-science-if-it-cant-prove-anything","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/chapter\/why-should-i-trust-science-if-it-cant-prove-anything\/","title":{"raw":"Why Should I Trust Science If It Can\u2019t Prove Anything?","rendered":"Why Should I Trust Science If It Can\u2019t Prove Anything?"},"content":{"raw":"<p class=\"import-BodyText\">It\u2019s worth delving a bit deeper into why we ought to trust the scientific inductive process, even when it relies on limited samples that don\u2019t offer absolute \u201cproof.\u201d To do this, let\u2019s examine a widespread practice in psychological science: null-hypothesis significance testing.<\/p>\r\n<p class=\"import-BodyText\">To understand this concept, let\u2019s begin with another research example. Imagine, for instance, a researcher is curious about the ways maturity affects academic performance. She might have a hypothesis that mature students are more likely to be responsible about studying and completing homework and, therefore, will do better in their courses. To test this hypothesis, the researcher needs a measure of maturity and a measure of course performance. She might calculate the <a href=\"#_bookmark14\"><strong>correlation<\/strong><\/a>\u2014or relationship\u2014between student age (her measure of maturity) and points earned in a course (her measure of academic performance). Ultimately, the researcher is interested in the likelihood\u2014or probability\u2014 that these two variables closely relate to one another. <strong>Null-hypothesis<\/strong> <strong>significance<\/strong> <strong>testing <\/strong><a href=\"#_bookmark29\">(NHST)<\/a>assesses the probability that the collected data (the observations) would be the same if there were no relationship between the variables in the study. Using our example, the NHST would test the probability that the researcher would find a link between age and class performance if there were, in reality, no such link.<\/p>\r\n&nbsp;\r\n<p class=\"import-Normal\"><img src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image11-1.jpeg\" alt=\"image\" width=\"299.133333333333px\" height=\"299.133333333333px\" \/><\/p>\r\n<p class=\"import-Normal\">Is there a relationship between student age and academic performance? How could we research this question? How confident can we be that our observations reflect reality? [Image: Jeremy Wilburn, https:\/\/goo.gl\/i9MoJb, CC BY-NC-ND 2.0, https:\/\/goo.gl\/SjTsDg]<\/p>\r\n<p class=\"import-BodyText\">Now, here\u2019s where it gets a little complicated. NHST involves a <em>null<\/em> <em>hypothesis,<\/em> a statement that two variables are <em>not<\/em> related (in this case, that student maturity and academic performance are <em>not<\/em> related in any meaningful way). NHST also involves an <em>alternative<\/em> <em>hypothesis,<\/em> a statement that two variables <em>ar<\/em><em>e<\/em> related (in this case, that student maturity and academic performance go together). To evaluate these two hypotheses, the researcher collects data. The researcher then compares what she expects to find (probability) with what she actually finds (the collected data) to determine whether she can falsify, or reject, the null hypothesis in favor of the alternative hypothesis.<\/p>\r\n<p class=\"import-BodyText\">How does she do this? By looking at the <a href=\"#_bookmark28\"><strong>distribution<\/strong> <\/a>of the data. The distribution is the spread of values\u2014in our example, the numeric values of students\u2019 scores in the course. The researcher will test her hypothesis by comparing the observed distribution of grades earned by older students to those earned by younger students, recognizing that some distributions are more or less likely. Your intuition tells you, for example, that the chances of every single person in the course getting a perfect score are lower than their scores being distributed across all levels of performance.<\/p>\r\n<p class=\"import-BodyText\">The researcher can use a probability table to assess the likelihood of any distribution she finds in her class. These tables reflect the work, over the past 200 years, of mathematicians and scientists from a variety of fields. You can see, in Table 2a, an example of an expected distribution if the grades were normally distributed (most are average, and relatively few are amazing or terrible). In Table 2b, you can see possible results of this imaginary study, and can clearly see how they differ from the expected distribution.<\/p>\r\n<p class=\"import-BodyText\">In the process of testing these hypotheses, there are four possible outcomes. These are determined by two factors: 1) reality, and 2) what the researcher finds (see Table 3). The best possible outcome is <em>accurate<\/em> <em>detection<\/em>. This means that the researcher\u2019s conclusion mirrors reality. In our example, let\u2019s pretend the more mature students do perform slightly better. If this is what the researcher finds in her data, her analysis qualifies as an accurate detection of reality. Another form of accurate detection is when a researcher finds no evidence for a phenomenon, but that phenomenon doesn\u2019t actually exist anyway! Using this same example, let\u2019s now pretend that maturity has <em>nothing<\/em> to do with academic performance. Perhaps academic performance is instead related to intelligence or study habits. If the researcher finds no evidence for a link between maturity and grades and none actually exists, she will have also\u00a0achieved accurate detection.<\/p>\r\n<p class=\"import-Normal\"><img src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image12-1.jpeg\" alt=\"image\" width=\"509.99937007874px\" height=\"600px\" \/><\/p>\r\n<p class=\"import-Normal\">Table 2a (Above): Expected grades if there were no difference\u00a0between the two groups. Table 2b (Below): Course grades by age<\/p>\r\n<p class=\"import-BodyText\">There are a couple of ways that research conclusions might be wrong. One is referred to as a <a href=\"#_bookmark30\"><strong>type<\/strong> <strong>I<\/strong> <strong>error<\/strong><\/a>\u2014when the researcher concludes there <em>is<\/em> a relationship between two variables but, in reality, there is <em>not<\/em>. Back to our example: Let\u2019s now pretend there\u2019s no relationship between maturity and grades, but the researcher still finds one. Why does this happen? It may be that her sample, by chance, includes older students who <em>also<\/em> have better study habits and perform better: The researcher has \u201cfound\u201d a relationship (the data appearing to show age\u00a0as significantly correlated with academic performance), but the truth is that the apparent relationship is purely coincidental\u2014the result of these specific older students in this particular sample having better-than-average study habits (the real cause of the relationship). They may have always had superior study habits, even when they were young.<\/p>\r\n<p class=\"import-BodyText\">Another possible outcome of NHST is a <a href=\"#_bookmark30\"><strong>type<\/strong> <strong>II<\/strong> <strong>error<\/strong><\/a>, when the data fail to show a relationship between variables that actually exists. In our example, this time pretend that maturity <em>is<\/em> \u2014in reality\u2014associated with academic performance, but the researcher <em>doesn<\/em><em>\u2019t<\/em> find it in her sample. Perhaps it was just her bad luck that her older students are just having an off day, suffering from test anxiety, or were uncharacteristically careless with their homework: The peculiarities of her particular sample, by chance, prevent the researcher from identifying the real relationship between maturity and academic performance.<\/p>\r\n<p class=\"import-BodyText\">These types of errors might worry you, that there is just no way to tell if data are any good or not. Researchers share your concerns, and address them by using <a href=\"#_bookmark29\"><strong>probability<\/strong> <strong>values<\/strong> <\/a>(p- values) to set a threshold for type I or type II errors. When researchers write that a particular finding is \u201csignificant at a <em>p<\/em> &lt; .05 level,\u201d they\u2019re saying that if the same study were repeated 100 times, we should expect this result to occur\u2014by chance\u2014fewer than five times. That is, in this case, a Type I error is unlikely. Scholars sometimes argue over the exact threshold that should be used for probability. The most common in psychological science are .05 (5% chance), .01 (1% chance), and .001 (1\/10th of 1% chance). Remember, psychological science doesn\u2019t rely on definitive proof; it\u2019s about the probability of seeing a specific result. This is also why it\u2019s so important that scientific findings be replicated in additional studies.<\/p>\r\n<p class=\"import-Normal\"><img src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image13.png\" alt=\"image\" width=\"624px\" height=\"209.04px\" \/><\/p>\r\n<p class=\"import-Normal\">Table 3: Accurate detection and errors in research<\/p>\r\n<p class=\"import-BodyText\">It\u2019s because of such methodologies that science is generally trustworthy. Not all claims and explanations are equal; some conclusions are better bets, so to speak. Scientific claims are more\u00a0likely to be correct and predict real outcomes than \u201ccommon sense\u201d opinions and\u00a0personal anecdotes. This is because researchers consider how to best prepare and measure their subjects, systematically collect data from large and\u2014ideally\u2014representative samples, and test their findings against probability.<\/p>","rendered":"<p class=\"import-BodyText\">It\u2019s worth delving a bit deeper into why we ought to trust the scientific inductive process, even when it relies on limited samples that don\u2019t offer absolute \u201cproof.\u201d To do this, let\u2019s examine a widespread practice in psychological science: null-hypothesis significance testing.<\/p>\n<p class=\"import-BodyText\">To understand this concept, let\u2019s begin with another research example. Imagine, for instance, a researcher is curious about the ways maturity affects academic performance. She might have a hypothesis that mature students are more likely to be responsible about studying and completing homework and, therefore, will do better in their courses. To test this hypothesis, the researcher needs a measure of maturity and a measure of course performance. She might calculate the <a href=\"#_bookmark14\"><strong>correlation<\/strong><\/a>\u2014or relationship\u2014between student age (her measure of maturity) and points earned in a course (her measure of academic performance). Ultimately, the researcher is interested in the likelihood\u2014or probability\u2014 that these two variables closely relate to one another. <strong>Null-hypothesis<\/strong> <strong>significance<\/strong> <strong>testing <\/strong><a href=\"#_bookmark29\">(NHST)<\/a>assesses the probability that the collected data (the observations) would be the same if there were no relationship between the variables in the study. Using our example, the NHST would test the probability that the researcher would find a link between age and class performance if there were, in reality, no such link.<\/p>\n<p>&nbsp;<\/p>\n<p class=\"import-Normal\"><img decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image11-1.jpeg\" alt=\"image\" width=\"299.133333333333px\" height=\"299.133333333333px\" \/><\/p>\n<p class=\"import-Normal\">Is there a relationship between student age and academic performance? How could we research this question? How confident can we be that our observations reflect reality? [Image: Jeremy Wilburn, https:\/\/goo.gl\/i9MoJb, CC BY-NC-ND 2.0, https:\/\/goo.gl\/SjTsDg]<\/p>\n<p class=\"import-BodyText\">Now, here\u2019s where it gets a little complicated. NHST involves a <em>null<\/em> <em>hypothesis,<\/em> a statement that two variables are <em>not<\/em> related (in this case, that student maturity and academic performance are <em>not<\/em> related in any meaningful way). NHST also involves an <em>alternative<\/em> <em>hypothesis,<\/em> a statement that two variables <em>ar<\/em><em>e<\/em> related (in this case, that student maturity and academic performance go together). To evaluate these two hypotheses, the researcher collects data. The researcher then compares what she expects to find (probability) with what she actually finds (the collected data) to determine whether she can falsify, or reject, the null hypothesis in favor of the alternative hypothesis.<\/p>\n<p class=\"import-BodyText\">How does she do this? By looking at the <a href=\"#_bookmark28\"><strong>distribution<\/strong> <\/a>of the data. The distribution is the spread of values\u2014in our example, the numeric values of students\u2019 scores in the course. The researcher will test her hypothesis by comparing the observed distribution of grades earned by older students to those earned by younger students, recognizing that some distributions are more or less likely. Your intuition tells you, for example, that the chances of every single person in the course getting a perfect score are lower than their scores being distributed across all levels of performance.<\/p>\n<p class=\"import-BodyText\">The researcher can use a probability table to assess the likelihood of any distribution she finds in her class. These tables reflect the work, over the past 200 years, of mathematicians and scientists from a variety of fields. You can see, in Table 2a, an example of an expected distribution if the grades were normally distributed (most are average, and relatively few are amazing or terrible). In Table 2b, you can see possible results of this imaginary study, and can clearly see how they differ from the expected distribution.<\/p>\n<p class=\"import-BodyText\">In the process of testing these hypotheses, there are four possible outcomes. These are determined by two factors: 1) reality, and 2) what the researcher finds (see Table 3). The best possible outcome is <em>accurate<\/em> <em>detection<\/em>. This means that the researcher\u2019s conclusion mirrors reality. In our example, let\u2019s pretend the more mature students do perform slightly better. If this is what the researcher finds in her data, her analysis qualifies as an accurate detection of reality. Another form of accurate detection is when a researcher finds no evidence for a phenomenon, but that phenomenon doesn\u2019t actually exist anyway! Using this same example, let\u2019s now pretend that maturity has <em>nothing<\/em> to do with academic performance. Perhaps academic performance is instead related to intelligence or study habits. If the researcher finds no evidence for a link between maturity and grades and none actually exists, she will have also\u00a0achieved accurate detection.<\/p>\n<p class=\"import-Normal\"><img decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image12-1.jpeg\" alt=\"image\" width=\"509.99937007874px\" height=\"600px\" \/><\/p>\n<p class=\"import-Normal\">Table 2a (Above): Expected grades if there were no difference\u00a0between the two groups. Table 2b (Below): Course grades by age<\/p>\n<p class=\"import-BodyText\">There are a couple of ways that research conclusions might be wrong. One is referred to as a <a href=\"#_bookmark30\"><strong>type<\/strong> <strong>I<\/strong> <strong>error<\/strong><\/a>\u2014when the researcher concludes there <em>is<\/em> a relationship between two variables but, in reality, there is <em>not<\/em>. Back to our example: Let\u2019s now pretend there\u2019s no relationship between maturity and grades, but the researcher still finds one. Why does this happen? It may be that her sample, by chance, includes older students who <em>also<\/em> have better study habits and perform better: The researcher has \u201cfound\u201d a relationship (the data appearing to show age\u00a0as significantly correlated with academic performance), but the truth is that the apparent relationship is purely coincidental\u2014the result of these specific older students in this particular sample having better-than-average study habits (the real cause of the relationship). They may have always had superior study habits, even when they were young.<\/p>\n<p class=\"import-BodyText\">Another possible outcome of NHST is a <a href=\"#_bookmark30\"><strong>type<\/strong> <strong>II<\/strong> <strong>error<\/strong><\/a>, when the data fail to show a relationship between variables that actually exists. In our example, this time pretend that maturity <em>is<\/em> \u2014in reality\u2014associated with academic performance, but the researcher <em>doesn<\/em><em>\u2019t<\/em> find it in her sample. Perhaps it was just her bad luck that her older students are just having an off day, suffering from test anxiety, or were uncharacteristically careless with their homework: The peculiarities of her particular sample, by chance, prevent the researcher from identifying the real relationship between maturity and academic performance.<\/p>\n<p class=\"import-BodyText\">These types of errors might worry you, that there is just no way to tell if data are any good or not. Researchers share your concerns, and address them by using <a href=\"#_bookmark29\"><strong>probability<\/strong> <strong>values<\/strong> <\/a>(p- values) to set a threshold for type I or type II errors. When researchers write that a particular finding is \u201csignificant at a <em>p<\/em> &lt; .05 level,\u201d they\u2019re saying that if the same study were repeated 100 times, we should expect this result to occur\u2014by chance\u2014fewer than five times. That is, in this case, a Type I error is unlikely. Scholars sometimes argue over the exact threshold that should be used for probability. The most common in psychological science are .05 (5% chance), .01 (1% chance), and .001 (1\/10th of 1% chance). Remember, psychological science doesn\u2019t rely on definitive proof; it\u2019s about the probability of seeing a specific result. This is also why it\u2019s so important that scientific findings be replicated in additional studies.<\/p>\n<p class=\"import-Normal\"><img decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-content\/uploads\/sites\/17\/2018\/08\/image13.png\" alt=\"image\" width=\"624px\" height=\"209.04px\" \/><\/p>\n<p class=\"import-Normal\">Table 3: Accurate detection and errors in research<\/p>\n<p class=\"import-BodyText\">It\u2019s because of such methodologies that science is generally trustworthy. Not all claims and explanations are equal; some conclusions are better bets, so to speak. Scientific claims are more\u00a0likely to be correct and predict real outcomes than \u201ccommon sense\u201d opinions and\u00a0personal anecdotes. This is because researchers consider how to best prepare and measure their subjects, systematically collect data from large and\u2014ideally\u2014representative samples, and test their findings against probability.<\/p>\n","protected":false},"author":23,"menu_order":4,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-479","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":107,"_links":{"self":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/479","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/users\/23"}],"version-history":[{"count":7,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/479\/revisions"}],"predecessor-version":[{"id":1840,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/479\/revisions\/1840"}],"part":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/parts\/107"}],"metadata":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapters\/479\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/media?parent=479"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/pressbooks\/v2\/chapter-type?post=479"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/contributor?post=479"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/upeiintropsychology\/wp-json\/wp\/v2\/license?post=479"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}