{"id":382,"date":"2020-04-10T07:58:47","date_gmt":"2020-04-10T11:58:47","guid":{"rendered":"http:\/\/pressbooks.library.upei.ca\/montelpare\/?post_type=chapter&#038;p=382"},"modified":"2020-08-24T14:16:22","modified_gmt":"2020-08-24T18:16:22","slug":"the-two-sample-goodness-of-fit-chi-square","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.upei.ca\/montelpare\/chapter\/the-two-sample-goodness-of-fit-chi-square\/","title":{"raw":"Goodness of Fit Chi-Square for k=5","rendered":"Goodness of Fit Chi-Square for k=5"},"content":{"raw":"<div>\r\n\r\nIn the following example, we consider the goodness of fit chi-square test with <strong>five<\/strong> response categories.\r\n\r\nThe biweekly lottery \u2013 Lotto 649 provides players with an opportunity to win millions of dollars if they can select the set of six numbers that are randomly drawn from the set of numbers from 1 to 49.\u00a0 Since the lottery is purported to be random, the chance associated with a player\u2019s single ticket matching the six numbers drawn at random is based on the combinatorial formula for 49 choose 6 and has a probability of 1 \u00f7 <sup>49<\/sup>C<sub>6<\/sub>.\r\n\r\nThe probability associated with every single ticket is the same and is 1 in 13,983,816 possible combinations of 6 numbers. So then, what if we wanted to test the randomness of this lottery?\u00a0 In the following example, we will use the chi-square goodness of fit test to determine if each number is random with respect to selection, and that there is no apriori pattern of numbers from one range or another within the set of 49 occurring more frequently or with a systematic selection pattern.\r\n\r\nTo begin we need to organize the range of possible outcomes into manageable categories that can be processed with the chi-square goodness of fit test.\u00a0 Given that the range of all possible outcomes for the lotto is from 1 to 49, we can organize the potential sampling space (1 to 49) into 5 categories as follows 1-9, 10-19, 20-29, 30-39, 40-49.\r\n\r\nFurther, if we wanted to test randomness, then we would need to sample more than one single week of numbers, so considering that there are 104 draws per year we could use an entire year\u2019s worth of data to establish the frequency distribution of numbers drawn, and after organizing the outcomes into the 5 categories determine the chi-square goodness of fit, statistically.\r\n\r\nStep 1: after establishing that there are 5 categories for the outcome frequency distribution we would expect that the distribution or organization of the responses should be equal across all of the possible responses categories as follows:\r\n\r\nData represent the actual numbers that are drawn in a single year for the lotto 649.\u00a0 That is, in any given year there are 104 draws, which is based on 2 draws per week for 52 weeks.\u00a0 Therefore, in the lotto 649 example we have 624 possible numbers drawn \u00e0 (2 draws per week for 52 weeks = 6 numbers x 104). These data can then be organized into the following 5 categories to represent the set of all possible numbers drawn in the one year so that a frequency distribution chart of the responses might look like this:\r\n<div align=\"center\">\r\n<table class=\"aligncenter\" style=\"height: 75px\">\r\n<tbody>\r\n<tr style=\"height: 15px\">\r\n<td style=\"height: 15px;width: 139.817px;text-align: center\">1 - 9<\/td>\r\n<td style=\"height: 15px;width: 229.283px;text-align: center\">124 numbers<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"height: 15px;width: 139.817px;text-align: center\">10 - 19<\/td>\r\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"height: 15px;width: 139.817px;text-align: center\">20 - 29<\/td>\r\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"height: 15px;width: 139.817px;text-align: center\">30 - 39<\/td>\r\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"height: 15px;width: 139.817px;text-align: center\">40 - 49<\/td>\r\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\nTherefore, we can say from this chart that our responses to the research question should be evenly distributed across all of the possible responses.\r\n\r\nSuch a response pattern is consistent with our expected distribution. In other words, in an unbiased research study, we should expect that all possible responses are equally as likely to occur. We call this the unbiased null hypothesis, and state this in terms of frequencies of responses which are represented as f(k)\u00a0= and is shown as follows:\r\n<h3 style=\"text-align: center\">H<sub>0<\/sub>: f<sub>1<\/sub>\u00a0= f<sub>2<\/sub>\u00a0= f<sub>3<\/sub>\u00a0= f<sub>4<\/sub>\u00a0= f<sub>5<\/sub><\/h3>\r\n<p style=\"text-align: center\">Therefore, based on the null hypothesis, considering that each response category should have an equal number of responses, the formula to compute the expected responses might be as follows:<\/p>\r\n<p class=\"import-NormalWeb\"><span lang=\"en-US\"><img class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image1-1.jpeg\" width=\"326\" height=\"120\" \/><\/span><\/p>\r\nNow then let's consider the following example. We asked students to generate 52 weeks of biweekly draws of the lotto 649 and then to sort the data so that we could simulate a test of the outcome distribution to determine how random the simulated lottery is at selecting numbers. The response options for the data produced for my 104 draws are given in the following table.\r\n<table class=\"aligncenter\">\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: center\">1-9<\/td>\r\n<td style=\"text-align: center\">146<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center\">10-19<\/td>\r\n<td style=\"text-align: center\">155<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center\">20-29<\/td>\r\n<td style=\"text-align: center\">282<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center\">30-39<\/td>\r\n<td style=\"text-align: center\">12<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center\">40-49<\/td>\r\n<td style=\"text-align: center\">29<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nFrom this data, it would appear that a large proportion of the scores were found in the 20-29 range and only a few scores were found in the 30-39 range\u00a0283 of 624 or 28.5%. However, the lowest proportions of the scores (choices drawn) 12 of 624 = 1.9%\r\n\r\nThe chi-square test is, therefore, a useful statistical test to determine if the overall distribution of the responses in the observed sample is similar to or matches the expected distribution of responses in the target population (the \u201ctarget population\u201d being defined as all scores drawn) in this one year of simulated data.. The equation below is the basic equation for the goodness of fit chi-square test.\r\n\r\n<img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png\" alt=\"\" class=\"size-full wp-image-432 aligncenter\" width=\"181\" height=\"104\" \/>\r\n\r\nThe equation shown here measures how closely an observed set of responses (the <em>\u201c\u00a0o\u201d<\/em>\u00a0for\u00a0<em>\u201cobserved\u201d<\/em>) matches an expected set of responses (the\u00a0<em>\u201ce\u201d<\/em>\u00a0for\u00a0<em>\u201cexpected\u201d<\/em>).\r\n\r\nSo then how do we calculate the items that we use in the chi-square equation?\r\n\r\nThe observed frequencies are simply taken from the data recording sheet, but the expected frequencies are computed from the following formula:\r\n\r\n<img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-300x110.png\" alt=\"\" class=\"size-medium wp-image-433 aligncenter\" width=\"300\" height=\"110\" \/>\r\n\r\nAnother way to view the computation of the expected frequencies is to consider the null hypothesis which stated that:\r\n<p style=\"text-align: center\"><strong>H<sub>0<\/sub>: f<sub>1<\/sub>\u00a0= f<sub>2<\/sub>\u00a0= f<sub>3<\/sub>\u00a0= f<sub>4<\/sub>\u00a0= f<sub>5<\/sub><\/strong><\/p>\r\nand multiply the total frequency by the probability associated with each category, as in the following computations.\r\n<p style=\"text-align: center\">624 x 0.20 = 124.8<\/p>\r\nThe chi-square is then used to compute whether or not the observed distribution fits a hypothetical or expected distribution. This can be accomplished by applying the formula to each row of the response table. The computation of the first row is shown here:\r\n\r\n<img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn.png\" alt=\"\" class=\"aligncenter wp-image-437 size-full\" width=\"600\" height=\"50\" \/>\r\n<div align=\"center\">\r\n<table style=\"height: 90px\">\r\n<tbody>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\">Response Category<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\">Observed Frequency<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\">Expected Frequency<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\">(Obs - Exp)2 \u00f7 Exp<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\"><span style=\"background-color: #ffff00\">1 - 9<\/span><\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\"><span style=\"background-color: #ffff00\">146<\/span><\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\"><span style=\"background-color: #ffff00\">124.8<\/span><\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\"><span style=\"background-color: #ffff00\">3.60<\/span><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\">10 - 19<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\">155<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\">7.31<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\">20 - 29<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\">282<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\">198.01<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\">30 - 39<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\">12<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\">101.95<\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"text-align: center;height: 15px;width: 128.567px\">40 - 49<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 138.183px\">29<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\r\n<td style=\"text-align: center;height: 15px;width: 121.967px\">73.54<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\nIn the calculation of the chi-square we see that in each row of the table, the observed score from the sample is subtracted from the expected score that represents the scores of the population. For example in ROW_1 of the table the observed score of\u00a0 146 is subtracted from the expected score of 124.8. The difference of 21.2 is squared and the outcome is divided by 124.8, and the resulting value is 3.6.\u00a0 The calculation is repeated for each row of the table and the outcomes are added together to produce the chi-square value as shown below.\r\n<div align=\"center\">\r\n<table style=\"width: 504px\">\r\n<tbody>\r\n<tr>\r\n<td style=\"text-align: center;width: 111.567px\">Response Category<\/td>\r\n<td style=\"text-align: center;width: 86.4667px\">Observed Frequency<\/td>\r\n<td style=\"text-align: center;width: 75.4167px\">Expected Frequency<\/td>\r\n<td style=\"text-align: center;width: 173.85px\">(Obs - Exp)2 \u00f7 Exp<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"text-align: center;width: 301.317px\" colspan=\"3\"><img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png\" alt=\"\" class=\"size-full wp-image-432 aligncenter\" width=\"181\" height=\"104\" \/><\/td>\r\n<td style=\"text-align: center;width: 173.85px\">\u00a0\u00a0 3.60\r\n\r\n+ 7.31\r\n\r\n+ 198.01\r\n\r\n+ 101.95\r\n\r\n+ <span style=\"text-decoration: underline\">73.54<\/span>\r\n\r\n<strong>384.41<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\nOur next step is then to determine if the chi-square observed value is greater than the chi-square critical value, so that we can make a decision about the significance of the observed distribution.\r\n<div class=\"textbox textbox--key-takeaways\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><em>Chi-Square decision rule for the one sample chi-square test. <\/em><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n<em>The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi square values. The chi-square critical score represents what we should expect to observe for a distribution with five responses. The critical value is determined by computing the degrees of freedom for our response set.\u00a0<\/em>\r\n\r\n<em>The computation of the degrees of freedom is:<\/em>\r\n\r\n<em>degrees of freedom = k possible responses -1<\/em>\r\n\r\n<em>degrees of freedom = 5-1<\/em>\r\n\r\n<em>degrees of freedom = 4<\/em>\r\n\r\n<em>and the chi-square critical value for degrees of freedom of 4 at p&lt;0.05 = 9.49<\/em>\r\n\r\n<em>If the chi-square observed value is\u00a0GREATER THAN\u00a0the chi-square critical value of\u00a09.49, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL. <\/em>A large chi-square value, that is a value which exceeds the chi-square critical value demonstrates that the outcome is less likely to occur by chance.\r\n\r\n<\/div>\r\n<\/div>\r\nUsing the degrees for freedom for a one-sample chi-square, our degrees of freedom are:\r\n\r\ndegrees of freedom = \u201ck\u201d\u00a0possible responses\u00a0-1\r\n\r\ndegrees of freedom = 5-1\r\n\r\ndegrees of freedom = 4\r\n\r\nand the \u201c chi-square critical value\u201d for degrees of freedom of \u201c4\u201d is 9.49\r\n\r\nTherefore, because our chi-square observed value of 384.41 is \u203a the chi-square critical value of 9.49, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL.\r\n\r\nWe can check our calculations with the following SAS Program. This program produces a frequency distribution with chi-square analysis to evaluate the null hypothesis (see above), as well as a pie chart to show the proportion of times a number from each category was drawn in the lotto.\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">ONE SAMPLE GOODNESS OF FIT CHI-SQUARE FOR K=5<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nPROC FORMAT;\r\nVALUE SLICE 1='#1 to #9' 2='#10 to #19' 3='#20 to #29'\r\n4='#30 to #39' 5='#40 to #49';\r\nDATA GFIT_2;\r\nINPUT LOTTOGRP N_DRAWS;\r\n\/* DEFINE THE AXIS CHARACTERISTICS *\/\r\nAXIS1 LABEL=(\"LOTTO CATEGORIES\")\r\nVALUE=(JUSTIFY=CENTER);\r\nAXIS2 LABEL=(ANGLE=90 \"N TIMES CATEGORY VALUE DRAWN\")\r\nORDER=(0 TO 1000 BY 100)\r\nMINOR=(N=3);\r\nAXIS3 LABEL=(ANGLE=90 \"LOTTO CATEGORIES\");\r\n\r\nAXIS4 LABEL=(\"N TIMES CATEGORY VALUE DRAWN\") ;\r\nDATALINES;\r\n1 146\r\n2 155\r\n3 282\r\n4 12\r\n5 29\r\n;\r\n\/* HERE WE USE THE OPTION SUMVAR TO GRAPH THE SUM OF THE FREQ *\/\r\nPROC FREQ ORDER=DATA; TABLES LOTTOGRP\/CHISQ CL CELLCHI2;\r\nWEIGHT N_DRAWS;\r\nFORMAT LOTTOGRP SLICE. ;\r\nTITLE 'FREQUENCY DISTRIBUTION FOR N TIMES CATEGORY VALUE WAS DRAWN';\r\nTITLE2 'ONE SAMPLE GOODNESS OF FIT EXAMPLE K=5';\r\nRUN;\r\nPROC GCHART DATA=GFIT_1;\r\nPIE3D LOTTOGRP\/SUMVAR=N_DRAWS TYPE=SUM DISCRETE PERCENT=ARROW\r\nCOUTLINE=RED WOUTLINE=1 FILL=solid SLICE = arrow clockwise\r\nnoLEGEND noheading value=none;\r\nFORMAT LOTTOGRP SLICE. ;\r\nTITLE1 'PIE CHART FOR N TIMES CATEGORY VALUE WAS DRAWN';\r\nPATTERN1 COLOR = LIGHTBLUE;\r\nRun;\r\n\r\n<\/div>\r\n<\/div>\r\nThe output for the frequency distribution with corresponding chi-square is shown here:\r\n\r\n<section><article aria-label=\"One-Way Frequencies\">\r\n<div class=\"proc_title_group\">\r\n<p class=\"c proctitle\">The FREQ Procedure<\/p>\r\n\r\n<\/div>\r\n<section><article aria-label=\"One-Way Frequencies\">\r\n<table class=\"table\" aria-label=\"One-Way Frequencies\"><caption aria-label=\"One-Way Frequencies\">\u00a0<\/caption><colgroup> <col \/><\/colgroup> <colgroup> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<thead>\r\n<tr>\r\n<th class=\"r b header\" scope=\"col\">LOTTOGRP<\/th>\r\n<th class=\"r b header\" scope=\"col\">Frequency<\/th>\r\n<th class=\"r b header\" scope=\"col\">Percent<\/th>\r\n<th class=\"r b header\" scope=\"col\">Cumulative\r\nFrequency<\/th>\r\n<th class=\"r b header\" scope=\"col\">Cumulative\r\nPercent<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<th class=\"r rowheader\" scope=\"row\">#1 to #9<\/th>\r\n<td class=\"r data\">146<\/td>\r\n<td class=\"r data\">23.40<\/td>\r\n<td class=\"r data\">146<\/td>\r\n<td class=\"r data\">23.40<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" scope=\"row\">#10 to #19<\/th>\r\n<td class=\"r data\">155<\/td>\r\n<td class=\"r data\">24.84<\/td>\r\n<td class=\"r data\">301<\/td>\r\n<td class=\"r data\">48.24<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" scope=\"row\">#20 to #29<\/th>\r\n<td class=\"r data\">282<\/td>\r\n<td class=\"r data\">45.19<\/td>\r\n<td class=\"r data\">583<\/td>\r\n<td class=\"r data\">93.43<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" scope=\"row\">#30 to #39<\/th>\r\n<td class=\"r data\">12<\/td>\r\n<td class=\"r data\">1.92<\/td>\r\n<td class=\"r data\">595<\/td>\r\n<td class=\"r data\">95.35<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" scope=\"row\">#40 to #49<\/th>\r\n<td class=\"r data\">29<\/td>\r\n<td class=\"r data\">4.65<\/td>\r\n<td class=\"r data\">624<\/td>\r\n<td class=\"r data\">100.00<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/article><article id=\"IDX1\" aria-label=\"One-Way Chi-Square Test\">\r\n<table class=\"table\" aria-label=\"One-Way Chi-Square Test\"><caption aria-label=\"One-Way Chi-Square Test\">\u00a0<\/caption><colgroup> <col \/> <col \/><\/colgroup>\r\n<thead>\r\n<tr>\r\n<th class=\"c b header\" colspan=\"2\" scope=\"colgroup\">Chi-Square Test\r\nfor Equal Proportions<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">Chi-Square<\/th>\r\n<td class=\"r data\">384.4135<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">DF<\/th>\r\n<td class=\"r data\">4<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">Pr &gt; ChiSq<\/th>\r\n<td class=\"r data\">&lt;.0001<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/article><\/section><\/article><article id=\"IDX1\" aria-label=\"One-Way Chi-Square Test\"><\/article><\/section>\r\n<h5>The pie chart for the number of times a value was drawn within each category, expressed as a percent is shown here.<\/h5>\r\n<h5><img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3.png\" alt=\"\" class=\"aligncenter wp-image-441 size-full\" width=\"756\" height=\"299\" \/><\/h5>\r\n\r\n<hr \/>\r\n\r\n<h5>Webulator Form 1:<\/h5>\r\n<span>The following is a Goodness of Fit Webulator for k= 5 responses In the example above our raw data values for the cumulative times that a number was drawn from each category of the Lotto is shown here:<\/span>\r\n<table class=\"aligncenter\" style=\"width: 340px\" aria-label=\"One-Way Frequencies\"><caption aria-label=\"One-Way Frequencies\"><span>Distribution of Draws per Category<\/span><\/caption><colgroup> <col \/><\/colgroup> <colgroup> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<thead>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#1 to #9\r\n<\/span><\/th>\r\n<td class=\"r data\" style=\"width: 145.7px\"><span>146<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#10 to #19\r\n<\/span><\/th>\r\n<td class=\"r data\" style=\"width: 145.7px\"><span>155<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#20 to #29<\/span><\/th>\r\n<td class=\"r data\" style=\"width: 145.7px\"><span>282<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#30 to #39<\/span><\/th>\r\n<td class=\"r data\" style=\"width: 145.7px\"><span>12<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th style=\"width: 223.4px;text-align: center\"><span>#40 to #49<\/span><\/th>\r\n<td style=\"width: 145.7px\"><span>29\r\n<\/span><\/td>\r\n<\/tr>\r\n<\/thead>\r\n<\/table>\r\n<p align=\"left\"><span>Enter these data into the webulator below for each of your category options and then click the button labeled <strong><em>CLICK ME<\/em><\/strong>. This will produce the sum of the five values that you entered and compute the expected frequency for the values in the table.<\/span><\/p>\r\n<p align=\"left\"><span><code>[h5p id=\"5\"]<\/code><\/span><\/p>\r\nThe important value from this <i>Webulator<\/i> is the computed chi-square score. The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for the distribution with \"k\" responses. The critical value is determined by computing the \u201cdegrees of freedom\u201d for our response set.\r\n\r\nThe computation of the degrees of freedom is: degrees of freedom = \u201ck\u201d possible responses -1\r\n\r\ndegrees of freedom = 4-1 --&gt; degrees of freedom = 3\r\n<p align=\"center\">and the \u201cchi-square critical value\u201d for degrees of freedom of \u201c3\u201d at p&lt;0.05 = 7.815<\/p>\r\nIf the \u201cchi-square observed value \u201d is<strong>\u00a0 \u203a<\/strong>\u00a0 the \u201cchi-square critical value of <b>7.815<\/b>\u201d, we must reject the null hypothesis and state that the distribution of responses across the response categories IS NOT EQUAL.\r\n<h4>If you would you like to use the Webulators for your own applications, without this text visit: https:\/\/health.ahs.upei.ca\/webulators\/w_menu.php<\/h4>\r\n<h4>This Webulator application to compute the one sample goodness of fit with k=5 is available at\u00a0 <a href=\"https:\/\/health.ahs.upei.ca\/webulators\/goodfit2.php\"><code>https:\/\/health.ahs.upei.ca\/webulators\/goodfit2.php<\/code><\/a><\/h4>\r\n<\/div>","rendered":"<div>\n<p>In the following example, we consider the goodness of fit chi-square test with <strong>five<\/strong> response categories.<\/p>\n<p>The biweekly lottery \u2013 Lotto 649 provides players with an opportunity to win millions of dollars if they can select the set of six numbers that are randomly drawn from the set of numbers from 1 to 49.\u00a0 Since the lottery is purported to be random, the chance associated with a player\u2019s single ticket matching the six numbers drawn at random is based on the combinatorial formula for 49 choose 6 and has a probability of 1 \u00f7 <sup>49<\/sup>C<sub>6<\/sub>.<\/p>\n<p>The probability associated with every single ticket is the same and is 1 in 13,983,816 possible combinations of 6 numbers. So then, what if we wanted to test the randomness of this lottery?\u00a0 In the following example, we will use the chi-square goodness of fit test to determine if each number is random with respect to selection, and that there is no apriori pattern of numbers from one range or another within the set of 49 occurring more frequently or with a systematic selection pattern.<\/p>\n<p>To begin we need to organize the range of possible outcomes into manageable categories that can be processed with the chi-square goodness of fit test.\u00a0 Given that the range of all possible outcomes for the lotto is from 1 to 49, we can organize the potential sampling space (1 to 49) into 5 categories as follows 1-9, 10-19, 20-29, 30-39, 40-49.<\/p>\n<p>Further, if we wanted to test randomness, then we would need to sample more than one single week of numbers, so considering that there are 104 draws per year we could use an entire year\u2019s worth of data to establish the frequency distribution of numbers drawn, and after organizing the outcomes into the 5 categories determine the chi-square goodness of fit, statistically.<\/p>\n<p>Step 1: after establishing that there are 5 categories for the outcome frequency distribution we would expect that the distribution or organization of the responses should be equal across all of the possible responses categories as follows:<\/p>\n<p>Data represent the actual numbers that are drawn in a single year for the lotto 649.\u00a0 That is, in any given year there are 104 draws, which is based on 2 draws per week for 52 weeks.\u00a0 Therefore, in the lotto 649 example we have 624 possible numbers drawn \u00e0 (2 draws per week for 52 weeks = 6 numbers x 104). These data can then be organized into the following 5 categories to represent the set of all possible numbers drawn in the one year so that a frequency distribution chart of the responses might look like this:<\/p>\n<div style=\"margin: auto;\">\n<table class=\"aligncenter\" style=\"height: 75px\">\n<tbody>\n<tr style=\"height: 15px\">\n<td style=\"height: 15px;width: 139.817px;text-align: center\">1 &#8211; 9<\/td>\n<td style=\"height: 15px;width: 229.283px;text-align: center\">124 numbers<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"height: 15px;width: 139.817px;text-align: center\">10 &#8211; 19<\/td>\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"height: 15px;width: 139.817px;text-align: center\">20 &#8211; 29<\/td>\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"height: 15px;width: 139.817px;text-align: center\">30 &#8211; 39<\/td>\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"height: 15px;width: 139.817px;text-align: center\">40 &#8211; 49<\/td>\n<td style=\"height: 15px;width: 229.283px;text-align: center\">125 numbers<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Therefore, we can say from this chart that our responses to the research question should be evenly distributed across all of the possible responses.<\/p>\n<p>Such a response pattern is consistent with our expected distribution. In other words, in an unbiased research study, we should expect that all possible responses are equally as likely to occur. We call this the unbiased null hypothesis, and state this in terms of frequencies of responses which are represented as f(k)\u00a0= and is shown as follows:<\/p>\n<h3 style=\"text-align: center\">H<sub>0<\/sub>: f<sub>1<\/sub>\u00a0= f<sub>2<\/sub>\u00a0= f<sub>3<\/sub>\u00a0= f<sub>4<\/sub>\u00a0= f<sub>5<\/sub><\/h3>\n<p style=\"text-align: center\">Therefore, based on the null hypothesis, considering that each response category should have an equal number of responses, the formula to compute the expected responses might be as follows:<\/p>\n<p class=\"import-NormalWeb\"><span lang=\"en-US\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image1-1.jpeg\" width=\"326\" height=\"120\" \/><\/span><\/p>\n<p>Now then let&#8217;s consider the following example. We asked students to generate 52 weeks of biweekly draws of the lotto 649 and then to sort the data so that we could simulate a test of the outcome distribution to determine how random the simulated lottery is at selecting numbers. The response options for the data produced for my 104 draws are given in the following table.<\/p>\n<table class=\"aligncenter\">\n<tbody>\n<tr>\n<td style=\"text-align: center\">1-9<\/td>\n<td style=\"text-align: center\">146<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center\">10-19<\/td>\n<td style=\"text-align: center\">155<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center\">20-29<\/td>\n<td style=\"text-align: center\">282<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center\">30-39<\/td>\n<td style=\"text-align: center\">12<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center\">40-49<\/td>\n<td style=\"text-align: center\">29<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>From this data, it would appear that a large proportion of the scores were found in the 20-29 range and only a few scores were found in the 30-39 range\u00a0283 of 624 or 28.5%. However, the lowest proportions of the scores (choices drawn) 12 of 624 = 1.9%<\/p>\n<p>The chi-square test is, therefore, a useful statistical test to determine if the overall distribution of the responses in the observed sample is similar to or matches the expected distribution of responses in the target population (the \u201ctarget population\u201d being defined as all scores drawn) in this one year of simulated data.. The equation below is the basic equation for the goodness of fit chi-square test.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png\" alt=\"\" class=\"size-full wp-image-432 aligncenter\" width=\"181\" height=\"104\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png 181w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM-65x37.png 65w\" sizes=\"auto, (max-width: 181px) 100vw, 181px\" \/><\/p>\n<p>The equation shown here measures how closely an observed set of responses (the <em>\u201c\u00a0o\u201d<\/em>\u00a0for\u00a0<em>\u201cobserved\u201d<\/em>) matches an expected set of responses (the\u00a0<em>\u201ce\u201d<\/em>\u00a0for\u00a0<em>\u201cexpected\u201d<\/em>).<\/p>\n<p>So then how do we calculate the items that we use in the chi-square equation?<\/p>\n<p>The observed frequencies are simply taken from the data recording sheet, but the expected frequencies are computed from the following formula:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-300x110.png\" alt=\"\" class=\"size-medium wp-image-433 aligncenter\" width=\"300\" height=\"110\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-300x110.png 300w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-65x24.png 65w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-225x83.png 225w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq-350x128.png 350w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/expfrq.png 379w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>Another way to view the computation of the expected frequencies is to consider the null hypothesis which stated that:<\/p>\n<p style=\"text-align: center\"><strong>H<sub>0<\/sub>: f<sub>1<\/sub>\u00a0= f<sub>2<\/sub>\u00a0= f<sub>3<\/sub>\u00a0= f<sub>4<\/sub>\u00a0= f<sub>5<\/sub><\/strong><\/p>\n<p>and multiply the total frequency by the probability associated with each category, as in the following computations.<\/p>\n<p style=\"text-align: center\">624 x 0.20 = 124.8<\/p>\n<p>The chi-square is then used to compute whether or not the observed distribution fits a hypothetical or expected distribution. This can be accomplished by applying the formula to each row of the response table. The computation of the first row is shown here:<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn.png\" alt=\"\" class=\"aligncenter wp-image-437 size-full\" width=\"600\" height=\"50\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn.png 600w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn-300x25.png 300w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn-65x5.png 65w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn-225x19.png 225w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chi_eqn-350x29.png 350w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/p>\n<div style=\"margin: auto;\">\n<table style=\"height: 90px\">\n<tbody>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\">Response Category<\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\">Observed Frequency<\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\">Expected Frequency<\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\">(Obs &#8211; Exp)2 \u00f7 Exp<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\"><span style=\"background-color: #ffff00\">1 &#8211; 9<\/span><\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\"><span style=\"background-color: #ffff00\">146<\/span><\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\"><span style=\"background-color: #ffff00\">124.8<\/span><\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\"><span style=\"background-color: #ffff00\">3.60<\/span><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\">10 &#8211; 19<\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\">155<\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\">7.31<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\">20 &#8211; 29<\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\">282<\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\">198.01<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\">30 &#8211; 39<\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\">12<\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\">101.95<\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"text-align: center;height: 15px;width: 128.567px\">40 &#8211; 49<\/td>\n<td style=\"text-align: center;height: 15px;width: 138.183px\">29<\/td>\n<td style=\"text-align: center;height: 15px;width: 136.433px\">124.8<\/td>\n<td style=\"text-align: center;height: 15px;width: 121.967px\">73.54<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>In the calculation of the chi-square we see that in each row of the table, the observed score from the sample is subtracted from the expected score that represents the scores of the population. For example in ROW_1 of the table the observed score of\u00a0 146 is subtracted from the expected score of 124.8. The difference of 21.2 is squared and the outcome is divided by 124.8, and the resulting value is 3.6.\u00a0 The calculation is repeated for each row of the table and the outcomes are added together to produce the chi-square value as shown below.<\/p>\n<div style=\"margin: auto;\">\n<table style=\"width: 504px\">\n<tbody>\n<tr>\n<td style=\"text-align: center;width: 111.567px\">Response Category<\/td>\n<td style=\"text-align: center;width: 86.4667px\">Observed Frequency<\/td>\n<td style=\"text-align: center;width: 75.4167px\">Expected Frequency<\/td>\n<td style=\"text-align: center;width: 173.85px\">(Obs &#8211; Exp)2 \u00f7 Exp<\/td>\n<\/tr>\n<tr>\n<td style=\"text-align: center;width: 301.317px\" colspan=\"3\"><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png\" alt=\"\" class=\"size-full wp-image-432 aligncenter\" width=\"181\" height=\"104\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM.png 181w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/chisquFRM-65x37.png 65w\" sizes=\"auto, (max-width: 181px) 100vw, 181px\" \/><\/td>\n<td style=\"text-align: center;width: 173.85px\">\u00a0\u00a0 3.60<\/p>\n<p>+ 7.31<\/p>\n<p>+ 198.01<\/p>\n<p>+ 101.95<\/p>\n<p>+ <span style=\"text-decoration: underline\">73.54<\/span><\/p>\n<p><strong>384.41<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Our next step is then to determine if the chi-square observed value is greater than the chi-square critical value, so that we can make a decision about the significance of the observed distribution.<\/p>\n<div class=\"textbox textbox--key-takeaways\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><em>Chi-Square decision rule for the one sample chi-square test. <\/em><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p><em>The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi square values. The chi-square critical score represents what we should expect to observe for a distribution with five responses. The critical value is determined by computing the degrees of freedom for our response set.\u00a0<\/em><\/p>\n<p><em>The computation of the degrees of freedom is:<\/em><\/p>\n<p><em>degrees of freedom = k possible responses -1<\/em><\/p>\n<p><em>degrees of freedom = 5-1<\/em><\/p>\n<p><em>degrees of freedom = 4<\/em><\/p>\n<p><em>and the chi-square critical value for degrees of freedom of 4 at p&lt;0.05 = 9.49<\/em><\/p>\n<p><em>If the chi-square observed value is\u00a0GREATER THAN\u00a0the chi-square critical value of\u00a09.49, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL. <\/em>A large chi-square value, that is a value which exceeds the chi-square critical value demonstrates that the outcome is less likely to occur by chance.<\/p>\n<\/div>\n<\/div>\n<p>Using the degrees for freedom for a one-sample chi-square, our degrees of freedom are:<\/p>\n<p>degrees of freedom = \u201ck\u201d\u00a0possible responses\u00a0-1<\/p>\n<p>degrees of freedom = 5-1<\/p>\n<p>degrees of freedom = 4<\/p>\n<p>and the \u201c chi-square critical value\u201d for degrees of freedom of \u201c4\u201d is 9.49<\/p>\n<p>Therefore, because our chi-square observed value of 384.41 is \u203a the chi-square critical value of 9.49, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL.<\/p>\n<p>We can check our calculations with the following SAS Program. This program produces a frequency distribution with chi-square analysis to evaluate the null hypothesis (see above), as well as a pie chart to show the proportion of times a number from each category was drawn in the lotto.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">ONE SAMPLE GOODNESS OF FIT CHI-SQUARE FOR K=5<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>PROC FORMAT;<br \/>\nVALUE SLICE 1=&#8217;#1 to #9&#8242; 2=&#8217;#10 to #19&#8242; 3=&#8217;#20 to #29&#8242;<br \/>\n4=&#8217;#30 to #39&#8242; 5=&#8217;#40 to #49&#8242;;<br \/>\nDATA GFIT_2;<br \/>\nINPUT LOTTOGRP N_DRAWS;<br \/>\n\/* DEFINE THE AXIS CHARACTERISTICS *\/<br \/>\nAXIS1 LABEL=(&#8220;LOTTO CATEGORIES&#8221;)<br \/>\nVALUE=(JUSTIFY=CENTER);<br \/>\nAXIS2 LABEL=(ANGLE=90 &#8220;N TIMES CATEGORY VALUE DRAWN&#8221;)<br \/>\nORDER=(0 TO 1000 BY 100)<br \/>\nMINOR=(N=3);<br \/>\nAXIS3 LABEL=(ANGLE=90 &#8220;LOTTO CATEGORIES&#8221;);<\/p>\n<p>AXIS4 LABEL=(&#8220;N TIMES CATEGORY VALUE DRAWN&#8221;) ;<br \/>\nDATALINES;<br \/>\n1 146<br \/>\n2 155<br \/>\n3 282<br \/>\n4 12<br \/>\n5 29<br \/>\n;<br \/>\n\/* HERE WE USE THE OPTION SUMVAR TO GRAPH THE SUM OF THE FREQ *\/<br \/>\nPROC FREQ ORDER=DATA; TABLES LOTTOGRP\/CHISQ CL CELLCHI2;<br \/>\nWEIGHT N_DRAWS;<br \/>\nFORMAT LOTTOGRP SLICE. ;<br \/>\nTITLE &#8216;FREQUENCY DISTRIBUTION FOR N TIMES CATEGORY VALUE WAS DRAWN&#8217;;<br \/>\nTITLE2 &#8216;ONE SAMPLE GOODNESS OF FIT EXAMPLE K=5&#8217;;<br \/>\nRUN;<br \/>\nPROC GCHART DATA=GFIT_1;<br \/>\nPIE3D LOTTOGRP\/SUMVAR=N_DRAWS TYPE=SUM DISCRETE PERCENT=ARROW<br \/>\nCOUTLINE=RED WOUTLINE=1 FILL=solid SLICE = arrow clockwise<br \/>\nnoLEGEND noheading value=none;<br \/>\nFORMAT LOTTOGRP SLICE. ;<br \/>\nTITLE1 &#8216;PIE CHART FOR N TIMES CATEGORY VALUE WAS DRAWN&#8217;;<br \/>\nPATTERN1 COLOR = LIGHTBLUE;<br \/>\nRun;<\/p>\n<\/div>\n<\/div>\n<p>The output for the frequency distribution with corresponding chi-square is shown here:<\/p>\n<section>\n<article aria-label=\"One-Way Frequencies\">\n<div class=\"proc_title_group\">\n<p class=\"c proctitle\">The FREQ Procedure<\/p>\n<\/div>\n<section>\n<article aria-label=\"One-Way Frequencies\">\n<table class=\"table\" aria-label=\"One-Way Frequencies\">\n<caption aria-label=\"One-Way Frequencies\">\u00a0<\/caption>\n<colgroup>\n<col \/><\/colgroup>\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<thead>\n<tr>\n<th class=\"r b header\" scope=\"col\">LOTTOGRP<\/th>\n<th class=\"r b header\" scope=\"col\">Frequency<\/th>\n<th class=\"r b header\" scope=\"col\">Percent<\/th>\n<th class=\"r b header\" scope=\"col\">Cumulative<br \/>\nFrequency<\/th>\n<th class=\"r b header\" scope=\"col\">Cumulative<br \/>\nPercent<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th class=\"r rowheader\" scope=\"row\">#1 to #9<\/th>\n<td class=\"r data\">146<\/td>\n<td class=\"r data\">23.40<\/td>\n<td class=\"r data\">146<\/td>\n<td class=\"r data\">23.40<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" scope=\"row\">#10 to #19<\/th>\n<td class=\"r data\">155<\/td>\n<td class=\"r data\">24.84<\/td>\n<td class=\"r data\">301<\/td>\n<td class=\"r data\">48.24<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" scope=\"row\">#20 to #29<\/th>\n<td class=\"r data\">282<\/td>\n<td class=\"r data\">45.19<\/td>\n<td class=\"r data\">583<\/td>\n<td class=\"r data\">93.43<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" scope=\"row\">#30 to #39<\/th>\n<td class=\"r data\">12<\/td>\n<td class=\"r data\">1.92<\/td>\n<td class=\"r data\">595<\/td>\n<td class=\"r data\">95.35<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" scope=\"row\">#40 to #49<\/th>\n<td class=\"r data\">29<\/td>\n<td class=\"r data\">4.65<\/td>\n<td class=\"r data\">624<\/td>\n<td class=\"r data\">100.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/article>\n<article id=\"IDX1\" aria-label=\"One-Way Chi-Square Test\">\n<table class=\"table\" aria-label=\"One-Way Chi-Square Test\">\n<caption aria-label=\"One-Way Chi-Square Test\">\u00a0<\/caption>\n<colgroup>\n<col \/>\n<col \/><\/colgroup>\n<thead>\n<tr>\n<th class=\"c b header\" colspan=\"2\" scope=\"colgroup\">Chi-Square Test<br \/>\nfor Equal Proportions<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th class=\"rowheader\" scope=\"row\">Chi-Square<\/th>\n<td class=\"r data\">384.4135<\/td>\n<\/tr>\n<tr>\n<th class=\"rowheader\" scope=\"row\">DF<\/th>\n<td class=\"r data\">4<\/td>\n<\/tr>\n<tr>\n<th class=\"rowheader\" scope=\"row\">Pr &gt; ChiSq<\/th>\n<td class=\"r data\">&lt;.0001<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/article>\n<\/section>\n<\/article>\n<article aria-label=\"One-Way Chi-Square Test\"><\/article>\n<\/section>\n<h5>The pie chart for the number of times a value was drawn within each category, expressed as a percent is shown here.<\/h5>\n<h5><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3.png\" alt=\"\" class=\"aligncenter wp-image-441 size-full\" width=\"756\" height=\"299\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3.png 756w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3-300x119.png 300w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3-65x26.png 65w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3-225x89.png 225w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/pie3-350x138.png 350w\" sizes=\"auto, (max-width: 756px) 100vw, 756px\" \/><\/h5>\n<hr \/>\n<h5>Webulator Form 1:<\/h5>\n<p><span>The following is a Goodness of Fit Webulator for k= 5 responses In the example above our raw data values for the cumulative times that a number was drawn from each category of the Lotto is shown here:<\/span><\/p>\n<table class=\"aligncenter\" style=\"width: 340px\" aria-label=\"One-Way Frequencies\">\n<caption aria-label=\"One-Way Frequencies\"><span>Distribution of Draws per Category<\/span><\/caption>\n<colgroup>\n<col \/><\/colgroup>\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<thead>\n<tr>\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#1 to #9<br \/>\n<\/span><\/th>\n<td class=\"r data\" style=\"width: 145.7px\"><span>146<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#10 to #19<br \/>\n<\/span><\/th>\n<td class=\"r data\" style=\"width: 145.7px\"><span>155<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#20 to #29<\/span><\/th>\n<td class=\"r data\" style=\"width: 145.7px\"><span>282<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 223.4px;text-align: center\" scope=\"row\"><span>#30 to #39<\/span><\/th>\n<td class=\"r data\" style=\"width: 145.7px\"><span>12<\/span><\/td>\n<\/tr>\n<tr>\n<th style=\"width: 223.4px;text-align: center\"><span>#40 to #49<\/span><\/th>\n<td style=\"width: 145.7px\"><span>29<br \/>\n<\/span><\/td>\n<\/tr>\n<\/thead>\n<\/table>\n<p style=\"text-align: left;\"><span>Enter these data into the webulator below for each of your category options and then click the button labeled <strong><em>CLICK ME<\/em><\/strong>. This will produce the sum of the five values that you entered and compute the expected frequency for the values in the table.<\/span><\/p>\n<p style=\"text-align: left;\"><span><code><\/p>\n<div id=\"h5p-5\">\n<div class=\"h5p-content\" data-content-id=\"5\"><\/div>\n<\/div>\n<p><\/code><\/span><\/p>\n<p>The important value from this <i>Webulator<\/i> is the computed chi-square score. The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for the distribution with &#8220;k&#8221; responses. The critical value is determined by computing the \u201cdegrees of freedom\u201d for our response set.<\/p>\n<p>The computation of the degrees of freedom is: degrees of freedom = \u201ck\u201d possible responses -1<\/p>\n<p>degrees of freedom = 4-1 &#8211;&gt; degrees of freedom = 3<\/p>\n<p style=\"text-align: center;\">and the \u201cchi-square critical value\u201d for degrees of freedom of \u201c3\u201d at p&lt;0.05 = 7.815<\/p>\n<p>If the \u201cchi-square observed value \u201d is<strong>\u00a0 \u203a<\/strong>\u00a0 the \u201cchi-square critical value of <b>7.815<\/b>\u201d, we must reject the null hypothesis and state that the distribution of responses across the response categories IS NOT EQUAL.<\/p>\n<h4>If you would you like to use the Webulators for your own applications, without this text visit: https:\/\/health.ahs.upei.ca\/webulators\/w_menu.php<\/h4>\n<h4>This Webulator application to compute the one sample goodness of fit with k=5 is available at\u00a0 <a href=\"https:\/\/health.ahs.upei.ca\/webulators\/goodfit2.php\"><code>https:\/\/health.ahs.upei.ca\/webulators\/goodfit2.php<\/code><\/a><\/h4>\n<\/div>\n","protected":false},"author":56,"menu_order":3,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-382","chapter","type-chapter","status-publish","hentry"],"part":34,"_links":{"self":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/382","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/users\/56"}],"version-history":[{"count":16,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/382\/revisions"}],"predecessor-version":[{"id":446,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/382\/revisions\/446"}],"part":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/parts\/34"}],"metadata":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/382\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/media?parent=382"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapter-type?post=382"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/contributor?post=382"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/license?post=382"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}