{"id":375,"date":"2020-04-10T07:42:35","date_gmt":"2020-04-10T11:42:35","guid":{"rendered":"http:\/\/pressbooks.library.upei.ca\/montelpare\/?post_type=chapter&#038;p=375"},"modified":"2020-08-24T14:16:21","modified_gmt":"2020-08-24T18:16:21","slug":"statistical-analysis-when-data-are-counts","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.upei.ca\/montelpare\/chapter\/statistical-analysis-when-data-are-counts\/","title":{"raw":"Introducing the Goodness of Fit Chi-Square","rendered":"Introducing the Goodness of Fit Chi-Square"},"content":{"raw":"<div class=\"introducing-the-goodness-of-fit-chi-square\">\r\n<p class=\"import-ABodyCopy\">So you are asking yourself, \u201cgoodness of fitting what to what?\u201d<\/p>\r\n<p class=\"import-ABodyCopy\">The chi-square (pronounced \u201ckie\u201d square) is an extremely useful, non-parametric statistical technique, that allows a researcher to compare responses from a sample to expected responses in a \u2013 hypothetical distribution of responses for a population. Hence the name goodness of fit test.<\/p>\r\n<p class=\"import-ABodyCopy\">The chi-square goodness of fit test can be used to evaluate data at all variable levels, but because the currency of this test is count data, the goodness of fit test can be used to compute nominal and ordinal data.<\/p>\r\n<p class=\"import-ABodyCopy\">The chi-square test evaluates data in the form of counts or frequencies, as in the number of responses within a given category, or the number of people who responded a given way to a specific question, or the number of cases across outcome categories.<\/p>\r\n\r\n<\/div>\r\n<h1>The goodness of fit chi-square for one sample with four categories<\/h1>\r\n<p class=\"import-ABodyCopy\">In the following example, we consider the goodness of fit chi-square with four response categories. In this problem, we are studying a cohort of cancer patients to determine if cancer was more likely to be diagnosed in patients who are in a low-income category, based on socio-economic status (SES) quartiles. We begin by establishing that the expected distribution of cancer patients within the community is equally distributed across the four income categories so that in any community 25% of our population are in the highest SES category, 25% are in the moderate SES category, 25% are in the low SES income category, and 25% are in the very low SES category.<\/p>\r\nProportional Distribution of Sample Across Socioeconomic Categories\r\n\r\n<img class=\"aligncenter wp-image-210 size-full\" alt=\"\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp.jpg\" width=\"760\" height=\"247\" \/>\r\n<p class=\"import-ABodyCopy\">However, in the observed data set for our sample of cancer patients, we recorded the following distribution of patients.<\/p>\r\n\r\n<div>\r\n<table>\r\n<tbody>\r\n<tr class=\"TableGrid-R\">\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">Highest SES 25%<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">Moderate SES 25%<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">Lower SES 25%<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">Very Low SES 25%<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"TableGrid-R\">\r\n<td class=\"TableGrid-C\" colspan=\"4\">\r\n<p class=\"import-ATableBody\">Data from the community sample of cancer patients collected over a 10 year period in a community with an average population of greater than 1 million households<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"TableGrid-R\">\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">165 patients<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">283 patients<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">622 patients<\/p>\r\n<\/td>\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-ATableBody\">980 patients<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<p class=\"import-ABodyCopy\">The null hypothesis for this study is stated in an unbiased way so that each SES quartile is expected to have an equal percentage of households with cancer patients. Therein, the term f<sub>(k)<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= refers to the frequency or number of patients within the quartile indicated by the subscript (k). Since we have four groups representing four quartiles then (k) ranges from 1 to 4.<\/p>\r\n<p class=\"import-NormalWeb\">H<sub>0<\/sub>: f<sub>1<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>2<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>3<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>4<\/sub><\/p>\r\n<p class=\"import-ABodyCopy\">Since we have a total sample size of N = 2050, then each cell of the SES quartiles is expected to have a frequency (an expected number of patients) equal to 512.5 individuals.<\/p>\r\n<p class=\"import-ABodyCopy\">The chi-square formula to test the null hypothesis is:<\/p>\r\n<p class=\"import-AFigure\"><img class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image3.png\" width=\"124\" height=\"71\" \/><\/p>\r\n<p class=\"import-NormalWeb\">The equation measures how closely an observed set of responses (the<em>\u201c<\/em><em>o<\/em><em>\u201d <\/em>for <em>\u201c<\/em><em>observed<\/em><em>\u201d<\/em>) matches an expected set of responses (the <em>\u201c<\/em><em>e<\/em><em>\u201d<\/em> for <em>\u201c<\/em><em>expected<\/em><em>\u201d<\/em>).<\/p>\r\n<p class=\"import-ABodyCopy\">So then how do we calculate the items that we use in the chi-square equation?<\/p>\r\n<p class=\"import-ABodyCopy\">The observed frequencies are simply taken from the data recording sheet, but the expected frequencies<\/p>\r\n<p class=\"import-ABodyCopy\">are computed from the following formula:<\/p>\r\n<p class=\"import-NormalWeb\"><span lang=\"en-US\"><img class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image1-1.jpeg\" width=\"326\" height=\"120\" \/><\/span><\/p>\r\n<p class=\"import-ABodyCopy\">Another way to view the computation of the expected frequencies is to consider the null hypothesis which stated that:<\/p>\r\n<p class=\"import-NormalWeb\">H<sub>0<\/sub>: f<sub>1<\/sub>= f<sub>2<\/sub>= f<sub>3<\/sub>= f<sub>4<\/sub><\/p>\r\n<p class=\"import-ABodyCopy\">and multiply the total frequency by the probability associated with each category, as in the following computations.<\/p>\r\n<p class=\"import-Normal\">2050 x 0.25 = 512.5<\/p>\r\n<p class=\"import-ABodyCopy\">The chi-square is then used to compute whether or not the observed distribution fits a hypothetical or expected distribution. This can be accomplished by setting up the following table below:<\/p>\r\n\r\n<div>\r\n<table class=\"aligncenter\" cellspacing=\"2\" cellpadding=\"2\">\r\n<tfoot>\r\n<tr class=\"border\">\r\n<td>\r\n<p class=\"import-Normal\"><\/p>\r\n<\/td>\r\n<td>\r\n<p class=\"import-Normal\"><\/p>\r\n<\/td>\r\n<td>\r\n<p class=\"import-Normal\"><img alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image2-1.png\" width=\"136.595590551181px\" height=\"78.5975853018373px\" \/><\/p>\r\n<\/td>\r\n<td>\r\n<p class=\"import-Normal\"><strong>= 788.24<\/strong><\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tfoot>\r\n<tbody>\r\n<tr class=\"TableNormal-R\">\r\n<td class=\"TableNormal-C\">\r\n<p class=\"import-Normal\">Response Category<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\">\r\n<p class=\"import-Normal\">Observed Frequency<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\">\r\n<p class=\"import-Normal\">Expected Frequency<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\">\r\n<p class=\"import-Normal\">(Obs - Exp)<sup>2 <\/sup>\u00f7 Exp<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">1: High SES<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">165<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">512.5<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">235.62<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"TableNormal-R\">\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">2: Moderate SES<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">283<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">512.5<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">102.77<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"TableNormal-R\">\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">3: Low SES<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">622<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">512.5<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">23.40<\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr class=\"TableNormal-R\">\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">4: Very Low SES<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">980<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">512.5<\/p>\r\n<\/td>\r\n<td class=\"TableNormal-C\" style=\"text-align: center\">\r\n<p class=\"import-Normal\">425.45<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n<p class=\"import-ABodyCopy\">In this calculation for a one-sample scenario with 4 outcome categories, we see that the Here the chi-square statistic is: 788.24. So what does this mean?<\/p>\r\nTo evaluate the meaning of the variable we calculated for the Chi-square we need to review the decision rule for the Chi-square statistic,\u00a0 and shown here.\r\n<table>\r\n<tbody>\r\n<tr class=\"TableGrid-R\">\r\n<td class=\"TableGrid-C\">\r\n<p class=\"import-Normal\"><em>Chi-Square decision rule (one-sample chi-square test): <\/em><\/p>\r\n<p class=\"import-Normal\"><em>The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for a distribution with five responses. The critical value is determined by computing the degrees of freedom for our response set.\u00a0<\/em><\/p>\r\n<p class=\"import-Normal\"><em>The computation of the degrees of freedom is:<\/em><\/p>\r\n<p class=\"import-Normal\"><em>degrees of freedom = k possible responses -1<\/em><\/p>\r\n<p class=\"import-Normal\"><em>degrees of freedom = 5-1<\/em><\/p>\r\n<p class=\"import-Normal\"><em>degrees of freedom = 4<\/em><\/p>\r\n<p class=\"import-Normal\"><em>and the chi-square critical value for degrees of freedom of 4 at p&lt;0.05 = 9.49<\/em><\/p>\r\n<p class=\"import-Normal\"><em>If the chi-square observed value is\u00a0<\/em><em>GREATER THAN<\/em><em>\u00a0the chi-square critical value of\u00a0<\/em><em>9.49<\/em><em>, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL.<\/em> A large chi-square value, that is a value that exceeds the chi-square critical value demonstrates that the outcome is less likely to occur by chance.<\/p>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p class=\"import-ABodyCopy\">The chi-square statistic is computed as 788.24.<\/p>\r\n<p class=\"import-ABodyCopy\">We, therefore, compare the chi-square observed value of 788.24 against a chi-square expected, based on the expected probability level and the degrees of freedom. In the k=4 chi-square, the degrees of freedom are: degrees of freedom = \u201ck\u201d possible responses -1, so that given k=4, then the degrees of freedom is 4-1 = 3 and at p&lt;0.05 the chi-square critical value is 7.82. Therefore, since our chi-square observed value of 788.24 exceeds the chi-square critical (7.82)\u00a0 we reject the null hypothesis and state that the distribution of cancer patients is not equally distributed across the SES categories, and given the numbers we observed we can state that in this sample, the number of cancer patients in the very low SES group was significantly greater than the number of cancer patients in the high socio-economic category.<\/p>\r\n\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">The following is the SAS code used to analyze the data in the scenario above.<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n<code>PROC FORMAT;\r\nVALUE SLICE 1='HIGH SES' 2='MODERATE SES' 3='LOW SES' 4='VERY LOW  SES';\r\nDATA GFIT_1;<\/code>\r\n<code>INPUT SESGRP N_PATNTS;<\/code>\r\n<code>\/* DEFINE THE AXIS CHARACTERISTICS *\/<\/code>\r\n<code>AXIS1 LABEL=(\"SES CATEGORIES\")<\/code>\r\n<code>VALUE=(JUSTIFY=CENTER);<\/code>\r\n<code>AXIS2 LABEL=(ANGLE=90 \"ACTUAL NUMBER OF PATIENTS\")<\/code>\r\n<code>ORDER=(0 TO 1000 BY 100)<\/code>\r\n<code>MINOR=(N=3);<\/code>\r\n<code>AXIS3 LABEL=(ANGLE=90 \"SES CATEGORIES\");\r\nAXIS4 LABEL=(\"ACTUAL NUMBER OF PATIENTS\") ;<\/code>\r\n<code>DATALINES;<\/code>\r\n1 165\r\n2 283\r\n3 622\r\n4 980\r\n;\r\n<code>\/* HERE WE USE THE OPTION SUMVAR TO GRAPH THE SUM OF THE FREQ *\/<\/code>\r\n<code>PROC FREQ ORDER=DATA; TABLES SESGRP\/CHISQ CL CELLCHI2;<\/code>\r\n<code>WEIGHT N_PATNTS;<\/code>\r\n<code>FORMAT SESGRP SLICE. ;<\/code>\r\n<code>TITLE 'FREQUENCY DISTRIBUTION FOR PROPORTION OF PATIENTS IN EACH SES GROUP';<\/code>\r\n<code>TITLE2 'ONE SAMPLE GOODNESS OF FIT EXAMPLE FOR K=4';<\/code>\r\n<code>RUN;<\/code>\r\n\r\n<\/div>\r\n<\/div>\r\nThe output for Chi-square computation is shown here:\r\n<div class=\"systitleandfootercontainer\" id=\"IDX\">\r\n\r\n<span class=\"c systemtitle\">FREQUENCY DISTRIBUTION FOR PROPORTION OF PATIENTS IN EACH SES GROUP -- <\/span><span class=\"c systemtitle2\">ONE SAMPLE GOODNESS OF FIT EXAMPLE<\/span>\r\n\r\n<\/div>\r\n<div class=\"proc_title_group\">\r\n<p class=\"c proctitle\">The FREQUENCY procedure including the chi-square statistic to evaluate the null hypothesis H<sub>0<\/sub>: f<sub>1<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>2<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>3<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>4<\/sub>.<\/p>\r\n\r\n<\/div>\r\n<section><article aria-label=\"One-Way Frequencies\">\r\n<table class=\"shaded landscape aligncenter\" style=\"height: 287px\" width=\"649\" aria-label=\"One-Way Frequencies\"><caption aria-label=\"One-Way Frequencies\">\u00a0<\/caption>\r\n<tbody>\r\n<tr>\r\n<th class=\"r b header\" style=\"width: 161.133px;text-align: center\" scope=\"col\">SES GRPS<\/th>\r\n<th class=\"r b header\" style=\"width: 106px;text-align: center\" scope=\"col\">Frequency<\/th>\r\n<th class=\"r b header\" style=\"width: 80.4667px;text-align: center\" scope=\"col\">Percent<\/th>\r\n<th class=\"r b header\" style=\"width: 115.383px;text-align: center\" scope=\"col\">Cumulative\r\nFrequency<\/th>\r\n<th class=\"r b header\" style=\"width: 115.383px;text-align: center\" scope=\"col\">Cumulative\r\nPercent<\/th>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">HIGH SES<\/th>\r\n<td class=\"r data\" style=\"width: 106px;text-align: center\">165<\/td>\r\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">8.05<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">165<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">8.05<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">MODERATE SES<\/th>\r\n<td class=\"r data\" style=\"width: 106px;text-align: center\">283<\/td>\r\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">13.80<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">448<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">21.85<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">LOW SES<\/th>\r\n<td class=\"r data\" style=\"width: 106px;text-align: center\">622<\/td>\r\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">30.34<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">1070<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">52.20<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">VERY LOW SES<\/th>\r\n<td class=\"r data\" style=\"width: 106px;text-align: center\">980<\/td>\r\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">47.80<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">2050<\/td>\r\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">100.00<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<colgroup> <col \/><\/colgroup> <colgroup> <col \/> <col \/> <col \/> <col \/><\/colgroup><\/table>\r\n<\/article><article id=\"IDX1\" aria-label=\"One-Way Chi-Square Test\">\r\n<table class=\"shaded aligncenter\" aria-label=\"One-Way Chi-Square Test\"><caption aria-label=\"One-Way Chi-Square Test\">\u00a0<\/caption><colgroup> <col \/> <col \/><\/colgroup>\r\n<thead>\r\n<tr>\r\n<th class=\"c b header\" colspan=\"2\" scope=\"colgroup\">Chi-Square Test\r\nfor Equal Proportions<\/th>\r\n<\/tr>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">Chi-Square<\/th>\r\n<td class=\"r data\">788.2400<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">DF<\/th>\r\n<td class=\"r data\">3<\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"rowheader\" scope=\"row\">Pr &gt; ChiSq<\/th>\r\n<td class=\"r data\">&lt;.0001<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<\/table>\r\n<\/article><\/section>\r\n<div class=\"proc_title_group\">\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">The SAS code to produce the pie chart is as follows:<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\"><code>PROC FORMAT;<\/code>\r\n<code>VALUE SLICE 1='HIGH SES' 2='MODERATE SES' 3='LOW SES' 4='VERY LOW  SES';<\/code>\r\n<code>PROC GCHART DATA=GFIT_1;<\/code>\r\n<code>PIE3D SESGRP\/SUMVAR=N_PATNTS TYPE=SUM DISCRETE PERCENT=inside<\/code>\r\n<code>COUTLINE=RED WOUTLINE=1 FILL=SOLID SLICE =ARROW CLOCKWISE<\/code>\r\n<code>NOLEGEND NOHEADING VALUE=NONE;<\/code>\r\n<code>FORMAT SESGRP SLICE. ;<\/code>\r\n<code>TITLE1 'PIE CHART FOR PROPORTION OF PATIENTS IN EACH SES GROUP';<\/code>\r\n<code>PATTERN1 COLOR = LIGHTBLUE;<\/code>\r\n<code>RUN;<\/code><\/div>\r\n<\/div>\r\n<\/div>\r\n<h3>PIE CHART FOR PROPORTION OF PATIENTS IN EACH SES GROUP<\/h3>\r\n<img src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1.png\" alt=\"\" class=\"aligncenter wp-image-429 size-full\" width=\"998\" height=\"350\" \/>\r\n\r\n<hr \/>\r\n\r\n<h5>Webulator Form 1:<\/h5>\r\n<span>The following is a Goodness of Fit Webulator for k= 4 responses In the table above we used the values for socioeconomic status:<\/span>\r\n<table class=\"aligncenter\" aria-label=\"One-Way Frequencies\"><caption aria-label=\"One-Way Frequencies\"><span>Distribution of individuals across SES<\/span><\/caption><colgroup> <col \/><\/colgroup> <colgroup> <col \/> <col \/> <col \/> <col \/><\/colgroup>\r\n<thead>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>HIGH SES<\/span><\/th>\r\n<td class=\"r data\" style=\"text-align: center\"><span>165<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>MODERATE SES<\/span><\/th>\r\n<td class=\"r data\" style=\"text-align: center\"><span>283<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>LOW SES<\/span><\/th>\r\n<td class=\"r data\" style=\"text-align: center\"><span>622<\/span><\/td>\r\n<\/tr>\r\n<tr>\r\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>VERY LOW SES<\/span><\/th>\r\n<td class=\"r data\" style=\"text-align: center\"><span>980<\/span><\/td>\r\n<\/tr>\r\n<\/thead>\r\n<\/table>\r\n<p align=\"left\"><span>Enter these data into the webulator below for each of your four options and then click the button labelled <strong><em>compute expected frequencies<\/em><\/strong>. This will produce the sum of the four values that you entered and compute the expected frequency for the values in the table.<\/span><\/p>\r\n<p align=\"left\"><span><code>[h5p id=\"4\"]<\/code><\/span><\/p>\r\nThe important value from this <i>Webulator<\/i> is the computed chi-square score. The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for the distribution with \"k\" responses. The critical value is determined by computing the \u201cdegrees of freedom\u201d for our response set.\r\n\r\nThe computation of the degrees of freedom is: degrees of freedom = \u201ck\u201d possible responses -1\r\n\r\ndegrees of freedom = 4-1 --&gt; degrees of freedom = 3\r\n<p align=\"center\">and the \u201cchi-square critical value\u201d for degrees of freedom of \u201c3\u201d at p&lt;0.05 = 7.815<\/p>\r\nIf the \u201cchi-square observed value \u201d is<strong>\u00a0 \u203a<\/strong>\u00a0 the \u201cchi-square critical value of <b>7.815<\/b>\u201d, we must reject the null hypothesis and state that the distribution of responses across the response categories IS NOT EQUAL.\r\n<h4>If you would you like to use the Webulators for your own applications, without this text visit: https:\/\/health.ahs.upei.ca\/webulators\/w_menu.php<\/h4>\r\n<h4>This Webulator application to compute the one sample goodness of fit with k=4 is available at\u00a0 <code>https:\/\/health.ahs.upei.ca\/webulators\/4k_Gf.php<\/code><\/h4>","rendered":"<div class=\"introducing-the-goodness-of-fit-chi-square\">\n<p class=\"import-ABodyCopy\">So you are asking yourself, \u201cgoodness of fitting what to what?\u201d<\/p>\n<p class=\"import-ABodyCopy\">The chi-square (pronounced \u201ckie\u201d square) is an extremely useful, non-parametric statistical technique, that allows a researcher to compare responses from a sample to expected responses in a \u2013 hypothetical distribution of responses for a population. Hence the name goodness of fit test.<\/p>\n<p class=\"import-ABodyCopy\">The chi-square goodness of fit test can be used to evaluate data at all variable levels, but because the currency of this test is count data, the goodness of fit test can be used to compute nominal and ordinal data.<\/p>\n<p class=\"import-ABodyCopy\">The chi-square test evaluates data in the form of counts or frequencies, as in the number of responses within a given category, or the number of people who responded a given way to a specific question, or the number of cases across outcome categories.<\/p>\n<\/div>\n<h1>The goodness of fit chi-square for one sample with four categories<\/h1>\n<p class=\"import-ABodyCopy\">In the following example, we consider the goodness of fit chi-square with four response categories. In this problem, we are studying a cohort of cancer patients to determine if cancer was more likely to be diagnosed in patients who are in a low-income category, based on socio-economic status (SES) quartiles. We begin by establishing that the expected distribution of cancer patients within the community is equally distributed across the four income categories so that in any community 25% of our population are in the highest SES category, 25% are in the moderate SES category, 25% are in the low SES income category, and 25% are in the very low SES category.<\/p>\n<p>Proportional Distribution of Sample Across Socioeconomic Categories<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-210 size-full\" alt=\"\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp.jpg\" width=\"760\" height=\"247\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp.jpg 760w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp-300x98.jpg 300w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp-65x21.jpg 65w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp-225x73.jpg 225w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/catgrp-350x114.jpg 350w\" sizes=\"auto, (max-width: 760px) 100vw, 760px\" \/><\/p>\n<p class=\"import-ABodyCopy\">However, in the observed data set for our sample of cancer patients, we recorded the following distribution of patients.<\/p>\n<div>\n<table>\n<tbody>\n<tr class=\"TableGrid-R\">\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">Highest SES 25%<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">Moderate SES 25%<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">Lower SES 25%<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">Very Low SES 25%<\/p>\n<\/td>\n<\/tr>\n<tr class=\"TableGrid-R\">\n<td class=\"TableGrid-C\" colspan=\"4\">\n<p class=\"import-ATableBody\">Data from the community sample of cancer patients collected over a 10 year period in a community with an average population of greater than 1 million households<\/p>\n<\/td>\n<\/tr>\n<tr class=\"TableGrid-R\">\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">165 patients<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">283 patients<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">622 patients<\/p>\n<\/td>\n<td class=\"TableGrid-C\">\n<p class=\"import-ATableBody\">980 patients<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p class=\"import-ABodyCopy\">The null hypothesis for this study is stated in an unbiased way so that each SES quartile is expected to have an equal percentage of households with cancer patients. Therein, the term f<sub>(k)<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= refers to the frequency or number of patients within the quartile indicated by the subscript (k). Since we have four groups representing four quartiles then (k) ranges from 1 to 4.<\/p>\n<p class=\"import-NormalWeb\">H<sub>0<\/sub>: f<sub>1<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>2<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>3<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>4<\/sub><\/p>\n<p class=\"import-ABodyCopy\">Since we have a total sample size of N = 2050, then each cell of the SES quartiles is expected to have a frequency (an expected number of patients) equal to 512.5 individuals.<\/p>\n<p class=\"import-ABodyCopy\">The chi-square formula to test the null hypothesis is:<\/p>\n<p class=\"import-AFigure\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image3.png\" width=\"124\" height=\"71\" \/><\/p>\n<p class=\"import-NormalWeb\">The equation measures how closely an observed set of responses (the<em>\u201c<\/em><em>o<\/em><em>\u201d <\/em>for <em>\u201c<\/em><em>observed<\/em><em>\u201d<\/em>) matches an expected set of responses (the <em>\u201c<\/em><em>e<\/em><em>\u201d<\/em> for <em>\u201c<\/em><em>expected<\/em><em>\u201d<\/em>).<\/p>\n<p class=\"import-ABodyCopy\">So then how do we calculate the items that we use in the chi-square equation?<\/p>\n<p class=\"import-ABodyCopy\">The observed frequencies are simply taken from the data recording sheet, but the expected frequencies<\/p>\n<p class=\"import-ABodyCopy\">are computed from the following formula:<\/p>\n<p class=\"import-NormalWeb\"><span lang=\"en-US\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image1-1.jpeg\" width=\"326\" height=\"120\" \/><\/span><\/p>\n<p class=\"import-ABodyCopy\">Another way to view the computation of the expected frequencies is to consider the null hypothesis which stated that:<\/p>\n<p class=\"import-NormalWeb\">H<sub>0<\/sub>: f<sub>1<\/sub>= f<sub>2<\/sub>= f<sub>3<\/sub>= f<sub>4<\/sub><\/p>\n<p class=\"import-ABodyCopy\">and multiply the total frequency by the probability associated with each category, as in the following computations.<\/p>\n<p class=\"import-Normal\">2050 x 0.25 = 512.5<\/p>\n<p class=\"import-ABodyCopy\">The chi-square is then used to compute whether or not the observed distribution fits a hypothetical or expected distribution. This can be accomplished by setting up the following table below:<\/p>\n<div>\n<table class=\"aligncenter\" cellpadding=\"2\" style=\"border-spacing: 2px;\">\n<tfoot>\n<tr class=\"border\">\n<td>\n<p class=\"import-Normal\">\n<\/td>\n<td>\n<p class=\"import-Normal\">\n<\/td>\n<td>\n<p class=\"import-Normal\"><img decoding=\"async\" alt=\"image\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/02\/image2-1.png\" width=\"136.595590551181px\" height=\"78.5975853018373px\" \/><\/p>\n<\/td>\n<td>\n<p class=\"import-Normal\"><strong>= 788.24<\/strong><\/p>\n<\/td>\n<\/tr>\n<\/tfoot>\n<tbody>\n<tr class=\"TableNormal-R\">\n<td class=\"TableNormal-C\">\n<p class=\"import-Normal\">Response Category<\/p>\n<\/td>\n<td class=\"TableNormal-C\">\n<p class=\"import-Normal\">Observed Frequency<\/p>\n<\/td>\n<td class=\"TableNormal-C\">\n<p class=\"import-Normal\">Expected Frequency<\/p>\n<\/td>\n<td class=\"TableNormal-C\">\n<p class=\"import-Normal\">(Obs &#8211; Exp)<sup>2 <\/sup>\u00f7 Exp<\/p>\n<\/td>\n<\/tr>\n<tr>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">1: High SES<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">165<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">512.5<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">235.62<\/p>\n<\/td>\n<\/tr>\n<tr class=\"TableNormal-R\">\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">2: Moderate SES<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">283<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">512.5<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">102.77<\/p>\n<\/td>\n<\/tr>\n<tr class=\"TableNormal-R\">\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">3: Low SES<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">622<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">512.5<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">23.40<\/p>\n<\/td>\n<\/tr>\n<tr class=\"TableNormal-R\">\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">4: Very Low SES<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">980<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">512.5<\/p>\n<\/td>\n<td class=\"TableNormal-C\" style=\"text-align: center\">\n<p class=\"import-Normal\">425.45<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p class=\"import-ABodyCopy\">In this calculation for a one-sample scenario with 4 outcome categories, we see that the Here the chi-square statistic is: 788.24. So what does this mean?<\/p>\n<p>To evaluate the meaning of the variable we calculated for the Chi-square we need to review the decision rule for the Chi-square statistic,\u00a0 and shown here.<\/p>\n<table>\n<tbody>\n<tr class=\"TableGrid-R\">\n<td class=\"TableGrid-C\">\n<p class=\"import-Normal\"><em>Chi-Square decision rule (one-sample chi-square test): <\/em><\/p>\n<p class=\"import-Normal\"><em>The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for a distribution with five responses. The critical value is determined by computing the degrees of freedom for our response set.\u00a0<\/em><\/p>\n<p class=\"import-Normal\"><em>The computation of the degrees of freedom is:<\/em><\/p>\n<p class=\"import-Normal\"><em>degrees of freedom = k possible responses -1<\/em><\/p>\n<p class=\"import-Normal\"><em>degrees of freedom = 5-1<\/em><\/p>\n<p class=\"import-Normal\"><em>degrees of freedom = 4<\/em><\/p>\n<p class=\"import-Normal\"><em>and the chi-square critical value for degrees of freedom of 4 at p&lt;0.05 = 9.49<\/em><\/p>\n<p class=\"import-Normal\"><em>If the chi-square observed value is\u00a0<\/em><em>GREATER THAN<\/em><em>\u00a0the chi-square critical value of\u00a0<\/em><em>9.49<\/em><em>, we must reject the null hypothesis and state that the distribution of responses across the four categories IS NOT EQUAL.<\/em> A large chi-square value, that is a value that exceeds the chi-square critical value demonstrates that the outcome is less likely to occur by chance.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p class=\"import-ABodyCopy\">The chi-square statistic is computed as 788.24.<\/p>\n<p class=\"import-ABodyCopy\">We, therefore, compare the chi-square observed value of 788.24 against a chi-square expected, based on the expected probability level and the degrees of freedom. In the k=4 chi-square, the degrees of freedom are: degrees of freedom = \u201ck\u201d possible responses -1, so that given k=4, then the degrees of freedom is 4-1 = 3 and at p&lt;0.05 the chi-square critical value is 7.82. Therefore, since our chi-square observed value of 788.24 exceeds the chi-square critical (7.82)\u00a0 we reject the null hypothesis and state that the distribution of cancer patients is not equally distributed across the SES categories, and given the numbers we observed we can state that in this sample, the number of cancer patients in the very low SES group was significantly greater than the number of cancer patients in the high socio-economic category.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">The following is the SAS code used to analyze the data in the scenario above.<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p><code>PROC FORMAT;<br \/>\nVALUE SLICE 1='HIGH SES' 2='MODERATE SES' 3='LOW SES' 4='VERY LOW  SES';<br \/>\nDATA GFIT_1;<\/code><br \/>\n<code>INPUT SESGRP N_PATNTS;<\/code><br \/>\n<code>\/* DEFINE THE AXIS CHARACTERISTICS *\/<\/code><br \/>\n<code>AXIS1 LABEL=(\"SES CATEGORIES\")<\/code><br \/>\n<code>VALUE=(JUSTIFY=CENTER);<\/code><br \/>\n<code>AXIS2 LABEL=(ANGLE=90 \"ACTUAL NUMBER OF PATIENTS\")<\/code><br \/>\n<code>ORDER=(0 TO 1000 BY 100)<\/code><br \/>\n<code>MINOR=(N=3);<\/code><br \/>\n<code>AXIS3 LABEL=(ANGLE=90 \"SES CATEGORIES\");<br \/>\nAXIS4 LABEL=(\"ACTUAL NUMBER OF PATIENTS\") ;<\/code><br \/>\n<code>DATALINES;<\/code><br \/>\n1 165<br \/>\n2 283<br \/>\n3 622<br \/>\n4 980<br \/>\n;<br \/>\n<code>\/* HERE WE USE THE OPTION SUMVAR TO GRAPH THE SUM OF THE FREQ *\/<\/code><br \/>\n<code>PROC FREQ ORDER=DATA; TABLES SESGRP\/CHISQ CL CELLCHI2;<\/code><br \/>\n<code>WEIGHT N_PATNTS;<\/code><br \/>\n<code>FORMAT SESGRP SLICE. ;<\/code><br \/>\n<code>TITLE 'FREQUENCY DISTRIBUTION FOR PROPORTION OF PATIENTS IN EACH SES GROUP';<\/code><br \/>\n<code>TITLE2 'ONE SAMPLE GOODNESS OF FIT EXAMPLE FOR K=4';<\/code><br \/>\n<code>RUN;<\/code><\/p>\n<\/div>\n<\/div>\n<p>The output for Chi-square computation is shown here:<\/p>\n<div class=\"systitleandfootercontainer\" id=\"IDX\">\n<p><span class=\"c systemtitle\">FREQUENCY DISTRIBUTION FOR PROPORTION OF PATIENTS IN EACH SES GROUP &#8212; <\/span><span class=\"c systemtitle2\">ONE SAMPLE GOODNESS OF FIT EXAMPLE<\/span><\/p>\n<\/div>\n<div class=\"proc_title_group\">\n<p class=\"c proctitle\">The FREQUENCY procedure including the chi-square statistic to evaluate the null hypothesis H<sub>0<\/sub>: f<sub>1<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>2<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>3<\/sub><span class=\"import-apple-converted-space\">\u00a0<\/span>= f<sub>4<\/sub>.<\/p>\n<\/div>\n<section>\n<article aria-label=\"One-Way Frequencies\">\n<table class=\"shaded landscape aligncenter\" style=\"height: 287px; width: 649px;\" aria-label=\"One-Way Frequencies\">\n<caption aria-label=\"One-Way Frequencies\">\u00a0<\/caption>\n<tbody>\n<tr>\n<th class=\"r b header\" style=\"width: 161.133px;text-align: center\" scope=\"col\">SES GRPS<\/th>\n<th class=\"r b header\" style=\"width: 106px;text-align: center\" scope=\"col\">Frequency<\/th>\n<th class=\"r b header\" style=\"width: 80.4667px;text-align: center\" scope=\"col\">Percent<\/th>\n<th class=\"r b header\" style=\"width: 115.383px;text-align: center\" scope=\"col\">Cumulative<br \/>\nFrequency<\/th>\n<th class=\"r b header\" style=\"width: 115.383px;text-align: center\" scope=\"col\">Cumulative<br \/>\nPercent<\/th>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">HIGH SES<\/th>\n<td class=\"r data\" style=\"width: 106px;text-align: center\">165<\/td>\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">8.05<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">165<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">8.05<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">MODERATE SES<\/th>\n<td class=\"r data\" style=\"width: 106px;text-align: center\">283<\/td>\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">13.80<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">448<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">21.85<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">LOW SES<\/th>\n<td class=\"r data\" style=\"width: 106px;text-align: center\">622<\/td>\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">30.34<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">1070<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">52.20<\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"width: 161.133px;text-align: center\" scope=\"row\">VERY LOW SES<\/th>\n<td class=\"r data\" style=\"width: 106px;text-align: center\">980<\/td>\n<td class=\"r data\" style=\"width: 80.4667px;text-align: center\">47.80<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">2050<\/td>\n<td class=\"r data\" style=\"width: 115.383px;text-align: center\">100.00<\/td>\n<\/tr>\n<\/tbody>\n<colgroup>\n<col \/><\/colgroup>\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<\/table>\n<\/article>\n<article id=\"IDX1\" aria-label=\"One-Way Chi-Square Test\">\n<table class=\"shaded aligncenter\" aria-label=\"One-Way Chi-Square Test\">\n<caption aria-label=\"One-Way Chi-Square Test\">\u00a0<\/caption>\n<colgroup>\n<col \/>\n<col \/><\/colgroup>\n<thead>\n<tr>\n<th class=\"c b header\" colspan=\"2\" scope=\"colgroup\">Chi-Square Test<br \/>\nfor Equal Proportions<\/th>\n<\/tr>\n<tr>\n<th class=\"rowheader\" scope=\"row\">Chi-Square<\/th>\n<td class=\"r data\">788.2400<\/td>\n<\/tr>\n<tr>\n<th class=\"rowheader\" scope=\"row\">DF<\/th>\n<td class=\"r data\">3<\/td>\n<\/tr>\n<tr>\n<th class=\"rowheader\" scope=\"row\">Pr &gt; ChiSq<\/th>\n<td class=\"r data\">&lt;.0001<\/td>\n<\/tr>\n<\/thead>\n<\/table>\n<\/article>\n<\/section>\n<div class=\"proc_title_group\">\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">The SAS code to produce the pie chart is as follows:<\/p>\n<\/header>\n<div class=\"textbox__content\"><code>PROC FORMAT;<\/code><br \/>\n<code>VALUE SLICE 1='HIGH SES' 2='MODERATE SES' 3='LOW SES' 4='VERY LOW  SES';<\/code><br \/>\n<code>PROC GCHART DATA=GFIT_1;<\/code><br \/>\n<code>PIE3D SESGRP\/SUMVAR=N_PATNTS TYPE=SUM DISCRETE PERCENT=inside<\/code><br \/>\n<code>COUTLINE=RED WOUTLINE=1 FILL=SOLID SLICE =ARROW CLOCKWISE<\/code><br \/>\n<code>NOLEGEND NOHEADING VALUE=NONE;<\/code><br \/>\n<code>FORMAT SESGRP SLICE. ;<\/code><br \/>\n<code>TITLE1 'PIE CHART FOR PROPORTION OF PATIENTS IN EACH SES GROUP';<\/code><br \/>\n<code>PATTERN1 COLOR = LIGHTBLUE;<\/code><br \/>\n<code>RUN;<\/code><\/div>\n<\/div>\n<\/div>\n<h3>PIE CHART FOR PROPORTION OF PATIENTS IN EACH SES GROUP<\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1.png\" alt=\"\" class=\"aligncenter wp-image-429 size-full\" width=\"998\" height=\"350\" srcset=\"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1.png 998w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1-300x105.png 300w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1-768x269.png 768w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1-65x23.png 65w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1-225x79.png 225w, https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-content\/uploads\/sites\/49\/2020\/04\/PIE1-1-350x123.png 350w\" sizes=\"auto, (max-width: 998px) 100vw, 998px\" \/><\/p>\n<hr \/>\n<h5>Webulator Form 1:<\/h5>\n<p><span>The following is a Goodness of Fit Webulator for k= 4 responses In the table above we used the values for socioeconomic status:<\/span><\/p>\n<table class=\"aligncenter\" aria-label=\"One-Way Frequencies\">\n<caption aria-label=\"One-Way Frequencies\"><span>Distribution of individuals across SES<\/span><\/caption>\n<colgroup>\n<col \/><\/colgroup>\n<colgroup>\n<col \/>\n<col \/>\n<col \/>\n<col \/><\/colgroup>\n<thead>\n<tr>\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>HIGH SES<\/span><\/th>\n<td class=\"r data\" style=\"text-align: center\"><span>165<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>MODERATE SES<\/span><\/th>\n<td class=\"r data\" style=\"text-align: center\"><span>283<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>LOW SES<\/span><\/th>\n<td class=\"r data\" style=\"text-align: center\"><span>622<\/span><\/td>\n<\/tr>\n<tr>\n<th class=\"r rowheader\" style=\"text-align: center\" scope=\"row\"><span>VERY LOW SES<\/span><\/th>\n<td class=\"r data\" style=\"text-align: center\"><span>980<\/span><\/td>\n<\/tr>\n<\/thead>\n<\/table>\n<p style=\"text-align: left;\"><span>Enter these data into the webulator below for each of your four options and then click the button labelled <strong><em>compute expected frequencies<\/em><\/strong>. This will produce the sum of the four values that you entered and compute the expected frequency for the values in the table.<\/span><\/p>\n<p style=\"text-align: left;\"><span><code><\/p>\n<div id=\"h5p-4\">\n<div class=\"h5p-content\" data-content-id=\"4\"><\/div>\n<\/div>\n<p><\/code><\/span><\/p>\n<p>The important value from this <i>Webulator<\/i> is the computed chi-square score. The computed score is referred to as the chi-square observed. After computing the chi-square observed value, determine the chi-square critical score from a table of chi-square values. The chi-square critical score represents what we should expect to observe for the distribution with &#8220;k&#8221; responses. The critical value is determined by computing the \u201cdegrees of freedom\u201d for our response set.<\/p>\n<p>The computation of the degrees of freedom is: degrees of freedom = \u201ck\u201d possible responses -1<\/p>\n<p>degrees of freedom = 4-1 &#8211;&gt; degrees of freedom = 3<\/p>\n<p style=\"text-align: center;\">and the \u201cchi-square critical value\u201d for degrees of freedom of \u201c3\u201d at p&lt;0.05 = 7.815<\/p>\n<p>If the \u201cchi-square observed value \u201d is<strong>\u00a0 \u203a<\/strong>\u00a0 the \u201cchi-square critical value of <b>7.815<\/b>\u201d, we must reject the null hypothesis and state that the distribution of responses across the response categories IS NOT EQUAL.<\/p>\n<h4>If you would you like to use the Webulators for your own applications, without this text visit: https:\/\/health.ahs.upei.ca\/webulators\/w_menu.php<\/h4>\n<h4>This Webulator application to compute the one sample goodness of fit with k=4 is available at\u00a0 <code>https:\/\/health.ahs.upei.ca\/webulators\/4k_Gf.php<\/code><\/h4>\n","protected":false},"author":56,"menu_order":2,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-375","chapter","type-chapter","status-publish","hentry"],"part":34,"_links":{"self":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/375","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/users\/56"}],"version-history":[{"count":12,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/375\/revisions"}],"predecessor-version":[{"id":1526,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/375\/revisions\/1526"}],"part":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/parts\/34"}],"metadata":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/375\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/media?parent=375"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapter-type?post=375"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/contributor?post=375"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/license?post=375"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}