{"id":991,"date":"2020-06-02T10:51:25","date_gmt":"2020-06-02T14:51:25","guid":{"rendered":"http:\/\/pressbooks.library.upei.ca\/montelpare\/?post_type=chapter&#038;p=991"},"modified":"2022-05-06T05:15:46","modified_gmt":"2022-05-06T09:15:46","slug":"computer-simulation-and-random-number-generators","status":"publish","type":"chapter","link":"https:\/\/pressbooks.library.upei.ca\/montelpare\/chapter\/computer-simulation-and-random-number-generators\/","title":{"raw":"Computer Simulation and Random Number Generators","rendered":"Computer Simulation and Random Number Generators"},"content":{"raw":"Research using simulated data is often done to predict future events based on real-world data. Once accepted as a reliable method to contribute to decision making, there are several applications in which computer simulation could be used to provide plausible outcome scenarios prior to actually making a decision to advance in a specific direction.\u00a0 For example, using computer simulation tools, administrators can create financial forecasting models based on selected expenditure statements and health sector administrative data to estimate future costs and establish budgetary guidelines that are within the appropriate tolerances for a given fiduciary system. Similarly, health researchers can use demographic information about a cohort within the population or about the health care workforce to predict how many new health care professionals we will need in the years to come. This information can then be used to make decisions about the number of students that universities and colleges should accept into their programs in order to meet the predicted needs.\r\n\r\nCreating and using a simulated dataset is also an excellent way to practice the application of statistical methods without having to collect real-world data. That is, we can create a simulated dataset by first establishing the set of independent and dependent variables that are of interest to us in our research project, and then establish the range for each response within the variables of interest.\r\n\r\nFor example, if in our research study we were interested in measuring the effect of a drug versus placebo on reacton time, then our study could be as simple as having three variables: ID, DRUG, and REACTION_TIME. Given that the ID is simply a counter which SAS will assign as an observation number, we need not be concerned with the scoring of ID. LIkewise, given that DRUG can either be DRUG or PLACEBO, we know that this is what we refer to as either a grouping variable or in some fileds we may see this referred to as a DUMMY variable, and the range for this variable will be DRUG=1\u00a0 and PLACEBO=2. Further, from the literature we understand that the reaction time will have an upper threshold value we use to establish the range of outcomes for our response in our computer simulation data set.\r\n\r\nWith all of this information known before we begin, we can use a random number generator with SAS to create a dataset that is estimated from the set of values we provide to the computer.\r\n\r\nHere is our first example of creating random numbers with the SAS random number generator functions. In this first example, we are simply testing the code for the random number function. Here we used 3 lines of SAS code.\r\n<code>\r\n<span style=\"color: blue\">DATA SASRNG;<\/span>\r\n<span style=\"color: purple\">\/* THE SEED FOR THE RANDOM NUMBER *\/<\/span>\r\n<span style=\"color: blue\">call streaminit(999); <\/span>\r\n<span style=\"color: purple\"> \/* CREATE THE VARIABLE GROUP *\/<\/span>\r\n<span style=\"color: blue\">group=RAND(\"normal\")*1000000000000;<\/span>\r\n<span style=\"color: purple\"> \/* RUN THE RNG AND PRINT OUTPUT *\/<\/span>\r\n<span style=\"color: blue\">run;\r\nproc print; var group;\r\nrun;<\/span>\r\n<\/code>\r\n\r\nThe result of this code is a random number but the number has no real meaning to us except that it shows us SAS generated a value for the variable GROUP.\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td>Obs<\/td>\r\n<td>group<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>1<\/td>\r\n<td>-4.8095E1<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nNow we want to modify that output so that it has meaning. In the following SAS code we will first create the random number, establish the sign of the output to be positive by using the <span style=\"color: purple\">ABS <i>absolute number<\/i> function <\/span>and then set the range of the output by using the MOD functions (<span style=\"color: purple\"><i>MODULO MATH<\/i> function <\/span>). In the application of modulo math here we are setting the lowest value to 1 and the ceiling value to 2.\r\n\r\nNotice here that we are generating a set of 20 values, and just to be sure that we restrict the output to 1 and 2 we add the logic statement <span style=\"color: purple\"> if group=3 then group=1; <\/span>\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><code>\r\n<span style=\"color: blue\">DATA sasrng;\r\nDO K=1 TO 20;\r\ncall streaminit(999);\r\ngroup=RAND(\"normal\")*1000000000000;\r\ngroup=1+ABS((mod(group,2)));\r\ngroup=ROUND(group);\r\nif group=3 then group=1;\r\noutput;\r\nend;\r\nrun;\r\nproc print; var group;\r\nrun;<\/span><\/code><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe output from this SAS code is presented here:\r\n\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td bgcolor=\"skyblue\">Obs<\/td><td>  group<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">1 <\/td><td> 2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">2 <\/td><td> 2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">3<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">4<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">5<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">6<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">7<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">8<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">9<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">10<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">11<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">12<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">13<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">14<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">15<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">16<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">17<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">18<\/td><td>  2<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">19<\/td><td>  1<\/td><\/tr><tr>\r\n<td bgcolor=\"skyblue\">20<\/td><td> 2<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThis is an amazing learning opportunity as it enables you to create, albeit artificially, a complete dataset with the variables in which you are interested.\u00a0 The experience is invaluable as it provides you with the opportunity to critically evaluate both the strategies for input as well as the interpretation for output. Although not required, working through a computer-simulated dataset during the development of your research proposal can help you develop your data analysis plan, and enable you to become familiar with the ranges and nuances of the important variables.\r\n\r\nIn the following program, we will generate a set of values based on the application of SAS random number generator funtions. \r\nFor the variables age, which is a discrete random variable, we will set a minimum age of 18 and a maximum age of 72, The data will be generated from the random normal distribution to produce a variable for group and and a variable for reaction time. Each of these variables represents a different type of important variable that you might encounter in your data. \r\nFor example, the variable age is discrete and can have a range from 1 to 120; sex will be alphanumeric and can be of four types (M, F, O, U) where O is other and U is undisclosed; the group will be limited to a binary output (1,2), and reaction time will represent a continuous random variable. Here we will control the function of the random number generator by controlling the parameters of the processor to ensure that our output falls within a specific range.\r\n\r\n.. Chapter 42 RNG_PRG01\r\n\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><code>\r\n<span style=\"color: blue\">DATA sasrng;<br>\r\nDO K=1 TO 20;<br>\r\ncall streaminit(999);       <\/span>\r\n<span style=\"color: purple\">\/* set random number seed *\/<br><\/span>\r\n<span style=\"color: blue\">group=RAND(\"normal\")*1000000000000;<br>\r\ngroup=1+ABS((mod(group,2)));<br>\r\ngroup=ROUND(group);<br>\r\nif group=3 then group=1;<\/span>\r\n<span style=\"color: purple\">\/* here we use the UNIFORM distribtion as the source \r\nfor the random number function *\/<\/span>\r\n<span style=\"color: blue\">\r\n   U = RAND(\"Uniform\"); <\/span>\r\n<span style=\"color: purple\">\/* u ~ U(0,1) *\/<br> \r\n\/* Next we reassign the random generated scores \r\n   and establish the groups using if then logic  *\/ <\/span>\r\n<span style=\"color: blue\"> LENGTH sex $12;<br>\r\n if U LE 0.25 then sex = 'other';<br>\r\n     if U GT 0.25 and U LE 0.5 then sex = 'undisclosed';<BR>\r\n    if U GT 0.5 and U LE 0.75 then sex = 'female';\r\n    if U GT 0.75 then sex = 'male';<\/span>\r\n<span style=\"color: purple\">     \r\n\/* Continuous discrete variable age *\/<\/span>\r\n<span style=\"color: blue\"> call streaminit(13);\r\n<BR> age=RAND(\"normal\")*1000000000000;<BR>\r\nage=18+ABS((mod(age,50))); <BR>\r\nage=ROUND(age);\r\nif age GT 72 then age=15+ABS((mod(age,50)));<\/span>\r\n<span style=\"color: PURPLE\"> \/* Continuous decimal variable REACTION TIME *\/<\/span>\r\n<span style=\"color: blue\"> call streaminit(99);<BR>\r\n react=RAND(\"normal\")*1000000000000;<BR>\r\nreact=1+ABS((mod(react,3)));\r\nreact=ROUND(react,0.01);\r\n<BR>output; end;\r\nrun; \r\nproc print; var group sex age react;\r\nrun; <\/span><\/code>\r\n<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n\r\n\r\n  <strong>The program above produced the following output. <\/strong>\r\n\r\n<div align=\"center\">\r\n<table>\r\n<thead>\r\n<tr>\r\n<td><strong>OBS<\/strong><\/td>\r\n<td><strong>Group<\/strong><\/td>\r\n<td><strong>Sex<\/strong><\/td>\r\n<td><strong>Age<\/strong><\/td>\r\n<td><strong>React<\/strong><\/td>\r\n\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>OBS<\/strong><\/td>\r\n<td><strong>Group<\/strong><\/td>\r\n<td><strong>Sex<\/strong><\/td>\r\n<td><strong>Age<\/strong><\/td>\r\n<td><strong>React<\/strong><\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td><strong>1<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>17<\/td>\r\n<td>2.55<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>11<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>21<\/td>\r\n<td>3.97<\/td>\r\n<\/tr><tr>\r\n<td><strong>2<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>41 <\/td>\r\n<td>1.85<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>12<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>32<\/td>\r\n<td>3.66<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>3<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>female<\/td>\r\n<td>16<\/td>\r\n<td>1.00<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>13<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>female<\/td>\r\n<td>51<\/td>\r\n<td>2.15<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>4<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>female<\/td>\r\n<td>59<\/td>\r\n<td>1.34<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>14<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>female<\/td>\r\n<td>29<\/td>\r\n<td>2.17<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>5<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>male<\/td>\r\n<td>59<\/td>\r\n<td>1.15<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>15<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>undisclosed<\/td>\r\n<td>59<\/td>\r\n<td>1.30<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>6<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>other<\/td>\r\n<td>56<\/td>\r\n<td>2.15<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>16<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>other<\/td>\r\n<td>32<\/td>\r\n<td>2.13<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>7<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>other<\/td>\r\n<td>45<\/td>\r\n<td>3.09<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>17<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>23<\/td>\r\n<td>2.45<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>8<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>other<\/td>\r\n<td>30<\/td>\r\n<td>1.61<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>18<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>female<\/td>\r\n<td>24<\/td>\r\n<td>2.88<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>9<\/strong><\/td>\r\n<td>1<\/td>\r\n<td>male<\/td>\r\n<td>30<\/td>\r\n<td>2.04<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>19<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>female<\/td>\r\n<td>19<\/td>\r\n<td>1.51<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>10<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>undisclosed<\/td>\r\n<td>33<\/td>\r\n<td>3.44<\/td>\r\n<td><strong>|<\/strong><\/td>\r\n<td><strong>20<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>male<\/td>\r\n<td>18<\/td>\r\n<td>1.00<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n\r\n<hr \/>These data can be used in later statistical analyses.\r\n\r\n<h2 style=\"text-align: left\">Using a random number generator to produce ICD-9 codes<\/h2>\r\nIn this next example, we will produce a randomly generated dataset consisting of ICD-9 codes. In this SAS program we first create the data, then we organize the output into categories, and finally we produce a horizontal bar chart of the relative percentile values for each category of \u00a0ICD-9 codes.\r\n\r\nConsider the following program to evaluate the primary diagnosis for a group of patients visiting a healthcare clinic. \u00a0The data are generated using a customized random number generator that generates data in the form of ICD-9<a href=\"#_ftn1\">[1]<\/a> codes. Since the codes are based on a continuous number line several unique values can be generated to represent the various sub-conditions of that which a patient may present to a healthcare provider. Here we simplify the organization of the codes by creating categories and using the SAS PROC FORMAT command to assign the categories to the output.\r\n\r\nSAS Code To Organize Categories Of ICD-9 Codes\r\n<div>\r\nPROC FORMAT; VALUE CATFMT 1='Infectious\/parasitic'\r\n2='Neoplasms'\r\n3=' Endo\/nutri\/metabolic'\r\n4=' Blood\/blood-forming organs'\r\n5=' Mental disorders'\r\n6=' Nervous system'\r\n7=' Sense organs'\r\n8=' Circulatory system'\r\n9=' Respiratory system'\r\n10=' Digestive system'\r\n11=' Genitourinary system'\r\n12=' Pregnancy\/childbirth'\r\n13=' Skin &amp; subcutaneous tissue'\r\n14=' MSK &amp; connective tissue'\r\n15=' Congenital anomalies'\r\n16=' Perinatal period Conditions'\r\n17=' Injury and poisoning'\r\n18=' Supplementary classification'\r\n20=' Diagnosis not reported';\r\n<\/div> \r\n\r\nIn this example, the random number generator produces ICD-9 scores as the dependent variable which we assign with the label (PRDIAG). The SAS code uses a DO loop to create a set of 500 scores, representing\u00a0 ICD-9 score for each patient. The data are drawn from a normal distribution at random using the command: PRDIAG=RAND(\"NORMAL\")*10000; We seed the random number generator for k=500 times with the CALL STREAMINIT(K); command. We also set a maximum absolute value for the dependent variable using the modulus math function MOD().\r\n<div>\r\n\r\nDO K=1 TO 500;\r\nCALL STREAMINIT(K); \/* SEED THE RNG ON EACH LOOP FOR K TIMES *\/\r\nPRDIAG=RAND(\"NORMAL\")*10000;\r\n\/* SET MAX RANDOM NUMBER TO 1500 *\/\r\nPRDIAG=0+ABS((MOD(PRDIAG,1500)));\r\n\/* ROUND THE RANDOM NUMBERS TO 2 DECIMAL PLACES *\/\r\nPRDIAG=ROUND(PRDIAG,.01);\r\n<\/div>\r\n&nbsp;\r\n\r\nNext, we use if-then logic statements to organize the randomly generated numbers into specific categories based on specific cutpoints. Notice these commands are included within the DO loop. The loop is closed with the commands OUTPUT; followed by END;\r\n<div>\r\n\r\nIF PRDIAG = 95 OR PRDIAG = 99 THEN CATEGORY=20;\r\nIF PRDIAG &gt;=001 AND PRDIAG&lt;94 THEN CATEGORY=1;\r\nIF PRDIAG &gt;=96 AND PRDIAG&lt;99 THEN CATEGORY=1;\r\nIF PRDIAG &gt;99 AND PRDIAG&lt;140 THEN CATEGORY=1;\r\nIF PRDIAG &gt;=140 AND PRDIAG&lt;240 THEN CATEGORY=2;\r\nIF PRDIAG &gt;=240 AND PRDIAG&lt;280 THEN CATEGORY=3;\r\nIF PRDIAG &gt;=280 AND PRDIAG&lt;290 THEN CATEGORY=4;\r\nIF PRDIAG &gt;=290 AND PRDIAG&lt;320 THEN CATEGORY=5;\r\nIF PRDIAG &gt;=320 AND PRDIAG&lt;390 THEN CATEGORY=6;\r\nIF PRDIAG &gt;=390 AND PRDIAG&lt;460 THEN CATEGORY=7;\r\nIF PRDIAG &gt;=460 AND PRDIAG&lt;520 THEN CATEGORY=8;\r\nIF PRDIAG &gt;=520 AND PRDIAG&lt;580 THEN CATEGORY=9;\r\nIF PRDIAG &gt;=580 AND PRDIAG&lt;630 THEN CATEGORY=10;\r\nIF PRDIAG &gt;=630 AND PRDIAG&lt;677 THEN CATEGORY=11\r\nIF PRDIAG &gt;=680 AND PRDIAG&lt;710 THEN CATEGORY=12;\r\nIF PRDIAG &gt;=710 AND PRDIAG&lt;740 THEN CATEGORY=13;\r\nIF PRDIAG &gt;=740 AND PRDIAG&lt;760 THEN CATEGORY=14;\r\nIF PRDIAG &gt;=760 AND PRDIAG&lt;780 THEN CATEGORY=15;\r\nIF PRDIAG &gt;=780 AND PRDIAG&lt;800 THEN CATEGORY=16;\r\nIF PRDIAG &gt;=800 AND PRDIAG&lt;1000 THEN CATEGORY=17;\r\nIF PRDIAG &gt;=1000 THEN CATEGORY=18;\r\nOUTPUT;\r\nEND;\r\n<\/div>\r\n&nbsp;\r\n\r\nThe SAS commands to create a frequency distribution table are shown below. By using a frequency distribution table the author can provide a standard presentation of important summary statistics within the data set. For example, here we show the organization of the randomly generated numbers within each of the designated categories while also presenting the relative percentages that the categories represent within this data set (see Cumulative Percent column). The frequency distribution table is followed by the horizontal bar chart of the percentage of diagnoses within each category. In this figure, we included the data values at the end of each horizontal bar.\r\n<div>\r\n\r\nPROC FREQ; TABLES CATEGORY;\r\n\r\nTITLE1 'FREQUENCY DISTRIBUTION FOR RNG ICD-9 CODES';\r\n\r\n&nbsp;\r\n\r\nPROC SGPLOT DATA=PRDIAG; HBAR CATEGORY\/ GROUPDISPLAY = CLUSTER\r\n\r\nSTAT=PERCENT DATALABELFITPOLICY=NONE DATALABEL;\r\n\r\nXAXIS LABEL=\"PERCENT OF CASES\";\r\n\r\nYAXIS LABEL=\"DISEASE\/DIAGNOSIS CATEGORIES\";\r\n\r\nFORMAT CATEGORY CATFMT. ;\r\n\r\nTITLE1 'PERCENT OF REPORTED DIAGNOSIS CATEGORY'; RUN;\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\n<strong>Frequency distribution for RNG ICD-9 codes<\/strong>\r\n\r\n<strong>The FREQ Procedure<\/strong>\r\n\r\n&nbsp;\r\n<div align=\"center\">\r\n<table>\r\n<thead>\r\n<tr>\r\n<td>CATEGORY<\/td>\r\n<td>FREQUENCY<\/td>\r\n<td>PERCENT<\/td>\r\n<td>CUMULATIVE\r\nFREQUENCY<\/td>\r\n<td>CUMULATIVE\r\nPERCENT<\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td><strong>1<\/strong><\/td>\r\n<td>61<\/td>\r\n<td>12.20<\/td>\r\n<td>61<\/td>\r\n<td>12.20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>2<\/strong><\/td>\r\n<td>41<\/td>\r\n<td>8.20<\/td>\r\n<td>102<\/td>\r\n<td>20.40<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>3<\/strong><\/td>\r\n<td>10<\/td>\r\n<td>2.00<\/td>\r\n<td>112<\/td>\r\n<td>22.40<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>4<\/strong><\/td>\r\n<td>2<\/td>\r\n<td>0.40<\/td>\r\n<td>114<\/td>\r\n<td>22.80<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>5<\/strong><\/td>\r\n<td>7<\/td>\r\n<td>1.40<\/td>\r\n<td>121<\/td>\r\n<td>24.20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>6<\/strong><\/td>\r\n<td>24<\/td>\r\n<td>4.80<\/td>\r\n<td>145<\/td>\r\n<td>29.00<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>7<\/strong><\/td>\r\n<td>18<\/td>\r\n<td>3.60<\/td>\r\n<td>163<\/td>\r\n<td>32.60<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>8<\/strong><\/td>\r\n<td>22<\/td>\r\n<td>4.40<\/td>\r\n<td>185<\/td>\r\n<td>37.00<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>9<\/strong><\/td>\r\n<td>18<\/td>\r\n<td>3.60<\/td>\r\n<td>203<\/td>\r\n<td>40.60<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>10<\/strong><\/td>\r\n<td>11<\/td>\r\n<td>2.20<\/td>\r\n<td>214<\/td>\r\n<td>42.80<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>11<\/strong><\/td>\r\n<td>20<\/td>\r\n<td>4.00<\/td>\r\n<td>234<\/td>\r\n<td>46.80<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>12<\/strong><\/td>\r\n<td>16<\/td>\r\n<td>3.20<\/td>\r\n<td>250<\/td>\r\n<td>50.00<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>13<\/strong><\/td>\r\n<td>8<\/td>\r\n<td>1.60<\/td>\r\n<td>258<\/td>\r\n<td>51.60<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>14<\/strong><\/td>\r\n<td>13<\/td>\r\n<td>2.60<\/td>\r\n<td>271<\/td>\r\n<td>54.20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>15<\/strong><\/td>\r\n<td>7<\/td>\r\n<td>1.40<\/td>\r\n<td>278<\/td>\r\n<td>55.60<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>16<\/strong><\/td>\r\n<td>6<\/td>\r\n<td>1.20<\/td>\r\n<td>284<\/td>\r\n<td>56.80<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>17<\/strong><\/td>\r\n<td>57<\/td>\r\n<td>11.40<\/td>\r\n<td>341<\/td>\r\n<td>68.20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td><strong>18<\/strong><\/td>\r\n<td>159<\/td>\r\n<td>31.80<\/td>\r\n<td>500<\/td>\r\n<td>100.00<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\n7\r\n<div>\r\n<div>\r\n\r\n<a href=\"#_ftnref1\">[1]<\/a> ICD-9 codes refer to the International Classification of Disease Codes \u2013 version 9.\r\n\r\n<\/div>\r\n<\/div>\r\n\r\n<hr \/>\r\n\r\n<h1>Consider an example using the Lotto 649<\/h1>\r\n<div>\r\n\r\ncombination of six numbers from 1 to 49 is extremely low:\r\n\r\n1\/(49C6)\r\n\r\nWhich expands to 1 chance in 13,983,816 combinations.\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nConsidering the low probability of winning the grand prize (i.e. all 6 numbers the player chooses will be selected), it is expected that the Lotto 649 lottery should be strategy free. If however, the selection process is not random, but rather follows a specific pattern, then the chance of winning will not remain constant and a strategy to predict outcome could be developed.\r\n\r\nHere we will generate data for one draw. That is, using SAS code we will create a random number generator to produce a unique set of 6 numbers that simulates the data that could be generated by the Lotto 649.\r\n\r\nThe program to generate 6 numbers at random from a set of 49 numbers is shown here. In this instance we have a few constraints. First, we need to be sure that once the first number is drawn, it is not placed back into the set of 49 to be redrawn on a subsequent step. This is because the lotto uses a strategy of <strong>sampling without replacement<\/strong> and therefore each draw selects only 6 unique numbers. Likewise, in presenting the output from the random number gnerators we need to be sure that the data are reported as discrete scores and not as decimal based continuous scores; and finally, in filtering the numbers produced we need to be sure that the numbers range from 1 to 49 inclusive.\r\n\r\nCopy the following program to your SAS workspace and run the program to see which lucky lottery numbers you can produce. This program has several important features that are noted by the comments\u00a0 \/* comment *\/\u00a0 within the code.\r\n\r\n&nbsp;\r\n<div>\r\n\r\n\/* NOTE THE CALL STREAMINIT(13); Command\r\n\r\n&nbsp;\r\n\r\nTo create reproducible random numbers then seed the system with the streaminit command. If RAND() is used without an initial streaminit the program will use the value of the system clock and the random numbers will change each time the program is run.\r\n\r\n*\/\r\n\r\n&nbsp;\r\n\r\nDATA LOTTO1;\r\n\r\n&nbsp;\r\n\r\n* CALL STREAMINIT(13); \/* CREATES REPRODUCIBLE NUMBERS *\/\r\n\r\nDO UNTIL (CHOICE1 NE 0);\r\n\r\nCHOICE1 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE1 = ROUND(CHOICE1);\r\n\r\nCHOICE1 = 1+(MOD(CHOICE1,49));\r\n\r\nCHOICE1 = ABS(CHOICE1);\r\n\r\nEND;\r\n\r\n* CALL STREAMINIT(999);\r\n\r\nDO UNTIL (CHOICE2 NE CHOICE1 AND CHOICE2 NE 0);\r\n\r\nCHOICE2 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE2 = ROUND(CHOICE2);\r\n\r\nCHOICE2 = 1+(MOD(CHOICE2,49));\r\n\r\nCHOICE2 = ABS(CHOICE2);\r\n\r\nEND;\r\n\r\n* CALL STREAMINIT(28);\r\n\r\nDO UNTIL (CHOICE3 NE CHOICE2 AND CHOICE3 NE CHOICE1 AND CHOICE3 NE 0);\r\n\r\nCHOICE3 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE3 = ROUND(CHOICE3);\r\n\r\nCHOICE3 = 1+(MOD(CHOICE3,49));\r\n\r\nCHOICE3 = ABS(CHOICE3);\r\n\r\nEND;\r\n\r\n* CALL STREAMINIT(218);\r\n\r\nDO UNTIL (CHOICE4 NE CHOICE3 AND CHOICE4 NE CHOICE2 AND CHOICE4 NE CHOICE1 AND CHOICE4 NE 0);\r\n\r\nCHOICE4 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE4 = ROUND(CHOICE4);\r\n\r\nCHOICE4 = 1+(MOD(CHOICE4,49));\r\n\r\nCHOICE4 = ABS(CHOICE4);\r\n\r\nEND;\r\n\r\n&nbsp;\r\n\r\n* CALL STREAMINIT(28);\r\n\r\nDO UNTIL (CHOICE5 NE CHOICE4 AND CHOICE5 NE CHOICE3 AND CHOICE5 NE CHOICE2 AND CHOICE5 NE CHOICE1 AND CHOICE5 NE 0);\r\n\r\nCHOICE5 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE5 = ROUND(CHOICE5);\r\n\r\nCHOICE5 = 1+(MOD(CHOICE5,49));\r\n\r\nCHOICE5 = ABS(CHOICE5);\r\n\r\nEND;\r\n\r\n&nbsp;\r\n\r\n* CALL STREAMINIT(68);\r\n\r\nDO UNTIL (CHOICE6 NE CHOICE5 AND CHOICE6 NE CHOICE4 AND CHOICE6 NE CHOICE3 AND CHOICE6 NE CHOICE2 AND CHOICE6 NE CHOICE1 AND CHOICE6 NE 0);\r\n\r\nCHOICE6 = RAND(\"NORMAL\")*1000000000000;\r\n\r\nCHOICE6 = ROUND(CHOICE6);\r\n\r\nCHOICE6 = 1+(MOD(CHOICE6,49));\r\n\r\nCHOICE6 = ABS(CHOICE6);\r\n\r\nEND;\r\n\r\nRUN;\r\n\r\nPROC PRINT; VAR CHOICE1 CHOICE2 CHOICE3 CHOICE4 CHOICE5 CHOICE6;\r\n\r\nRUN;\r\n\r\n<\/div>\r\n&nbsp;\r\n<div align=\"center\">\r\n<table>\r\n<thead>\r\n<tr>\r\n<td><strong>Obs<\/strong><\/td>\r\n<td><strong>CHOICE1<\/strong><\/td>\r\n<td><strong>CHOICE2<\/strong><\/td>\r\n<td><strong>CHOICE3<\/strong><\/td>\r\n<td><strong>CHOICE4<\/strong><\/td>\r\n<td><strong>CHOICE5<\/strong><\/td>\r\n<td><strong>CHOICE6<\/strong><\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td><strong>1<\/strong><\/td>\r\n<td>37<\/td>\r\n<td>32<\/td>\r\n<td>48<\/td>\r\n<td>11<\/td>\r\n<td>26<\/td>\r\n<td>30<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<\/div>\r\n&nbsp;\r\n<div>\r\n\r\n<strong>So then how many combinations of six numbers are we really talking about?<\/strong>\r\n\r\n<\/div>\r\nTo compute the number of possible combinations of 6 numbers from the 49 numbers, we need to use the following combinatorial (or factorial) formula. We have 49 numbers choose 6.\u00a0 The number 49 represents the population from which the sample of 6 numbers will be chosen.\u00a0 We write the formula for determining the combinations using the following combinatorial equation:\r\n\r\nor we may wish to write the formula using a factorial format as:\r\n\r\nTherefore the number of all possible combinations of 6 numbers from a set of 49 consecutive numbers is:\r\n\r\n=\r\n\r\nYet you won't be happy unless all of your numbers were chosen, but REALLY what is the chance that all six of your numbers will be selected by the lottery machine.\u00a0 Well since you only bought one ticket, then your chance of winning the lottery is 1 in 13,983,816 chances, or\r\n\r\nThe value 0.000000071 represents the probability associated with your set of scores.\r\n\r\nWhile this example is fairly straight-forward it is somewhat abstract and is not guaranteed to make you a winner. It does however present the basic concepts in presenting a value for a variable that is generated randomly from the set of all possible outcomes. Let\u2019s now turn our attention to an applied health example and see how we can use the utility of the random number generators and computer simulation to create a dataset that exemplifies a real world example.\r\n\r\n<hr \/>\r\n\r\n<h1>An Applied Health Example using Simulated Data<\/h1>\r\n<\/div>\r\n<div>An Applied Health Example using Simulated Data<\/div>\r\nConsider for example that you are asked to assess the benefits of a 12-week pulmonary rehabilitation program, consisting of exercise and education, for a cohort of individuals with varying classifications of chronic obstructive pulmonary disease (COPD). The intake data include demographic variables such as the individual\u2019s age, sex, height, and weight; and performance data such as the distance walked in 6 minutes, a physician based rating of COPD, the program participant\u2019s self reported smoking status, years smoked; and physiological measures such as forced expiratory volume in 1 second, and resting heart rate.\r\n\r\nIn the following example we will generate data artificially using random number generators written with SAS code.\u00a0 In this way we can produce a simulated dataset that we can then use to observe what might happen if we were to actually conduct a research study with the same parameters and considerations.\r\n\r\nUsing random number generators we create the data set to produce a set of values representing 20 individuals (a random selection of males and females). The variables used in the table along with the variable types and the possible minimum and maximum range for each variable are presented in Table 6.1 below.\r\n\r\nTable 6.1 Variables Used To Produce A Sample Of Raw Data For The COPD Clinic\r\n\r\n&nbsp;\r\n<table>\r\n<tbody>\r\n<tr>\r\n<td><strong>Variable Name &amp; Variable label<\/strong><\/td>\r\n<td><strong>Variable Type<\/strong><\/td>\r\n<td><strong>Range of Values<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Patient identification\u00a0 -- Px id<\/td>\r\n<td>discrete<\/td>\r\n<td>1 to 20<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Age in years. -- <em>age<\/em><\/td>\r\n<td>discrete<\/td>\r\n<td>45 to 75<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Sex\u00a0 -- <em>sex<\/em><\/td>\r\n<td>discrete<\/td>\r\n<td>m: male; f: female;<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Height\u00a0 -- <em>ht<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>1.5 m to 2.0 m<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Weight -- <em>wt<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>50 kg to 150 kg<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>-\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Distance walked in 6 minutes -- <em>walkdist<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>54 metres to 150 meters<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Rating of COPD severity -- <em>severity<\/em><\/td>\r\n<td>discrete<\/td>\r\n<td>MI: mild; MO: moderate;\r\n\r\nS: severe<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Smoking status \u00a0-- <em>smoke<\/em><\/td>\r\n<td>discrete<\/td>\r\n<td>S: smoker; EX: ex-smoker;\r\n\r\nNON: never smoked<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Years as a smoker \u00a0-- <em>yrsmoke<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>&lt;1 to max years smoked<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Forced expiratory vol in 1 sec -- <em>FEV1<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>1.5 \u2013 4.0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>Resting heart rate -- <em>rhr<\/em><\/td>\r\n<td>continuous<\/td>\r\n<td>50 to 100<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<strong><em>\u00a0<\/em><\/strong>\r\n<div>6.3.1 Creating your dataset with a random number generator<\/div>\r\nHere we will use SAS code to produce the table of random numbers for each of the variables listed above. Recent developments in high speed computing and the creation of the Mersenne-Twister Random Number Generator which is now used by SAS, have led to the creation of the RAND() function.\u00a0 As stated in the SAS Knowledge Base (SAS(R) 9.3 Functions and CALL Routines), the RAND function can generate random numbers for a distribution specified by the user.\r\n\r\nIn the following example the random number generator was seeded with the statement:\u00a0 call streaminit(n);\r\n<div>\r\n\r\n\/* where n refers to any number you wish to use *\/\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nHere we specify that the data we generate will be drawn from the normal distribution.\r\n<div>\r\n\r\n\u2026 RAND(\"normal\")\u2026\r\n\r\n<\/div>\r\n&nbsp;\r\n<div>\r\n\r\ncode snippet:\r\n\r\ndata sasrng;\r\n\r\ncall streaminit(13);\r\n\r\n\/* here we use n=13 to seed the RNG *\/\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nSAS User Notes provide an explanation of the RAND() function as follows:\r\n<div>\r\n\r\n&nbsp;\r\n\r\nwhere \u00a0is an observation from the normal distribution with a mean of \u03b8 and a standard deviation of \u03bb that has the following probability density function:\r\n\r\nRange:\r\n\r\n\u03b8: is the mean parameter \uf0e0 Default:0\r\n\r\n: is the standard deviation parameter \uf0e0 Default:1\r\n\r\nRange: \u00a0&gt; 0.\r\n\r\n&nbsp;\r\n\r\n<\/div>\r\n<strong>\u00a0<\/strong>\r\n\r\nOnce we have established the parameters for random number selection we begin writing the SAS program to create random number generators as we would for any SAS program.\u00a0 Start by stating the options that you would like included in the output and then name the workspace using normal SAS code.\r\n\r\n&nbsp;\r\n<div>\r\n\r\nOPTIONS PAGESIZE=63 LINESIZE=90 DATE;\r\n\r\nDATA SASRNG;\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nOur next statement is to create an array. An array is a set of variables that generally have some commonality and that you wish to process together. In our example, we will start by creating an array that we name\u00a0 <strong>SCORES<\/strong>, and which has three elements or variables.\r\n<div>\r\n\r\nARRAY SCORES SEX SEVERITY SMOKING;\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nBy naming the array, as we have here (<strong>SCORES<\/strong>) we can refer to the array <strong>SCORES<\/strong> later to reference the specific elements that are contained within. For example, since the array has three elements, then <strong>SCORES<\/strong>(1) refers to the first element\u2014the participant\u2019s <em>SEX<\/em>, while <strong>SCORES<\/strong>(2) refers to the second element\u2014the <em>SEVERITY<\/em> of the COPD condition, and <strong>SCORES<\/strong>(3) refers to the third element\u2014the patient\u2019s <em>SMOKING<\/em> status.\r\n\r\nOnce we create the workspace in SAS, we next use the do-loop statements to generate a data set consisting of 20 cases. The first do-loop (<strong>DO<\/strong> K=<strong>1<\/strong> TO <strong>20<\/strong>) tells SAS to execute the statements within the loop 20 times.\r\n\r\nThe second do-loop (<strong>DO<\/strong> K=<strong>1<\/strong> TO <strong>3<\/strong>) is contained within the first loop and is designed to provide data specifically for the variables <em>SEX, SEVERITY, and SMOKING<\/em>\r\n<div>\r\n\r\nDO K=1 TO 20;\r\n\r\nDO I=1 TO 3;\r\n\r\n<\/div>\r\nFigure 6.1 Functions of The Do-Loop To Generate Random Numbers For The Array SCORES\r\n\r\nFinally, we end the do loops with the following statement sequence.\r\n\r\nEND;\r\n\r\nOUTPUT;\r\n\r\nEND;\r\n\r\n&nbsp;\r\n\r\nThe first END; statement closes the inside loop that begins with DO I=1 TO 3; likewise, the\u00a0 OUTPUT; \u00a0statement is needed to assign the RNG values to each variable for each participant, the outside loop (DO K=<strong>1<\/strong> TO <strong>20<\/strong>;) is closed with the second\u00a0 END; statement.\r\n\r\n&nbsp;\r\n\r\nFigure 6.2\u00a0 Closing the do-loops and producing output\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\nThe actual statement sequence to generate a random number for each of the variables in the array is shown here as a three step process beginning by seeding the Random Number Generator (RNG)\u00a0\u00a0\u00a0 with CALL STREAMINIT(N); where the (N) can be any number you wish to use.\u00a0 In this first example here we used the number 13 (only because 13 is MY lucky number!).\r\n<div>\r\n\r\nCALL STREAMINIT(13);\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nThe call statement initiates or seeds the random number generator.\r\n<div>\r\n\r\nOPTIONS PAGESIZE=63 LINESIZE=90 DATE;\r\n\r\nDATA SASRNG;\r\n\r\n&nbsp;\r\n\r\nARRAY SCORES SEX SEVERITY SMOKING;\r\n\r\nDO K=1 TO 20;\r\n\r\n&nbsp;\r\n\r\nDO I=1 TO 3;\r\n\r\n&nbsp;\r\n\r\nCALL STREAMINIT(13);\r\n\r\nSCORES(I)=RAND(\"NORMAL\")*100000000;\r\n\r\nSCORES(I)=ROUND(SCORES(I));\r\n\r\nSCORES(I)=1+ABS((MOD(SCORES(I),333)));\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nThis sequence of statemente invokes the random number generator and places a value in each element of the array (i.e., the list of variables).\r\n\r\nAfter generating the random numbers for each variable in the array \u00a0<strong>SCORE<\/strong> (SEX, SEVERITY, SMOKING) we then process the number with a logic filter so that it makes sense in relation to the range of scores that we would expect to see for each given variable.\r\n\r\nFor example, if the RNG produces a value of 75 for the variable sex, then what does that mean?\r\n\r\nWell actually it is meaningless until you assign the meaning.\r\n\r\nWe assign meaning to the values within a variable using logic statements. For each of the elements (variables) within the array we process the RNG value with logic statements that will make the data relevant to our variables.\r\n\r\nFor example, the logic statements to convert the RNG values for sex are shown here. In this situation we convert the numeric variable for sex to a text variable that we call sex. Since we have text labels that extend beyond 8 characters we use the length statement with the $ to ensure that the full length of the text label is used.\r\n<div>\r\n\r\n\/* LOGIC STATEMENTS FOR THE VARIABLE: SEX *\/\r\n\r\n&nbsp;\r\n\r\nLENGTH SEX $12;\r\n\r\nIF SEX &gt; 175 THEN SEX = 'NOT STATED';\r\n\r\nIF SEX &gt;54 AND SEX&lt;175 THEN SEX = 'FEMALE';\r\n\r\nIF SEX &gt;0 AND SEX&lt;55 THEN SEX = 'MALE';\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nIn SAS, the logic statements use the if-then conventional approach. That is, for every IF statement we use a corresponding THEN statement. In this way we process the RNG values to be within the range of logical outcomes for the variable that we are creating.\r\n<div>\r\n\r\n\/* LOGIC STATEMENTS TO CREATE CATEGORIES FOR THE VARIABLE: COPD SEVERITY TYPE *\/\r\n\r\nIF SEVERITY &gt;55\u00a0 THEN SEVERITY = 3;\r\n\r\nIF SEVERITY &gt;27 AND SEVERITY&lt;56 THEN SEVERITY = 2;\r\n\r\nIF SEVERITY &gt;4 AND SEVERITY&lt;28 THEN SEVERITY = 1;\r\n\r\n&nbsp;\r\n\r\n\/* LOGIC STATEMENTS TO CREATE CATEGORIES FOR THE VARIABLE: SMOKING STATUS *\/\r\n\r\nIF SMOKING &gt;55\u00a0 THEN SMOKING = 3;\r\n\r\nIF SMOKING &gt;27 AND SMOKING&lt;56 THEN SMOKING = 2;\r\n\r\nIF SMOKING &gt;4 AND SMOKING&lt;28 THEN SMOKING = 1;\r\n\r\n<\/div>\r\n&nbsp;\r\n\r\nNext we create RNGs for the remaining seven variables that we plan to include in the analysis. We do not need to include these in the array and can simply generate the values when SAS walks through the outer do loop. The independent execution of the rand(\u201cnormal\u201d) function can run with a new seed and a new maximum score.\u00a0 Notice that these follow the array processing statements.\r\n\r\nThe continuous discrete variables were age, years smoked, resting heart rate, weight, and distance walked in 6 minutes (measured in metres).\r\n\r\n\/* CONTINUOUS DISCRETE VARIABLE AGE *\/\r\n\r\nCALL STREAMINIT(13); AGE=RAND(\"NORMAL\")*1000000000000;\r\n\r\nAGE=40+ABS((MOD(AGE,62))); AGE=ROUND(AGE);\r\n\r\nIF AGE&gt;72 THEN AGE=35+ABS((MOD(AGE,50)));\r\n\r\n&nbsp;\r\n\r\n\/* CONTINUOUS DISCRETE VARIABLE YRSMOKE *\/\r\n\r\nCALL STREAMINIT(13); YRSMOKE=RAND(\"NORMAL\")*1000000000000;\r\n\r\nYRSMOKE=1+ABS((MOD(YRSMOKE,12))); YRSMOKE=ROUND(YRSMOKE);\r\n\r\n&nbsp;\r\n\r\n\/* CONTINUOUS DISCRETE VARIABLE RHR *\/\r\n\r\nCALL STREAMINIT(69); RHR=RAND(\"NORMAL\")*1000000000000;\r\n\r\nRHR=54+ABS((MOD(RHR,80))); RHR=ROUND(RHR);\r\n\r\n&nbsp;\r\n\r\n\/* CONTINUOUS DISCRETE VARIABLE WT *\/\r\n\r\nCALL STREAMINIT(45); WT=RAND(\"NORMAL\")*1000000000000;\r\n\r\nWT=45+ABS((MOD(WT,65)));WT=ROUND(WT,0.01);\r\n\r\nIF WT&gt;85 THEN WT=55+ABS((MOD(WT,12)));\r\n\r\n&nbsp;\r\n\r\n\/* CONTINUOUS DISCRETE VARIABLE WALKDIST *\/\r\n\r\nCALL STREAMINIT(69);WALKDIST=RAND(\"NORMAL\")*1000000000000;\r\n\r\nWALKDIST=54+ABS((MOD(WALKDIST,80))); WALKDIST=ROUND(WALKDIST);\r\n\r\n&nbsp;\r\n\r\nNext we created the continuous decimal variables. Again these statements are placed within the do loops to produce a full set of 20 outputs.\r\n\r\n\/* CONTINUOUS DECIMAL VARIABLE FEV1 *\/\r\n\r\nCALL STREAMINIT(99); FEV1=RAND(\"NORMAL\")*1000000000000;\r\n\r\nFEV1=1+ABS((MOD(FEV1,3)));FEV1=ROUND(FEV1,0.01);\r\n\r\n&nbsp;\r\n\r\n\/* CONTINUOUS DECIMAL VARIABLE HT *\/\r\n\r\nCALL STREAMINIT(21); HT=RAND(\"NORMAL\")*1000000000000;\r\n\r\nHT=1+ABS((MOD(HT,1.1)));HT=ROUND(HT,0.01);\r\n\r\nIF HT&lt;1.5 THEN HT=1.2+ABS((MOD(HT,1.1)));","rendered":"<p>Research using simulated data is often done to predict future events based on real-world data. Once accepted as a reliable method to contribute to decision making, there are several applications in which computer simulation could be used to provide plausible outcome scenarios prior to actually making a decision to advance in a specific direction.\u00a0 For example, using computer simulation tools, administrators can create financial forecasting models based on selected expenditure statements and health sector administrative data to estimate future costs and establish budgetary guidelines that are within the appropriate tolerances for a given fiduciary system. Similarly, health researchers can use demographic information about a cohort within the population or about the health care workforce to predict how many new health care professionals we will need in the years to come. This information can then be used to make decisions about the number of students that universities and colleges should accept into their programs in order to meet the predicted needs.<\/p>\n<p>Creating and using a simulated dataset is also an excellent way to practice the application of statistical methods without having to collect real-world data. That is, we can create a simulated dataset by first establishing the set of independent and dependent variables that are of interest to us in our research project, and then establish the range for each response within the variables of interest.<\/p>\n<p>For example, if in our research study we were interested in measuring the effect of a drug versus placebo on reacton time, then our study could be as simple as having three variables: ID, DRUG, and REACTION_TIME. Given that the ID is simply a counter which SAS will assign as an observation number, we need not be concerned with the scoring of ID. LIkewise, given that DRUG can either be DRUG or PLACEBO, we know that this is what we refer to as either a grouping variable or in some fileds we may see this referred to as a DUMMY variable, and the range for this variable will be DRUG=1\u00a0 and PLACEBO=2. Further, from the literature we understand that the reaction time will have an upper threshold value we use to establish the range of outcomes for our response in our computer simulation data set.<\/p>\n<p>With all of this information known before we begin, we can use a random number generator with SAS to create a dataset that is estimated from the set of values we provide to the computer.<\/p>\n<p>Here is our first example of creating random numbers with the SAS random number generator functions. In this first example, we are simply testing the code for the random number function. Here we used 3 lines of SAS code.<br \/>\n<code><br \/>\n<span style=\"color: blue\">DATA SASRNG;<\/span><br \/>\n<span style=\"color: purple\">\/* THE SEED FOR THE RANDOM NUMBER *\/<\/span><br \/>\n<span style=\"color: blue\">call streaminit(999); <\/span><br \/>\n<span style=\"color: purple\"> \/* CREATE THE VARIABLE GROUP *\/<\/span><br \/>\n<span style=\"color: blue\">group=RAND(\"normal\")*1000000000000;<\/span><br \/>\n<span style=\"color: purple\"> \/* RUN THE RNG AND PRINT OUTPUT *\/<\/span><br \/>\n<span style=\"color: blue\">run;<br \/>\nproc print; var group;<br \/>\nrun;<\/span><br \/>\n<\/code><\/p>\n<p>The result of this code is a random number but the number has no real meaning to us except that it shows us SAS generated a value for the variable GROUP.<\/p>\n<table>\n<tbody>\n<tr>\n<td>Obs<\/td>\n<td>group<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>-4.8095E1<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now we want to modify that output so that it has meaning. In the following SAS code we will first create the random number, establish the sign of the output to be positive by using the <span style=\"color: purple\">ABS <i>absolute number<\/i> function <\/span>and then set the range of the output by using the MOD functions (<span style=\"color: purple\"><i>MODULO MATH<\/i> function <\/span>). In the application of modulo math here we are setting the lowest value to 1 and the ceiling value to 2.<\/p>\n<p>Notice here that we are generating a set of 20 values, and just to be sure that we restrict the output to 1 and 2 we add the logic statement <span style=\"color: purple\"> if group=3 then group=1; <\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><code><br \/>\n<span style=\"color: blue\">DATA sasrng;<br \/>\nDO K=1 TO 20;<br \/>\ncall streaminit(999);<br \/>\ngroup=RAND(\"normal\")*1000000000000;<br \/>\ngroup=1+ABS((mod(group,2)));<br \/>\ngroup=ROUND(group);<br \/>\nif group=3 then group=1;<br \/>\noutput;<br \/>\nend;<br \/>\nrun;<br \/>\nproc print; var group;<br \/>\nrun;<\/span><\/code><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The output from this SAS code is presented here:<\/p>\n<table>\n<tbody>\n<tr>\n<td style=\"background-color: skyblue;\">Obs<\/td>\n<td>  group<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">1 <\/td>\n<td> 2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">2 <\/td>\n<td> 2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">3<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">4<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">5<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">6<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">7<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">8<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">9<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">10<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">11<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">12<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">13<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">14<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">15<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">16<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">17<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">18<\/td>\n<td>  2<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">19<\/td>\n<td>  1<\/td>\n<\/tr>\n<tr>\n<td style=\"background-color: skyblue;\">20<\/td>\n<td> 2<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>This is an amazing learning opportunity as it enables you to create, albeit artificially, a complete dataset with the variables in which you are interested.\u00a0 The experience is invaluable as it provides you with the opportunity to critically evaluate both the strategies for input as well as the interpretation for output. Although not required, working through a computer-simulated dataset during the development of your research proposal can help you develop your data analysis plan, and enable you to become familiar with the ranges and nuances of the important variables.<\/p>\n<p>In the following program, we will generate a set of values based on the application of SAS random number generator funtions.<br \/>\nFor the variables age, which is a discrete random variable, we will set a minimum age of 18 and a maximum age of 72, The data will be generated from the random normal distribution to produce a variable for group and and a variable for reaction time. Each of these variables represents a different type of important variable that you might encounter in your data.<br \/>\nFor example, the variable age is discrete and can have a range from 1 to 120; sex will be alphanumeric and can be of four types (M, F, O, U) where O is other and U is undisclosed; the group will be limited to a binary output (1,2), and reaction time will represent a continuous random variable. Here we will control the function of the random number generator by controlling the parameters of the processor to ensure that our output falls within a specific range.<\/p>\n<p>.. Chapter 42 RNG_PRG01<\/p>\n<table>\n<tbody>\n<tr>\n<td><code><br \/>\n<span style=\"color: blue\">DATA sasrng;<br \/>\nDO K=1 TO 20;<br \/>\ncall streaminit(999);       <\/span><br \/>\n<span style=\"color: purple\">\/* set random number seed *\/<br \/><\/span><br \/>\n<span style=\"color: blue\">group=RAND(\"normal\")*1000000000000;<br \/>\ngroup=1+ABS((mod(group,2)));<br \/>\ngroup=ROUND(group);<br \/>\nif group=3 then group=1;<\/span><br \/>\n<span style=\"color: purple\">\/* here we use the UNIFORM distribtion as the source<br \/>\nfor the random number function *\/<\/span><br \/>\n<span style=\"color: blue\"><br \/>\n   U = RAND(\"Uniform\"); <\/span><br \/>\n<span style=\"color: purple\">\/* u ~ U(0,1) *\/<br \/> <br \/>\n\/* Next we reassign the random generated scores<br \/>\n   and establish the groups using if then logic  *\/ <\/span><br \/>\n<span style=\"color: blue\"> LENGTH sex $12;<br \/>\n if U LE 0.25 then sex = 'other';<br \/>\n     if U GT 0.25 and U LE 0.5 then sex = 'undisclosed';<br \/>\n    if U GT 0.5 and U LE 0.75 then sex = 'female';<br \/>\n    if U GT 0.75 then sex = 'male';<\/span><br \/>\n<span style=\"color: purple\"><br \/>\n\/* Continuous discrete variable age *\/<\/span><br \/>\n<span style=\"color: blue\"> call streaminit(13);<br \/>\n<br \/> age=RAND(\"normal\")*1000000000000;<br \/>\nage=18+ABS((mod(age,50))); <br \/>\nage=ROUND(age);<br \/>\nif age GT 72 then age=15+ABS((mod(age,50)));<\/span><br \/>\n<span style=\"color: PURPLE\"> \/* Continuous decimal variable REACTION TIME *\/<\/span><br \/>\n<span style=\"color: blue\"> call streaminit(99);<br \/>\n react=RAND(\"normal\")*1000000000000;<br \/>\nreact=1+ABS((mod(react,3)));<br \/>\nreact=ROUND(react,0.01);<br \/>\n<br \/>output; end;<br \/>\nrun;<br \/>\nproc print; var group sex age react;<br \/>\nrun; <\/span><\/code>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>  <strong>The program above produced the following output. <\/strong><\/p>\n<div style=\"margin: auto;\">\n<table>\n<thead>\n<tr>\n<td><strong>OBS<\/strong><\/td>\n<td><strong>Group<\/strong><\/td>\n<td><strong>Sex<\/strong><\/td>\n<td><strong>Age<\/strong><\/td>\n<td><strong>React<\/strong><\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>OBS<\/strong><\/td>\n<td><strong>Group<\/strong><\/td>\n<td><strong>Sex<\/strong><\/td>\n<td><strong>Age<\/strong><\/td>\n<td><strong>React<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>1<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>17<\/td>\n<td>2.55<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>11<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>21<\/td>\n<td>3.97<\/td>\n<\/tr>\n<tr>\n<td><strong>2<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>41 <\/td>\n<td>1.85<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>12<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>32<\/td>\n<td>3.66<\/td>\n<\/tr>\n<tr>\n<td><strong>3<\/strong><\/td>\n<td>1<\/td>\n<td>female<\/td>\n<td>16<\/td>\n<td>1.00<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>13<\/strong><\/td>\n<td>1<\/td>\n<td>female<\/td>\n<td>51<\/td>\n<td>2.15<\/td>\n<\/tr>\n<tr>\n<td><strong>4<\/strong><\/td>\n<td>1<\/td>\n<td>female<\/td>\n<td>59<\/td>\n<td>1.34<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>14<\/strong><\/td>\n<td>2<\/td>\n<td>female<\/td>\n<td>29<\/td>\n<td>2.17<\/td>\n<\/tr>\n<tr>\n<td><strong>5<\/strong><\/td>\n<td>1<\/td>\n<td>male<\/td>\n<td>59<\/td>\n<td>1.15<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>15<\/strong><\/td>\n<td>1<\/td>\n<td>undisclosed<\/td>\n<td>59<\/td>\n<td>1.30<\/td>\n<\/tr>\n<tr>\n<td><strong>6<\/strong><\/td>\n<td>2<\/td>\n<td>other<\/td>\n<td>56<\/td>\n<td>2.15<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>16<\/strong><\/td>\n<td>2<\/td>\n<td>other<\/td>\n<td>32<\/td>\n<td>2.13<\/td>\n<\/tr>\n<tr>\n<td><strong>7<\/strong><\/td>\n<td>2<\/td>\n<td>other<\/td>\n<td>45<\/td>\n<td>3.09<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>17<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>23<\/td>\n<td>2.45<\/td>\n<\/tr>\n<tr>\n<td><strong>8<\/strong><\/td>\n<td>2<\/td>\n<td>other<\/td>\n<td>30<\/td>\n<td>1.61<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>18<\/strong><\/td>\n<td>2<\/td>\n<td>female<\/td>\n<td>24<\/td>\n<td>2.88<\/td>\n<\/tr>\n<tr>\n<td><strong>9<\/strong><\/td>\n<td>1<\/td>\n<td>male<\/td>\n<td>30<\/td>\n<td>2.04<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>19<\/strong><\/td>\n<td>2<\/td>\n<td>female<\/td>\n<td>19<\/td>\n<td>1.51<\/td>\n<\/tr>\n<tr>\n<td><strong>10<\/strong><\/td>\n<td>2<\/td>\n<td>undisclosed<\/td>\n<td>33<\/td>\n<td>3.44<\/td>\n<td><strong>|<\/strong><\/td>\n<td><strong>20<\/strong><\/td>\n<td>2<\/td>\n<td>male<\/td>\n<td>18<\/td>\n<td>1.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<p>These data can be used in later statistical analyses.<\/p>\n<h2 style=\"text-align: left\">Using a random number generator to produce ICD-9 codes<\/h2>\n<p>In this next example, we will produce a randomly generated dataset consisting of ICD-9 codes. In this SAS program we first create the data, then we organize the output into categories, and finally we produce a horizontal bar chart of the relative percentile values for each category of \u00a0ICD-9 codes.<\/p>\n<p>Consider the following program to evaluate the primary diagnosis for a group of patients visiting a healthcare clinic. \u00a0The data are generated using a customized random number generator that generates data in the form of ICD-9<a href=\"#_ftn1\">[1]<\/a> codes. Since the codes are based on a continuous number line several unique values can be generated to represent the various sub-conditions of that which a patient may present to a healthcare provider. Here we simplify the organization of the codes by creating categories and using the SAS PROC FORMAT command to assign the categories to the output.<\/p>\n<p>SAS Code To Organize Categories Of ICD-9 Codes<\/p>\n<div>\nPROC FORMAT; VALUE CATFMT 1=&#8217;Infectious\/parasitic&#8217;<br \/>\n2=&#8217;Neoplasms&#8217;<br \/>\n3=&#8217; Endo\/nutri\/metabolic&#8217;<br \/>\n4=&#8217; Blood\/blood-forming organs&#8217;<br \/>\n5=&#8217; Mental disorders&#8217;<br \/>\n6=&#8217; Nervous system&#8217;<br \/>\n7=&#8217; Sense organs&#8217;<br \/>\n8=&#8217; Circulatory system&#8217;<br \/>\n9=&#8217; Respiratory system&#8217;<br \/>\n10=&#8217; Digestive system&#8217;<br \/>\n11=&#8217; Genitourinary system&#8217;<br \/>\n12=&#8217; Pregnancy\/childbirth&#8217;<br \/>\n13=&#8217; Skin &amp; subcutaneous tissue&#8217;<br \/>\n14=&#8217; MSK &amp; connective tissue&#8217;<br \/>\n15=&#8217; Congenital anomalies&#8217;<br \/>\n16=&#8217; Perinatal period Conditions&#8217;<br \/>\n17=&#8217; Injury and poisoning&#8217;<br \/>\n18=&#8217; Supplementary classification&#8217;<br \/>\n20=&#8217; Diagnosis not reported&#8217;;\n<\/div>\n<p>In this example, the random number generator produces ICD-9 scores as the dependent variable which we assign with the label (PRDIAG). The SAS code uses a DO loop to create a set of 500 scores, representing\u00a0 ICD-9 score for each patient. The data are drawn from a normal distribution at random using the command: PRDIAG=RAND(&#8220;NORMAL&#8221;)*10000; We seed the random number generator for k=500 times with the CALL STREAMINIT(K); command. We also set a maximum absolute value for the dependent variable using the modulus math function MOD().<\/p>\n<div>\n<p>DO K=1 TO 500;<br \/>\nCALL STREAMINIT(K); \/* SEED THE RNG ON EACH LOOP FOR K TIMES *\/<br \/>\nPRDIAG=RAND(&#8220;NORMAL&#8221;)*10000;<br \/>\n\/* SET MAX RANDOM NUMBER TO 1500 *\/<br \/>\nPRDIAG=0+ABS((MOD(PRDIAG,1500)));<br \/>\n\/* ROUND THE RANDOM NUMBERS TO 2 DECIMAL PLACES *\/<br \/>\nPRDIAG=ROUND(PRDIAG,.01);\n<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>Next, we use if-then logic statements to organize the randomly generated numbers into specific categories based on specific cutpoints. Notice these commands are included within the DO loop. The loop is closed with the commands OUTPUT; followed by END;<\/p>\n<div>\n<p>IF PRDIAG = 95 OR PRDIAG = 99 THEN CATEGORY=20;<br \/>\nIF PRDIAG &gt;=001 AND PRDIAG&lt;94 THEN CATEGORY=1;<br \/>\nIF PRDIAG &gt;=96 AND PRDIAG&lt;99 THEN CATEGORY=1;<br \/>\nIF PRDIAG &gt;99 AND PRDIAG&lt;140 THEN CATEGORY=1;<br \/>\nIF PRDIAG &gt;=140 AND PRDIAG&lt;240 THEN CATEGORY=2;<br \/>\nIF PRDIAG &gt;=240 AND PRDIAG&lt;280 THEN CATEGORY=3;<br \/>\nIF PRDIAG &gt;=280 AND PRDIAG&lt;290 THEN CATEGORY=4;<br \/>\nIF PRDIAG &gt;=290 AND PRDIAG&lt;320 THEN CATEGORY=5;<br \/>\nIF PRDIAG &gt;=320 AND PRDIAG&lt;390 THEN CATEGORY=6;<br \/>\nIF PRDIAG &gt;=390 AND PRDIAG&lt;460 THEN CATEGORY=7;<br \/>\nIF PRDIAG &gt;=460 AND PRDIAG&lt;520 THEN CATEGORY=8;<br \/>\nIF PRDIAG &gt;=520 AND PRDIAG&lt;580 THEN CATEGORY=9;<br \/>\nIF PRDIAG &gt;=580 AND PRDIAG&lt;630 THEN CATEGORY=10;<br \/>\nIF PRDIAG &gt;=630 AND PRDIAG&lt;677 THEN CATEGORY=11<br \/>\nIF PRDIAG &gt;=680 AND PRDIAG&lt;710 THEN CATEGORY=12;<br \/>\nIF PRDIAG &gt;=710 AND PRDIAG&lt;740 THEN CATEGORY=13;<br \/>\nIF PRDIAG &gt;=740 AND PRDIAG&lt;760 THEN CATEGORY=14;<br \/>\nIF PRDIAG &gt;=760 AND PRDIAG&lt;780 THEN CATEGORY=15;<br \/>\nIF PRDIAG &gt;=780 AND PRDIAG&lt;800 THEN CATEGORY=16;<br \/>\nIF PRDIAG &gt;=800 AND PRDIAG&lt;1000 THEN CATEGORY=17;<br \/>\nIF PRDIAG &gt;=1000 THEN CATEGORY=18;<br \/>\nOUTPUT;<br \/>\nEND;\n<\/p><\/div>\n<p>&nbsp;<\/p>\n<p>The SAS commands to create a frequency distribution table are shown below. By using a frequency distribution table the author can provide a standard presentation of important summary statistics within the data set. For example, here we show the organization of the randomly generated numbers within each of the designated categories while also presenting the relative percentages that the categories represent within this data set (see Cumulative Percent column). The frequency distribution table is followed by the horizontal bar chart of the percentage of diagnoses within each category. In this figure, we included the data values at the end of each horizontal bar.<\/p>\n<div>\n<p>PROC FREQ; TABLES CATEGORY;<\/p>\n<p>TITLE1 &#8216;FREQUENCY DISTRIBUTION FOR RNG ICD-9 CODES&#8217;;<\/p>\n<p>&nbsp;<\/p>\n<p>PROC SGPLOT DATA=PRDIAG; HBAR CATEGORY\/ GROUPDISPLAY = CLUSTER<\/p>\n<p>STAT=PERCENT DATALABELFITPOLICY=NONE DATALABEL;<\/p>\n<p>XAXIS LABEL=&#8221;PERCENT OF CASES&#8221;;<\/p>\n<p>YAXIS LABEL=&#8221;DISEASE\/DIAGNOSIS CATEGORIES&#8221;;<\/p>\n<p>FORMAT CATEGORY CATFMT. ;<\/p>\n<p>TITLE1 &#8216;PERCENT OF REPORTED DIAGNOSIS CATEGORY&#8217;; RUN;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p><strong>Frequency distribution for RNG ICD-9 codes<\/strong><\/p>\n<p><strong>The FREQ Procedure<\/strong><\/p>\n<p>&nbsp;<\/p>\n<div style=\"margin: auto;\">\n<table>\n<thead>\n<tr>\n<td>CATEGORY<\/td>\n<td>FREQUENCY<\/td>\n<td>PERCENT<\/td>\n<td>CUMULATIVE<br \/>\nFREQUENCY<\/td>\n<td>CUMULATIVE<br \/>\nPERCENT<\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>1<\/strong><\/td>\n<td>61<\/td>\n<td>12.20<\/td>\n<td>61<\/td>\n<td>12.20<\/td>\n<\/tr>\n<tr>\n<td><strong>2<\/strong><\/td>\n<td>41<\/td>\n<td>8.20<\/td>\n<td>102<\/td>\n<td>20.40<\/td>\n<\/tr>\n<tr>\n<td><strong>3<\/strong><\/td>\n<td>10<\/td>\n<td>2.00<\/td>\n<td>112<\/td>\n<td>22.40<\/td>\n<\/tr>\n<tr>\n<td><strong>4<\/strong><\/td>\n<td>2<\/td>\n<td>0.40<\/td>\n<td>114<\/td>\n<td>22.80<\/td>\n<\/tr>\n<tr>\n<td><strong>5<\/strong><\/td>\n<td>7<\/td>\n<td>1.40<\/td>\n<td>121<\/td>\n<td>24.20<\/td>\n<\/tr>\n<tr>\n<td><strong>6<\/strong><\/td>\n<td>24<\/td>\n<td>4.80<\/td>\n<td>145<\/td>\n<td>29.00<\/td>\n<\/tr>\n<tr>\n<td><strong>7<\/strong><\/td>\n<td>18<\/td>\n<td>3.60<\/td>\n<td>163<\/td>\n<td>32.60<\/td>\n<\/tr>\n<tr>\n<td><strong>8<\/strong><\/td>\n<td>22<\/td>\n<td>4.40<\/td>\n<td>185<\/td>\n<td>37.00<\/td>\n<\/tr>\n<tr>\n<td><strong>9<\/strong><\/td>\n<td>18<\/td>\n<td>3.60<\/td>\n<td>203<\/td>\n<td>40.60<\/td>\n<\/tr>\n<tr>\n<td><strong>10<\/strong><\/td>\n<td>11<\/td>\n<td>2.20<\/td>\n<td>214<\/td>\n<td>42.80<\/td>\n<\/tr>\n<tr>\n<td><strong>11<\/strong><\/td>\n<td>20<\/td>\n<td>4.00<\/td>\n<td>234<\/td>\n<td>46.80<\/td>\n<\/tr>\n<tr>\n<td><strong>12<\/strong><\/td>\n<td>16<\/td>\n<td>3.20<\/td>\n<td>250<\/td>\n<td>50.00<\/td>\n<\/tr>\n<tr>\n<td><strong>13<\/strong><\/td>\n<td>8<\/td>\n<td>1.60<\/td>\n<td>258<\/td>\n<td>51.60<\/td>\n<\/tr>\n<tr>\n<td><strong>14<\/strong><\/td>\n<td>13<\/td>\n<td>2.60<\/td>\n<td>271<\/td>\n<td>54.20<\/td>\n<\/tr>\n<tr>\n<td><strong>15<\/strong><\/td>\n<td>7<\/td>\n<td>1.40<\/td>\n<td>278<\/td>\n<td>55.60<\/td>\n<\/tr>\n<tr>\n<td><strong>16<\/strong><\/td>\n<td>6<\/td>\n<td>1.20<\/td>\n<td>284<\/td>\n<td>56.80<\/td>\n<\/tr>\n<tr>\n<td><strong>17<\/strong><\/td>\n<td>57<\/td>\n<td>11.40<\/td>\n<td>341<\/td>\n<td>68.20<\/td>\n<\/tr>\n<tr>\n<td><strong>18<\/strong><\/td>\n<td>159<\/td>\n<td>31.80<\/td>\n<td>500<\/td>\n<td>100.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>7<\/p>\n<div>\n<div>\n<p><a href=\"#_ftnref1\">[1]<\/a> ICD-9 codes refer to the International Classification of Disease Codes \u2013 version 9.<\/p>\n<\/div>\n<\/div>\n<hr \/>\n<h1>Consider an example using the Lotto 649<\/h1>\n<div>\n<p>combination of six numbers from 1 to 49 is extremely low:<\/p>\n<p>1\/(49C6)<\/p>\n<p>Which expands to 1 chance in 13,983,816 combinations.<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Considering the low probability of winning the grand prize (i.e. all 6 numbers the player chooses will be selected), it is expected that the Lotto 649 lottery should be strategy free. If however, the selection process is not random, but rather follows a specific pattern, then the chance of winning will not remain constant and a strategy to predict outcome could be developed.<\/p>\n<p>Here we will generate data for one draw. That is, using SAS code we will create a random number generator to produce a unique set of 6 numbers that simulates the data that could be generated by the Lotto 649.<\/p>\n<p>The program to generate 6 numbers at random from a set of 49 numbers is shown here. In this instance we have a few constraints. First, we need to be sure that once the first number is drawn, it is not placed back into the set of 49 to be redrawn on a subsequent step. This is because the lotto uses a strategy of <strong>sampling without replacement<\/strong> and therefore each draw selects only 6 unique numbers. Likewise, in presenting the output from the random number gnerators we need to be sure that the data are reported as discrete scores and not as decimal based continuous scores; and finally, in filtering the numbers produced we need to be sure that the numbers range from 1 to 49 inclusive.<\/p>\n<p>Copy the following program to your SAS workspace and run the program to see which lucky lottery numbers you can produce. This program has several important features that are noted by the comments\u00a0 \/* comment *\/\u00a0 within the code.<\/p>\n<p>&nbsp;<\/p>\n<div>\n<p>\/* NOTE THE CALL STREAMINIT(13); Command<\/p>\n<p>&nbsp;<\/p>\n<p>To create reproducible random numbers then seed the system with the streaminit command. If RAND() is used without an initial streaminit the program will use the value of the system clock and the random numbers will change each time the program is run.<\/p>\n<p>*\/<\/p>\n<p>&nbsp;<\/p>\n<p>DATA LOTTO1;<\/p>\n<p>&nbsp;<\/p>\n<p>* CALL STREAMINIT(13); \/* CREATES REPRODUCIBLE NUMBERS *\/<\/p>\n<p>DO UNTIL (CHOICE1 NE 0);<\/p>\n<p>CHOICE1 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE1 = ROUND(CHOICE1);<\/p>\n<p>CHOICE1 = 1+(MOD(CHOICE1,49));<\/p>\n<p>CHOICE1 = ABS(CHOICE1);<\/p>\n<p>END;<\/p>\n<p>* CALL STREAMINIT(999);<\/p>\n<p>DO UNTIL (CHOICE2 NE CHOICE1 AND CHOICE2 NE 0);<\/p>\n<p>CHOICE2 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE2 = ROUND(CHOICE2);<\/p>\n<p>CHOICE2 = 1+(MOD(CHOICE2,49));<\/p>\n<p>CHOICE2 = ABS(CHOICE2);<\/p>\n<p>END;<\/p>\n<p>* CALL STREAMINIT(28);<\/p>\n<p>DO UNTIL (CHOICE3 NE CHOICE2 AND CHOICE3 NE CHOICE1 AND CHOICE3 NE 0);<\/p>\n<p>CHOICE3 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE3 = ROUND(CHOICE3);<\/p>\n<p>CHOICE3 = 1+(MOD(CHOICE3,49));<\/p>\n<p>CHOICE3 = ABS(CHOICE3);<\/p>\n<p>END;<\/p>\n<p>* CALL STREAMINIT(218);<\/p>\n<p>DO UNTIL (CHOICE4 NE CHOICE3 AND CHOICE4 NE CHOICE2 AND CHOICE4 NE CHOICE1 AND CHOICE4 NE 0);<\/p>\n<p>CHOICE4 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE4 = ROUND(CHOICE4);<\/p>\n<p>CHOICE4 = 1+(MOD(CHOICE4,49));<\/p>\n<p>CHOICE4 = ABS(CHOICE4);<\/p>\n<p>END;<\/p>\n<p>&nbsp;<\/p>\n<p>* CALL STREAMINIT(28);<\/p>\n<p>DO UNTIL (CHOICE5 NE CHOICE4 AND CHOICE5 NE CHOICE3 AND CHOICE5 NE CHOICE2 AND CHOICE5 NE CHOICE1 AND CHOICE5 NE 0);<\/p>\n<p>CHOICE5 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE5 = ROUND(CHOICE5);<\/p>\n<p>CHOICE5 = 1+(MOD(CHOICE5,49));<\/p>\n<p>CHOICE5 = ABS(CHOICE5);<\/p>\n<p>END;<\/p>\n<p>&nbsp;<\/p>\n<p>* CALL STREAMINIT(68);<\/p>\n<p>DO UNTIL (CHOICE6 NE CHOICE5 AND CHOICE6 NE CHOICE4 AND CHOICE6 NE CHOICE3 AND CHOICE6 NE CHOICE2 AND CHOICE6 NE CHOICE1 AND CHOICE6 NE 0);<\/p>\n<p>CHOICE6 = RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>CHOICE6 = ROUND(CHOICE6);<\/p>\n<p>CHOICE6 = 1+(MOD(CHOICE6,49));<\/p>\n<p>CHOICE6 = ABS(CHOICE6);<\/p>\n<p>END;<\/p>\n<p>RUN;<\/p>\n<p>PROC PRINT; VAR CHOICE1 CHOICE2 CHOICE3 CHOICE4 CHOICE5 CHOICE6;<\/p>\n<p>RUN;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div style=\"margin: auto;\">\n<table>\n<thead>\n<tr>\n<td><strong>Obs<\/strong><\/td>\n<td><strong>CHOICE1<\/strong><\/td>\n<td><strong>CHOICE2<\/strong><\/td>\n<td><strong>CHOICE3<\/strong><\/td>\n<td><strong>CHOICE4<\/strong><\/td>\n<td><strong>CHOICE5<\/strong><\/td>\n<td><strong>CHOICE6<\/strong><\/td>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><strong>1<\/strong><\/td>\n<td>37<\/td>\n<td>32<\/td>\n<td>48<\/td>\n<td>11<\/td>\n<td>26<\/td>\n<td>30<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<div>\n<p><strong>So then how many combinations of six numbers are we really talking about?<\/strong><\/p>\n<\/div>\n<p>To compute the number of possible combinations of 6 numbers from the 49 numbers, we need to use the following combinatorial (or factorial) formula. We have 49 numbers choose 6.\u00a0 The number 49 represents the population from which the sample of 6 numbers will be chosen.\u00a0 We write the formula for determining the combinations using the following combinatorial equation:<\/p>\n<p>or we may wish to write the formula using a factorial format as:<\/p>\n<p>Therefore the number of all possible combinations of 6 numbers from a set of 49 consecutive numbers is:<\/p>\n<p>=<\/p>\n<p>Yet you won&#8217;t be happy unless all of your numbers were chosen, but REALLY what is the chance that all six of your numbers will be selected by the lottery machine.\u00a0 Well since you only bought one ticket, then your chance of winning the lottery is 1 in 13,983,816 chances, or<\/p>\n<p>The value 0.000000071 represents the probability associated with your set of scores.<\/p>\n<p>While this example is fairly straight-forward it is somewhat abstract and is not guaranteed to make you a winner. It does however present the basic concepts in presenting a value for a variable that is generated randomly from the set of all possible outcomes. Let\u2019s now turn our attention to an applied health example and see how we can use the utility of the random number generators and computer simulation to create a dataset that exemplifies a real world example.<\/p>\n<hr \/>\n<h1>An Applied Health Example using Simulated Data<\/h1>\n<\/div>\n<div>An Applied Health Example using Simulated Data<\/div>\n<p>Consider for example that you are asked to assess the benefits of a 12-week pulmonary rehabilitation program, consisting of exercise and education, for a cohort of individuals with varying classifications of chronic obstructive pulmonary disease (COPD). The intake data include demographic variables such as the individual\u2019s age, sex, height, and weight; and performance data such as the distance walked in 6 minutes, a physician based rating of COPD, the program participant\u2019s self reported smoking status, years smoked; and physiological measures such as forced expiratory volume in 1 second, and resting heart rate.<\/p>\n<p>In the following example we will generate data artificially using random number generators written with SAS code.\u00a0 In this way we can produce a simulated dataset that we can then use to observe what might happen if we were to actually conduct a research study with the same parameters and considerations.<\/p>\n<p>Using random number generators we create the data set to produce a set of values representing 20 individuals (a random selection of males and females). The variables used in the table along with the variable types and the possible minimum and maximum range for each variable are presented in Table 6.1 below.<\/p>\n<p>Table 6.1 Variables Used To Produce A Sample Of Raw Data For The COPD Clinic<\/p>\n<p>&nbsp;<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Variable Name &amp; Variable label<\/strong><\/td>\n<td><strong>Variable Type<\/strong><\/td>\n<td><strong>Range of Values<\/strong><\/td>\n<\/tr>\n<tr>\n<td>Patient identification\u00a0 &#8212; Px id<\/td>\n<td>discrete<\/td>\n<td>1 to 20<\/td>\n<\/tr>\n<tr>\n<td>Age in years. &#8212; <em>age<\/em><\/td>\n<td>discrete<\/td>\n<td>45 to 75<\/td>\n<\/tr>\n<tr>\n<td>Sex\u00a0 &#8212; <em>sex<\/em><\/td>\n<td>discrete<\/td>\n<td>m: male; f: female;<\/td>\n<\/tr>\n<tr>\n<td>Height\u00a0 &#8212; <em>ht<\/em><\/td>\n<td>continuous<\/td>\n<td>1.5 m to 2.0 m<\/td>\n<\/tr>\n<tr>\n<td>Weight &#8212; <em>wt<\/em><\/td>\n<td>continuous<\/td>\n<td>50 kg to 150 kg<\/td>\n<\/tr>\n<tr>\n<td>&#8211;\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 Distance walked in 6 minutes &#8212; <em>walkdist<\/em><\/td>\n<td>continuous<\/td>\n<td>54 metres to 150 meters<\/td>\n<\/tr>\n<tr>\n<td>Rating of COPD severity &#8212; <em>severity<\/em><\/td>\n<td>discrete<\/td>\n<td>MI: mild; MO: moderate;<\/p>\n<p>S: severe<\/td>\n<\/tr>\n<tr>\n<td>Smoking status \u00a0&#8212; <em>smoke<\/em><\/td>\n<td>discrete<\/td>\n<td>S: smoker; EX: ex-smoker;<\/p>\n<p>NON: never smoked<\/td>\n<\/tr>\n<tr>\n<td>Years as a smoker \u00a0&#8212; <em>yrsmoke<\/em><\/td>\n<td>continuous<\/td>\n<td>&lt;1 to max years smoked<\/td>\n<\/tr>\n<tr>\n<td>Forced expiratory vol in 1 sec &#8212; <em>FEV1<\/em><\/td>\n<td>continuous<\/td>\n<td>1.5 \u2013 4.0<\/td>\n<\/tr>\n<tr>\n<td>Resting heart rate &#8212; <em>rhr<\/em><\/td>\n<td>continuous<\/td>\n<td>50 to 100<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><strong><em>\u00a0<\/em><\/strong><\/p>\n<div>6.3.1 Creating your dataset with a random number generator<\/div>\n<p>Here we will use SAS code to produce the table of random numbers for each of the variables listed above. Recent developments in high speed computing and the creation of the Mersenne-Twister Random Number Generator which is now used by SAS, have led to the creation of the RAND() function.\u00a0 As stated in the SAS Knowledge Base (SAS(R) 9.3 Functions and CALL Routines), the RAND function can generate random numbers for a distribution specified by the user.<\/p>\n<p>In the following example the random number generator was seeded with the statement:\u00a0 call streaminit(n);<\/p>\n<div>\n<p>\/* where n refers to any number you wish to use *\/<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Here we specify that the data we generate will be drawn from the normal distribution.<\/p>\n<div>\n<p>\u2026 RAND(&#8220;normal&#8221;)\u2026<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div>\n<p>code snippet:<\/p>\n<p>data sasrng;<\/p>\n<p>call streaminit(13);<\/p>\n<p>\/* here we use n=13 to seed the RNG *\/<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>SAS User Notes provide an explanation of the RAND() function as follows:<\/p>\n<div>\n<p>&nbsp;<\/p>\n<p>where \u00a0is an observation from the normal distribution with a mean of \u03b8 and a standard deviation of \u03bb that has the following probability density function:<\/p>\n<p>Range:<\/p>\n<p>\u03b8: is the mean parameter \uf0e0 Default:0<\/p>\n<p>: is the standard deviation parameter \uf0e0 Default:1<\/p>\n<p>Range: \u00a0&gt; 0.<\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<p><strong>\u00a0<\/strong><\/p>\n<p>Once we have established the parameters for random number selection we begin writing the SAS program to create random number generators as we would for any SAS program.\u00a0 Start by stating the options that you would like included in the output and then name the workspace using normal SAS code.<\/p>\n<p>&nbsp;<\/p>\n<div>\n<p>OPTIONS PAGESIZE=63 LINESIZE=90 DATE;<\/p>\n<p>DATA SASRNG;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Our next statement is to create an array. An array is a set of variables that generally have some commonality and that you wish to process together. In our example, we will start by creating an array that we name\u00a0 <strong>SCORES<\/strong>, and which has three elements or variables.<\/p>\n<div>\n<p>ARRAY SCORES SEX SEVERITY SMOKING;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>By naming the array, as we have here (<strong>SCORES<\/strong>) we can refer to the array <strong>SCORES<\/strong> later to reference the specific elements that are contained within. For example, since the array has three elements, then <strong>SCORES<\/strong>(1) refers to the first element\u2014the participant\u2019s <em>SEX<\/em>, while <strong>SCORES<\/strong>(2) refers to the second element\u2014the <em>SEVERITY<\/em> of the COPD condition, and <strong>SCORES<\/strong>(3) refers to the third element\u2014the patient\u2019s <em>SMOKING<\/em> status.<\/p>\n<p>Once we create the workspace in SAS, we next use the do-loop statements to generate a data set consisting of 20 cases. The first do-loop (<strong>DO<\/strong> K=<strong>1<\/strong> TO <strong>20<\/strong>) tells SAS to execute the statements within the loop 20 times.<\/p>\n<p>The second do-loop (<strong>DO<\/strong> K=<strong>1<\/strong> TO <strong>3<\/strong>) is contained within the first loop and is designed to provide data specifically for the variables <em>SEX, SEVERITY, and SMOKING<\/em><\/p>\n<div>\n<p>DO K=1 TO 20;<\/p>\n<p>DO I=1 TO 3;<\/p>\n<\/div>\n<p>Figure 6.1 Functions of The Do-Loop To Generate Random Numbers For The Array SCORES<\/p>\n<p>Finally, we end the do loops with the following statement sequence.<\/p>\n<p>END;<\/p>\n<p>OUTPUT;<\/p>\n<p>END;<\/p>\n<p>&nbsp;<\/p>\n<p>The first END; statement closes the inside loop that begins with DO I=1 TO 3; likewise, the\u00a0 OUTPUT; \u00a0statement is needed to assign the RNG values to each variable for each participant, the outside loop (DO K=<strong>1<\/strong> TO <strong>20<\/strong>\ud83d\ude09 is closed with the second\u00a0 END; statement.<\/p>\n<p>&nbsp;<\/p>\n<p>Figure 6.2\u00a0 Closing the do-loops and producing output<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>The actual statement sequence to generate a random number for each of the variables in the array is shown here as a three step process beginning by seeding the Random Number Generator (RNG)\u00a0\u00a0\u00a0 with CALL STREAMINIT(N); where the (N) can be any number you wish to use.\u00a0 In this first example here we used the number 13 (only because 13 is MY lucky number!).<\/p>\n<div>\n<p>CALL STREAMINIT(13);<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>The call statement initiates or seeds the random number generator.<\/p>\n<div>\n<p>OPTIONS PAGESIZE=63 LINESIZE=90 DATE;<\/p>\n<p>DATA SASRNG;<\/p>\n<p>&nbsp;<\/p>\n<p>ARRAY SCORES SEX SEVERITY SMOKING;<\/p>\n<p>DO K=1 TO 20;<\/p>\n<p>&nbsp;<\/p>\n<p>DO I=1 TO 3;<\/p>\n<p>&nbsp;<\/p>\n<p>CALL STREAMINIT(13);<\/p>\n<p>SCORES(I)=RAND(&#8220;NORMAL&#8221;)*100000000;<\/p>\n<p>SCORES(I)=ROUND(SCORES(I));<\/p>\n<p>SCORES(I)=1+ABS((MOD(SCORES(I),333)));<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>This sequence of statemente invokes the random number generator and places a value in each element of the array (i.e., the list of variables).<\/p>\n<p>After generating the random numbers for each variable in the array \u00a0<strong>SCORE<\/strong> (SEX, SEVERITY, SMOKING) we then process the number with a logic filter so that it makes sense in relation to the range of scores that we would expect to see for each given variable.<\/p>\n<p>For example, if the RNG produces a value of 75 for the variable sex, then what does that mean?<\/p>\n<p>Well actually it is meaningless until you assign the meaning.<\/p>\n<p>We assign meaning to the values within a variable using logic statements. For each of the elements (variables) within the array we process the RNG value with logic statements that will make the data relevant to our variables.<\/p>\n<p>For example, the logic statements to convert the RNG values for sex are shown here. In this situation we convert the numeric variable for sex to a text variable that we call sex. Since we have text labels that extend beyond 8 characters we use the length statement with the $ to ensure that the full length of the text label is used.<\/p>\n<div>\n<p>\/* LOGIC STATEMENTS FOR THE VARIABLE: SEX *\/<\/p>\n<p>&nbsp;<\/p>\n<p>LENGTH SEX $12;<\/p>\n<p>IF SEX &gt; 175 THEN SEX = &#8216;NOT STATED&#8217;;<\/p>\n<p>IF SEX &gt;54 AND SEX&lt;175 THEN SEX = &#8216;FEMALE&#8217;;<\/p>\n<p>IF SEX &gt;0 AND SEX&lt;55 THEN SEX = &#8216;MALE&#8217;;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>In SAS, the logic statements use the if-then conventional approach. That is, for every IF statement we use a corresponding THEN statement. In this way we process the RNG values to be within the range of logical outcomes for the variable that we are creating.<\/p>\n<div>\n<p>\/* LOGIC STATEMENTS TO CREATE CATEGORIES FOR THE VARIABLE: COPD SEVERITY TYPE *\/<\/p>\n<p>IF SEVERITY &gt;55\u00a0 THEN SEVERITY = 3;<\/p>\n<p>IF SEVERITY &gt;27 AND SEVERITY&lt;56 THEN SEVERITY = 2;<\/p>\n<p>IF SEVERITY &gt;4 AND SEVERITY&lt;28 THEN SEVERITY = 1;<\/p>\n<p>&nbsp;<\/p>\n<p>\/* LOGIC STATEMENTS TO CREATE CATEGORIES FOR THE VARIABLE: SMOKING STATUS *\/<\/p>\n<p>IF SMOKING &gt;55\u00a0 THEN SMOKING = 3;<\/p>\n<p>IF SMOKING &gt;27 AND SMOKING&lt;56 THEN SMOKING = 2;<\/p>\n<p>IF SMOKING &gt;4 AND SMOKING&lt;28 THEN SMOKING = 1;<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Next we create RNGs for the remaining seven variables that we plan to include in the analysis. We do not need to include these in the array and can simply generate the values when SAS walks through the outer do loop. The independent execution of the rand(\u201cnormal\u201d) function can run with a new seed and a new maximum score.\u00a0 Notice that these follow the array processing statements.<\/p>\n<p>The continuous discrete variables were age, years smoked, resting heart rate, weight, and distance walked in 6 minutes (measured in metres).<\/p>\n<p>\/* CONTINUOUS DISCRETE VARIABLE AGE *\/<\/p>\n<p>CALL STREAMINIT(13); AGE=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>AGE=40+ABS((MOD(AGE,62))); AGE=ROUND(AGE);<\/p>\n<p>IF AGE&gt;72 THEN AGE=35+ABS((MOD(AGE,50)));<\/p>\n<p>&nbsp;<\/p>\n<p>\/* CONTINUOUS DISCRETE VARIABLE YRSMOKE *\/<\/p>\n<p>CALL STREAMINIT(13); YRSMOKE=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>YRSMOKE=1+ABS((MOD(YRSMOKE,12))); YRSMOKE=ROUND(YRSMOKE);<\/p>\n<p>&nbsp;<\/p>\n<p>\/* CONTINUOUS DISCRETE VARIABLE RHR *\/<\/p>\n<p>CALL STREAMINIT(69); RHR=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>RHR=54+ABS((MOD(RHR,80))); RHR=ROUND(RHR);<\/p>\n<p>&nbsp;<\/p>\n<p>\/* CONTINUOUS DISCRETE VARIABLE WT *\/<\/p>\n<p>CALL STREAMINIT(45); WT=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>WT=45+ABS((MOD(WT,65)));WT=ROUND(WT,0.01);<\/p>\n<p>IF WT&gt;85 THEN WT=55+ABS((MOD(WT,12)));<\/p>\n<p>&nbsp;<\/p>\n<p>\/* CONTINUOUS DISCRETE VARIABLE WALKDIST *\/<\/p>\n<p>CALL STREAMINIT(69);WALKDIST=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>WALKDIST=54+ABS((MOD(WALKDIST,80))); WALKDIST=ROUND(WALKDIST);<\/p>\n<p>&nbsp;<\/p>\n<p>Next we created the continuous decimal variables. Again these statements are placed within the do loops to produce a full set of 20 outputs.<\/p>\n<p>\/* CONTINUOUS DECIMAL VARIABLE FEV1 *\/<\/p>\n<p>CALL STREAMINIT(99); FEV1=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>FEV1=1+ABS((MOD(FEV1,3)));FEV1=ROUND(FEV1,0.01);<\/p>\n<p>&nbsp;<\/p>\n<p>\/* CONTINUOUS DECIMAL VARIABLE HT *\/<\/p>\n<p>CALL STREAMINIT(21); HT=RAND(&#8220;NORMAL&#8221;)*1000000000000;<\/p>\n<p>HT=1+ABS((MOD(HT,1.1)));HT=ROUND(HT,0.01);<\/p>\n<p>IF HT&lt;1.5 THEN HT=1.2+ABS((MOD(HT,1.1)));<\/p>\n","protected":false},"author":56,"menu_order":3,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-991","chapter","type-chapter","status-publish","hentry"],"part":982,"_links":{"self":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/991","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/users\/56"}],"version-history":[{"count":46,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/991\/revisions"}],"predecessor-version":[{"id":2192,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/991\/revisions\/2192"}],"part":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/parts\/982"}],"metadata":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapters\/991\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/media?parent=991"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/pressbooks\/v2\/chapter-type?post=991"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/contributor?post=991"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.library.upei.ca\/montelpare\/wp-json\/wp\/v2\/license?post=991"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}