Histograms and bar plots Roulette is a fascinating example of a betting game using
random outcomes. In order to explore some properties of roulette spins, let's
visualize some randomly drawn numbers in the range of those in an European roulette game (0 to 36). Histograms allow the graphic representation of the distribution of variables. Let's have a look at it! Type in the following code: 1 set.seed(1) 2 drawn = sample(0:36, 100, replace = T) 3 hist(drawn, main = "Frequency of numbers drawn", 4 xlab = "Numbers drawn", breaks=37) Here we first set the seed number to 1 (see line 1). For reproducibility reasons, computer generated random numbers are generally not really random (they are in fact called pseudo-random). Exceptions exist, such as numbers generated on the website http://www.random.org (which bases the numbers on atmospheric variations). Setting the seed number to 1 (or any number really) makes sure the numbers we generate here will be the same as you will have on your screen, when using the same code and the same seed number. Basically, setting it allows the reproduction of random drawings. On line 2, we use the sample() function to generate 100 random numbers in a range of 0 to 36 (0:36). The replace argument is set to true (T), which means that the same number can be drawn several times. The hist() function (lines 3 and 4) will plot the frequency of these numbers. The hist() function takes a multitude of arguments, of which we use 4 here; main, which sets the title of the graphic, xlab, which sets the title of the horizontal axis (similarity, ylab would set the title of the vertical axis), and breaks, which forces the display of 37 breaks (corresponding to the number of possible outcomes of the random drawings). For more information about the hist() function, you can simply type ?hist() in your R console. Chapter 2 [ 19 ] As you can notice on the graph below, the frequencies are quite different between numbers, even though each number has an equal theoretical probability to be drawn on each roll. The output is provided in the figure below: A histogram of the frequency of numbers drawn Let's dwell a little upon the representation of mean values using bar plots. This will allow us to have a look at other properties of the roulette drawings. The mean values will represent the proportions of presence of characteristics of the roulette outcomes (for example, proportion of red number drawn). We will therefore build some new functions. Visualizing and Manipulating Data Using R [ 20 ] The buildDf() function will return a data frame with a number of rows that correspond to how many numbers we want to be drawn, and a number of columns that correspond to the total number of attributes we are interested in (the number drawn, its position on the wheel and several possible bets), totaling 14 columns. The matrix is first filled with zeroes, and will be populated at a later stage: 1 buildDf = function(howmany) { 2 Matrix=matrix(rep(0, howmany * 14), nrow=howmany,ncol=14) 3 DF=data.frame(Matrix) 4 names(DF)=c("number","position","isRed","isBlack", 5 "isOdd","isEven","is1to18","is19to36","is1to12", 6 "is13to24","is25to36","isCol1","isCol2","isCol3") 7 return(DF) 8 } Let's examine the code in detail: on line one, we declare the function, which we call buildDF. We tell R that it will have an argument called howmany. On line 2, we assign a matrix of howmany rows and 14 columns to an object called Matrix. The matrix is at this stage filled with zeroes. On line 3, we make a data frame called DF of the matrix, which will make some operations easier later. On lines 4 to 6, we name the columns of the data frame using names() functions. The first column will be the number drawn, the second the position on the wheel (the position for 0 will be 1, the position for 32 will be 2, and so on). The other names correspond to possible bets on the betting grid. We will describe these later when declaring the function that will fill in the matrix. On line 7, we specify that we want the function to return the data frame. On line 8, we close the function code block (using a closing bracket), which we opened on line 1 (using an opening bracket). Our next function, attributes(), will fill the data frame with numbers drawn from the roulette, their position on the roulette, their color, and other attributes (more about this below): 1 attributes = function(howmany,Seed=9999) { 2 if (Seed != 9999) set.seed(Seed) 3 DF = buildDf(howmany) 4 drawn = sample(0:36, howmany, replace = T) 5 DF$number=drawn 6 numbers = c(0, 32, 15, 19, 4, 21, 2, 25, 17, 34, 6, 27, 7 13, 36, 11, 30, 8, 23, 10, 5, 24, 16, 33, 1, 20, 14, 8 31, 9, 22, 18, 29, 7, 28, 12, 35, 3, 26)