So previously we talked about expected values, and their properties, and a little bit of how to calculate them.

Now let's talk about the expected value operator itself, and the expected value operator is in fact a linear operator, and that'll greatly simplify calculating expected values for some relatively complicated things. So, loomis in the'A' and'B' are not random numbers. So, when you think about'A' and'B', you think'A' as some thing like five, it's this number you can plug in. And'x' and'Y' are two random variables. Then expective value of'AX plus B' works out to be'A' expective value of'X' plus'B', exactly what you hope it work out to be. And expective value of'X' plus'Y' works out to be expective value of'X' and expective value of'Y'. And the reason it works out in these cases is because the expected is a linear operator, it is not always the case that expected value of g of x is equal to g of expected value of x, where g is some general function that's not linear. This can happen with specific random variables and specific values of g, but in general it's not the case. The most, sort of, famous example where it's not the case is that expected value of x squared is not equal to expected value of x, whole thing quantity squared. Now, let's talk about what the difference of these two entities is. What do we mean by the difference between these two things? Here x is a random variable. X-squared is the random variable you obtain by squaring x. So, for example, if x is a die roll, it can take values one, two, three, four, five, six. X-squared then can take the values one, four, nine, and it takes those values with probability one-sixth each. So the expected value of X squared represents the expected value of the squared random variable. On the other hand, expected value of X quantity squared, represents what you obtain if you first calculate the expected value of X, and then square the result. And these two things are not equal. This is a, well-known example where expected value of g of x is not equal to g of expected value of x. And we'll see in a couple slides why it is

a well-known example of that property. But in general, I would like you to remember that if g is not a linear function then you can't just commute expected values outside of g to the inside of g. And that, that rule would generally hold. If it's, if it's linear, if g is a linear function you can always do it. If g is not a linear, just in general things that you cannot. The expected value rule. Hold, no matter what. Constitutes X and Y. X could be discrete, continuous, mixed discrete and continuous. Y could be discrete, continuous, and mixed and the rules still holds. So, let me go through an example, supposed you flip a coin X, and as we normally do, X is zero is it's tails and one if it's head and you simulate a uniform random number Y. The random number is between zero and one. What's the expected value of their sum. Well the sum of a coin flip and a uniform random variable is weird distribution. It's not obvious, especially if all you've had is the handful of lectures from this class, how you would calculate that distribution, and then from that distribution then calculate the expected value. However, we do know how to calculate the expected value of a uniform random variable and the expected value of a coin flip, and so the expected value of their sum is the sum of their expected values. We know the expected value of the coin flip is.5. We know that the expected value of the uniform random variable is.5, so that expected value is one. So you can see how these. Expected value operator rules make calculating things associated with expected values a lot easier. Another example is, suppose you role a die twice. What is the expected value of the average of two die roles. So you often roll two dies when you're playing a board game, for example. Okay, let's let x1 be the result of the first die and x2 be the result of the second die. Now the variable that we're interested in, let's call it y equal to x1 plus x2, divided by two. Now, one way you could calculate the

expected value of y is to figure out what the distribution was of the average of two die rolls. So, let me give you a sense of this really quickly. The reason we think the distribution of a single die roll is one sixth at each number, is if you roll a die a lot of times you get about, one sixth of the, of the die rolls are one, one sixth are two, one sixth are three, one sixth are four, and so on. And, and, and then kind of geometrically we are modelling the process as if they're all equally likely, and so that's why we're going to model the population of die rolls as having probability one sixth on each number. Now this implies a distribution on the average of two die rolls, right? That the smallest number it could take is one, right? One plus one divided by two, this, the average of if you were to get two 1s. And the largest it could take is, is six plus six divided by two or six if you were to roll two 6s. But it takes different values in between, and it, and it's not equally likely. For all the, all the numbers in between. A one has probability 136, but some of the middle values have higher probabilities. So this, any rate, our variable y itself has a distribution. And you could get a pretty good sense of it. Maybe you could do this by taking two dice, rolling them, taking the average, rolling them again, taking the average, doing that over and over and over again, and prob, plotting, you know, a bar plot of the frequency of the, the averages that you get. And that would give you a good sense of what the population distribution is. Or you could work it out on pen and paper as to what, what the distribution actually is. And then once you get that worked out, then you could use your expected value formula to calculate the expected value of y directly by doing summation overall the possible values of y times p of y and calculate its expected value. Another way to do it is to directly use the expected value of linear operator rule. So in this case, expected value of x1 plus x2 divided by two is one half expected value of x1 plus the expected value of x2

because the one half is the non random variable that we could just pull out and the expected value goes across the two sums here to get expected value of x1 plus expected value of x2. That then yields 3.5 plus 3.5 divided by two, which is 3.5. Now you might be wanting, wondering. After hearing this it's, "oh, that's interesting." You'd expect a value of the average of two die rolls is the exact same as the expected value of an individual die roll. And that is exactly the case, but you're probably thinking, "Maybe does this extend beyond that. Is the expected value of the average of N die rolls equal to the 3.5 as well." And the, the answer is yes, that is, that's exactly true. In, in fact, as a nice segue way into our next slide. Where we actually derive. The property that we were hinting at in the previous slide. Namely that the expected value of the average of a collection of random variables from the same distribution, is the same as the, the expected value of the individual random variables. So lets let XI, for I equal one to N, be a collection of random variables, each. Each from a distribution would mean mu. I just wanna also point out that we tend to use Greek letters to represent, population quantities, in this case the population mean of the distribution is mu. So lets calculate the expected value of the sample average of the XI. Well, we want the expected value of the sample average which is one over N, summation I equals one to the N to the XI's. The one over N pulls out because its not random. The expected value commutes across the sum. And the expected value of each of those x I's, is itself mu. So we get the summation I equals one to n of mu. We get mu added up n times then which is n mu divided by n on the outside, so we get mu. So what this says is, it doesn't matter what the distribution of the individual x's is The distribution of the mean of the Xes, has the same mean as the, the, individual means. So let me just summarize one more time.

The expected value of the sample mean, is the population mean that it's trying to estimate. The population mean of the distribution of the sample mean of N observations is exactly the population mean that it's trying to estimate. And so when this happens. When the expected value of an estimator is what it's trying to estimate, that's a good thing. We say that the estimator is itself unbiased. So sample means are unbiased estimators of population means. And again, there were some assumptions for this to be true, right? All the axis have to be from a distribution that has mean . U being the value you want estimate and then the. The sample mean is, is an unbiased estimator of the population mean and we finally getting to the point where we can talk about how we're going to connect our probability modeling to the data that we observed. We're not quite there yet but we're getting closer and closer to this and I want you to remember that we're throwing around the term mean a lot. And I want, and so if you get confused, I want you to qualify the mean that we're talking about, whether it's a population quantity. By component of the probability distribution, or a sample quantity, an empirical quantity that you connect from the data. And remember our goal in probability modeling is to connect our sample observations to the population using our probability model.