You are on page 1of 38

Definitions Probability Experiment Process which leads to well-defined results call outcomes Outcome The result of a single

trial of a probability experiment Sample Space Set of all possible outcomes of a probability experiment Event One or more outcomes of a probability experiment Classical Probability Uses the sample space to determine the numerical probability that an event will happen. Also called theoretical probability. Equally Likely Events Events which have the same probability of occurring. Complement of an Event All the events in the sample space except the given events. Empirical Probability Uses a frequency distribution to determine the numerical probability. An empirical probability is a relative frequency. Subjective Probability Uses probability values based on an educated guess or estimate. It employs opinions and inexact information. Mutually Exclusive Events Two events which cannot happen at the same time. Disjoint Events Another name for mutually exclusive events. Independent Events Two events are independent if the occurrence of one does not affect the probability of the other occurring. Dependent Events Two events are dependent if the first event affects the outcome or occurrence of the second event in a way the probability is changed. Conditional Probability The probability of an event occurring given that another event has already occurred. Bayes' Theorem A formula which allows one to find the probability that an event occurred as the result of a particular previous event. Factorial

A positive integer factorial is the product of each natural number up to and including the integer. Permutation An arrangement of objects in a specific order. Combination A selection of objects without regard to order. Tree Diagram A graphical device used to list all possibilities of a sequence of events in a systematic way.

Introduction to Probability

Sample Spaces A sample space is the set of all possible outcomes. However, some sample spaces are better than others. Consider the experiment of flipping two coins. It is possible to get 0 heads, 1 head, or 2 heads. Thus, the sample space could be {0, 1, 2}. Another way to look at it is flip { HH, HT, TH, TT }. The second way is better because each event is as equally likely to occur as any other. When writing the sample space, it is highly desirable to have events which are equally likely. Another example is rolling two dice. The sums are { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 }. However, each of these aren't equally likely. The only way to get a sum 2 is to roll a 1 on both dice, but you can get a sum of 4 by rolling a 3-1, 2-2, or 3-1. The following table illustrates a better sample space for the sum obtain when rolling two dice.
Second Die First Die 1 1 2 2 3 3 4 4 5 5 6 6 7

2 3 4 5 6

3 4 5 6 7

4 5 6 7 8

5 6 7 8 9

6 7 8 9 10

7 8 9 10 11

8 9 10 11 12

Classical Probability The above table lends itself to describing data another way -- using a probability distribution. Let's consider the frequency distribution for the above sums.
Sum Frequency Relative Frequency 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36

2 3 4 5 6 7 8 9 10

1 2 3 4 5 6 5 4 3

The empirical probability of an event is the relative frequency of a frequency distribution based upon observation. A classical probability is the relative frequency of each event in the sample space when each event is equally likely. P(E) = n(E) / n(S) Empirical Probability Empirical probability is based on observation.11 12 2 1 2/36 1/36 If just the first and last columns were written. The probability of an event occurring is the number in the event divided by the number in the sample space. however. The probability of any event which is not in the sample space is zero. Again. This gives us the formula for classical probability. P(E) = f / n Probability Rules There are two rules which are very important. if the events are equally likely. this is only true when the events are equally likely. we would have a probability distribution. The relative frequency of a frequency distribution is the probability of the event occurring. This is only true. . The probability of an event which cannot occur is 0. All probabilities are between 0 and 1 inclusive 0 <= P(E) <= 1 The sum of all the probabilities in the sample space is 1 There are some other rules which are also important.

so the values that go on the inside portion of the table are the intersections or "and"s of each pair of events). Specific Addition Rule Only valid when the events are mutually exclusive. joint is what they have in common -.70. A and B are disjoint I like to use what's called a joint probability distribution.20. then the probability of them both occurring at the same time is 0. P(E') = 1 . "Marginal" is another word for totals -.P(E) Probability Rules "OR" or Unions Mutually Exclusive Events Two events are mutually exclusive if they cannot occur at the same time.The probability of an event which must occur is 1. . Disjoint: P(A and B) = 0 If two events are mutually exclusive. then the probability of either occurring is the sum of the probabilities of each occurring. Another word that means mutually exclusive is disjoint. P(A or B) = P(A) + P(B) Example 1: Given: P(A) = 0. If two events are disjoint. (Since disjoint means nothing in common.it's called marginal because they appear in the margins. The probability of the sample space is 1. The probability of an event not occurring is one minus the probability of it occurring. P(B) = 0.

P(B) = 0. P(A and B) = 0. Marginal . The rest of the values are obtained by addition and subtraction. The grand total is always 1. the probability of the intersection (and) is added twice.80 1.15 0. P(A or B) = P(A) + P(B) .00.70 0.00 The values in red are given in the problem.05 0.70 B' 0. Mutually exclusive events will have a probability of zero.20.30 Marginal 0. B A A and B are Mutually Exclusive B' . To compensate for that double addition.80 1.00 Interpreting the table Certain things can be determined from the joint probability distribution.15 B A A' Marginal 0.20 0. All inclusive events will have a zero opposite the intersection. there is some overlap. General Addition Rule Always valid.30 Marginal 0.70.B A A' Marginal 0.P(A and B) Example 2: Given P(A) = 0.70 B' 0. the intersection needs to be subtracted.20 0. All inclusive means that there is nothing outside of those two events: P(A or B) = 1.55 0.25 0.10 0. .00 0. When P(A) and P(B) are added.20 0. Non-Mutually Exclusive Events In events which aren't mutually exclusive.

A and B are All Inclusive if this value is 0 .00 The 0.00 "AND" or Intersections Independent Events Two events are independent if the occurrence of one does not change the probability of the other occurring.80 1.30 Marginal 0.20 * 0. A and B are independent.14 is because the probability of A and B is the probability of A times the probability of B or 0.14.56 0.70 B' 0.24 0. P(B) = 0.20 0. Specific Multiplication Rule Only valid for independent events P(A and B) = P(A) * P(B) Example 3: P(A) = 0. Rolling the 2 does not affect the probability of flipping the head. If events are independent. 1. An example would be rolling a 2 on a die and flipping a head on a coin.14 0.if this value is 0 A' Marginal .06 0. B A A' Marginal 0. .70. Dependent Events . .20.70 = 0. then the probability of them both occurring is the product of the probabilities of each occurring.

3. P(A and B) = P(A) * P(B|A) Example 4: P(A) = 0. thus the intersection is 0. 4.20.80 1. is simply the probability of B occurring. A and B are independent events P(A and B) = P(A) * P(B) P(A|B) = P(A) P(B|A) = P(B) The last two are because if two events are independent.20 0.12 0. the occurrence of one doesn't change the probability of the occurrence of the other.62 0.If the occurrence of one event does affect the probability of the other occurring.08 0.70 B' 0. Conditional Probability The probability of event B occurring that event A has already occurred is read "the probability of B given A" and is written: P(B|A) General Multiplication Rule Always works. 40% of the 20% which was in event A is 8%.08. then the events are dependent. B A A' Marginal 0. whether A has happened or not. This means that the probability of B occurring.18 0.40 A good way to think of P(B|A) is that 40% of A is B. P(B) = 0.70.00 Independence Revisited The following four statements are equivalent 1.30 Marginal 0. P(B|A) = 0. . 2.

Instead of the entire sample space S. . given that event A has already occurred is P(B|A) = P(A and B) / P(A) This formula comes from the general multiplication principle and a little bit of algebra.Conditional Probability Conditional Probability Recall that the probability of an event occurring given that another event has already occurred is called a conditional probability. we now have a sample space of A since we know A has occurred. "Do you smoke?" was asked of 100 people. Results are shown in the table. It is the number in A and B (must be in A since A has occurred) divided by the number in A. Male Female Total Yes 19 12 31 No 41 28 69 Total 60 40 100 . So the old rule about being the number in the event divided by the number in the sample space still applies. Examples Example 1: The question. we have a reduced sample space. The probability that event B occurs. Since we are given that event A has occurred. If you then divided numerator and denominator of the right hand side by the number in the sample space S. then you have the probability of A and B divided by the probability of A.

50) = 0.think of stratified sampling. Brochmailians. this is a marginal probability. you're told that you have a male . What is the probability of a randomly selected individual smoking? Again.475 0. What is the probability of a randomly selected male smoking? This time. The number of "Male and Smoke" divided by the total = 19/100 = 0.. Since no mention is made of smoking or not smoking. the total who smoke divided by the total = 31/100 = 0. since no mention is made of gender.50-0.025 = 0.30-0.60. This information can be placed into a joint probability distribution Company Aberations Brochmailians Good 0.021 = 0. 7% of Brochmailians' product is defective. In this class we will treat Bayes' problems as another conditional probability and not involve the large messy formula given in the text (and every other text).19 What is the probability of a randomly selected individual being a male? This is the total for male divided by the total = 60/100 = 0.31666. and Chompielians.50 0.6129 (approx) After that last part.025 0. 19 males smoke out of 60 males. Example 2: There are three major manufacturing companies that make a product: Aberations. you have just worked a Bayes' Theorem problem. A Bayes' problem can be set up so it appears to be just another conditional probability.05(0. and Brochmailians has a 30% market share. Aberations has a 50% market share.30 . What is the probability that a randomly selected smoker is male? This time. There are 19 male smokers out of 31 total smokers. it includes all the cases.07(0. so 19/31 = 0. and 10% of Chompieliens' product is defective.30) = 0. I know you didn't realize it .279 Defective 0. you're told that you have a smoker and asked to find the probability that the smoker is also male.021 Total 0. 5% of Aberations' product is defective..that's the beauty of it.31. so 19/60 = 0. What is the probability that the male smokes? Well.     What is the probability of a randomly selected individual being a male who smokes? This is just a joint probability.

Let's use the same example. and D instead of Aberations. then 93% is good. Bayes' Theorem However.066 0.066 = 7/22 = 0.    What is the probability a randomly selected product is defective? P(Defective) = 0. If they were. then P(Brochmailians|Defective)=0.934 0.318 would have to equal the P(Brochmailians)=0.279.30.021/0.066=0. and Defective. .00. or by multiplication using conditional probabilities.00 The percent of the market share for Chompieliens wasn't given. ie: A.180 0. The joint probability P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians). let's say that you want to know what Bayes' formula is. Again. my point is. but P(Defective|Brochmailians). Chompieliens. the 7% is not P(Defective). B. That is.10(0.20 1. but it doesn't.20-0. but shorten each event to its one letter initial. and it doesn't. The "good" probabilities can be found by subtraction as shown above. Also. Notice that the 5%. C.30)=0. This is because they are conditional probabilities and the table is a joint probability table. These defective probabilities are conditional upon which company was given. but since the marginals must add to be 1.033.020 0. just for the sake of argument. The second question asked above is a Bayes' problem. they have a 20% market share. and 10% defective rates don't go into the table directly.020 = 0.93(0. you don't have to know Bayes formula just to work a Bayes' problem. the P(Aberations and Defective)=0.20) = 0. 0. Are these events independent? No. 7%.066 What is the probability that a defective product came from Brochmailians? P(Brochmailian|Defective) = P(Brochmailian and Defective) / P(Defective) = 0.Chompieliens Total 0. If 7% of Brochmailians' product is defective.318 (approx).025 would have to be P(Aberations)*P(Defective) = 0. Brochmailians.50*0.

Bayes' formula finds the reverse conditional probability P(B|D). Fundamental Theorem of Arithmetic Every integer greater than one is either prime or can be expressed as an unique product of prime numbers Fundamental Theorem of Algebra Every polynomial function on one variable of degree n > 0 has at least one real or complex zero. the part of D in B. The table does the adding for you and makes the problems doable without having to memorize the formulas. It is based that the Given (D) is made of three parts. Counting Techniques Fundamental Theorems Every branch of mathematics has its fundamental theorem or theorems. then it will occur at a corner point or on a boundary between two or more corner points . the part of D in A. P(B|D) = P(B and D) ----------------------------------------P(A and D) + P(B and D) + P(C and D) Inserting the multiplication rule for each of these joint probabilities gives P(B|D) = P(D|B)*P(B) ----------------------------------------P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C) However.P(D|B) is not a Bayes problem. This is given in the problem. it is much easier to take the joint probability divided by the marginal probability. Fundamental Theorem of Linear Programming If there is a solution to a linear programming problem. and the part of D in C. and I hope you agree.

. arranged into one group of size n. without repetition. arranged in groups of size r. Probability menu. (3)(2)(1) n! = n (n-1)! A special case is 0! 0! = 1 Permutations A permutation is an arrangement of objects without repetition and where order is important. and order being important is: nPn = P(n. Another definition of permutation is the number of arrangements that can be formed. the permutation key is found under the Math. On the TI-82 or TI-83.Fundamental Counting Principle In a sequence of events.n) = n! Example: Find all permutations of the letters "ABC" ABC ACB BAC BCA CAB CBA Permutations of some of the objects A permutation of n objects. without repetition...r) = n! / (n-r)! The calculator can be used to find the number of such permutations. Factorials If n is a positive integer. then n! = n (n-1) (n-2) . Permutations using all the objects A permutation of n objects. and order being important is: nPr = P(n. the total possible number of ways all events can performed is the product of the possible number of ways each individual event can be performed.

. P(n. and you let n (n1.. n3. BOB BBO OBB OBB BBO BBO There are really only three distinguishable permutations here.r) = first r factors of n factorial Distinguishable Permutations Sometimes letters are repeated and all of the permutations aren't distinguishable from each other.Example: Find all two-letter permutations of the letters "ABC" AB AC BA BC CA CB Shortcut formula for finding a permutation Assuming that you start a n and count down to 1 in your factorials . Example: Find all permutations of the letters "BOB" To help you distinguish.. C=1. I'll write the second "B" as "b" BOb BbO OBb ObB bBO bOB If you just write "B" as "B". BOB BBO OBB If a word has N letters. k of which are unique. there are 10 letters total 10! 10*9*8*7*6*5*4*3*2*1 Permutations = -------------.= -------------------. however . T=3.= 50400 3! 3! 1! 2! 1! 6 * 6 * 1 * 2 * 1 . n2. . then the total number of distinguishable permutations is given by: Consider the word "STATISTICS": Here are the frequency of each letter: S=3... A=1.. nk) be the frequency of each of the k letters. I=2..

r) = first r factors of n factorial divided by the last r factors of n factorial Pascal's Triangle Combinations are used in the binomial expansion theorem from algebra to give the coefficients of the expansion (a+b)^n. C(n. Combinations A combination is an arrangement of objects without repetition and where order is not important. and order being important is: nCr = C(n. Shortcut formula for finding a combination Assuming that you start a n and count down to 1 in your factorials .You can find distinguishable permutations using the TI-82. you can not use the formulas for permutations or combinations. arranged in groups of size r.r) = n! / ( (n-r)! * r! ) Another way to write a combination of n things.. Note: The difference between a permutation and a combination is not whether there is repetition or not . A combination of n objects.there must not be repetition with either. The only difference in the definition of a permutation and a combination is whether order is important. They also form a pattern known as Pascal's Triangle. and if there is repetition. r at a time is using the binomial notation: Example: Find all two-letter combinations of the letters "ABC" AB = BA AC = CA BC = CB There are only three two-letter combinations.. 1 1 1 1 3 2 3 1 1 1 . without repetition.

r) r) = C(n. The n value is the number of the row (start counting at zero) and the r value is the element in the row (start counting at zero).r) = first r factors of n factorial divided by the last r factors of n factorial TI-82 You can use the TI-82 graphing calculator to find factorials. Each element is also a combination. permutations.it's in the row #6 (7th row) and position #3 (4th element).3) -. C(n. and combinations. Tree Diagrams Tree diagrams are a graphical way of listing all the possible outcomes. so listing all of the possible outcomes is easier than just trying to make sure that you have them all listed. then switch the combination to its alternative form and then use the shortcut given above. C(n.1) Shortcut formula for finding a combination Since combinations are symmetric.4) = C(10.6) or C(100. It is called a tree diagram because of the way it looks. That would make the 20 in the next to last row C(6.4 1 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 5 10 1 4 6 Each element in the table is the sum of the two elements directly above it.99) = C(100. The outcomes are listed in an orderly fashion. . Symmetry Pascal's Triangle illustrates the symmetric nature of a combination.n- Example: C(10. if n-r is smaller than r.

The tree diagram to the right would show the possible ways of flipping two coins. The probability of each outcome must remain constant from trial to trial. The final outcomes are obtained by following each branch to its conclusion: They are from top to bottom: HH HT TH TT Probability Distributions Definitions Random Variable Variable whose values are determined by chance Probability Distribution The values a random variable can assume and the corresponding probabilities of each. Binomial Distribution The outcomes of a binomial experiment with their corresponding probabilities.The first event appears on the left. Multinomial Distribution . or outcomes which can be reduced to two outcomes. Binomial Experiment An experiment with a fixed number of independent trials. Each trial can only have two outcomes. and then each sequential event is represented as branches off of the first event. Expected Value The theoretical mean of the variable.

then the function isn't a probability function. A random variable does not mean that the values can be anything (a random number).A probability distribution resulting from an experiment with a fixed number of independent trials. Random variables have a well defined set of . If these two conditions aren't met. Hypergeometric Distribution A probability distribution of a variable with two outcomes when sampling is done without replacement. The probability of each outcome must remain constant from trial to trial. Probability Distributions A listing of all the values the random variable can assume with their corresponding probabilities make a probability distribution. Poisson Distribution A probability distribution used when a density of items is distributed over a period of time. The sample size needs to be large and the probability of success to be small. There is no requirement that the values of the random variable only be between 0 and 1. A note about random variables. Probability Distributions Probability Functions A probability function is a function which assigns probabilities to the values of a random variable.   All the probabilities must be between 0 and 1 inclusive The sum of the probabilities of the outcomes must be 1. only that the probabilities be between 0 and 1. Each trial has two or more mutually exclusive outcomes.

Using algebra. the two formulas that we will be using are: . This simplifies to be: What's even better. the sample variance. So every f/N can be replaced by p(x). Variance. So. you don't know which outcome will occur next. x p(x) 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 sum 6/6=1 Mean.outcomes and well defined probabilities for the occurrence of each outcome. The random refers to the fact that the outcomes happen by chance -. is that the last portion of the variance is the mean squared. The definitions for population mean and variance used with an ungrouped frequency distribution were: Some of you might be confused by only dividing by N. Recall that this is the population variance. which was the unbiased estimator for the population variance was when it was divided by n-1. this is equivalent to: Recall that a probability is a long term relative frequency.that is. Here's an example probability distribution that results from the rolling of a single fair die. and Standard Deviation Consider the following.

7078 Do not use rounded off values in the intermediate calculations. x p(x) x p(x) x^2 p(x) 1 1/6 1/6 1/6 2 1/6 2/6 4/6 3 1/6 3/6 9/6 4 1/6 4/6 16/6 5 1/6 5/6 25/6 6 1/6 6/6 36/6 sum 6/6 = 1 21/6 = 3...916666.5 91/6 = 15. You can learn how to find the mean and variance of a probability distribution using lists with the TI-82 or using the program called PDIST. Only round off the final answer. The standard deviation is the square root of the variance = 1.1667 The mean is 7/2 or 3.Here's the example we were working on earlier.5 The variance is 91/6 . Binomial Probabilities Binomial Experiment A binomial experiment is an experiment which satisfies these four conditions    A fixed number of trials Each trial is independent of the others There are only two outcomes .(7/2)^2 = 35/12 = 2.

The conditions are met . Define Success first. the trials are independent [what one person does doesn't affect the next person]. each with only two outcomes. doesn't mean that it isn't binomial. Success must be for a single trial. Examples which aren't binomial experiments    Rolling a die until a 6 appears (not a fixed number of trials) Asking 20 people how old they are (not two outcomes) Drawing 5 cards from a deck for a poker hand (done without replacement. so not independent) Binomial Probability Function Example: What is the probability of rolling exactly two sixes in 6 rolls of a die? There are five things you need to do to work a binomial story problem. Examples of binomial experiments     Tossing a coin 20 times to see how many tails occur. These can be summarized as: An experiment with a fixed number of independent trials. and there's only two outcomes [yes or no]). A binomial experiment has a fixed number of independent trials. The fact that each trial is independent actually means that the probabilities remain constant. Success = "Rolling a 6 on a single die" 2. 1. The probability of each outcome remains constant from trial to trial.there's a fixed number [500]. Define the probability of success (p): p = 1/6 . Asking 500 die-hard Republicans if they would vote for the Democratic candidate. (Just because something is unlikely. Asking 200 people if they watch ABC news. each of which can only have two possible outcomes. Rolling a die to see if a 5 appears.

Further note that there are fifteen ways this can occur. the probability of the event (all six dice) is the product of each probability of each outcome (die) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 FFFFSS FFFSFS FFFSSF FFSFFS FFSFSF FFSSFF FSFFFS FSFFSF FSFSFF FSSFFF SFFFFS SFFFSF SFFSFF SFSFFF SSFFFF 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 5/6 1/6 1/6 1/6 1/6 1/6 * * * * * * * * * * * * * * * 5/6 5/6 5/6 5/6 5/6 5/6 1/6 1/6 1/6 1/6 5/6 5/6 5/6 5/6 1/6 * * * * * * * * * * * * * * * 5/6 5/6 5/6 1/6 1/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 * * * * * * * * * * * * * * * 5/6 1/6 1/6 5/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 * * * * * * * * * * * * * * * 1/6 5/6 1/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 * * * * * * * * * * * * * * * 1/6 1/6 5/6 1/6 5/6 5/6 1/6 5/6 5/6 5/6 1/6 5/6 5/6 5/6 5/6 = = = = = = = = = = = = = = = (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 (1/6)^2 * * * * * * * * * * * * * * * (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 (5/6)^4 Notice that each of the 15 probabilities are exactly the same: (1/6)^2 * (5/6)^4. 2 at a time. 1. The probability of getting exactly x success in n trials. The probability of each is written to the right of the way it could occur. it is a failure (denoted F). it is a success (denoted S) and anytime something else appears. then 4 of the 6 must be failures. Define the number of successes out of those trials: x = 2 Anytime a six appears. What is the probability that exactly 6 heads will occur. Because the trials are independent. This is the number of ways 2 successes can be occur in 6 trials without repetition and order not being important. or a combination of 6 things.3. Success = "A head is flipped on a single coin" . and if 2 of the 6 trials were success. The ways you can get exactly 2 successes in 6 trials are given below. note that the 1/6 is the probability of success and you needed 2 successes. The 5/6 is the probability of failure. Also. Find the probability of failure: q = 5/6 4. Note that 2 is the value of x and 4 is the value of n-x. Define the number of trials: n = 6 5. with the probability of success on a single trial being p is: P(X=x) = nCx * p^x * q^(n-x) Example: A coin is tossed 10 times.

Variance. Example: Find the mean. However.205078125 Mean. and Standard Deviation The mean.015625 * 0. p = 0.5^6 * 0. and standard deviation of a binomial distribution are extremely easy to find.2. there are still a fixed number of independent trials. .5 n = 10 x=6 P(x=6) = 10C6 * 0. 5. 3.5 q = 0. p = 1/6. 4.5^4 = 210 * 0. The standard deviation is the square root of the variance = 2. Another way to remember the variance is mu-q (since the np is mu). q = 5/6. variance. The variance is 30 * (1/6) * (5/6) = 25/6. and the probability of each outcome must remain constant from trial to trial.041241452 (approx) Other Discrete Distributions Multinomial Probabilities A multinomial experiment is an extended binomial probability. The difference is that in a multinomial experiment. Success = "a six is rolled on a single die". The mean is 30 * (1/6) = 5. variance.0625 = 0. there are more than two possible outcomes. and standard deviation for the number of sixes that appear when rolling 30 dice.

Outcome Pass Withdraw Fail Total x 16 12 2 30 p(outcome) 0.55 0. Poisson probabilities are useful when there are a large number of independent trials with a small probability of success on a single trial and the variables occur over a period of time. .00 The probability is found using this formula: 30! P = ---------------.05. and the probability that a person will fail the class is 0. The multinomial experiment will be used later when we talk about the chi-square goodness of fit test. Poisson Probabilities Named after the French mathematician Simeon Poisson.05^2 (16!) (12!) (2!) You can do this on the TI-82.40 0. exactly 16 pass. It can also be used when a density of items is distributed over a given area or volume.55^16 * 0.55.Instead of using a combination.40.05 1.40^12 * 0. as in the case of the binomial probability. the number of ways the outcomes can occur is done using distinguishable permutations. and 2 fail. An example here will be much more useful than a formula. 12 withdraw. Find the probability that in a class of 30 students.* 0. The probability that a person will pass a College Algebra class is 0. the probability that a person will withdraw before the class is completed is 0.

then lambda is the same as mu or n * p. You can use the TI-82 to find hypergeometric probabilities. .2 customers in 5 minutes and want to know the probability of getting exactly 3.Lambda in the formula is the mean number of occurrences.1288 (approx) Hypergeometric Probabilities Hypergeometric experiments occur when the trials are not independent of each other and occur due to sampling without replacement -. Example: If there are 500 customers per eight-hour day in a check-out lane. So. Example: How many ways can 3 men and 4 women be selected from a group of 7 men and 10 women? The answer is = 7350/19448 = 0. you expect about 5. Hypergeometric probabilities involve the multiplication of two combinations together and then division by the total number of combinations. what is the probability that there will be exactly 3 in line during any five-minute period? The expected value during any one five minute period would be 500 / 96 = 5. If you're approximating a binomial probability using the Poisson. This can be extended to more than two groups and called an extended hypergeometric problem.500/96) = e^(-500/96) * (500/96)^3 / 3! = 0.as in a five card poker hand.2083333. p(3. The 96 is because there are 96 five-minute periods in eight hours.3779 (approx) Note that the sum of the numbers in the numerator are the numbers used in the combination in the denominator.

Sampling Error Difference which occurs between the sample statistic and the population parameter due to the fact that the sample isn't a perfect representation of the population. the sampling distribution of the sample means will become approximately normally distributed. Finite Population Correction Factor A correction applied to the standard error of the means when the sample size is more than 5% of the population size and the sampling is done without replacement. Sampling Distribution of the Sample Means Distribution obtained by using the means computed from random samples of a specific size.Normal Distribution Definitions Central Limit Theorem Theorem which stats as the sample size increases. Standard Error or the Mean . Correction for Continuity A correction applied to convert a discrete distribution to a continuous distribution.

95% within 2 standard deviations. It is denoted by z.00 Approximately 68% lies within 1 standard deviation of the mean. Standard Normal Distribution A normal distribution in which the mean is 0 and the standard deviation is 1. It is equal to the standard deviation of the population divided by the square root of the sample size. and 99. This is the Empirical Rule mentioned earlier.The standard deviation of the sampling distribution of the sample means. . A standardized score in which the mean is zero and the standard deviation is 1. Data values represented by x which has mean mu and standard deviation sigma. Normal Distributions Any Normal Distribution        Bell-shaped Symmetric about mean Continuous Never touches the x-axis Total area under curve is 1.7% within 3 standard deviations of the mean. The Z score is used to represent the standard normal distribution. Z-score Also known as z-value.

 Probability Function given by Standard Normal Mean = 0 and Variance = 1 Non-Standard Normal Mean is not 0 or Variance is not 1 Normal Probabilities This table has not been verified against the book. but also . . Because of the symmetry of the normal distribution. the whole number and tenth are looked up along the left side and the hundredth is looked up across the top.. Probability Function given by Standard Normal Distribution Same as a normal distribution. look up the absolute value of any z-score.. Comprehension of this table is vital to success in the course! There is a table which must be used to look up standard normal probabilities.     Mean is zero Variance is one Standard Deviation is one Data values represented by z. The value in the intersection of the row and column is the area under the curve between zero and the z-score looked up. please use the table out of your textbook. The zscore is broken into two parts.

and > is positive). . or Less than a positive Instructions Look up the area in the table Look up both areas in the table and subtract the smaller from the larger. and requires you to use the table inversely. then subtract. or Greater than a positive Greater than a negative.5000 This can be shortened into two rules. If there is only one z-score given. decide if the z-score should be positive or negative. Remember. Finally. or Between two negatives Between a negative and a positive Less than a negative. Look up both areas in the table and add them together Look up the area in the table and subtract from 0. Situation Between zero and any number Between two positives. then use the inequality to determine the second sign (< is negative. and then read the zscore from the outside.5000 for the second area. If there is only one z-score. Finding z-scores from probabilities This is more difficult. zscores can be negative. You must look up the area between zero and the value on the inside part of the table. If the two numbers are the same sign. otherwise look up both z-scores in the table 2.Computing Normal Probabilities There are several different situations that can arise when asked to find normal probabilities. then add. use 0.5000 Look up the area in the table and add to 0. based on whether it was on the left side or the right side of the mean. but areas or probabilities cannot be. if they are different signs. 1.

01 0.06 0.0871 0.1064 0.0636 0.1141 .0793 0.0160 0.0080 0.0279 0.08 0.0910 0.2 0.Situation Area between 0 and a value Instructions Look up the area in the table Make negative if on the left side Subtract the area from 0.0517 0.0596 0.0000 0.00 0.0319 0.0438 0.1026 0.0 0.0359 0.1103 0.03 0.0239 0.1 0.0398 0.0753 0.07 0.0714 0.0675 0.0832 0.0987 0.0040 0.04 0.0478 0.0199 0.000 Divide the area by 2 Look up the quotient in the table Use both the positive and negative z-scores Area in one tail Area including one complete half (Less than a positive or greater than a negative) Within z units of the mean Two tails with equal area (More than z units from the mean) Using the table becomes proficient with practice.0120 0.5000 Look up the difference in the table Make negative if in the left tail Subtract 0.0557 0.09 0.05 0.0948 0.5000 from the area Look up the difference in the table Make negative if on the left side Divide the area by 2 Look up the quotient in the table Use both the positive and negative z-scores Subtract the area from 1.02 0. work lots of the normal probability problems! Standard Normal Probabilities z 0.

4761 0.2517 0.4177 0.4719 0.1628 0.2939 0.2019 0.3790 0.4713 0.3264 0.3554 0.3729 0.2794 0.4484 0.2764 0.4 0.1700 0.1331 0.4474 0.2673 0.4515 0.4686 0.1985 0.3106 0.4744 0.3888 0.3340 0.4608 0.3212 0.4292 0.0 1.2704 0.2054 0.3665 0.4772 0.4319 0.4452 0.4671 0.2852 0.3770 0.4706 0.4778 0.4015 0.4192 0.4625 0.2995 0.4251 0.4699 0.3621 0.5 1.7 0.4406 0.4032 0.3577 0.4099 0.7 1.4115 0.3997 0.2324 0.4463 0.9 1.1 1.1950 0.3944 0.4656 0.3849 0.3749 0.2549 0.4738 0.1664 0.4616 0.4394 0.4641 0.3599 0.4678 0.9 2.4382 0.4582 0.3023 0.2881 0.2190 0.4525 0.1879 0.3925 0.0 0.3461 0.2157 0.3186 0.6 1.1808 0.4345 0.2088 0.1406 0.4370 0.8 1.2389 0.2580 0.3869 0.2967 0.4599 0.3133 0.4279 0.0.3238 0.4131 0.2454 0.3907 0.4554 0.2734 0.4750 0.2224 0.2123 0.4726 0.2357 0.4649 0.2910 0.3 1.6 0.3389 0.8 0.4783 0.1772 0.1736 0.4082 0.4306 0.4693 0.4793 0.1480 0.3485 0.3531 0.4162 0.1844 0.4591 0.4545 0.4798 0.4147 0.4066 0.4535 0.2 1.3962 0.1179 0.3508 0.4441 0.2486 0.4808 0.4564 0.2642 0.4236 0.4732 0.1591 0.4 1.3830 0.2823 0.4222 0.4767 0.1368 0.3708 0.3289 0.4633 0.4049 0.4495 0.4418 0.4573 0.3078 0.2291 0.4664 0.4788 0.3051 0.5 0.3980 0.2422 0.4207 0.4332 0.4812 0.1293 0.1255 0.3810 0.3643 0.1443 0.1915 0.4505 0.3159 0.4265 0.4817 .4357 0.3365 0.2611 0.3315 0.4756 0.1517 0.3 0.1554 0.3438 0.1217 0.3686 0.4429 0.3413 0.2257 0.4803 0.

4896 0. rather than individual scores being used.4881 0.7 2.4949 0. That is.4953 0.4904 0.4988 0.4988 0.4989 0.4977 0.4973 0.4987 0.4969 0.2.4974 0.4830 0.4931 0.4968 0.4980 0.4842 0.4971 0.4981 0.4982 0.4943 0.4983 0.4898 0.4927 0.4984 0.4913 0.4946 0.4893 0.4963 0.4975 0.4938 0. and then the means are used as the data.4961 0.4974 0.4990 The values in the table are the areas between zero and the z-score.4887 0. The sample is a sampling distribution of the sample means.4979 0.4979 0.4875 0.4918 0.6 2.4826 0.4846 0.4959 0.4987 0.4906 0.4972 0.4878 0.4965 0.4989 0.4962 0.4967 0.4956 0.4945 0.4951 0.4985 0.4986 0.4970 0.4857 0.4922 0.4966 0.4890 0.4976 0.4977 0.4909 0.4821 0.4925 0.4960 0.4929 0.4985 0.4984 0.4987 0.4884 0.4864 0.4868 0.4948 0.2 2.4982 0.4941 0.3 2.4861 0.5 2.4952 0. statisticians often work with means.4934 0.9 3.4986 0.4850 0. Examples .4936 0.4955 0.8 2.4932 0.4981 0. P(0<Z<zscore) Central Limit Theorem Sampling Distribution of the Sample Means Instead of working with individual scores.1 2.4964 0. the mean is computed for each sample.0 0.4920 0.4989 0.4957 0.4838 0.4978 0.4901 0.4 2. What happens is that several samples are taken.4990 0.4834 0.4916 0.4854 0.4940 0.4911 0.4871 0.

we're not interested in the sum of the dice. Here are the values that are possible and their probabilities.5 4/36 5.5 Variance.0 1/36 1.0 5/36 4..5 4/36 3. Mean. mu = sum [ x * p(x) ] = 3.Example 1: Sampling Distribution of Values (x) Consider the case where a single.0 1/36 Computing the mean. Sum Prob 2 1/36 3 2/36 4 3/36 5 4/36 6 5/36 7 6/36 8 5/36 9 4/36 10 3/36 11 2/36 12 1/36 But. Mean Prob 1.0 3/36 2. we get . Value Probability 1 1/6 2 1/6 3 1/6 4 1/6 5 1/6 6 1/6 Here are the mean. and standard deviation. mu = sum [ x * p(x) ] = 3..5 6/36 4. sigma^2 = sum [ x^2 * p(x) ] .5 2/36 2. fair die is rolled. sigma = sqrt ( variance ) = sqrt ( 35/24 ) . variance. sigma^2 = sum [ x^2 * p(x) ] . variance.0 5/36 3. Here are the sums that are possible and their probabilities.0 3/36 5.5 2/36 6. Mean. sigma = sqrt ( variance ) = sqrt ( 35/12 ) Example 2: Sampling Distribution of Sample Means (x-bar) Consider the case where two fair dice are rolled instead of one.5 Variance.mu^2 = 35/12 Standard deviation. we're interested in the sample mean. and standard deviation of this probability distribution. We find the sample mean by dividing the sum by the sample size.mu^2 = 35/24 Standard deviation.

Some books define sufficiently large as at least 30 and others as at least 31. we will be ignoring this in class. then the sample means will have a normal distribution. The standard deviation of the sample means (known as the standard error of the mean) will be smaller than the population mean and will be equal to the standard deviation of the population divided by the square root of the sample size. If the population is not normally distributed. . but the sample size is sufficiently large. The adjustment is to multiply the standard error by the square root of the quotient of the difference between the population and sample sizes and one less than the population size. In the following.Properties of the Sampling Distribution of the Sample Means When all of the possible sample means are computed. If the population has a normal distribution. then the following properties are true:      The mean of the sample means will be the mean of the population The variance of the sample means will be the variance of the population divided by the sample size. N is the population size and n is the sample size. For the most part. then a correction needs to be made to the standard error of the means. then the sample means will have an approximately normal distribution. The formula for a z-score when working with the sample means is: Finite Population Correction Factor If the sample size is more than 5% of the population size and the sampling is done without replacement.

5 x < 6. the sample mean of any distribution will become approximately normal if the sample size is sufficiently large. The correction is to either add or subtract 0. Continuity Correction Factor There is a problem with approximating the binomial with the normal. Examples Discrete x=6 x>6 x >= 6 x<6 x <= 6 Continuous 5. The basic difference here is that with discrete values.5 x > 5. This fills in the gaps to make it continuous.5 x < 5. This is very similar to expanding of limits to form boundaries that we did with group frequency distributions. Furthermore. recall that the mean of a binomial distribution is np and the variance of the binomial distribution is npq. and with the continuous distribution we are talking about both heights and widths. That problem arises because the binomial distribution is a discrete distribution while the normal distribution is a continuous distribution.5 < x < 6. It turns out that the binomial distribution can be approximated using the normal distribution if np and nq are both at least 5.5 .5 of a unit from each discrete x-value. we are talking about heights but no widths.5 x > 6.Normal Approximation to Binomial Recall that according to the Central Limit Theorem.

5. Calculate the probability desired. If the smaller one is at least five. 3. and the desired number of successes. the probability of success. Steps to working a normal approximation to the binomial distribution 1. Find the standard deviation. equality makes no difference. Population Proportion . Importance of the Normal Distribution Parametric Hypothesis Testing All parametric hypothesis testing that we're going to perform requires normality in some sense. sigma = sqrt (npq).that way you don't have to work with all of the decimal places. but go ahead and convert the x before you forget about it and miss the problem. 2. However. whether or not the equal to is included makes a big difference in the discrete distribution and the way the conversion is performed. or the population was approximately normal and the student's t was used. 4. 6. When you find np. then the larger must also be. Find the smaller of np or nq. you're actually finding the mean. It might be easier to find the variance and just stick the square root in the final calculation . mu. Population Mean Either the population was normally distributed.As you can see. so denote it as such. the number of trials. the sample size was large enough (so the central limit theorem applied and was approximately normal). Since this is a binomial problem. so the approximation will be considered good. Some people would argue that step 3 should be done before this step. these are the same things which were identified when working a binomial problem. Convert the discrete x to a continuous x. Identify success. Compute the z-score using the standard formula for an individual score (not the one for a sample mean). for a continuous distribution.

Multinomial Experiment The expected frequency of each category had to be at least five. obviously this one requires normality. you will get the corresponding chi- . Distributions The distributions have normality in them somewhere. the student's t approaches the normal distribution. As the sample size increases. This is analogous to approximating the binomial using the normal. Correlation and Regression The pairs of data had to have a bi-variate normal distribution. There is another interesting relationship between the normal and chi-square distributions. Chi-squared Distribution Required a normal population. This is analogous to approximating the binomial using the normal.The binomial distribution (the one that really applies) was approximated using the normal as long as np and nq were at least five. Population Variance It was required that the population be normally distributed. Normal Distribution Well. If you take a critical value from normal distribution and square it. too. That is another way of saying the expected frequency of each category (success and failure) is at least five. Independence The expected frequency of each cell had to be at least five. Student's T Distribution Had to be approximately normal.

but it can be approximated using a normal distribution if the expected frequency of each category is at least five.6452 = 2. the binomial doesn't require a normal population.706 = chi-square(1. . but twice the area in the tails. the multinomial can be approximated using the normal if the expected frequency of each category is at least five. Example: z(0.05)2 = 1.10) F Distribution Since F is the ratio of two independent chi-squared variables divided by their respective degrees of freedom.. and the chi-squares require a normal distribution.square value with one degree of freedom. then the F distribution is also going to require a normal distribution. your comprehension of the normal distribution is vital for success in the class.0. Binomial Distribution Obviously.. Multinomial Distribution Same as with the binomial. As stated in class and in the lecture notes .