You are on page 1of 62

# Defining Hypotheses

In the world we
often need to make decisions based on population parameters.
Hypothesis Testing helps us make these decisions.
Does a drug reduce blood pressure?
Does reduced class size increase test scores?
Is a person innocent or guilty of a crime?
Does more money spent on education in low-income areas improve student performance?

Upper One Sided Alternative Hypothesis
Assume that with the current drug 10% of all pancreatic cancer patients
survive for five years. A new drug is being tested. Let p =fraction of
pancreatic patients receiving the new drug who survive for five years.
Then we should test

H0: p <=.10 Ha: p >.10

To determine whether we should accept or reject the null hypothesis we
would give the new drug to a sample of pancreatic cancer patients and
look at phat = fraction of patients receiving new drug who survive 5
years. If phat<=.10 it is clear we should accept the null hypothesis, but
what if phat = .13 or phat = .15?

In this example our alternative hypothesis specifies that the population
parameter is greater than the values specified in the null hypothesis.
Such an alternative hypothesis is called an upper one-sided alternative
hypothesis.

Lower 1 Sided Alternative Hypothesis

The average US family income in 2015 was
\$79,263. You are interested in knowing
whether your Congressional District has a
lower average income than the US as a
whole. Define µ= Average family income in
hypotheses are

H0: µ = \$79,263 or µ>=\$79,263 Ha: µ<\$79,263.

Our null hypothesis is that our district is no
worse on income than the rest of the US.

We would now take a simple random sample
of families in our district and calculate the
sample mean xbar.

If xbar = \$80,000, it is clear that we should accept
the null hypothesis. However, if xbar = \$75,000, or
xbar = \$72,000

it is not clear whether we should accept or reject
the null hypothesis.

In this example our alternative hypothesis specifies
that the population parameter is smaller than the
values specified in the null hypothesis. Such an
alternative hypothesis is called a lower one-sided
alternative hypothesis.

we would reject the null hypothesis. the standard deviation of annu percentage returns is referred to as volatility.Two-Tailed Alternative Hypotheses Often we want to know if it is reasonable to assume that two populations have equal variance. . we would accept H0 while if the sample variance of the annual percentage retu on stocks and bonds differ greatly. In this example our alternative hypothesis does not specify a particular direction for t deviation of variances from equality. the last 10 years of annual returns on stocks and bonds. When looking at annual investment returns. Therefore. In this situation our hypotheses are H0: Annual Variance Stock Returns = Annual Variance on Bond Returns. If sample variance of the annual percentage returns on stocks and bonds are relatively close. We could now look at. say. Ha: Annual Variance Stock Returns ≠ Annual Variance on Bond Returns. the alternative hypothesis is called a two-sided alternative hypothesis.

then a one-tailed alternative hypothesis should be used.One-Tailed or Two-Tailed Test Some statisticians believe you should always use a Two-tailed test because a priori you have no idea of the direction in which deviations from the null hypothesis will occur. then a two-tailed alternative hypothesis should be used. while if a deviation from the null hypothesis is of interest in only one direction. . Other statisticians feel that if a deviation from the null hypothesis in either direction is of interest.

then a Type I error corresponds to convicting an innocent defendant. Since a 12-0 vote is needed for conviction. α is often called the level of significance of the test.10. We define β = Probability of making a Type II Error.Type I and Type II Error There are two types of errors that can be made in hypothesis testing: ·        Type I Error: Reject H0 given H0 True. while a Type II error corresponds to allowing a guilty person to go free.10 when in reality p<=. In this situation if we define H0: Defendant Innocent and Ha: Defendant Guilty. We let α= Probability of making a Type I Error. ·        Type II Error: Accept H0 given H0 not true.10. Type I and Type II Error for Example 1 In Example 1 a Type 1 error results when we reject p<=.10 Ha: p >. it is clear that the US judicial system considers a Type I Error to be costlier than a Type II Error. This corresponds to concluding the drug is an improvement when the drug is actually not an improvement. Let's return to example 1 H0: p <=. . In US criminal trials the defendant is innocent until proven guilty.

A Type 2 error results when we accept p<=. This corresponds to concluding the drug is not an improvement when the drug is actually an improvement. .10 when actually p>.10.

Null and Alternative Hypotheses Null Hypothesis is status quo: Defendant Innocent Alternative Hypothesis competes with null hypothesis. p = fraction of people who prefer Coke to Pepsi Ho: p = 0.5 Ha: p >0. Alternative Hypothesis: Defendant Guilty Need lots of proof to reject null hpothesis Null Hypothesis Needs lots of evidence to overturn it Examples Want to determine if Harvard education is superior to Penn State Average Salary 10 years out of Harvard Grads much higher This is wrong statistic. Look at incomes ten years out of students accepted to both schools Ho: Mean income of PSU Grads = Mean Income of Harvard Grads Ha: Mean income of PSU Grads ≠Mean income of Harvard Grads Two Tailed Test Ho Accepted!! Esther Duflo MIT Economist has changed the world! How to get teachers in India to be absent less? Some days 40% of teachers are absent! Divide Schools randomly into two groups: Half the schools teachers are paid extra \$1.5 .5 or p<=0.15 incentive-Teacher attendance with \$0 incentive>0 Upper 1 sided alternative I swim 100 yards on my "speed test" in average of 88 seconds Does drinking coffee before practice make me swim faster? Ho: Mean time with coffee = 88 seconds Ha: Mean time with coffee<88 seconds Left tailed test Ask cola drinkers if they prefer Coke to Pepsi You work for Coca Cola and want to say in ad you won the taste test.15 per day attending Ho: Teacher attendance with \$0 incentive=Teacher attendance with \$1.15 incentive Ha: Teacher attendance with \$0 incentive-Teacher attendance with \$1.15 incentive<0 Lower 1 sided Alternative Could be set up as Ha: Teacher attendance with \$1.

Periodically. .A bottling company needs to produce bottles that will hold 12 ounces of liquid for a local beer maker. What hypotheses would you test to help the beer company? Ho: Mean ounces in can<=12 oz Ha: Mean ounces in can >12 oz. the company gets complaints that their bottles are not holding enough liquid.

Critical Region The critical region is the range of values for a sample statistic that results in rejection of H0 Our approach to hypothesis testing will be to set a small probability α (usually 0. In Example 1 critical region will phat>=something In Example 2 criticial region is xbar<=something .05) of making a Type I Error and then choose a critical region that minimizes the probability of making a Type II Error.

H0: µ = \$79.263 Ha: µ<\$79. .01 Ha: p >.263 or µ>=\$79.263.01.H0: p <=.

The average state score on the test is 75.05= - Critical Region for One Sample Z test Passing the HISTEP test is required for graduation in the state of Fredonia.025= -1. Variance population known Use when n>=30 because xbar will be normal by CLT.One Sample Z-Test Test hypothesis about µ.05 would you conclude that Cooley High Students perform differently than the typical state student? µ = Cooley High mean score Cooley High There is no reason to believe that Cooley High is better or worse than the state . Use s for σ if σ is unkown z. For α= 0. A random sample of 49 students at Cooley High has xbar = 79 and s = 15.96 z.

H0: µ=75 Ha: µ≠75. We will reject the null hypothesis if 79>=75 + (1. For same α it takes more proof to reject null for a two-tailed test. So we reject for one-tailed test and accept with two-tailed test. Then we reject H0 if |79-75|>=1.96*15/sqrt(49) = 4.645*15)/sqrt(49) =78. xbar=79 n = 49 s = 15 Since this is false we accept the null hypothesis and conclude that the average C If we have reason to think Cooley High is better use a one-tailed test H0: µ= 75 Ha: µ>75.2.525. .

05= -1.645 orse than the state so we will use a two-tailed test with .96 z.025= -1. z.

-tailed test. .9 n = 49 s = 15 µ0 = 75 e that the average Cooley High score does not differ from the State average.

.

Let XBAR be the random variable for the sample mean under H0. Thus P Value<=α if and only if we reject H0. most statisticians use the concept of Probab values (P-values) to report the outcome of a hypothesis test. The P-value may also be interpreted as the probability of observing (given H0 is true) a value of the test statistic at least as extreme as the observed value of the test statistic. then the P-Value for the one sample Z-test is computed as follows: . The level of significan chosen is rather arbitrary. The P-value for a hypothes test is the smallest value of α for which the data indicates rejection of H 0Thus P Value<=α if and only if we reject H0. The P-value for a hypothes is the smallest value of α for which the data indicates rejection of H0. P-Value>α if and only if we accept H0. For that reason. P-Value>α if and only if we accept H0. most statisticians use the concep Probability values (P-values) to report the outcome of a hypothesis test. For that reason. The P-value may also be interpreted as the probability of observing (given H0 is true) a value of the statistic at least as extreme as the observed value of the test statistic. If we let XBAR represent the random variable for the sample mean under H 0 and x be the observed of .The level of significance chosen is rather arbitrary.

03<. H0: µ=75 Ha: µ≠75. for a two-tailed test we accept H0.030974) = 0. H0: µ= 75 Ha: µ>75.061948.75.All probabilities are computed under the assumption that H0 is true.15/sqrt(49).05 we reject H0. Since our P-Value of 0. the P-Value for the two-tailed test is Prob(|XBAR-75|>=4 = 2*Prob(XBAR>=79) which can be computed as 2*(1-NORM.030974 Since our P-value of 0.06>.DIST(79. In our Cooley High example.05.True) REMEMBER FROM MODULE 4 ST DEV XBAR = SIGMA/SQRT(n) = 2*(0. . The P-value for a one-tailed test is simply Prob(Xbar>=79) =0.

.

s= sample standard deviation n = sample size .Definition of T Random Variable If population is normal with mean µ and standard deviation is unknown then xbar-µ ------------ s/sqrt(n) follows a T random variable with n-1 degrees of freedom.

020027 0.02741 0.001723 0.025279 0.8 0.312254 0.005953 0.01859 0.156734 0.254508 0.127445 -1.171369 0.5 0.017528 0.214663 0.4 0.04438 -2.013583 0.236493 0.2 0.035676 0.35471 0.296296 0.5 0.258754 0.280909 0.380658 -0.015682 0.241971 0.6 0.386975 0.352065 0.055901 0.5 0.3 0.000873 0.391043 0.365787 0.3 0.007915 0.7 0.075114 0.AS DEGREES OF FREEDOM INCREASE THE T RANDOM VARIABLE APPROACHES THE STANDARD NORMAL 2 4 10 Normal T 5 df T 15 df T 30 df -3.5 0.173854 0.4 0.275824 0.35471 0.339695 -0.022395 0.19245 0.343206 0.009582 -3 0.001232 0.12288 0.01356 -2.315006 0.061619 0.8 0.014026 0.217852 0.212295 0.302319 0.356579 0.051648 0.4 0.343206 0.173877 0.375 0.25353 -0.36827 0.396953 0.002384 0.381388 0.011273 0.7 0.372666 0.008052 -3.339976 0.083369 0.299092 -0.6 0.035316 0.3 0.040323 0.233128 0.096302 0.6 0.352065 0.4 0.011401 -2.330964 0.276625 -0.396953 0.019151 -2.353553 0.380658 0.010421 0.164851 -1.350918 0.356579 -0.320326 -0.381388 0.045619 0.085081 0.194186 0.3 0.035475 0.7 0.096312 -1.207606 -1 0.111078 -1.092478 0.004432 0.071425 Norm -1.5 0.028327 0.1 0.108873 0.026939 -2.114134 0.006767 -3.193681 0.024877 0.05217 -2 0.005689 -3.2 0.9 0.003267 0.1 0.155382 0.016121 -2.126899 0.339695 .043984 0.266085 0.058505 0.07895 0.339976 0.022728 -2.01756 0.398942 0.8 0.372666 0.129518 0.094049 0.296296 0.068041 0.9 0.333225 0.1 0.2 0.185664 -1.061146 -4 -3 -2 -1.075259 0.315006 0.004784 -3.023352 0.019693 0.029773 0.1 0.9 0.386975 0 0.102696 0.066291 0.322262 0.110921 0.021608 0.2 0.3 0.36827 0.389108 0.042201 0.4 0.370398 -0.149727 0.028019 0.2 0.141078 0.322262 0.370398 0.022118 0.031879 -2.350918 0.065616 0.04626 0.053991 0.032397 0.365787 0.230362 -0.330964 0.037656 -2.038569 0.012565 0.289692 0.050805 0.145395 -1.083116 -1.138378 0.1 0.391043 0.031597 0.

0.6 0.333225 0.275824 0.302319 0.320326
0.7 0.312254 0.254508 0.280909 0.299092
0.8 0.289692 0.233128 0.258754 0.276625
0.9 0.266085 0.212295 0.236493 0.25353
1 0.241971 0.19245 0.214663 0.230362
1.1 0.217852 0.173877 0.193681 0.207606
1.2 0.194186 0.156734 0.173854 0.185664
1.3 0.171369 0.141078 0.155382 0.164851
1.4 0.149727 0.126899 0.138378 0.145395
1.5 0.129518 0.114134 0.12288 0.127445
1.6 0.110921 0.102696 0.108873 0.111078
1.7 0.094049 0.092478 0.096302 0.096312
1.8 0.07895 0.083369 0.085081 0.083116
1.9 0.065616 0.075259 0.075114 0.071425
2 0.053991 0.068041 0.066291 0.061146
2.1 0.043984 0.061619 0.058505 0.05217
2.2 0.035475 0.055901 0.051648 0.04438
2.3 0.028327 0.050805 0.045619 0.037656
2.4 0.022395 0.04626 0.040323 0.031879
2.5 0.017528 0.042201 0.035676 0.026939
2.6 0.013583 0.038569 0.031597 0.022728
2.7 0.010421 0.035316 0.028019 0.019151
2.8 0.007915 0.032397 0.024877 0.016121
2.9 0.005953 0.029773 0.022118 0.01356
3 0.004432 0.02741 0.019693 0.011401
3.1 0.003267 0.025279 0.01756 0.009582
3.2 0.002384 0.023352 0.015682 0.008052
3.3 0.001723 0.021608 0.014026 0.006767
3.4 0.001232 0.020027 0.012565 0.005689
3.5 0.000873 0.01859 0.011273 0.004784

M VARIABLE

T and Normal Densities

0.45

0.4

0.35

0.3

0.25
0.2

0.15

0.1

0.05

0
-4 -3 -2 -1 0 1 2 3 4
Norma l T 5 df T 15 df T 30 df

A B C D E F G H I J
1 One Sample Hypothesis for Mean: Small Sample, Normal Population, Variance Unknown
2
3
4 Use T.INV to get percentiles of T Random Variable
5
6 2.5 %ile 28 df -2.0484 =T.INV(0.025,28)
7 97.5%ile 28 df 2.0484 =T.INV(0.975,28)
8 0.5% ile 13 df -3.0123 =T.INV(0.005,13)
9 99.5%ile 13 df 3.0123 =T.INV(0.995,13)
10
11 Use T.DIST to get T probabilities
12 Prob T10>=2 0.0367 =1-T.DIST(2,10,1)
13 Prob T10<=-2 0.0367 =T.DIST(-2,10,1)
14
15 Basically, One Sample t-tests look just like One Sample Z-tests with s replacing σ and the t percentiles replacing the Z percentiles.
16
17
18
19
20
21
22
23
24 P-Values for T-Test
25 t= observed value of T-statistic
26
27
28
29
30
31
32
33
34
35
36
37 Example
38

Passing the HISTEP test is required for graduation in the state of Fredonia. The
39 average state score on the test is 75. A random sample of 25 students at Cooley
High finds xbar=81

40
and s = 15. For α= 0.05 would you conclude that Cooley High Students perform
differently than the typical state student?

41

We use a two-tailed test because before doing the test we have no view about
42 whether Cooley High students will perform better or worse than the typical state
student. Then we have

43 H0: µ= 75 Ha: µ≠75

44

45 Using the function T.INV(0.025,24) We find t(..025,24) = -2.06.

46 We reject H0 if
47 |81-75|>=2.06*15/sqrt(25)
48 = 6.18.

49 This is not true, so we accept H0.

50

51 The p-value for this test is 2*Prob(T24>= (81-75)/(15/sqrt(25))

52 =2*Prob(T24>=2)

Prob(T24>=2) may be computed with the formula = 1-T.DIST(2,24,TRUE) which
53 returns 0.028. Therefore, the p-value for this test is 2*0.028= 0.056. Since the P-
Value is >0.05, we accept H0.

binom.Righttail Reject H0 if pvalue<=α .range(trials.dist.dist.binom.98838 =_xlfn.Pze Left Tailed Test Ho: p>=p0 Ha: p<p0 Lefttailedpvalue 0.Testing a Proportion Player makes 300 of 400 Free Thro Right Tailed Test Has she improved from being a 70 trials 400 Ho: p<=p0 p= chance player makess free thro successes 300 Ha: p>p0 H0: p<=0.01553 =_xlfn.70 Pzero 0.7 Righttailedpvalue 0.70 Ha: P>0.03106 =2*MIN(Lefttailedpvalue.Pze Two Tailed Test Ho: p=p0 Ha: p≠p0 Twotailedpvalue 0.range(trials.

range(trials.Pzero.Righttailedpvalue) .dist.70 P-VALUE=.Pzero.trials) m.dist.range(trials.successes) fttailedpvalue. s 300 of 400 Free Throws? roved from being a 70% foul shooter? layer makess free throws after change α = 0.016<=.05 Ha: P>0.05 Reject Null Hypothesis m.0.successes.

5 Ho: p>=p0 Ha: p<p0 p-value 0.09882 =2*MIN(Lefttailedpvalue.dist.5 Righttailedpvalue 0.binom.Testing a Proportion Right Tailed Test trials 400 Ho: p<=p0 successes 217 Ha: p>p0 Pzero 0.04941 =_xlfn.range(trials.range(trials.05 Lefttailedpvalue 0.5 Left Tailed Test Ha:p≠0.Pze Accept the coin is fair Two Tailed Test Ho: p=p0 217 of 400 coin tosses Ha: p≠p0 come up heads Twotailedpvalue 0.Pze p=chance coin comes up heads Ho: p = 0.098 For alpha = .96001 =_xlfn.Righttail Reject H0 if pvalue<=α .binom.dist.

0.Righttailedpvalue) .dist.range(trials.successes) (Lefttailedpvalue.range(trials.inom.dist.successes.trials) inom.Pzero.Pzero.

binom.dist.01 Ha: p≠p0 50 of 300 flights are late Twotailedpvalue 0.Pze Alpha = .Pze Left Tailed Test p = fraction of late flights Ho: p>=p0 after changing boarding process Ha: p<p0 Ho: p >=.range(trials.binom.47823 =_xlfn.30 Lefttailedpvalue 0.30 Ha:p<.Testing a Proportion Right Tailed Test trials 300 Ho: p<=p0 successes 89 Ha: p>p0 Pzero 0.Righttail Reject H0 if pvalue<=α .range(trials.01 Two Tailed Test Reject H0 Ho: p=p0 p value<.95646 =2*MIN(Lefttailedpvalue.57174 =_xlfn.3 Righttailedpvalue 0.dist.

0.trials) inom.dist.Pzero.Pzero.successes) (Lefttailedpvalue.Righttailedpvalue) .inom.dist.range(trials.successes.range(trials.

662859 144.021 Marketing Finance 118 105 110 90 106 101 94 130 91 124 102 104 96 129 116 110 106 110 117 126 90 116 113 97 112 123 109 124 106 115 114 100 92 130 99 93 105 97 81 93 82 110 104 124 114 100 119 108 98 101 90 115 89 123 111 95 .Means 98.6475771 109.1896 How Many 227 211 variance 131.

83 99 98 126 111 100 114 110 117 119 97 109 90 117 100 126 86 94 108 113 89 96 102 111 82 103 120 99 114 112 111 90 93 117 99 110 83 102 107 126 91 95 104 124 88 110 89 109 109 115 80 126 84 118 106 119 82 95 101 102 114 122 93 122 .

89 94 95 119 82 100 107 97 90 109 109 122 105 122 113 93 89 113 82 118 94 92 94 90 88 127 94 123 103 114 109 92 90 110 104 115 95 90 93 116 95 94 117 94 96 106 112 97 94 124 107 110 115 111 118 127 81 93 88 93 119 118 99 120 .

89 128 112 119 102 127 117 120 80 113 120 116 118 118 89 121 93 102 93 102 113 128 93 100 110 91 94 93 115 116 111 128 107 90 87 125 87 130 103 108 81 121 80 125 112 99 113 118 93 122 113 96 94 116 87 98 91 122 102 114 86 108 94 96 .

94 115 104 91 106 116 89 93 93 130 85 115 93 93 94 104 96 115 95 106 107 102 114 98 115 122 107 92 84 101 84 100 111 103 96 124 86 110 111 107 82 98 94 116 80 90 88 103 95 94 103 121 102 104 86 92 102 123 119 94 112 118 88 119 .

80 102 103 115 118 99 100 102 120 123 109 116 89 97 90 126 85 101 90 125 106 116 107 99 80 108 97 102 94 107 85 97 112 124 82 108 89 102 99 92 106 92 108 118 93 94 107 99 100 95 87 91 100 110 116 116 83 94 89 107 100 108 100 124 .

93 119 86 95 100 114 103 116 93 97 111 90 86 128 105 105 81 97 114 113 105 127 119 99 96 129 98 102 103 105 111 124 90 129 116 124 95 113 84 101 82 101 108 116 80 95 86 111 101 84 87 92 84 105 96 .

89 103 88 120 95 80 100 .

H0: Mean Marketing=Mean Finance Ha: Mean Marketing ≠Mean Finance P-Value =0 so reject null hypothesis and conclude significant difference between Average salary of marketing and finance majors .

e majors .

02 12 116 110 Observations 227 211 13 106 110 Hypothesized Mean Differe 0 14 117 126 z -9.1896 11 96 129 Known Variance 131.644854 17 112 123 P(Z<=z) two-tail 0 18 109 124 z Critical two-tail 1.959964 19 106 115 20 114 100 H0: Mean Marketing=Mean Finance 21 92 130 Ha: Mean Marketing ≠Mean Finance 22 99 93 23 105 97 P-Value =0 so reject null hypothesis 24 81 93 and conclude significant difference 25 82 110 between Average salary of marketing and finance majors 26 104 124 27 114 100 28 119 108 29 98 101 30 90 115 31 89 123 32 111 95 33 83 99 34 98 126 35 111 100 36 114 110 37 117 119 38 97 109 39 90 117 40 100 126 41 86 94 42 108 113 43 89 96 44 102 111 45 82 103 46 120 99 47 114 112 48 111 90 49 93 117 50 99 110 51 83 102 52 107 126 53 91 95 54 104 124 55 88 110 56 89 109 57 109 115 58 80 126 59 84 118 60 106 119 61 82 95 62 101 102 63 114 122 64 93 122 65 89 94 66 95 119 67 82 100 68 107 97 69 90 109 70 109 122 71 105 122 72 113 93 73 89 113 74 82 118 75 94 92 76 94 90 77 88 127 78 94 123 79 103 114 80 109 92 81 90 110 82 104 115 83 95 90 84 93 116 85 95 94 86 117 94 87 96 106 88 112 97 89 94 124 90 107 110 91 115 111 92 118 127 93 81 93 94 88 93 95 119 118 96 99 120 97 89 128 98 112 119 99 102 127 100 117 120 101 80 113 102 120 116 103 118 118 104 89 121 105 93 102 106 93 102 107 113 128 108 93 100 109 110 91 110 94 93 111 115 116 112 111 128 113 107 90 114 87 125 115 87 130 116 103 108 117 81 121 118 80 125 119 112 99 120 113 118 121 93 122 122 113 96 123 94 116 124 87 98 125 91 122 126 102 114 127 86 108 128 94 96 129 94 115 130 104 91 131 106 116 132 89 93 133 93 130 134 85 115 135 93 93 136 94 104 137 96 115 138 95 106 139 107 102 140 114 98 141 115 122 142 107 92 143 84 101 144 84 100 145 111 103 146 96 124 147 86 110 148 111 107 149 82 98 150 94 116 151 80 90 152 88 103 153 95 94 154 103 121 155 102 104 156 86 92 157 102 123 158 119 94 159 112 118 160 88 119 161 80 102 162 103 115 163 118 99 164 100 102 165 120 123 166 109 116 167 89 97 168 90 126 169 85 101 170 90 125 171 106 116 172 107 99 173 80 108 174 97 102 175 94 107 176 85 97 177 112 124 178 82 108 179 89 102 180 99 92 181 106 92 182 108 118 183 93 94 184 107 99 185 100 95 186 87 91 187 100 110 188 116 116 189 83 94 190 89 107 191 100 108 192 100 124 193 93 119 194 86 95 195 100 114 196 103 116 197 93 97 198 111 90 199 86 128 200 105 105 201 81 97 202 114 113 203 105 127 204 119 99 205 96 129 206 98 102 207 103 105 208 111 124 209 90 129 210 116 124 211 95 113 212 84 101 213 82 101 214 108 116 215 80 95 216 86 217 111 218 101 219 84 220 87 221 92 222 84 223 105 224 96 225 89 226 103 227 88 228 120 229 95 230 80 231 100 .6628591478 144.66 144.382034 15 90 116 P(Z<=z) one-tail 0 16 113 97 z Critical one-tail 1.64758 109.021 3 4 Marketing Finance 5 118 105 6 110 90 7 106 101 z-Test: Two Sample for Means 8 94 130 9 91 124 Marketing Finance 10 102 104 Mean 98. A B C D E F G H I J K 1 2 variance 131.

4235 Kurtosis 0.10377 -0.50544 0.SAMPLE VAR 33.91758242 18.3493 Hybrid In Person 87 88 94 96 86 84 89 82 74 81 84 85 85 90 85 90 92 89 90 95 77 88 82 89 94 93 84 85 87 83 88 84 .0163399 Skewness -0.

TESTING HYPOTHESIS OF EQUAL VARIANCES NORMAL POPULATION H0: Variance Hybrid scores = Variance In-class scores Ha: Variance Hybrid Scores ≠ Variance In-class scores accept null hypothesis that population variances are equal USE F.TEST FUNCTION TO GET P-VALUE IF PVALUE<=ALPHA Reject Null .

A B C D E F G H I J K L M 1 2 3 4 Hybrid In Person 5 87 88 6 94 96 7 86 84 8 89 82 0.TEST(F5:F18.G5:G22) 9 74 81 10 84 85 11 85 90 12 85 90 13 92 89 14 90 95 15 77 88 16 82 89 17 94 93 18 84 85 19 87 20 83 21 88 22 84 .2208249 =F.

populations normal. populations normal. variances unknown but equal. and the samples from the two populations are independent Small Sample size (n<30) from at least one population.4 Tests to Test Differences Between Population Means Situation Large sample size (n>=30) from each population and samples from the two populations are independent Small Sample size (n<30) for at least one population. variances unknown but unequal. and the samples from the two populations are independent The two populations are normal and the observations from the two populations can be paired in a natural fashion .

Name of Test z-test Two Samples for Means t-Test Two sample Assuming Equal Variances t-Test Two sample Assuming Unequal Variances t-Test Paired Two Sample for Means .

92857 87.kurt 0.505441 0.103766 -0.423499 14 18 averages 85.349327 skew -0.61111 Hybrid In Person 87 88 94 96 86 84 89 82 74 81 84 85 85 90 85 90 92 89 90 95 77 88 82 89 94 93 84 85 87 83 88 84 .

ACCEPT H0 THAT VARIANCES EQUAL SKEWNESS AND KURTOSIS CONSISTENT WITH NORMAL Accept H0 that Mean score in hybrid and in person classes are same .

sses are same .

1758314 35 t Critical one-tail 1.906878 31 Hypothesized Mean 0 32 df 30 33 t Stat -0.917582 18.05 Hybrid In Person 27 P-Value Mean 85.2208249029 89 82 9 74 81 10 84 85 11 85 90 12 85 90 13 92 89 14 Variances Equal 90 95 15 77 88 16 Test H0: MeanHybrid=Mean In Person 82 89 17 Test Ha: MeanHybrid≠Mean In Person 94 93 18 84 85 19 Test H0: VarianceHybrid=Variance In Person 87 20 Test Ha: VarianceHybrid≠Variance In Person 83 21 Accept H0 88 22 84 23 Now test mean difference using equal variance t test 24 t-Test: Two-Sample Assuming Equal Variances 25 Accept H0 26 for α =0.6972609 36 P(T<=t) two-tail 0.35 for two tailed test Observations 14 18 30 Pooled Variance 24.946087 34 P(T<=t) one-tail 0.1447933 3 Kurtosis 0. A B C D E F G 1 TESTING IF VARIANCES OF TWO POPULATIONS ARE EQUAL 2 Skewness -0.1037655 -0.611111 28 .928571 87.61404 4 Hybrid In Person 5 87 88 6 94 96 7 P-Value for Equal Variances 86 84 8 0.175 for one tailed test Variance 33.01634 29 and .0422725 .505441 0.3516628 37 t Critical two-tail 2.

991193 10.6037579 2.2564067994 10.5501029713 8.2104423456 0.4767015 1.9227063 -3.9549384 11.9934903 5.4929335376 8.7716354574 8.71418 2.8538278346 9.1372110448 9.0594097 skew -0.9856399695 9.419072 4.9071280985 10.6193498 4.5590603401 9.378498 8.6804589 2.kurt -0.5968024617 0.127241 -2.1524677 7.9967932226 11.3823858 1.446886 3.8430219739 8.306467 .3968654266 9.1382608 10.0537916 9.4001502519 7.2596881 3.5868459 Placebo Drug 2.0758036672 9.5992424 -2.

05 Reject H0 .0289756028 1.11934 REJECT H0 THAT VARIANCES ARE EQUAL P-value<=.H0:Mean reduction for placebo and drug equal Ha: Mean reduction for drug and placebo not equal SAMPLE VARIANCES Placebo Drug 11.

550103 8.378498 20 Test Ha: VariancePlacebo≠VarianceDrug 8.028976 1.36353965325768E-05 -2.0758037 9.1524677 9 P-Value for Equal Variances 7.9549384 21 Reject H0 11.7530504 37 hypothesis and conclude P(T<=t) two-tail 7.5590603 9.306467 23 Now test mean difference using Unequal variance t test 24 25 26 t-Test: Two-Sample Assuming Unequal Variances 27 28 Placebo Drug 29 Mean 2.1382608 19 Test H0: VariancePlacebo=VarianceDrug 10.6804589 18 2.611E-07 38 drug is significantly t Critical two-tail 2.98564 9.9071281 10.137211 9.596802 0.1193373 31 Observations 14 18 32 Hypothesized Mea 0 33 df 15 34 t Stat -8.71418 15 2.806E-07 36 is 4 in 10 million so reject null t Critical one-tail 1.3968654 9.843022 8.4001503 7.446886 14 Variances Not Equal 3.1314495 39 at reducing cholesterol 40 than placebo.2564068 10.5992424 10 3.853828 9.127241 16 Test H0: MeanPlacebo=Mean Drug -2.3823858 12 1.9934903 7 5.9967932 11.9227063 11 -3.4929335 8.2596881 13 3. .210442 -0.4843505 3 KURT -0.5868459 30 Variance 11.771635 8.6193498 8 4.080409 35 P-value one tailed P(T<=t) one-tail 3.991193 22 10. A B C D E F G 1 TESTING IF VARIANCES OF TWO POPULATIONS ARE EQUAL 2 SKEW -0.0537916 9.4767015 17 Test Ha: MeanPlacebo<MeanDrug 1.225282 4 Placebo Drug 5 2.419072 6 4.

T TEST PAIRED TWO SAMPLE Goal To test if a drug reduces cholesterol To test if a new type of insulation reduces heating bills To test if cross training (not just swimming) improves a swimmer’s time In each of these situations we are blocking the effect of a variable on th Blocking Variable Physical characteristics of patients Size and design of home Swimmer’s ability .

For each pair Flip a coin to choose the swimmer in each pair who starts cross training are blocking the effect of a variable on the response and focusing on the differences due to the Treatment Variable Difference between drug and placebo. Then we flip a coin to randomly choose one member of each pair to receive the drug and one member to receive the placebo. Pick ten pairs of two houses that had the same heating bill last winter. Difference between cross training and just in water training . The other member of the pair keeps their old insulation Pick 15 pairs of two swimmers who had identical best times in their event. Flip a coin to choose the member of each pair that gets the new type of insulation. weight and cholesterol. Design Pick ten pairs of two people who are matched on age. Difference between new and old insulation.

he differences due to the treatment variable. .

Means -2.53 Accept H0 .9 Observation Old Insulation 1 -34 2 6 3 31 4 10 5 -2 6 -12 7 49 8 -15 9 -45 10 -17 P-VALUE =0.

-14.3 H0: mean change in heating bill old insulation=mean change w New Insulation Ha: mean change in heating bill old insulation≠mean change w 23 16 -28 29 30 -72 -46 -55 21 -61 .

on=mean change with new insulation on≠mean change with new insulation .

0575170242 -2.8331129327 30 P(T<=t) two-tail 0.9 -14.2621571628 .3222222222 1750. A B C D E F G H I 1 2 3 skewness 0.4342180803 -0.233333333 23 Observations 10 10 24 Pearson Correlation -0.2068623331 25 Hypothesized Mean D 0 26 df 9 27 t Stat 0.1972147109 4 kurtosis -0.2650496761 29 t Critical one-tail 1.1072878446 5 Observation Old Insulation New Insulation 6 1 -34 23 7 2 6 16 8 3 31 -28 9 4 10 29 10 5 -2 30 11 6 -12 -72 12 7 49 -46 13 8 -15 -55 14 9 -45 21 15 10 -17 -61 16 17 18 t-Test: Paired Two Sample for Means 19 20 Old Insulation New Insulation 21 Mean -2.6529714662 28 P(T<=t) one-tail 0.5300993523 31 t Critical two-tail 2.3 22 Variance 806.

69% 31.42% 31.DIST.TEST 0.INV(0.95.44 292.12 0.54 0.95.76 140.60 0.27 27 Get P-VALUE with Chi Square Total 28 CHISQ.TEST(L7:O8.80% 17.INV(0.34 928.03 6.D18) Gender Blue Brown Green Hazel Total 19 4 9.02 5.INV(0.25% 11.85% 18. Gender Blue Brown Green Hazel 25 Eij Female 1.89% 100.95.00% 14 Male 38.L19:O20) 33 FUNCTION .59 Reject H0 eye color and gender are not independent! 29 0.DIST.D19) Female 396.D17) Total 2035 Eye Color 18 3 7.21% 100.00 21 3 DF (R-1)*(C-1) Degrees of Freedom Total 22 Test Statistic 23 (Oij-Eij)2 Eye Color 24 ------------.00% 15 df Cutoff Under Independence Total 16 1 3.D16) Expected Values 17 2 5.8147 =CHISQ.INV(0.45 162.4877 =CHISQ.23 26 Male 2.56 349.0008581 =CHISQ.8415 =CHISQ.78 0.RT 16. A B C D E F G H I J K L M N O P 1 2 CHI SQUARE TEST FOR INDEPENDENCE 3 ARE EYE COLOR AND GENDER INDEPENDENT? 4 5 Eye Color 6 Gender Blue Brown Green Hazel Total 7 H0:EYE COLOR AND GENDER ARE INDEPENDENT Female 370 352 198 187 1107 8 HA: EYE COLOR AND GENDER ARE NOT INDEPENDENT Male 359 290 110 169 928 9 Total 729 642 308 356 10 Chance of eye color 11 by Gender Eye Color 12 Gender Blue Brown Green Hazel Total 13 Female 33.66 1107.89% 16.95.55 193.24 167.9915 =CHISQ.RT(H28.000858 =CHISQ.00 20 R=2 C= 4 '(\$P7/Total)*(L\$9/Total)*Total Male 332.3) 30 31 CAN DIRECTLY GET P-VALUE WITH 32 CHISQ.

00 22 Test Statistic 23 Eye Color 24 (4-1)*(2-1) =3 Degrees of Freedom Gender Blue Brown Green Hazel 25 Female 1.0008581 0.66 1107.89% 100.INV(1-H30.14% 17.D18)(\$P7/Total)*(L\$9/Total)*Total Gender Blue Brown Green Hazel Total 19 4 9.00 642.TEST(L7:O8.24 167.54 0.DIST.589883 0.44 292.INV(0. A B C D E F G H I J K L M N O P 1 2 3 4 5 Eye Color 6 Gender Blue Brown Green Hazel Total 7 Female 370 352 198 187 1107 8 Male 359 290 110 169 928 9 Total 729 642 308 356 10 Chance of eye color 11 by Gender Eye Color 12 Gender Blue Brown Green Hazel Total 13 Female 33.3) 31 32 33 34 0.L19:O20) .12 0.21% 100.82% 31.0008581 =CHISQ.42% 31.55 193.00 308.00 356.8414588 =CHISQ.76 140.8147279 =CHISQ.3) 16.INV(0.00% 16 1 3.00 20 Male 332.9914645 =CHISQ.95.55% 15.23 26 Male 2.89% 16.INV(0.00 21 Total 729.95.49% 100.25% 11.RT(H28.487729 =CHISQ.0008581 0.59 29 Note 30 =CHISQ.45 162.D19) Female 396.D17)Total 2035 Eye Color 18 3 7.69% 31.56 349.02 5.D16)Expected Values 17 2 5.34 928.03 6.80% 17.INV(0.85% 18.00% 14 Male 38.0008581 =CHISQ.00% 15 df Cutoff Total 35.27 27 Chi Square Total 28 16.95.78 0.60 0.95.