You are on page 1of 5

Anderson

Homework: Data Analysis (1/3)

Name___________________________________________ Date____________________
Test for Independence (Chi-Squared 2 test)
1. Form hypothesis.
Ho “null hypothesis”: No relationship exists (variables are Independent)
Ha “alternative hypothesis”: Relationship exists (variables are dependent)
2

2. Calculate the “chi-squared”  statistic.

Expected count for each cell=

 Row total    Column total 
OverallTotal

2
statistic

[observed  exp ected ]2

exp ected

2

3. Find  significant value in the table.
2
2 ,df  .05,(#
rows 1)(# columns 1)
2

2

4. Compare  statistic to  significant.
2 statistic > 2 significant: REJECT Ho. Accept Ha. Relationship exists.
2 statistic < 2 significant: FAIL TO REJECT Ho. Not enough evidence to show relationship exists.
Example:
The market research group for Alber’s Brewery of Tuscon, AZ, wantes to know whether preferences of beer type (light,
regular, dark) differ among gender (male, female). If beer preference is independent of gender, one advertising
campaign will be initiated. However, if beer preference depends on the gender of the beer drinker, the firm will tailor its
promotions to different target markets.
A survey was conducted and the following data was collected:
At the .05 level of significance, is there a statistically significant
relationship between beer preference and gender? Also
compare at the .01 level of significance.
1. Form Hypothesis:
Ho:
Ha:

3. Find Chi-Squared Significant value in the table.

df  (# rows 1)(# columns 1) 
2
2 , df  .05,
2 

2. Find the Chi-Squared Statistic.

2
2 , df  .01,
2 

4. Compare

2
2
 statistic
to  significant

At the .05 level of significance, is there a statistically
significant relationship between beer preference and gender?
2
 statistic


[observed  exp ected ]2

exp ected

At the .01 level of significance, is there a statistically
significant relationship between beer preference and gender?

1

Calculate chi-squared ( ) statistic. Accept Ha. Assumed distribution not true. Ha “alternative hypothesis”: The data are NOT consistent with a specified distribution. 2 2. Example: Is Sudden Infant Death Syndrome (SIDS) Seasonal? Data from King County.df  .Anderson Homework: Data Analysis (1/3) Goodness of Fit test (Chi-Squared 2 test) 1. Found in Chi-Square table. Washington regarding the number of deaths from SIDS for each season: Season f Winter 78 Spring 71 Summer 87 Fall 86 Total 322 2 . Compare  statistic to  significant. Ho “null hypothesis”: The data are consistent with a specified distribution.05. Not enough evidence to reject assumed distribution. 2 statistic < to significant 2: FAIL TO REJECT Ho. 2 2 . 2 3. 2 statistic > to significant 2: REJECT Ho.(# outcomes of var iable 1) 2 2 4. Find  significant. Form hypothesis.

Fill in the boxes below. From these we can calculate the expected frequencies (under the null hypothesis): Use the following formula. multiply these probabilities by 56 to find the hypothesized expected frequencies. 3 .59 we can compute the hypothesized probabilities associated with each class. p ( x  0)  e   xx  x!  .Anderson Homework: Data Analysis (1/3) We will learn how to use probability distributions to describe a variable to perform Simulations. Then. Problem definition: Let X be the number of defects in printed circuit boards. to find P(0 defects). you must calculate the EXPECTED VALUE. P(1 defect). To estimate the mean from the frequency table. E ( x)   56  p ( x  1)  e   xx  x!  . The results were as follows: Does the assumption of a Poisson distribution seem appropriate as a model for these data? The null hypothesis is H0 : The data follows a Poisson distribution Alternative hypothesis is H1 : The data does not follow a Poisson distribution. This assignment will focus on how we can test the hypothesis that observed data follows a particular distribution. E ( x)   56   . Use the following formula for the mean: µ = [ (32 × 0) + (15 × 1) + (9 × 2)] /56 = 0.59 Using the Poisson distribution with μ= 0. The Goodness of Fit Tests quantifies how well or how poorly a model fits the data using a chi-squared test. The mean of the (assumed) Poisson distribution is unknown so must be estimated from the data. E ( x)   56  p ( x  2 )  e   xx  x! You can now construct the following table. and P(2+defects). A random sample of n = 56 printed circuit boards is taken and the number of defects recorded.

48 5.Anderson Homework: Data Analysis (1/3) Number of Defects 0 1 2+ Observed frequency 32 15 9 Expected Frequency 30. Use a . Interpret your results. 4 .60 Using chi-square test to perform a goodness of fit test. determine if the observed data follows a Poisson distribution.80 18.05 level of significance.

908 7.328 0.578 6.886 10.601 5.195 44.088 2.449 16.980 44.845 30.592 14.690 2.928 48.222 0.278 21.344 1.409 34.265 6.833 3. α df 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 40 50 60 70 80 90 100 0.565 4.879 113.672 9.991 7.805 63.95 0.611 15.831 1.638 42.016 0.267 35.296 27.688 29.808 12.844 7.065 0.490 4.548 20.557 43.635 2.053 3.691 76.897 9.312 10.542 10.675 21.020 0.99 --0.571 7.575 5.779 9.072 0.790 8.879 10.865 5.191 31.091 13.10 2.773 55.869 30.289 41.277 15.362 14.126 77.629 118.074 3.017 13.493 26.706 4.635 9.605 6.378 9.167 2.923 43.722 46.684 15.955 23.932 40.114 18.852 34.000 33.488 28.564 8.292 18.442 53.844 14.587 28.482 48.987 17.168 4.929 0.559 46.145 1.236 10.196 10.145 124.758 57.076 39.584 1.642 46.211 0.758 67.256 51.300 29.527 96.188 51.573 15.275 51.013 17.645 50.204 2.507 16.070 12.329 64.953 22.025 5.490 91.401 42.191 37.420 83.154 88.764 43.643 9.283 10.769 25.433 32.565 118.136 129.342 0.459 55.660 5.291 82.364 40.652 38.603 3.939 19.198 12.924 35.410 32.578 32.210 11.475 20.425 112.064 22.685 24.982 11.807 0.260 9.290 49.711 1.940 4.204 28.261 7.848 14.591 12.526 32.01 6.188 26.240 14.997 41.610 2.010 0.561 0.229 5.024 7.397 85.278 49.485 45.434 8.647 74.314 45.498 0.563 36.928 17.689 46.082 90.239 1.342 71.819 31.237 1.540 61.278 73.633 8.565 14.001 0.121 13.379 100.591 10.181 45.90 0.297 0.659 16.963 48.169 .005 7.700 3.231 8.026 22.856 11.090 21.907 9.672 66.103 0.554 0.578 107.216 0.156 2.781 38.064 1.816 4.390 10.646 41.015 7.991 35.549 19.05 3.848 15.180 2.757 28.023 20.067 15.337 42.676 0.215 116.207 0.164 29.741 37.558 3.566 38.023 106.034 8.086 16.872 1.589 25.298 95.404 5.156 38.143 12.345 13.262 6.362 23.707 27.461 13.750 18.629 6.325 3.041 14.119 27.217 27.646 2.408 7.209 24.142 5.319 32.920 23.805 36.962 8.534 43.989 27.488 11.115 0.308 16.160 11.892 63.042 7.357 40.484 0.509 34.443 13.051 0.226 5.461 45.697 6.919 18.412 29.996 26.307 23.851 11.812 21.547 9.535 19.412 0.004 0.348 11.047 16.117 10.473 17.718 37.815 9.707 37.307 19.975 0.645 12.739 60.051 37.116 135.571 4.796 44.833 14.338 13.531 101.401 13.892 6.172 36.916 39.153 65.860 16.597 12.009 5.337 24.141 30.170 35.993 52.542 24.256 14.801 34.505 79.358 5 0.989 1.812 6.766 79.275 18.415 37.483 21.733 3.144 31.768 20.841 5.196 67.520 11.251 7.247 3.479 36.120 13.813 32.087 40.952 104.391 69.260 8.791 24.113 41.689 12.651 12.736 26.754 70.321 128.075 4.167 74.007 33.582 39.085 10.615 30.708 18.735 2.336 53.304 7.725 26.299 140.885 40.838 14.979 59.196 34.107 4.865 11.151 16.329 124.352 0.524 12.379 16.599 29.666 23.671 33.995 --0.Anderson Homework: Data Analysis (1/3) Table: Chi-Square Probabilities Statistical significance.588 50.787 20.879 13.172 59.382 35.812 18.