# Tutorial 12

Scope of this tutorial: 1. Goodness-of-Fit Test (chi sq test of proportions) Ho: π1=xxx, π2=xxx, π3=xxx, etc. (Note: One categorical variable only) 2.

Worksheet 1: Chi sq Goodness-of-Fit Test
Research question: Have the proportions of students holidaying within NSW, interstate and overseas changed over the past 8 years? 2003 past history: 50% NSW, 20% interstate, 30% overseas Now 2011: Sample: One tutorial class, n=25 NSW: 14 Interstate: 2 Overseas: 9
1

χ2

χ 2 Test of Independence (chi sq test of

association) Ho: There is no association between X and Y (Or: X and Y are independent) (Note: Two categorical variables.)

Ho: π1=0.5, π2=0.2, π3=0.3

2

Observed from survey

Expected from Ho

Note: NOT n1*π1, n2*π2, etc

Observed O

NSW 14

Interstate 2

Overseas 9

Total 25
(25)

(Expected E (25*0.5=12.5) (25*0.2=5) (25*0.3=7.5) =n*πi=25*πi) Check: sum of Ei = 12.5+5+7.5=25=n A: Check every Ei ≥ 5. Hence chi sq test is valid.
(E - O) 2 E 2 2 (14 − 12.5) (2 − 5) (9 − 7.5) 2 = + + = 2.28 12.5 5 7.5 chi - sq statistic = ∑

T:

df = no. of categories – 1 = 3-1 = 2 (NOT n-1) P: From chi-sq table (df=2) (see next slide), 0.2<p-val<0.5 Since p-val>0.05, do not reject Ho. C: Conclusion: The proportions of students holidaying within NSW, interstate and overseas in 2011 could be the same as in 2003. There is no strong evidence to indicate otherwise. Note: No manual calculation of C.I.; get from computer output, if needed.
4

3

and re-do again from the beginning.8) 3 (3. 7 Females 3+3=6 .8) 6 (5.2) 13 Transport 5 (4.6) Total 9 Sex Males Sex Males Total Health & Transport 6+5=11 17 18 12 30 8 Problem: Expected counts <5 !!! Solution: Pool (merge) cells together: combine 1st & 3rd columns together. Then check if sum of column totals. and sum of row totals equals grand total before proceeding. Opinion Education 7 6 13 Total Total 18 12 30 Health 6 (5.Worksheet 2: chi sq test of Independence Research question: Do males and females differ in their opinion on Government’s spending on health.4) Females 3 (3.2) 8 8×18 30 Pooling cells together Pool Health and Transport together. education or transport? Ho: There is no association between Sex and Opinion First form the column and row totals. Sex Males Opinion Total Health Education Transport 6 7 6 13 5 3 8 18 12 30 5 Females 3 Total 9 6 9× 18 30 χ2 Test of Independence Opinion Education 7 (7.

Note: No formula for CI – it has no meaning. retain (do NOT reject) Ho.362 10.8) 2 (6 − 5.17×18 30 Back to square one! Opinion Education 7 (7. p-val>0.2 Health & Transport Males 11 (10.8) Total 17 Sex 18 12 30 df = (r-1)*(c-1) = 1*1 = 1 P: From chi-sq table (df=1).2) 2 = + + + = 0.8 6.8) 2 (6 − 6. always check if Sum of Ei = n (apart from some rounding errors) before continuing the HATPCs.2) 2 (7 − 7.sq statistic = ∑ E (11 − 10.) 9 10 Practice Exercises (Recall: Ho: There is no association between Sex and Opinion.8) 6 (5.2 7.2) 13 13×18 30 Total (E .5 Since p-val>0. there is no strong evidence to indicate otherwise. A: Now.) C: There could be no association between Sex and Opinion. 11 Note: In Q4-6.2) Females 6 (6.8 5. all Ei ≥ 5 (Please check column totals and row totals of Ei again.05.O) 2 T : chi . 12 .

5 3.3 1.65 3.15 0.85 3 3 1 Mathematics HD 87 96.5 3 2 Environmental Management HD 91 94.25 3 2 Economics CR 71 97. 3=independent Sex: 1=male.65 1 2 2 13 School: 1=Govt. 2=female 14 Question 1 (continued) (a) How many variables are there in the first column Course? (b) How many variables are there in the first column Sex? (c) Make up a research question that can be asked for the column Sex? Question 1 (continued) (d) What is the research question that can be asked for the last 2 columns School and Sex? 15 16 .75 3 2 Applied Finance CR 70 98.75 1 2 Business F 27 89.75 1 1 Business HD 91 87.15 3 1 1 Psychology D 76 88.25 1 2 Financial Management D 79 88.778 1 1 Applied Finance P 63 90.5 2.25 2 2 Psychology CR 69 91.7 3.05 1.5 3 2 Computer Science P 58 81.25 1 2 Philosophy D 77 74.Question 1 Refer to the table on the next slide.05 1.5 2 1 Psychology HD 88 92.5 1 2 Geoscience P 57 2.35 2. 2=Catholic.5 3 2 Medical Chemistry P 58 95.7 2.15 2.25 2 2 Information Systems & Technology 58 P 56.6 2. Question 1 Course Grade SNG UAI GPA School Sex Accounting And Finance CR 69 77.15 2.6 2 1 Accounting And Finance CR 66 96.1 3.

Question 3 The computer output below refers to a survey of Chinese males who were living in the Minhang District of China.Question 2 It is hypothesized that 4 types of peas should occur in the ratio 9:3:3:1 (Mendel’s theory).) (b) Write down the null hypothesis for this test. 19 20 . They were classified by their level of education and their smoking status: 18 17 Question 3 (answer) Research Question: Is there an association between level of education and smoking status of Chinese males from the Minhang District? Write a conclusion only to the above research question. (a) What type of test should be used to test Mendel’s theory? (Just quote the name of the test. Do NOT perform the formal hypothesis test (HATPC).

3) Assume the sample of Chinese males was selected randomly from the Minhang District. 10% never read a newspaper and the rest divided equally between weekly and less than weekly. of 668 students who responded to this question. Did the newspaper reading habits of students changed between 1998 and 2008? 23 Source: Stat170 student database (2008) 24 .Question 4 (continued from Q. middle school and college (Note: These proportions are NEVER used to perform the hypothesis test. (Do not perform the test) Question 5 The Stat170 survey was used to investigate whether students’ newspaper reading habits have changed over the past ten years.) Question 4 (continued from Q. (Do not perform the test) 21 22 Question 4 (continued from Q. (b) Write down an appropriate null hypothesis to test this claim. (c) Would you expect to reject this null hypothesis? Explain. The Stat170 survey in 2008 revealed that. (a) Estimate the proportion of males who are educated to each of the three levels: primary school. 257 less than weekly and 82 never read a newspaper. 118 read a newspaper daily. middle school level and college level. In 1998 studies conducted at Macquarie University revealed that 20% of students read a newspaper daily. 211 weekly.3) Suppose you are asked to determine whether equal proportions of males are educated to primary school level.3) (d) Write down the conclusion you would expect.

6 here under ANY circumstances – because there is no π whatsoever in this type of chi sq. weekly. newspaper reading habit daily weekly <weekly never total (Ans: chi sq = 9. is there an association between the newspaper reading habits and the sex of students? The following results are obtained. Q.5.76) 28 . the π’s in Q.5 and Q. Do not use the π’s in Q. Use a test of independence to answer the research question. less than weekly or never different for males and females? ie.6 are unrelated.81) Male 45 Female 26 25 56 69 54 109 14 16 169 220 26 Question 6 (answers) Note:There are no π’s here.Question 5 (answers) newspaper reading habits daily weekly <weekly never observed 118 211 257 82 total 668 Question 6 Research Question: Are the proportions of students who read newspapers daily. Question 6 (answers) 27 (Ans: chi sq= 18. But even if they are related.5 should never be used in Q.

56) 76 (75. is there an association between type of diet and how they feel about their weight? preferred diet weightfeel underweight just right overweight total meat 35 (34. Is there evidence to indicate that the coin is biased? (Ans: chi sq = 0.01) 9 total 40 263 87 390 29 There is a problem with the test of association as seen from the given table.67) 30 Question 8 In order to check if a coin is fair.82) 44 vegan 1 (0. the coin is tossed 200 times.92) 4 (6. Re-construct the table and perform the test to answer the research question. The results are 92 heads and 108 tails. 226 (227.26) 33 (29.16) 31 32 .51) 7 (9.18) 337 vegetarian 4 (4.Question 7 Research Question: Among students.07) 4 (2.

3. Is there evidence to indicate that the die is biased? (Partial Ans: chi sq = 1.1) 35 Sign # of CEOs 23 Aries 20 Taurus 18 Gemini 23 Cancer 20 Leo 19 Virgo 18 Libra 21 Scorpio 19 Sagittarius 22 Capricorn 24 Aquarius 29 Pisces 36 . 57. 55.28. the die thrown 300 times.Question 8 (Answers) Question 9 In order to check if a die is fair. (Partial Ans: chi sq = 10. 38. The results of 1. 5 and 6 appearing are 42.2585)33 34 Question 9 (Answers) Question 10 A survey was conducted on the zodiac (astrological) sign of 256 CEOs of in the largest 400 companies.131. 2. p-val=0. 64 and 44 respectively. 0. The results are given in the table.5 Alternatively. 4.2<p-val<0. Research Q.: Are some zodiac signs more common among CEOs than other zodiac signs? Perform an appropriate hypothesis test to answer the research question. z=-1. 0.28.05<p-val<0.

enter “1” • Row 16 – row 17. interstate or overseas. p-val>0. 40 . the numbers are: NSW = 14.row 15. 0.5 37 38 Question 1 Re-do the problem in the beginning of this tutorial about students holidaying in NSW. enter “2” • Row 18 – row 26. and Overseas = 9 (n = 25) Research question: Have the proportions of students holidaying within NSW.3 beside them. and the null values 0. enter “3” Type the labels (anywhere).2 and 0.Question 10 (answers) Given the chi sq statistic to be 5.1 (because the calculations are too long!) Computer (EcStat) Exercises Partial Ans: chi sq = 5.1.5. Interstate = 2. interstate and overseas changed over the past 8 years? 39 Question 1 (continued) Re-construct the original data file (of 2011) in this way: • Row 1: Title (eg “Holidaying”) • Row 2 . 30% overseas In the sample of 2011 (n=25). 20% interstate. Given 2003 past history: 50% NSW.

Tell EcStat where to look for the labels and null values.3000 0. p2 and p3 in the chi sq calculations. we can input the summary (the observed counts or frequencies).0917 chisq(2): 2.0800 Overseas 0.0993 Interstate 2 0.5482 “Proportion” refers to the 3 sample proportions p1. 41 42 Question 1 (continued) Fill in the following answers: (a) Ho: ___________________ (b) What is the value of test statistic? (Include symbol z/t/χ2) ____________ (c) What is the value of p-val? __________ (d) What is the value of df ? _________ (e) Do you reject or not reject Ho? _________ (f) What are the expected counts? You do them. E3 = _______ 43 Alternatively. instead of the raw data file.7546 -0. Tick CI (optional).8000 0.3198 95% CI 0.1800 0. Holidaying Size Proportion StErr NSW 14 0. E1= ________. then click Univariate icon.3 0.1863 0.3654 0.5600 0. E2 = _________. StErr 0 π0 NSW 0.0800 0.1000 Interstate 1.0263 0.5 0.1718 0. EcStat does not calculate them.0960 Goodness-of-fit test on population proportions: 2 Holidaying χ dcmp. 44 .0543 Overseas 9 0.2 0.280 pValue: 0.Optional but recommended: Prehighlight column A. But we do NOT use p1. p2 and p3.3600 0.

25 respectively be justified? Perform the hypothesis test using EcStat. E3 = _______ 48 Type in the labels and null values of proportions -anywhere.1000 Interstate 1. E2 = _________.0800 0.25 and 0.3198 95% CI 0.7546 -0.xls” (used in Pract/WASP 10) Research question: Can the claim that the proportions of university students coming from Government schools. 47 .280 pValue: 0.5482 Question 2 (Pract 10 Exercises) Load the file “Students.1800 0.0800 Overseas 0. EcStat does not calculate them.3 0.5600 0.0960 Goodness-of-fit test on population proportions: 2 Holidaying χ dcmp.1863 0. CI optional.3654 0.0543 Overseas 9 0.5.2 0.0263 0. 45 46 Question 2 (continued) Fill in the following answers: (a) Ho: ___________________ (b) What is the value of test statistic? (Include symbol z/t/χ2) ____________ (c) What is the value of p-val? __________ (d) What is the value of df ? _________ (e) Do you reject or not reject Ho? _________ (f) What are the expected counts? You do them. 0.Holidaying Size Proportion StErr NSW 14 0. Then answer the questions that follow. Outputs will be the same as from a raw data file.8000 0.5 0. StErr 0 π0 NSW 0. E1= ________.3000 0.1718 0. Catholic schools and independent schools being 0.0917 chisq(2): 2.3600 0.0993 Interstate 2 0.

3. Research question: Is there an association between School and Sex? Question 3 (continued) Steps: 1. It does not matter which one comes first. Optional but recommended: Pre-highlight the columns School and Sex.xls”. Fill in the boxes as shown on the next slide. Click Association (5th) EcStat icon. then press Ctrl and highlight Sex. Tick bar chart (clustered bar chart). Make sure you highlight them separately – highlight School. Type the labels of the 2 variables as shown. 2. 4.Question 3 (Pract 10 Exercises) Continue with the file “Students. 49 50 Question 3 (continued) Fill in the following answers: (a) Ho: _______________________________ (b) What is the value of test statistic? (Include symbol z/t/χ2) ____________ (c) What is the value of p-val? __________ (d) What is the value of df ? _________ (e) Do you reject or not reject Ho? _________ 51 52 . 5.