Calculation of sample and population odds Calculation of sample and population odds ratio Test of association using odds ratio Comparison with chi sq test of association Tutorial 13 12.1 2 Practice exercises: Question 1 In a random sample of students (of size 50) who passed STAT170, 20 are males and 30 are females. (a) Suggest a target population. 3 Question 1(continued) (b) Research question: Are the proportions of males and females who passed STAT170 the same? Suppose we employ the z-test for proportion (i) Write down the null hypothesis. Ho: M =0.5 (or F =0.5 ) (ii) Write down the sample proportions for males and females. p M = __2/5=0.4__, p F = __3/5=0.6__ 4 Question 1(continued) (c) Research question: Are the proportions of males and females who passed STAT170 the same? [Same as in (b)] Suppose we employ the chi sq test of proportions (i) Write down the null hypothesis. Ho: M = F = 0.5 (ii) Write down the observed and expected counts O M = __20___, O F = __30__ E M = _ 50*0.5=25__, E F = _ 50*0.5=25 _ 5 Question 1(continued) (d) (i) Of those who passed STAT170, what is the sample odds of males (against females)? sample odds = 2/3 (ii) Of those who passed STAT170, what is the estimated population odds of males (against females)? estimated population odds = sample odds = 2/3 (iii) Can you find the exact population odds of males (against females)? Explain why or why not. No, since there is no info on the population 6 Question 2 (a) Prob Odds Calculate the odds of the following events if they occur with the given probabilities. Do NOT ask for formulas! (i) 1/6, (ii) 1/2, (iii) 2/5, (iv) 4/5 1/5, 1 (or 1/1), 2/3, 4 (or 4/1) (b) Odds Prob Calculate the probability of the occurrence of following events if the odds are: (i) 1/3 (ie 1 to 3), (ii) 4/9 1/4, 4/13 7 Question 3 In Australia, the prevalence of asthma is 20% among 5-9 year old males, while it is 12% for 5-9 year old females. (a) What are the odds of a 5-9 year old male suffering from asthma, and of a 5-9 year old female suffering from asthma? (b) Compute the population odds ratio comparing the odds of 5-9 year old males suffering from asthma compared to the odds for 5-9 year old females suffering from asthma. (c) Compute the population odds ratio comparing the odds of 5-9 year old females suffering from asthma compared to the odds for 5-9 year old males suffering from asthma. (d) Comment on the odds ratio in (b). 8 Question 3 (answers) (a) Ans: 0.25, 0.13636 (b) =1.8333 (This should be , not w.) asthma No asthma Male % % Female % % 9 (c) (d) 10 Question 4 In Australia, 17.8% of 6 17.8% of 6- -14 year old males 14 year old males have ADHD (Attention Deficit Hyperactivity Disorder) and 7.9% 7.9% of 6 of 6- -14 year old females 14 year old females have ADHD. (a) What are the odds of a 6-14 year old male having ADHD, and of a 6-14 year old female having ADHD? (b) Compute the population odds ratio comparing the odds of 6-14 year old males having ADHD compared to the odds for 6-14 year old females having ADHD. 11 Question 4 (answers) (a) Ans: 0.217, 0.086 (b) =2.52 (This should be , not w.) ADHD No ADHD Male % % Female % % 12 Question 4 (continued) (c) Compute the population odds ratio comparing the odds of 6-14 year old females having ADHD compared to the odds for 6-14 year old males having ADHD. (d) Comment on your odds ratio in (b) (e) Comment on the association between ADHD and gender. 13 (c) (d) (e) 14 Question 5 A questionnaire was handed out to students. One question was Do you believe in love at first sight? There were 200 males of whom 102 responded yes. Of the 281 females, 121 responded yes. (a) Put these data into a table. (b) Compute the sample odds ratio for a male student believing in love at first sight compared to a female student believing in love at first sight. (c) Interpret the odds ratio in (b) 15 Question 5 (answers) (b) ns: w = 1.38 (This time it should be w, not .) (c) The odds for a male student believing in love at first sight is 1.38 times the odds for a female student believing in love at first sight. Love (yes) Love (No) Male 102 98 Female 121 160 38 . 1 160 / 121 98 / 102 = = w 16 Question 5 (continued) (d) Research question: Is there an association between Gender and Love at First Sight? Perform a chi square test of association to answer the above research question. Answer: 2 =2.96 , df =1 0.05<p-val<0.1, NOT reject Ho There could be no association 17 Question 5 (answers) 18 (e) Suppose we use a statistical package, eg EcStat, to obtain a 95% CI for the odds ratio for a male student believing in love at first sight compared to a female student believing in love at first sight. Using your conclusion in (d), do you think the CI will or will not include 1? Explain. (Do NOT find CI from EcStat.) Chi sq test in (d) shows that there could be no association between gender and love at first sight. Odds ratio must have the same conclusion of probably NO association. Hence CI for should include 1. Question 5 (continued) 19 Question 6 A study was done on births in New South Wales in 2002. Country of birth of the mothers, and the mothers ages were recorded as binary categorical variables. The results are tabulated below: (a) What is the odds ratio of an Australian born mother being less than 30 years of age compared to an Asian born mother being less than 30? Mothers Age Mothers Birth Country <30 30 or over Australia 631 605 Asia 147 167 20 Question 6 (continued) (a) (b) Interpret the odds ratio in (a). 21 Question 6 (continued) (c) Research question: Is there an association between mothers birth country and mothers age? Perform an chi square test of association to answer the above research question. Answer: 1 2 = 1.80 22 Question 6 (answers) 23 (d) Suppose we use a statistical package, eg EcStat, to obtain a 95% CI for the odds ratio of an Australian born mother being less than 30 years of age compared to an Asian born mother being less than 30. Using your conclusion in (c), do you think the CI will or will not include 1? Explain. (Do NOT find CI from EcStat.) Question 6 (continued) 24 Question 7 Continued from previous question. The babys weight is also recorded as a binary variable. The mothers age and the babys weight are summarized in the table below: (a) What is the odds ratio of a mother who is under 30 years of age having a low birth weight baby (< 2.5kg) compared to an older mother? Babys Wt Mothers Age < 2.5 kg 2.5 kg or over <30 52 901 30 or over 50 997 25 Question 7 (continued) (a) (b) Interpret the odds ratio in (a). 26 Question 7 (continued) (c) Research question: Is there an association between mothers age and babys weight? Perform an chi square test of association to answer the above research question. Answer: 1 2 = 0.478 27 Question 7 (answers) 28 (d) Suppose we use a statistical package, eg EcStat, to obtain a 95% CI for the odds ratio of a mother who is under 30 years of age having a low birth weight baby (< 2.5kg) compared to an older mother. Using your conclusion in (c), do you think the CI will or will not include 1? Explain. (Do NOT find CI from EcStat.) Question 7 (continued) 29 Question 8 The following shows the distribution of gender in university education. 110 293 Non-university 11 19 University Female Male Education The 95% CI for population odds ratio () of male entering university against female is given to be (0.788, 3.724). Determine if there is an association between gender and entering university. Ans: Since CI for (0.788, 3.724) includes 1, there could be no association between gender and entering university. 30 Question 9 In the NSW Local Court in 2007, 14.5% of charges of assault occasioning actual bodily harm resulted in prison sentences, while 7.2% of charges of common assault resulted in prison sentences. (a) Complete the following table. 12.9 7.2% Common assaults (no body injury) 14.5% Assault with body injury Odds of going to prison Set Free Sent to Prison 31 Question 9 (b) Find the odds ratio of odds of going to prison for assault with body injury compared to odds of going to prison for common assault. (Ans:2.18) (c) Is this the population odds ratio () or a sample odds ratio (w)? 32 Question 10 The following 4 tables were shown in Lect 11, where for chi sq test involving 2 variables, we only use counts, not %. From this sample sample sample sample of exact counts, we form 3 other tables involving %. Smoker Non- smoker Male 14 36 Female 16 84 Smoker Non- smoker Male 9.33% 24% Female 10.67% 56% Smoker Non- smoker Male 46.67% 30% Female 53.33% 70% Smoker Non- smoker Male 28% 72% Female 16% 84% 33 Question 10 (continued) (a) Check that all % in the 3 tables are correct. (b) Find the (sample) odds ratio for a male being a smoker compared to a female being a smoker, in ALL the four (4) tables. (c) Do you find the same answer for all 4 cases? Yes. All give the same answer: w=2.04 (d) Is this expected? If yes, WHY? Yes. Whether actual counts or %, the ratios must be the same. 04 . 2 84 / 16 72 / 28 ; 04 . 2 70 / 33 . 53 30 / 67 . 46 04 . 2 84 / 15 36 / 14 ; 04 . 2 56 / 67 . 10 24 / 33 . 9 = = = = = = = = w w w w 34 Question 11 While in the previous question, all 4 tables can be considered equivalent, the following 3 tables are NOT. Find the OR of a male being a smoker to that of a female being a smoker? (Which table should you use? This seems obvious, but MANY students made mistakes in past exams.) Sex smoker non-smoker total male 5 11 16 female 6 8 14 total 11 19 30 Sex smoker non-smoker total male 5.9 10.1 16.0 female 5.1 8.9 14.0 total 11.0 19.0 30.0 Sex smoker non-smoker total male 0.128 0.074 0.202 female 0.146 0.085 0.231 total 0.274 0.159 0.433 Test: df chi-sq p-val Indep 1 0.43 0.51 Smoke Observed counts Smoke Expected counts Smoke Chi-squared decomposition Smoke
Dimensionality Reduction in Automated Evaluation of Descriptive Answers Through Zero Variance, Near Zero Variance and Non Frequent Words Techniques - A Comparison