You are on page 1of 43

FF2613

MEDICINE & SOCIETY II

PRACTICAL SESSIONS:
EPIDEMIOLOGY & STATISTICS

FOR YEAR 2 STUDENTS ONLY

DEPARTMENT OF COMMUNITY HEALTH
FACULTY OF MEDICINE
UNIVERSITI KEBANGSAAN MALAYSIA
KUALA LUMPUR

Scenario
An outbreak of gastroenteritis occurred in Bandar Tun Razak, a suburban
neighborhood, on the evening of April 28. A total of 89 people went to the
emergency departments of the three local hospitals during that evening. No more
cases were reported afterward.
The patients complained of headache, fever, nausea, vomiting and diarrhea. The
disease was severe enough in 19 patients to require hospitalization for
rehydration.
The local health department was immediately notified of a potential food-borne
outbreak of gastroenteritis in Bandar Tun Razak.
Exercise 1
1. Define epidemic, endemic and pandemic.
2. Describe the gastroenteritis outbreak according to disease transmission and
epidemiological triad.
3. What are the possible causes of the outbreak?
4. List and discuss steps that should be taken in outbreak investigations
5. What further information needed?

Exercise 2
The epidemic team, including a medical epidemiologist (public health physician –
Health Officer), health inspectors and a nurse, visited the local hospitals to
interview the attending physicians, the patients and some of their relatives. Some
stool samples were obtained from patients for microbiologic identification of the
causative agent.
The distribution of the disease by person (age and gender) was found as follows:

Age group
0 - 5 yr
6 - 10 yr
11 yr and
older
Total by
gender

Gastroenteritis Outbreak Findings by Person, Case Distribution
by Age and Gender
Female
Male
Total by age
No
%Females
No
%Male
No
%
1
1
38
37
10

2

Please calculate the totals for each column and row and their corresponding
percentages to try to determine if there are any important differences by age or
by gender. Interpret your findings.

Discuss the epidemic curve above

Exercise 3
Therefore the epidemic team investigated the places where affected persons,
their relatives and neighbors ate that day (April 28). The following table shows
the team's findings:

Gastroenteritis Outbreak Findings by Place

Place

People
who
attended

Ill
people

Attack
rate

People
who did
not
attend

Ill
people

Cafeteria
LRT

207

61

157

47

Kedai
Makan Ali

246

25

122

13

Restaurant
ABC

475

68

189

29

Elementary
school
cafeteria

239

67

495

22

Attack
rate

Relative
risk

Please calculate the attack rates per 100 (incidence rates per 100) by place to try
to determine where the contaminated meal was served. For each place compare
attack rates (AR) for those who attended with attack rates for those who did not,
by using the relative risk (i.e., RR = AR in attendees/AR in non attendees).
Interpret your findings.

and Jamilah prepared the burgers and served the ice cream. Interpret your findings. The following table includes the food items served in that place on April 28: Food Item Beef rendang Burger Gastroenteritis Outbreak Findings by Person Ate the food item Did not eat the food item No.Exercise 4 Once the implicated place was determined. The ice cream was a commercial brand and was bought at a nearby supermarket. .. Ill Attack No. The names of the kitchen personnel and their participation in the food preparation are as follows: Ms Mary prepared the beef rendang and the potatoes. Salmah served all dishes except the ice cream. Compare attack rates (AR) for those who ate the food item with attack rates for those who did not eat the food item. Ill Attack people people rate people people rate Salad 276 218 105 28 21 49 266 131 297 27 14 15 Baked potato 139 11 213 31 88 48 279 25 175 18 203 49 Fruit cocktail Ice cream Relative risk Important note: None of the kitchen personnel were ill.e. RR = AR in those who ate the food/AR in those who did not eat the food). by using the relative risk (i. Please calculate the attack rates per 100 (incidence rates per 100) by food item to try to determine the one that was probably contaminated. the investigation centered on the food. Johan prepared the salad and the fruit.

Discuss the general principle of prevention and control of gastroenteritis outbreak. In addition. Furthermore. . food samples from some meal leftovers were taken to the laboratory. Please discuss these findings and identify the kitchen worker possibly responsible for the outbreak.Exercise 5 Given that the epidemic team worked fast enough and the implicated meal(s) was (were) identified before all food leftovers were discarded. stool samples were taken from the kitchen personnel who prepared or handled each different food item. The laboratory confirmed that Salmonella toxin was present in some of the food samples and that one of the kitchen personnel of that place had the same Salmonella species. the Salmonella species found in the food and the kitchen worker was the same species found in stool samples of the patients.

reliable and inexpensive z 1 . amongst apparently healthy individuals. safe and simple z Painless.Screening: Definition Screening Test The identification. of those who are sufficiently at risk of a specific disorder Screening program: Requirements (I) Screening vs Diagnosis z In screening. there is no intention to make a definitive diagnosis or offer therapeutic intervention solely based on a positive result Natural history of disease must be understood z Have an agreed policy on whom to treat z Prevalence of undiagnosed disease high z Disease has high morbidity and mortality z Of public health concern z Early treatment easier and more effective z Screening program: Requirements (II) Signs present to indicate disease presence Screening test acceptable and harmless z Screening test must be valid z Yield of screening must be high z Diagnostic work-up for a positive test must have acceptable morbidity z Screening exercise must be cost-effective z The ‘ideal’ screening test z Would always give the right answer z Quick.

Structure of a study involving a screening test Resembles an observational study Same concepts applied for ‘diagnostic test’ z Designed to determine how well a test can discriminate between diseased and nondiseased z A predictor variable (the test result) z An outcome variable (presence or absence of disease) z Structure of a study involving a screening test z The test result z z Measures of accuracy for screening tests: z z – Dichotomous z +ve or -ve z +. +++. ng/L. Evaluation of a screening test TRUTH Disease Predictive values (PV) Positive A True-positive B False-positive Negative C False-negative D True-negative Sensitivity = Sensitivity z Sensitivity is the proportion of those with the disease who tested positive Indicates how good a test is at identifying the diseased Specificity z z Specificity is the proportion of those without the disease who tested negative Indicates how good a test is at identifying the non-diseased No disease TEST RESULT – Positive PV and Negative PV z – Presence or absence determined by a gold standard – Categorical Validity – Sensitivity and specificity The disease as outcome variable A A + C x 100 Specificity = D B + D x 100 Sensitivity and specificity z Describe the performance of a test z A test with a high sensitivity is useful to RULE OUT the disease z A test with a high specificity is useful to CONFIRM the presence of disease 2 . ++. etc. ++++ – Continuous z mg/dl.

9% PV- z Sensitivity.01%) Shapiro et al. specificity and PVs Prevalence: 179/63459 x 100 = Sensitivity: 132/179 x 100 = Specificity: 62295/63280 x 100 = +ve PV: 132/1117 x 100 = -ve PV: 62295/62342 x 100 = A x 100 A + B Predictive values z Mammography No disease TEST RESULT Mammography had an excellent specificity (98%) False +ve tests outnumber the true +ve tests by over 7:1 (PV+ =12%) ~7 in every 8 patients who had positive mammograms had normal biopsies Predictive value for a positive test is low (12%) (False +ve 88.7% 98.8% 99.2%) (False -ve 0.Predictive values (PV) the usefulness of a test z A test of efficient use of time and resources z PV estimate the probability of disease z PV describe the frequency of correct identification z Positive PV and Negative PV Predictive values TRUTH z Assess Disease Positive A True-positive B False-positive Negative C False-negative D True-negative PV+ = Predictive values PV of a positive test is the proportion of individuals who test +ve and have the disease z The positive PV estimates the likelihood that a person who tests positive has the disease z PV of a negative test is the proportion of individuals who test -ve and don’t have the disease z The negative PV indicates the likelihood that a person who tests negative is actually disease free = D x 100 C + D Greatest value in deciding whether to implement a screening program z Not useful if positive PV is low Comments z Disease status Cancer No cancer Total Positive Negative 132 47 985 62295 1117 62342 Total 179 63280 63459 z z z 0.4% 11..3% 73. 1988 3 .

specificity & predictive values.Predictive Value Of A Test Is Affected By Prevalence Of Disease SUMMARY A screening test study determines the usefulness of a test in identifying those at risk of a disease z Students must be able to calculate and interpret sensitivity. z THANK YOU 4 .

Semester-1 Trigger: You are the State Medical Officer for AIDS/HIV of Negeri Sembilan and you are expected to conduct a sentinel surveillance for HIV amongst. o Pregnant mothers (Antenatal Screening) o STD clinic patients Page 2 of 12 .Year-2.

Calculate the sensitivity. Positive Negative Total Disease Present TP FN TP + FN Disease Absent FP TN FP + TN Total TP + FP FN + TN N TP = True Positive FP = False Positive FN = False Negative TN = True Negative Sensitivity = TP/(TP+FN) x 100% HIV Enzyme Immuno Assay (EIA) Gold Standard + + 1000 9 EIA (blood) 0 8991 total 1000 9000 total 1009 8991 10.000 HIV Particle Agglutination Test Gold Standard + + 999 270 PA 1 8730 total 1000 9000 total 1269 8731 10. specificity. you did a literature review and collated the following tables. PPV and NPV of each test to help you decide.Year-2.000 Specificity = TN/(TN+FP) x 100% PPV = TP/(TP+FP) x 100% NPV = TN/(TN+FN) x 100% HIV Rapid Test Kit Rapid Test + total Gold Standard + 998 180 2 8820 1000 9000 total 1178 8822 10000 + total Gold Standard + 930 180 70 8820 1000 9000 total 1110 8890 10000 Oral Rapid Test Kit Oral Test Kit EIA PA Rapid Oral Sensitivity Specificity PPV NPV Which is the best screening test? Page 5 of 12 . Semester-1 Data Information Sheet-1 – Choosing The Appropriate Screening Test To select the appropriate screening test.

Sp) NPV = (1-Prevalence) x Specificity (1-Prev) x Sp + Prev x (1 .Sen) Page 6 of 12 .0% and specificity of 99.000 Since the sensitivity and specificity is the same for all three study populations. NPV and prevalence rate of HIV for each study population.9% was selected to be used for the sentinel surveillance in Negeri Sembilan. You decided to include the inmates of Pusat Serenti Tampin and Pusat Serenti Jelebu in the sentinel surveillance. PPV and NPV can also be calculated using the following formulas.Prev)x (1 .000 10.Year-2. Calculate the PPV. PPV = Prevalence x Sensitivity (Prev x Sen) + (1 . a test with sensitivity of 100. please discuss how PPV and NPV are affected by the prevalence of the disease in each study population.000 people. Semester-1 Data Information Sheet-2 – Effect of Prevalence on Sensitivity & Specificity Based on the earlier analysis. HIV EIA. Antenatal mothers Disease Present Disease Absent Total PPV = NPV = Positive 3 10 13 Negative 0 9987 9987 Total 3 9997 10000 Disease Present Disease Absent Total Positive 9 10 19 Negative 0 9981 9981 Total 9 9991 10000 Disease Present Disease Absent Total Positive 2000 8 2008 Negative 0 7992 7992 Total 2000 8000 10000 Blood donors PPV = NPV = IVDU Population Antenatal mothers Blood donors IVDUs Population with HIV 3 9 2000 PPV = Population without HIV 9987 9991 8000 NPV = TOTAL Prevalence rate 10. Each study population consisted of 10.000 10.

39% 100 10 0 9.77% NPV d/c+d 100. Sensitivity & Specificity prevalence 20.60% 3000 7 0 6993 3000 7.998 16.00% 20.00% c d c+d 99.4% Page 7 of 12 .995 33.491 500 9.9% 50.99% 500 10 0 9.01% 0.08% 5 10 0 9.67% 3 10 0 9.6% 69.0% 1.989 1 9.985 5 9.00% 100.999 9.000 99.00% 100.03% 0.3% 1.09% 2 10 0 9.500 98.00% 100.0% 32.8% 17.981 9 9.Year-2.1% 91.90% a+c b+d a+b+c+d TP FP FN TN PPV a b c d a+c b+d a/a+b 1 10 0 9.000 99.997 23.1% 99% 99% 96.00% 100.00% 30.9% 0.4% 3.34% 9 10 0 9.02% 0.991 47.Antenatal <.00% 5.00% 100.1% 16.11% 2000 8 0 7992 2000 8.0% 5.900 90.1% 8.00% 100.0% 10.Pusat Serenti PPV based on Prevalence.0% 30.Blood Donors <.00% 100.0% sensitivity % specificity % 95% 90% 95% 90% 82.987 3 9.09% 1.9% 50.7% 83.0% 0.9% 0.890 100 9.00% <.00% 100.2% 67.14% 1000 9 0 8991 1000 9.000 99.000 + a b a+b 100.9% 80% 80% 50.0% 50.05% 0.988 2 9. Semester-1 Data Information Sheet-3 – Effect of Prevalence on Sensitivity & Specificity Population Sensitivity Specificity Prevalence 0.00% Hypothetical Illustration of Screening Programme with Test Kit + 10.00% 10.00% 100.0% 9.

int/docstore/wer/pdf/1997/wer7212.int/ethics/topics/hivtestingpolicy_who_unaids_en_2004. How to read a paper: Papers that report diagnostic or screening tests. 2005. 2004.htm USFDA.pdf WHOSEA. Africa Conference 2005: African Health and Illness. 1998.? http://www. Semester-1 References: Osman Ali.cdc. 1997. “Donor Screening Assays for Infectious Agents and HIV Diagnostic Assays” http://www.fda. http://w3.bmjjournals.org/bct/332/diagnosis1.edu/conferences/africa/2005/panels/hellweg. Narrative and Secrecy: Sentinel Surveillance and Alternative Epidemiologies of HIV/AIDS in Northwestern Côte d'Ivoire. Penerbit: Dewan Bahasa Dan Pustaka. 1990.htm CDC. No.whosea.gov/cber/products/testkits.utexas.gov/hiv/pubs/faq/faq8. BMJ 1997.who. Standard Operating Procedures for Diagnosis of HIV Infection. Weekly Epidemiological Record. 2006. What are the different HIV screening tests available in the U.com/cgi/content/full/315/7107/540 Page 8 of 12 .html Trisha Greenhalgh. http://www. 12.who.htm Joseph Hellweg. 2005. UNAIDS/WHO. UNAIDS/WHO Policy Statement on HIV Testing http://www.Year-2. http://www. Kaedah Epidemiologi.S.315:540-543 (30 August) http://bmj.pdf WHO (March 1997) Revised Recommendation for the Selection and Use of HIV Antibody Tests.

For homework. .00 Research Project 5 Presentation of their findings. problem framework. the students are expected to distribute the questionnaires and collect the data for the study. as homework. 27/08/10 10. For homework. 20/08/10 2.30 Correlation & Research Project 3 Calculation and interpretation of correlation and regression using the given dataset.00 – 12. Students are guided on how to enter the data into the computer using Excel or SPSS. as homework.30 – 4. including calculating the measures of central tendencies and variability using statistical formulas. they are expected to write up the proposal. including the questionnaire. students will complete the data entry for all collected data and bring the complete file to the fourth practical session. hypothesis and methodology.00 – 4. 20/09/10 24/09/10 2. objective. Presentation of the complete research proposal. For homework. the students will complete the analysis and prepare a PowerPoint presentation for the final practical session. Once the above has been agreed upon.30 TOPIC CONTENT Descriptive Statistics & Research Project 1 Manipulation and presentation of data using the given dataset. DATE 14/07/10 TIME 10. to be submitted in two weeks time from their presentation.30 – 12. All completed forms are to be brought to the third practical session. 21/07/10 10. NonParametric and Research Project 4 Calculation and interpretation of non-parametric and chi-square tests using the given dataset.30 Analysis of Quantitative Data & Research Project 2 Calculation and interpretation of t-tests and proportionate tests using the given dataset.30 – 12. Students will be guided by the respective lecturer/tutor assigned to each lab. Each lecturer will demonstrate how to analyse the data using computer and advice on the interpretation of results.00 10.PRACTICALS GUIDE Medicine & Society Module (FF2613) INTRODUCTION In this module there will be 4 practical sessions for the research project and statistical exercises. which will be discussed during the second practical session. The schedule for the practical sessions for this semester is as stated below. Determine the title.30 Chi-Square.30 – 12. Upon acceptance. Each lab is required to prepare a notebook for the session. the students will prepare a written report of the study.

We use these measures of central tendency and variability to describe the data that we collected. Write down the formulas for standard deviation in the boxes below. Calculate the mean. Write down the formulas for mean in the boxes below. But we can’t cover everything. therefore students are also expected to learn on their own. we try to slot the practical sessions according to lectures. Now we only have 7 hours of lecture and 4 practical sessions for statistics and research methodology in the new curriculum. In the past we had 25 hours of lectures and 8 practical sessions just for statistics and research methodology. mode and median for the age χi of the following respondents. For variability. Measures of Central Tendency for Quantitative Data 1. mode and median. Whenever possible. Basic Formula Formula for grouped data (Formula A) . For this session. we are will learn about measures of central tendency and variability. The measures of central tendency are mean. it is standard deviation (sd).Practical 1 Descriptive Statistics Introduction In the old curriculum. Please be patient and persists in doing the exercises. the practical sessions were slotted immediately after the respective lectures. Basic Formula Formula for grouped data (Formula A) 2. 35 24 36 21 21 20 34 29 37 30 26 27 29 34 33 33 27 25 21 26 32 30 33 36 28 33 19 29 27 29 22 23 31 32 31 Total = ___________ Mean = __________ n = ________ Median = __________ Mode = __________ 3. Kindly refer to your formula sheet or your books for help.

the quantitative data are sorted in frequency tables such as the one below.9 5 5 0 40.00 33.00 29.9 82 43 39 60.00 29. 5. the factor being studied is the weight of the mothers during first trimester (first three months of pregnancy) and the incidence of babies with low birth weight. But for studies with large number of samples.00 x-mean (x-mean)2 Total Total (x-mean)2 = _______________ Therefore standard deviation s = _________________ It is easy to calculate the mean and standard deviation for data with few observations. These are data from a case-control study to identify factors that are associated with small for gestational age amongst newborn babies.00 30.9 10 2 8 80.00 33.00 28.00 27.00 Total x 29.4.00 33.00 20.00 21. Weight during first All Frequency Frequency of trimester in kg Frequencies of Cases Controls 30.00 30.0-69.00 37.0-49.0-99.00 27.2. Therefore for large studies. it is much harder.9 4 1 3 Total 218 110 108 .9 69 48 21 50. x x-mean (x-mean)2 19.00 21.00 22.00 33.00 24.00 32.00 26.00 26.00 29. calculate the standard deviation and variance of the age χi of respondents.00 31.0-39.00 36.00 34.00 21.00 27.00 25.0-89.00 31.00 23. For the table below. Using the data from Q.00 32.00 36.9 3 1 2 90.00 35.00 34.0-59.0-79.9 45 10 35 70.

0-69.0-79. = + = + .9 90.0-39.9 Total Frequency 5 48 43 10 2 1 1 110 m.0-69.9 60.mp2 means “frequency x (midpoint)2”.9 50. calculate the mean.95 f. Weight in kg 30.9 50. Weight in kg 30. = = . just fill up the table below.9 80.9 Total Frequency 0 21 39 35 8 2 3 108 m.mp2 f cumulative 5 53 96 106 108 109 110 For controls.doc 8-4-07.95 54.9 80. To simplify matters.9 40.95 94. Case Control = = Mean Mode + .95 84.p 34.0-59. .95 44.0-79.0-49. median and standard deviation for both cases and controls. Standard deviation = The answers above will be used in the coming practical sessions.mp2 0 f cumulative 0 21 60 95 103 105 108 ☻f.mp 0 f. For cases. = Median + . not (fmp)2 Fill up your answers in the table below.95 44.0-89.0-89.mp f.0-49.95 54.For the following exercise.95 64.9 90.9 70.p 34.95 94.9 70.95 74. mode.95 64. Hakcipta terpelihara Dr Azmi Mohd Tamil Amali1.9 60.0-99.9 40.0-59.95 74.95 f.0-39.0-99.95 84.

present their findings and write up the final report for submission. analyse the data. For this session. . including the questionnaire.Practical 1b Research Proposal Each lab group is required to come up with a research proposal. which will be discussed during the second practical session. they are expected to write up the proposal. the students are expected to agree on the. collect the data required. • Title of the research • Objectives • Problem Framework • Hypothesis • Methodology Once the above has been agreed upon. as homework.

The tables below gave a general guide on the correct statistical test for the respective variable types. qualitative & quantitative.g.Normally distributed data continous Pearson Correlation & Linear Regresssion Non-Parametric Analysis Variable 1 Qualitative Dichotomus Qualitative Dichotomus Qualitative Polinomial Quantitative Quantitative continous Variable 2 Qualitative Dichotomus Criteria Type of Test Sample size < 20 or (< 40 but Fisher Test with at least one expected value < 5) Quantitative Data not normally distributed Wilcoxon Rank Sum Test or U MannWhitney Test Quantitative Data not normally distributed Kruskal-Wallis One Way ANOVA Test Quantitative Repeated measurement of the Wilcoxon Rank Sign same individual & item Test Quantitative .Practical 2 Inferential Statistics Statistical Tests & Types of Variables In general there are 2 types of variables. Qualitative Data Analysis Parametric Analysis Qualitative Dichotomus Qualitative Polinomial Quantitative Quantitative Normally distributed data Student's t Test Quantitative Normally distributed data ANOVA Quantitative Repeated measurement of the Paired t Test same individual & item (e. When you want to test the association between 2 variables. the type of test to be utilised depends on the type of variables. Normally distributed data Quantitative continous Quantitative . Hb level before & after treatment).Data not normally distributed Spearman/Kendall continous Rank Correlation .

d.Practical 2 This is the second practical session for this module. There is a difference of first trimester body weight between the cases (mothers with SGA babies) and controls (mothers with non-SGA babies). Basic Formula Sample size > 30 Small sample size & equal variance b. In this session. complete the boxes below. . Write down the null hypothesis. c. we will be conducting exercises on Student’s t-test. Case Control 110 108 Mean Standard deviation n The hypothesis that we want to test out is that. Based on results from the previous session. Student’s t-test 1a. Write down the formula for Student’s t-test in the boxes below. paired t-test and proportionate test. Calculate the t for Student’s t-test for the above exercise. Q5.

It is believed that the shorter mothers were of higher risk to get SGA babies. try to do the exercise below. A case-control study to identify factors that can cause small for gestational age – SGA was conducted. Discuss which table is more appropriate for this exercise. is the null hypothesis rejected? g. Instead the students are expected to choose the appropriate one based on the problem and the data given. What is the appropriate statistical test to prove this hypothesis? c. b. d. Is there a significant difference of first trimester weight between the two groups? Explain your answer. we will not tell you what test to use. Using the data given. Among the factors studied were the mothers’ heights. For example. State the hypothesis and null hypothesis for the above problem.e. 2. Please refer to table A1 and A3. Total of samples n Total of weight ∑x Total of (x-mean)2 Total of samples n Total of weight ∑x Total of (x-mean)2 Case 110 16620 2326 Control 108 16439 3605 Both groups 218 33059 5931 a. What is your conclusion. Based on the above p value. conduct the statistical test. During the examination. f. and try to estimate the p value from the t value calculated. based on your answers in Q2c? .

0 10.0 10.0 10.5 9.0 10.0 10.0 10. To measure the effectiveness of the treatment.0 11.6 9.0 10.8 10.0 12. d.0 10.0 10.0 10.0 10.5 10.5 9.0 13.8 9.0 9. Write down the formula for paired t-test in the box below.0 11.6 10.0 10.0 10.0 9. .0 10.0 10.0 D D2 Total c.0 10.6 7.0 13.0 10.1 Hb2 9.0 10. Is the intervention effective? Do a paired t-test analysis using the data above.0 11.5 13.0 10. Basic Formula b.0 10. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Hb1 9.0 10.0 10.5 10.0 10.0 10.7 9. please complete the table below.0 10.8 11.Paired t-test 3a.2 9.0 11.0 10.0 10.0 11.0 10.0 10.0 11.0 10.3 9. They were treated with haematinics for 2 months and their haemoglobin levels were measured again. Thirty of the pregnant mothers were found to be anaemic during their second trimester follow-up.3 10.0 10. Discuss the result of your statistical test.6 10.

05 as the level of significance (the z value in the normal distribution table for 0.Proportionate Test 4a. Basic Formula The rate of SGA for mothers exposed to cigarette smoke (passive smoker) was 89/156. c. The rate of SGA for mothers not exposed to cigarette smoke was 20/61. Research Project 2 Presentation of the complete research proposal. Do the proportionate test and discuss its result using 0. Upon acceptance of the proposal. Write down the formula for proportionate test in the box below.doc 8-8-06. Hakcipta terpelihara Dr Azmi Mohd Tamil Amali4. State the appropriate null hypothesis. .05 as the level of significance is 1. b. All completed forms are to be brought to the third practical session. as homework.96). the students are expected to distribute the questionnaires and collect the data for the study.

Just imagine the number of calculations that you have to do before you even get to calculate the correlation coefficient (r). • Total of the second variable (∑y). complete the following table.mean x)(y-mean y) As you can see from the formulas above. the chance of error occurring is quite high indeed☺. Now we want to see whether there is an association between the mothers’ first trimester weight (WEIGHT2) and the child’s birth weight (BIRTHWGT). you need to identify the following.☺ 2. Basic Formula for r (x-mean x)2 (y-mean y)2 (x. In this session we will do exercises on Pearson correlation and linear regression. Since you’ll be doing this calculations manually. • Total of the second variable squared (∑y2) and • Total of the two variables multiplied (∑xy). Write down the formula for Pearson Correlation in the boxes below. to calculate the correlation coefficient (r). Please complete the following table.………. • Total of the first variable squared (∑x2). we have proven that there is an association between the mothers’ first trimester weight and SGA. Once done. For exercise. • Total of the first variable (∑x).Practical 3 Inferential Statistics 2 Introduction This is the third practical session. Measure the time required to complete it. you will have to do more than 455 calculations. If the sample size is 150. Pearson Correlation 1a. . In the past exercise. please note that you may have to do the same thing again for a dataset 5 times larger than this. A case-control study to identify factors that can cause small for gestational age – SGA was conducted.

b.00 2.55 2.50 53.16 3.00 70. Is the r significant? What is the p value? How is it calculated? .10 2.50 55.80 44.10 3.00 59.50 47.50 40.00 60.30 2.48 3.50 39.00 40.52 2.00 55. Conduct the correlation test and calculate the r (correlation coefficient).84 3.00 51.09 3.40 52.INDEX 9 10 12 20 21 29 31 32 34 43 60 70 72 79 90 97 117 126 131 138 145 146 156 159 171 173 174 175 178 181 TOTAL WEIGHT2 42.10 72.48 2.00 46.00 58.00 44.23 2.55 3.40 2.46 2.41 3.00 BIRTHWGT2 ∑xy a.50 2.50 52.97 3.30 47.00 WEIGHT22 BIRTHWGT 2.27 3.00 63.31 2.28 2. State the null hypothesis for correlation test between the two variables.00 49.30 2.15 2.50 86.50 47.50 62.50 92.20 2.49 2.46 3.56 3. How strong is the relationship between the two variables? c.00 61.19 3.00 2.20 45.00 62.00 66.

A case-control study to identify factors that can cause small for gestational age – SGA was conducted.8 . Name the appropriate statistical test to test the association between the two variables.If the r is significant.2 1. Discuss the result of the test.00 5019291. Conduct the statistical test including the test of significance. .4 1.2 0.431.8 1.017 . 3.2 2. 3. n = 218 Mean Standard deviation ∑(observation) ∑ (observation2) ∑ (observation 1 x observation 2) HEIGHT 151.0 Babies' Birthweight 1.0 .79 0.6 1. usually. Instead.98 92386. State the null hypothesis for the above statistical test.2 3.8 2.46 1760.0 2. b.4 3. along with some extraneous data. Among the factors studied were whether there is an association between the mothers’ height in cm (HEIGHT) and the child’s birth weight in kilogram (BIRTHWGT).35 a.4 .0 Rsq = 0.6 3.54 608.6 r = 0.6 2. it is best to demonstrate it using a scatter diagram like the one below. p = 0. would be rather cruel.1874 0 10 20 30 40 50 60 70 80 90 100 Mothers' Weight To expect the students to calculate all that during the examination.26 33059. It is up to the students to select the appropriate data and use it in the appropriate statistical test. c.00 BIRTHWGT 2.4 2. just to confuse the students.65 5. all the required data will be given.

For homework. Draw a rough diagram of the final equation from the calculation.doc 16-8-06. Each lab is required to prepare a notebook for the session. Basic Formula b a b. Using the data from Q2. Write down the final equation of the calculation. Research Project 3 Students will be guided on how to enter the data that they have collected into the computer using Excel or SPSS. students are required to complete the data entry for all collected data and bring the completed file to the fourth practical session. d. Write down the formula for linear regression in the boxes below.Linear Regression 4a. conduct the test for linear regression and calculate the regression co-efficient (b) and constant (a). Hakcipta terpelihara Dr Azmi Mohd Tamil Amali5. . c.

-"-+%&#'(.#! 1-$0--#!$0.+&*$&.! '-''&.7! )-*.#!.%&'!&'!$%-!8..! $-'$! <.8!R)*+$&+*.'$!<)-67-#$!'$*$&'$&+*.$*..:! ! ! J! KJ L!M!!G@5CI ! !!! C! ! 2<!L!G)!H!NI!G+!H!NI! ! ! O.7-!..!J:!! ! .%#%&/%&$(0( ( *+%"123$%&1+! ! "#! $%&'! ()*+$&+*.*1.! 1-! 2..9!&#!*!+.-:!.#! +%&5'67*)-! $-'$! *#2! #.&$*$&?-!?*)&*1.'/!*'!&.!$%-!<.-! ! B! 5! ! B! *! 1! 3! 5! +! 2! %! ! -! <! #! ! C4(-+$-2!D*.)! *''.9'&'!$%*$!&'!$-'$-2!<..7'$)*$-2!1-..#$&#3-#+9!$*1.! 8*>-! '7)-! $%*$! 9.1'-)?-2! H! -4(-+$-2IJE-4(-+$-2! <.78#!$.-':!! ! @1'-)?-2!2*$*!0.0&#3!-4-)+&'-:!.)!27)&#3!-4*8&#*$&.-! ! B! 5! ! B! -3E#! <3E#! 3! 5! -%E#! <%E#! %! ! -! <! #! ! ! F%&5'67*)-! &'! +*.73#"-(8-/%(9:.7.&#3! -4-)+&'-'! .+7.2!1-!'.*$-2! 19! '788&#3! 7(! G.!*#*.)$-2!*++.+7.<( ! .*$-2/!7'&#3!$%-!).-!&'!+*.)!P:Q+!<).%-#!$%-!-4(-+$-2! ?*.)2&#3.0A!! ! @1'-)?*$&.!"#$%&$#'()( *+..#:!=.!67*.9'&'! &'! 2..0'!*#2!+.9! 7#2-)'$*#2! &$:! .%-#!+.#/! 0-! 0&.9'&':!! ! 45&6.8(*)-!0&$%!$%-!*#'0-)!<.#5 (*)*8-$)&+!*#*.*1.%&'! *#*.#-! $.)! -*+%! +-.7-!$*1..

%-!)*$-!.<! =ST! <.>-)! ! ! ! ! ! ! ! ! 1: ^%*$!&'!$%-!#7..&$9!(!L!!!-c<c3c%c!!!c! B! *! 1! 3! ! ! !!!!#c*c1c+c2c! ! ! ! ! ! 5! +! 2! %! ! -! <! #! ! .)8*.&$*$&?-!?*)&*1.#5=8.-!'&b-A!.$%-'&'_! ! ! ! +: ^%*$! &'! $%-! )*$-! .'-2!$.%-!<.-'! *#2!%*'!*!'8*.!$-'$!$%-!*''..-!..! ()..<!=ST!<.-! ! =ST! \.1*1&.'-2! $.#5=8.7)! %9(.0A! ! ! =ST! \.! +&3*)-$$-! '8.$%-)'!#.)!..#27+$! $%-! *(().<!$%-!-4(-+$-2! ?*.()&*$-! '$*$&'$&+*.-$-!$%-!$*1.>-)! ! ! ! \.-''!$%*#!Y:!..8(.!%9(.#27+$-2!$.>-!&'!J[EZN:!! ! @1'-)?*$&.! ! R*''&?-!=8.#5'8.7-'!&'!.7)! <&#2&#3':! ( ( ( ( ( ( ( ( ( ( ( ( =&/5-">/(?@#$%(8-/%( ! `&'%-)a'!C4*+$!.#!$*1.>-)! J[! QN! ZN! ! N[X! N[W! JN]! ! *: F.$%-'&':! O&'+7''! 9.<!-4(-+$-2!?*.$!-4(.#-!.*!&'!*'!<.<! =ST! <.0'A!! ! ! B! 5! ! !!!!!!().)! (*''&?-! '8.)87.N: .!+&3*)-$$-!'8.>-)! WX! Z]! NYZ! \.%-! )*$-! .+&*$&.#!1-$0--#!J!67*.>-)'_! "'! $%-)-! *#9! 2&<<-)-#+-_!! ! ! ! 2: F.)8*.>-! GU(*''&?-! '8.?-! 9.! $-'$! $.!'*8(.$%-)'! -4(.>-)VI! &'! WXENYZ:!.7-'!1-.>-)'! *#2! #.)!8.-''!$%*#!Q[!*#2!.)! 8.! ! R*''&?-!=8.-'$!&'!+.-''!$%*#!J[!.

.#!#.9'&#3! $%&'! 3).+&*$&.7'!?*)&*1.<! $%-! -*).! +&3*)-$$-!'8.>-)! ! =ST! N[! [! N[! \.7)! <&#2&#3':! ! E1+6.0&#3! -4-)+&'-:! .+.!$-'$:!! ! R*''&?-!=8.7-'!<.-/!f7'$!'.8! $%-! -*).)! .! $-'$! $..()&*$-!'$*$&'$&+*..7('! *#2! +.7(! ..#27+$-2!.#!+*#!9.2!&'!'&8(.#5=8.?-! 9.%-! 2*$*! &'! *! '71'-$! .)2-)/! )*#>! $%-8/! '78! 7(! $%-! )*#>'! *++.#27+$! $%-! *(().<!$%-!1*19:!=&#+-!$%-!'*8(.)2&#3! $.8(*)-!$%-!?*.$%-'&'_! ! -: F.)9/! &'! $%-)-! *#! *''.#! 1-$0--#! -4(.'7)-! $.)!^&.()&*$-!$-'$!&'!*!#.#! 1-$0--#! -4(.! ()..?-!)-'7.!?*.)8*.)! $%-! *''.$.#$&#3-#+9!$*1.-'$:!! ! d: `.1'$-$)&+! %&'$.! 3).&-)! '$729:! ^-! *)-! $)9&#3! $..! '--! 0%-$%-)! $%-)-! &'! *#9! *''.-:!.#2-#$'! %*2! 8&'+*))&*3-'! &#! $%-! (*'$:! e9! *#*.-!'&b-!&'!67&$-!'8*.#! 1-$0--#! *! 67*.9'&':!! ! *: ^%*$!&'!$%-!#7.-/!+.)$!$%-! 2*$*! &#! *#! *'+-#2&#3! .3D(8-/%(( ( .$! #.>-!*#2!$%-!0-&3%$!..+&*$&.7!8*>-!<).#5(*)*8-$)&+!-67&?*.-!.4..'7)-! $.9! 2&'$)&17$-2! 2*$*:! "$! &'! 7'-2! $.7-!0&$%!$%-!$*1.8.7)! %9(.#+.!%9(.&-)! =ST! '$729/! Jd! .#! $%-! <.7-_! ! 1: ^%*$!+.! +&3*)-$$-! '8.)8*.#!g*#>!=78!.-#$!.%&'!$-'$!&'!$%-!#.D1C-"(9+FGH<( J&"%5(A-&K5%( B#+C( d:]Z! ! d:Z[! ! d:YY! ! d:QW! ! d:JY! ! d:[Z! ! d:[Y! ! J:YY! ! J:Q]! ! J:QZ! ! J:QY! ! J:QY! ! J:Qd! ! J:d[! ! J:[X! ! .! $-'$! <.&$*$&?-! 2&+%.D1C-"(9+FGH<( J&"%5(A-&K5%( B#+C( Q:J[! ! d:XZ! ! d:][! ! d:ZN! ! d:JZ! ! d:NY! ! d:NJ! ! d:[[! ! J:XW! ! J:WQ! ! J:WN! ! J:Y]! ! J:QQ! ! J:Qd! ! J:N[! ! !#//&I-(.<!$%-!=$72-#$a'!$!$-'$/!17$!+.<!+)&$&+*.8!$%-!*1..$%-'&':! O&'+7''! 9.>-)! \.#27+$!$%-!*(().+&*$&..%-!8-$%.>-! *#2! =ST_! e*'-2! .)! ()*+$&'-/! 2./! $%-!*(().<! $%-! )-'(.#5(*)*8-$)&+!*#*.-!0&$%!*!67*#$&$*$&?-!?*)&*1.! ]! Z! Nd! ! N]! Z! Jd! ! *: ^%*$!&'!$%-!(!?*.J: `).$'_! ( ( A&'$1@1+(B#+C(.! $%-! <.7'&.0&#3!+.<! (*$&-#$'! 0&$%! (.()&*$-! '$*$&'$&+*.

0!$.#!$%-!'*8-!&#2&?&27*.8(.-#$! .9'&'!*#2!()-(*)-!*!R.&K+(8-/%( ! .>-! *#2! $%-! 0-&3%$! .!'-''&.#! #.!2-8.!()*+$&+*.-! &'! 67&$-! '8*.! ?*.)!$%-!<&#*.! $%-! <.$! #.$':!`.<! $%-! 2*$*/! #.! T8*.+7.#!<.$%-'&'_! ! 1: F.<! $%-! 2*$*:!! ! Q: `.()&*$-!'$*$&'$&+*.'7)-!$.#:! i*>+&($*!$-)(-.)! ()*+$&'-/! 2.)>/!$%-!'$72-#$'!0&.0&#3! -4-)+&'-:! .7)! <&#2&#3':! ! ! ! ! ! ! ! "\OCh! ieJ! ied! i1!O&<<! g*#>! N! N[:Q! N[:[! ! ! Y! N[:]! NN:[! ! ! Nd! N[:Y! NN:[! ! ! N]! N[:Z! N[:W! ! ! NX! N[:Y! NN:[! ! ! J[! N[:]! NN:[! ! ! JX! N[:d! X:Y! ! ! Z[! X:d! X:Y! ! ! ZN! N[:[! NN:Y! ! ! XJ! N[:Y! N[:W! ! ! XQ! N[:Q! W:J! ! ! XY! N[:[! ]:J! ! ! NdQ! N[:J! NN:[! ! ! NWW! N[:[! N[:[! ! ! NX]! N[:J! N[:[! ! ! ( ( B-/-#"$5(!"1L-$%()( ! C*+%!..#!%.-?-.!'--!0%-$%-)!$%-!&#$-)?-#$&.<! $%-! -*).! +&3*)-$$-! '8.&%*)*!O)!Tb8&!j.+!JQ5W5[Z:! .!&#$-)()-$!$%-!)-'7.<! $%-! '*8-! $%&#3/! *$! 2&<<-)-#$! $&8-':! T'! &#2&+*$-2! 19! $%-! #*8-/! $%-! +*.! .+&*$&.!$-'$!$.#! 2-(-#2'! .#27+$-2! $.!+.<! $%-! (*&)-2! $! $-'$/! 17$! +.*$&.1&#!.3.0-)R.#5(*)*8-$)&+! -67&?*.<! $%-! )-*./! $%-! *(().$%-'&':!O&'+7''!9.A&'$1@1+(B#+C(.$%-)':!)-!&'!*#9!*''.9'&':!! ! *: ^%*$!&'!$%-!#7..-+$7)-)!0&.! $-'$! 0%-$%-)! $%-)-! &'! *#9! *''.-'!0%&+%!*)-!)-(-*$-2!8-*'7)-'!.)8*.7-! .&#$!()-'-#$*$&.&Z:2.#5(*)*8-$)&+!*#*.#27+$!$%-!*(().!*#*.9! 2&'$)&17$-2! 2*$*:! "$! &'! +.#'$)*$-!%.8(7$-)!*#2!*2?&+-! $%-!'$72-#$'!..+&*$&...%&'! $-'$! &'! $%-! #.?-!9.<!%*-8.0!$.#! 1-$0--#!J!67*#$&$*$&?-!?*)&*1.7)!%9(..<!$%-!*#*-8&+!8.$! .<!%*-8*$&#&+'!+*#!&#+)-*'-!$%-!./! .!%9(..#!.%-! 2*$*! &'! *! '71'-$! .!().%2!.<! $%-! 1*19:! =&#+-! $%-! '*8(.*8&.#27+$-2! .)!%.-$-! $%-!*#*.*$&?-! 8*3#&$72-! .&-)! '$729:! ^-!*)-!$)9&#3!$.()&*$-!$-'$!&'!*!#.#!1-$0--#!-4(.#! $%-! '&3#! *#2! )-.8-0..9'-!$%-!2*$*!7'&#3!$%-!+..