You are on page 1of 7

Problems on Regression and Correlation Prepared by: Dr.

Elias Dabeet
Q1. Dr. Green (a pediatrician) wanted to test if there is a correlation between the number of meals consumed by a child per day (X) and the child weight (Y). Included you will find a table containing the information on 5 of the children. Use the table to answer the following: Child Number of meals consumed per day (X) 11 16 12 19 child weight (Y) 8 11 9 13 X Y XY

Ahmad Ali Osama Husien Total

121 256 144 361

64 121 81 169

88 176 108 247

58

41

882

435

619

a. Determine the simple linear regression equation. b. Determine the correlation coefficient. Interpret it in words. c. What is the expected child weight if the number of meals increased by 2 meals per day? Q2. A hospital supervisor wishes to find the relationship between the number of nurses on a job and the number of patients examined for a shift. Listed below is the result for a sample of 4 days. Let the number of patients ( y ) be the dependent variable. Nurses ( x ) 9 3 5 8 a. Compute Patients ( y ) 12 14 11 13
x2

y2

xy

x , y , x , y , xy .
2 2

b. Determine the coefficient of correlation. Interpret its meaning.

c. Determine the estimated simple linear regression equation.

d. If the number of nurses on a job changes by 3, what is the corresponding change in the number of examined patients.

Q3. A study was conducted on the relation between advertising expenditures and amount of sales for a medical device product, the data are in the following table: Advertising Expenditures ( $1000 ) x 2 4 5 7 8 Amount of Sales ( 1000 units ) y 2 3 2 6 4

a. Determine the estimated linear regression equation. b. Determine the coefficient of correlation and the coefficient of determination. Interpret their meanings. c. Describe the relation between X & Y ( Direct, Inverse).Explain. c. If the advertising expenditures changes by 2 units, what is the corresponding change in amount of sales of the medical device?
Q4. A computer program has printed out r as equal to 0.8214 for the correlation between the amount of fat consumed daily in kg (X) and the amount of percent increase in risk for CHD (Y). Find out if the program is correct then estimate the precise amount of increase in risk for CHD if fat consumed is 8 kg. Child Number 1 2 3 4 5 6 7 Daily fat consumption (X) 1 2 3 4 5 6 7 Increase in CHD risk (Y) 2 1 4 3 7 5 6

Q5. A medical study at the college of orthopedics conducted a study to determine the correlation between height and weight of female children. A sample of 6 children were selected and the data is in the following table. Height 12 10 14 11 12 9 Weight 18 17 23 19 20 15

1. Determine the correlation coefficient. Interpret the answer in words. 2. Determine the linear regression equation. 3. Predict the weight of a female children whose height is 13.

Problems on Probability:
Q6. If one person is selected randomly from a set of 75 persons which are classified by according to three categories of ages and three categories of weights: Slim ( S ) 15 10 7 32 ) ) Normal ( N ) 10 12 11 33 Fat ( F ) 2 3 5 10 Total 27 25 23 75

( 05-24 years ) A ( 25-44 years ) B ( 45-64 years ) C Total 1. 2. 3. ( ( ( )

4. Are the events B and S independent? Why? 5. 6. ( ( ) )

Q7. A statistic instructor has noted from past experience that 75% of the students in the course do their homework. On the other hand, he found that of those who do their homework 90% pass the course, while of those who do not do their homework only 25% pass the course. Use the above percentages, answer the following: a. What is the probability that a randomly selected student will pass the course. b. Given that a student did not pass the course, what is the probability that he/she completed the homework. Q8. In an experiment to study the relationship of hypertension and smoking habits, the following data are
collected for 180 individuals: Non-smokers Hypertension 21 No hypertension 48 Moderate smokers 36 26 Heavy smokers 19 30

If one of these individuals is selected at random, find the probability that the person is

1. experiencing hypertension, given that the person is a moderate smoker; 2. non-smoker or experiencing hypertension. 3. a smoker, given that the person is experiencing no hypertension.

Q9. 1418 men were cross-classified according to their smoking status and their lung cancer status as in the following table: Lung Cancer Present Absent Total Smoker Yes 688 650 1338 No 21 59 80 Total 709 709 1418 If one of these men is selected at random, find the probability that he is a) a smoker b) having lung cancer. . c) non-smoker or having lung cancer. d) having lung cancer, given that he is a smoker. e) a smoker, given that he does not have lung cancer. h) Are the events being a smoker and having lung cancer independent? Why?

Q10. A lab test is 95 percent effective at detecting a certain disease when it is present. When the disease is not present, the test is 99 percent effective at declaring the subject negative. If 8 percent of the population has the disease , what is the probability that a subject has the disease given that his test is positive?

Problems on BINOMIAL, POISSON, NORMAL:


Q11. The probability of a person with a headache finding relief with a pill is .8. If four randomly selected individuals with headaches are given the pill. Find the probability that the number obtaining relief will be: (a) None will find relief. (b) That exactly two will find relief. (c) That at most two will find relief. Q12. During the early years of the AIDS epidemic 35% of the population with AIDS developed Kaposis Sarcoma (KS). (a) Suppose one treated 20 patients from this population, what is the probability of having at least 8 patients with KS? (b) Suppose one treated 20 patients from this population, what is the probability of having less than 8 but more than or equal to 5 patients with KS? (c) If one treated 500 patients from this population, how many patients would you expect to have KS? Q13. In the April 8, 1994 issue of MMWR on adolescent risk behavior it was determined that 25% of the population consisting of 12-13 year olds reported at least one health-risk behavior (smoking, not wearing seatbelts, alcohol or drug use, etc.) in the last 30 days. (a) Suppose one has a class of 15 adolescents from this population, what is the probability of having at least 4 students exhibiting a health-risk behavior in the last 30 days? (b) Suppose one has a class of 15 adolescents from this population, what is the probability of having at least 2 but less than 6 students exhibiting a health-risk behavior in the last 30 days4 Q14. Assuming the heights of college women are normally distributed with mean 65 in. and standard deviation 2.5 in., find the following (a) The percent of women are taller than 67 in. (b) The percent of women are shorter than 62 in. (c) The percent of women are between 62.5 and 67.5 in. (d) The percent of women are between 60 and 70 in. Q15. A vending machine pours soft drinks into cups. The amount of drink dispersed is normally distributed with mean 7.6 oz. and standard deviation 0.4 oz. Answer the following (a) Estimate the probability that the machine will overfill an 8 oz. cup. (b) If one had to fill 800 8 zo cup. How many would you expect to be overfilled? (c) Suppose you filled 20 cups what is the probability that at least one cup would be overfilled? Q16. Suppose T-Lymphocyte (L2) counts are normally distributed with mean 30 and standard deviation 6. Find. (a) The probability of having an L2 count of greater than 40. (b) The 5th percentile (c) The 97.5 percentile.

Q17. Using the results of a recent clinical study, the mean CD4 cell count for healthy individuals ages 18-70 is 700. If one assumes that the CD4 counts are normally distributed with a standard deviation 100, find the probability of having an individual from this population with a CD4 count of less than 550. Q18. Suppose that one knows that the CD4 cell counts for healthy individuals is normally distributed with mean 1000 and standard deviation 200. Find the probability that the sample mean of 16 individuals will be (a) less than 900. (b) more than 1100. (c) between 900 and 950.

Q19. Using the results of a recent clinical study, the mean CD4 cell count for healthy individuals ages 18-70 is 700. If one assumes that the CD4 counts are normally distributed with a standard deviation 100, find the probability of having an individual from this population with a CD4 count greater than 850.

Q20. The life time of a medical device has a normal distribution with mean
1. between 2100 & 2400 hours? 2. less than 1470 hours? 3. The 80th percentile

= 2000 hours & a standard deviation = 200 hours. What is the probability that such a device will last