You are on page 1of 1

Yildiz Technical University, Computer Engineering Department Multivariate Statistical Data Analysis, 2013, Homework 1 Due date: 26 October

2013 Q1: We have measured the height (in inches) and weight (in pounds) for five newborn babies. Manually calculate the mean and standard deviation of height and weight; show all the steps (Table 1).
Tablo 1: Height (in inches) and weight (in pounds) for five newborn babies

Observation 1 2 3 4 5

Height 18 21 17 16 19

Weight 7.8 9.1 8.2 6.4 8.8

Q2: Download the BodyTemperature.txt from the course website and use the data set , a) Find the five-number data summary for all numerical variables. b) Create the scatterplot for body temperature by heart rate. Describe the pattern and comment on possible relationship between the two variables. c) Find the correlation coefficient between body temperature and heart rate. d) Create boxplots of body temperature for men and women separately. Which one tends to be higher? Which one has higher dispersion? e) For numerical variables, provide the histograms. Comment on the central and the form of the histograms. f) For numerical variables, provide the boxplots. Are there any outliers in the data? Q3: A person has received the result of his medical test and realized that his diagnosis was positive (affected by the disease). However, the lab report stated that this kind of test has false positive probability of 0.06 (i.e., diagnosing a healthy person, H as affected, D) and that the probability of false negative is 0.038 (i.e., diagnosing an affected person as healthy). Therefore, while this news was devastating, there is a chance that he was misdiagnosed. After some research, he found out that the probability of this disease in the population is P (D) = 0.02. Find the probability that he is actually affected by the disease given the positive lab result.

You might also like