Professional Documents
Culture Documents
year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2211
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 5% of the valid messages. Also, 10% of the messages
are spam.
0.1 * 0.2 + 0.9 * 0.05 = 0.065 => B
1. Find the probability that the message contains the word "free".
A 0.365 B 0.065 C 0.165 D 0.265 E 0.465
P(spam | free) = P(spam & free) / P(free)
2. Find the probability that the message is spam given that it contains the word "free".
A 0.0077 B 0.3077 C 0.5077 D 0.6077 E 0.4077 = 0.1 * 0.2 / 0.065
3. Compute the probability that the message is spam or contains the word "free".
A 0.145 B 0.445 C 0.545 D 0.345 E 0.245 1 - 0.9 * (1 - 0.05)
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 5 packages resulted in the following data: 15, 14.2, 17.8, 14.1,
16.7.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 13.7 (%). mean_sample = u = 15.56
std_sample = sqrt(((15-u)^2 + ... ) / 4) = 1.6288
4. Find the estimated standard deviation of the sample mean. std_SM = std_sample / sqrt(5) => A
A 0.7284 B 0.0284 C 1.9284 D 2.4284 E 2.6284
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1.9, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.25?
z_0.05 * sqrt(1.9) / sqrt(N) <= 0.25 => N >= 82..
A 79 B 82 C 78 D 76 E 87
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.4053 B 0.399 C 0.0197 D 0.2364 E 0.1468
14. Find the coefficient of determination for the linear regression model.
A 84.8381 B 94.3945 C 97.1568 D 86.3936 E 75.5051
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 4.7 4.9 5.8 4.8 3.7
Algorithm 2 6.7 5.8 7.7 6 8.7
Algorithm 3 8.6 9.6 9.5 10 10.8
Consider an ANOVA situation with a significance level α = 0.01.
15. Choose the correct quantity to describe the total variability between treatment means.
A 60.7413 B 672.7391 C 71.4373 D 300.7403 E 10.696
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.8242 B 0.8843 C 2.5643 D 3.2843 E 2.0393
20. Find a 99% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [-3.1356,0.5128] B [-7.0241,-3.3757] C [-4.0242,-0.3758] D [-1.4691,2.1793]
E [-3.6911,-0.0427]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
22)
H_0: Before - after <= 0
H_1: Before - after > 0
u_before = 260.7, u_after = 253.8 => u_before - u_after = 6.9
var_before = 690.2333, var_after = 695.5111 => pop_var equals
s_before = 26.2723, s_after = 26.3725
s_p = sqrt((n1-1) * s_1^2 + (n2-1) * s_2^2) / (n1 + n2 - 2)) = 26.3225
std_(u_before - u_after) = s_p * sqrt(1/n_1 + 1/n_2), where s_p is the pooled standard dev
= 11.772
Test statistic = 6.9 / 11.772
t_0.05_df_18 = 1.734 => Reject H_0
Page 4