Professional Documents
Culture Documents
year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2211
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 5% of the valid messages. Also, 10% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.365 B 0.065 C 0.165 D 0.265 E 0.465
2. Find the probability that the message is spam given that it contains the word "free".
A 0.0077 B 0.3077 C 0.5077 D 0.6077 E 0.4077
3. Compute the probability that the message is spam or contains the word "free".
A 0.145 B 0.445 C 0.545 D 0.345 E 0.245
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 5 packages resulted in the following data: 15, 14.2, 17.8, 14.1,
16.7.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 13.7 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1.9, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.25?
A 79 B 82 C 78 D 76 E 87
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.4053 B 0.399 C 0.0197 D 0.2364 E 0.1468
14. Find the coefficient of determination for the linear regression model.
A 84.8381 B 94.3945 C 97.1568 D 86.3936 E 75.5051
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 4.7 4.9 5.8 4.8 3.7
Algorithm 2 6.7 5.8 7.7 6 8.7
Algorithm 3 8.6 9.6 9.5 10 10.8
Consider an ANOVA situation with a significance level α = 0.01.
15. Choose the correct quantity to describe the total variability between treatment means.
A 60.7413 B 672.7391 C 71.4373 D 300.7403 E 10.696
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.8242 B 0.8843 C 2.5643 D 3.2843 E 2.0393
20. Find a 99% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [-3.1356,0.5128] B [-7.0241,-3.3757] C [-4.0242,-0.3758] D [-1.4691,2.1793]
E [-3.6911,-0.0427]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2212
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 5% of the valid messages. Also, 15% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.4725 B 0.1725 C 0.2725 D 0.3725 E 0.0725
2. Find the probability that the message is spam given that it contains the word "free".
A 0.0138 B 0.8138 C 0.7138 D 0.4138 E 0.5138
3. Compute the probability that the message is spam or contains the word "free".
A 0.2925 B 0.0925 C 0.1925 D 0.4925 E 0.3925
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 6 packages resulted in the following data: 15.7, 15.3, 16.6, 16.1,
14.6, 15.3.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.01 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is less than 15.2 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1, how many
packages must be collected to ensure that the radius of a 99% two-sided confidence interval for the
population mean is at most 0.3?
A 74 B 80 C 68 D 83 E 75
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.1035 B 0.3198 C 0.1082 D 0.2824 E 0.0168
14. Find the coefficient of determination for the linear regression model.
A 97.3693 B 76.9244 C 98.6759 D 92.4794 E 95.5904
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 12 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 6.3 6.6 5 6.4
Algorithm 2 9.9 8.2 6.6 7.5
Algorithm 3 2.4 2.2 1.5 4.4
Consider an ANOVA situation with a significance level α = 0.05.
15. Choose the correct quantity to describe the total variability between treatment means.
A 12.085 B 72.3967 C 920.3087 D 60.3117 E 734.3093
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 0.6785 B 2.2385 C 3.6535 D 1.8534 E 3.3235
20. Find a 95% confidence interval for the difference in the mean costs between algorithms 1 and 2.
A [-6.2728,-2.566] B [-3.4953,0.2115] C [-2.9398,0.767] D [-3.8284,-0.1216] E [-
6.8283,-3.1215]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2213
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 15% of the spam messages and only 5% of the valid messages. Also, 15% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.165 B 0.065 C 0.465 D 0.265 E 0.365
2. Find the probability that the message is spam given that it contains the word "free".
A 0.0462 B 0.3462 C 0.1462 D 0.5462 E 0.7462
3. Compute the probability that the message is spam or contains the word "free".
A 0.5925 B 0.0925 C 0.1925 D 0.4925 E 0.2925
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 5 packages resulted in the following data: 13.5, 15.5, 16.4, 15.4,
14.5.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 13.9 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 0.8, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.3?
A 18 B 25 C 21 D 24 E 32
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.0226 B 0.3584 C 0.1631 D 0.1677 E 0.2807
14. Find the coefficient of determination for the linear regression model.
A 83.8238 B 94.7123 C 88.7137 D 94.188 E 77.6018
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 10 9.1 10.8 10.6 9.2
Algorithm 2 6.2 6.1 6 6.4 7.9
Algorithm 3 2.8 6.6 4.3 3.5 4.2
Consider an ANOVA situation with a significance level α = 0.01.
15. Choose the correct quantity to describe the total variability between treatment means.
A 104.249 B 259.2485 C 94.3373 D 13.088 E 81.2493
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.7779 B 1.1229 C 2.0178 D 2.6679 E 2.9429
20. Find a 99% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [3.9573,7.9929] B [-1.5977,2.4379] C [1.4022,5.4378] D [3.4018,7.4374]
E [0.0688,4.1044]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2214
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 5% of the valid messages. Also, 10% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.365 B 0.465 C 0.165 D 0.065 E 0.265
2. Find the probability that the message is spam given that it contains the word "free".
A 0.2077 B 0.4077 C 0.3077 D 0.1077 E 0.0077
3. Compute the probability that the message is spam or contains the word "free".
A 0.345 B 0.045 C 0.245 D 0.145 E 0.445
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 7 packages resulted in the following data: 14.5, 19.5, 15.3, 13,
18.3, 18.9, 16.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 17 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.2?
A 68 B 66 C 61 D 70 E 75
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.335 B 0.0857 C 0.0262 D 0.425 E 0.0921
14. Find the coefficient of determination for the linear regression model.
A 89.1966 B 59.1154 C 76.2259 D 60.6709 E 79.5603
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 7.2 6.9 8.1 7.5 8.5
Algorithm 2 4 3.5 5.1 4.2 3
Algorithm 3 9.4 8.9 8.6 9.6 8.7
Consider an ANOVA situation with a significance level α = 0.01.
15. Choose the correct quantity to describe the total variability between treatment means.
A 711.8457 B 4.976 C 68.848 D 804.8454 E 73.824
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.2793 B 2.4693 C 1.2442 D 3.1943 E 2.7193
20. Find a 99% confidence interval for the difference in the mean costs between algorithms 1 and 2.
A [4.9909,7.4793] B [4.4354,6.9238] C [3.3244,5.8128] D [-0.0086,2.4798] E
[2.4358,4.9242]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2215
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 10% of the valid messages. Also, 15% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.315 B 0.115 C 0.215 D 0.015 E 0.415
2. Find the probability that the message is spam given that it contains the word "free".
A 0.6609 B 0.2609 C 0.3609 D 0.0609 E 0.4609
3. Compute the probability that the message is spam or contains the word "free".
A 0.535 B 0.635 C 0.435 D 0.235 E 0.135
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 5 packages resulted in the following data: 16.1, 15, 14.3, 16.2,
15.3.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 16 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 0.7, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.25?
A 33 B 39 C 28 D 40 E 31
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.3982 B 0.0161 C 0.1534 D 0.2017 E 0.2711
14. Find the coefficient of determination for the linear regression model.
A 94.2667 B 98.7098 C 89.3768 D 97.0911 E 80.0438
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 4.1 3.6 4.5 3.5 4.4
Algorithm 2 8.6 7.8 11.1 7.8 8.3
Algorithm 3 6.2 5.9 6.6 6.5 4.8
Consider an ANOVA situation with a significance level α = 0.05.
15. Choose the correct quantity to describe the total variability between treatment means.
A 202.6806 B 55.6813 C 10.476 D 78.681 E 66.1573
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 2.7827 B 0.8327 C 1.2876 D 3.0827 E 3.2127
20. Find a 95% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [-8.9875,-6.4123] B [-4.5435,-1.9683] C [-7.321,-4.7458] D [-5.9876,-3.4124]
E [-7.8765,-5.3013]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2216
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 10% of the valid messages. Also, 10% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.41 B 0.01 C 0.31 D 0.11 E 0.21
2. Find the probability that the message is spam given that it contains the word "free".
A 0.0818 B 0.3818 C 0.1818 D 0.5818 E 0.4818
3. Compute the probability that the message is spam or contains the word "free".
A 0.09 B 0.29 C 0.19 D 0.59 E 0.39
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 5 packages resulted in the following data: 16.9, 14.4, 15.4, 15.3,
16.3.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is less than 15.5 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1.8, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.1?
A 492 B 479 C 476 D 486 E 485
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.0151 B 0.0282 C 0.439 D 0.2824 E 0.0728
14. Find the coefficient of determination for the linear regression model.
A 97.2131 B 82.9902 C 92.3232 D 95.4342 E 98.5967
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 12 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 7.5 5.5 7.3 6.3
Algorithm 2 3.1 2.9 5.7 4.1
Algorithm 3 9.2 9.1 10.8 11.2
Consider an ANOVA situation with a significance level α = 0.05.
15. Choose the correct quantity to describe the total variability between treatment means.
A 935.3787 B 439.3803 C 86.3892 D 75.3817 E 11.0075
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.7689 B 0.294 C 2.409 D 3.539 E 2.134
20. Find a 95% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [-0.4023,3.1355] B [2.9307,6.4685] C [-0.9578,2.58] D [0.9311,4.4689]
E [2.3752,5.913]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2217
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 15% of the spam messages and only 10% of the valid messages. Also, 15% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.2075 B 0.5075 C 0.3075 D 0.1075 E 0.4075
2. Find the probability that the message is spam given that it contains the word "free".
A 0.1093 B 0.5093 C 0.3093 D 0.4093 E 0.2093
3. Compute the probability that the message is spam or contains the word "free".
A 0.035 B 0.435 C 0.635 D 0.235 E 0.335
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 7 packages resulted in the following data: 16.1, 14, 11.4, 11.6,
13.2, 17.8, 14.6.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.1 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is less than 13.8 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1.1, how many
packages must be collected to ensure that the radius of a 90% two-sided confidence interval for the
population mean is at most 0.2?
A 74 B 78 C 66 D 75 E 82
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.1425 B 0.4805 C 0.0242 D 0.107 E 0.039
14. Find the coefficient of determination for the linear regression model.
A 99.985 B 96.874 C 90.8754 D 85.9855 E 95.3286
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 4.5 3.6 3.8 5.4 5.8
Algorithm 2 8.2 9 8.6 9.1 7.7
Algorithm 3 10.9 9.2 9.4 11.2 9.8
Consider an ANOVA situation with a significance level α = 0.05.
15. Choose the correct quantity to describe the total variability between treatment means.
A 79.5613 B 350.5602 C 722.559 D 87.8773 E 8.316
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 1.1373 B 0.1973 C 3.0423 D 2.5623 E 1.1472
20. Find a 95% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [-7.4916,-5.1972] B [-3.6031,-1.3087] C [-4.7141,-2.4197] D [-5.0472,-2.7528]
E [-3.0476,-0.7532]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4
Semester/Acad. year 1 2022-2023
Final Exam Date December 24th , 2022
Course title Probability and Statistics
UNIVERSITY OF TECHNOLOGY - VNUHCM Course ID MT2013 Question sheet code 2218
Faculty of Applied Science Duration 100 minutes Shift 16:00
Instructions to students:
- You are allowed to use your OWN materials and calculator. Total available score: 10.
- At the beginning of the working time, you MUST fill in your full name and student ID on this question
sheet. There are 22 questions on 4 pages. Do not round between steps. Round your final answers to 4
decimal places.
Questions 1 through 3. An e-mail filter is planned to separate valid e-mails from spam. The word
"free"occurs in 20% of the spam messages and only 5% of the valid messages. Also, 15% of the messages
are spam.
1. Find the probability that the message contains the word "free".
A 0.0725 B 0.1725 C 0.2725 D 0.3725 E 0.4725
2. Find the probability that the message is spam given that it contains the word "free".
A 0.8138 B 0.4138 C 0.1138 D 0.6138 E 0.2138
3. Compute the probability that the message is spam or contains the word "free".
A 0.3925 B 0.2925 C 0.0925 D 0.1925 E 0.5925
Questions 4 through 8.
A particular brand of diet margarine was analyzed to determine the level of polyunsaturated fatty acid
(in percentages). A random sample of 7 packages resulted in the following data: 15.4, 17.8, 15, 16.8, 16,
14.4, 13.9.
It is assumed that the level of polyunsaturated fatty acid follows a normal distribution and a significant
level of 0.01 is used. Scientists want to know if the data show enough evidence to prove that the average
level of polyunsaturated fatty acid is not equal to 16.8 (%).
8. If the population variance of the polyunsaturated fatty acid levels is assumed to be 1.8, how many
packages must be collected to ensure that the radius of a 99% two-sided confidence interval for the
population mean is at most 0.25?
A 202 B 194 C 192 D 189 E 201
11. Find the estimated standard error for the fitted slope coefficient β̂1 .
A 0.2173 B 0.2915 C 0.2053 D 0.0259 E 0.0225
14. Find the coefficient of determination for the linear regression model.
A 92.3202 B 95.2078 C 73.4308 D 79.6528 E 96.0834
Questions 15 through 20. An article in Communications of the ACM reported on a study of different
algorithms for estimating software development costs. Three algorithms were applied to 15 software
development projects and the development costs (hours) were observed. The data are given as below.
Algorithm 1 8.8 8.2 9.3 8.6 8.3
Algorithm 2 4 3.3 1 3.5 3.5
Algorithm 3 6.3 6.9 4.4 7.9 6.8
Consider an ANOVA situation with a significance level α = 0.01.
15. Choose the correct quantity to describe the total variability between treatment means.
A 815.0787 B 505.0797 C 92.0773 D 12.996 E 79.0813
Page 2
18. Find the least significant difference (LSD) for the Fisher’s multiple comparision.
A 2.7058 B 0.1908 C 2.0107 D 1.2458 E 2.7558
20. Find a 99% confidence interval for the difference in the mean costs between algorithms 1 and
2. A [3.5693,7.5907] B [5.5689,9.5903] C [5.0134,9.0348] D [4.4579,8.4793]
E [6.1244,10.1458]
21. A factory has 2 firms producing the same type of product. The numbers of errors per product
produced by firm A and firm B follow Poisson distributions with means of 0.1 and 0.2 respectively.
Furthermore, errors occur independently between products regardless the producing firms.
(a) Suppose that the proportion of products produced by firm A is 0.25. In a random sample of 15
products produced by this factory, find the probability that there are more than 12 products
that have exactly 3 errors.
(b) In a random sample of 100 products produced by firm A, find the probability that there are
from 60 to 95 products that have at least 1 error.
Page 3
22. Ten adult males between the ages of 35 and 50 participated in a study to evaluate the effect of diet
and exercise on blood cholesterol levels. The total cholesterol was measured in each subject initially
and then three months after participating in an aerobic exercise program and switching to a low-fat
diet. The data show in the following table.
Before 230 243 256 260 295 283 212 287 269 272
After 229 240 267 257 280 280 230 280 270 205
Suppose that the blood cholesterol levels of adult males between the ages of 35 and 50 follow a
normal distribution. Do the data support the claim that low-fat diet and aerobic exercise are of
value in producing a mean reduction in blood cholesterol levels at the significance level α = 0.05?
Page 4