Professional Documents
Culture Documents
Students Performance
Subject: MAS291
Semester: Spring 2023
Class: IA1701
Group: 1
Instructor: Nguyễn Văn Thiện
1
Table of Contents
Contents
Table of Contents......................................................................................................................... 2
I. Abstract............................................................................................................................ 3
II. Data.................................................................................................................................. 3
III. Test a hypothesis and construct a confidence interval for the mean of a population...............4
IV. Test a hypothesis and construct a confidence interval for the proportion of a population.......5
V. Test a hypothesis and construct a confidence interval for the difference in means of two
populations.............................................................................................................................. 7
VI. Test a hypothesis and construct a confidence interval for the difference in proportions of two
populations............................................................................................................................ 11
VII. Regression Analysis.......................................................................................................... 12
2
I. Abstract
Learning is the process of acquiring and learning to have an understanding of basic skills
and knowledge for yourself. There are many ways to measure the quality of learning.
However, the optimal and best way is often chosen as the average score. There are also a
number of factors that may be involved in the impact on this value. It can include factors
such as family, physical environment, and mental environment. This is considered an
interesting topic for our group. So we discussed and researched to find answers to the
questions:
1. How to improve the students' performance in each test?
2. What are the major factors influencing the test scores?
3. Effectiveness of test preparation course?
Học là quá trình tiếp thu, học hỏi để có những hiểu biết về kỹ năng, kiến
thức cơ bản cho bản thân. Có nhiều cách để đo lường chất lượng học tập.
Tuy nhiên, cách tối ưu và tốt nhất thường được chọn là điểm trung bình.
Ngoài ra còn có một số yếu tố có thể liên quan đến tác động đến giá trị
này. Nó có thể bao gồm các yếu tố như gia đình,hoàn cảnh và yếu tố tâm
lý. Đây được coi là một chủ đề thú vị đối với nhóm chúng tôi. Vì vậy, chúng
tôi đã thảo luận và nghiên cứu để tìm câu hỏi và câu trả lời:
1. Làm thế nào để nâng cao thành tích của học sinh trong mỗi bài kiểm tra?
2. Các yếu tố chính ảnh hưởng đến điểm thi là gì?
3. Hiệu quả của khóa luyện thi?
II. Data
There are many data sources on the internet. Our team's data search priorities are
complete, accurate, and reliable. So we found this data and will use it for analysis and
evaluation in this report.
3
● parental_edu_level: parents' final education
● lunch: having lunch before test (normal or abnormal)
● prepare_test: complete or not complete before test
● math_score: just student’s result on math test
● reading_score: just student’s result on reading test
● writing_score: just student’s result on writing test
III. Test a hypothesis and construct a confidence interval for the mean of a
population
1. Confidence Intervals
From the survey data, we can easily calculate confidence intervals of students' average
score by their gender. Then follow the mathematical formula to determine the confidence
s
interval x ± E where E=t(α /2 , n−1)×
√n
a. Male (Từ dữ liệu khảo sát và sử dụng công thức em có thể xác định dễ dàng khoảng
tin cậy điểm trung bình của học sinh nam)
Mean x=62.56
14.64
E=2.276 × =3.332
√ 100
→ The 95% confidence intervals of average score by male is (59.2, 65.9)
Khoảng tin cậy 95% điểm trung bình của học sinh nam
b. Female
Mean x=68.96
4
Critical Value t (0.025,98)=2.276
15.49
E=2.276 × =3.543
√ 99
→ The 95% confidence intervals of average score by female is (65.4, 72.5)
2. Hypothesis Testing
a. Problem
We predict that “ if a student's score is higher than sample mean, it is likely that all
of them tend to get grade B ”. In other words, the question is whether the population mean
of average scores equals 65.7 (mean of average scores) ?( Chúng tôi dự đoán rằng “nếu
điểm của một học sinh cao hơn trung bình mẫu, có khả năng tất cả các em đều có xu hướng
đạt điểm B”. Nói cách khác, câu hỏi đặt ra là liệu trung bình tổng thể của điểm trung bình
có bằng 65,7 hay không?)
b. Solution
Consider the problem: In this problem, we will test the hypothesis of score in the two
genders.
Test the hypothesis (at α = 0.05) whether the mean score is equal to 65.7 or not.
Male
5
Test Statistic:
t = -2.995
–> Reject H0
Female
Fail to reject H0 if -2.0243 < t < 2.0243
Test Statistic:
t = 2.898
→ Reject H0
3. Conclusion
According to the above results, we can conclude that if their score is greater than average,
not all of them get B grade.( Theo kết quả trên, chúng ta có thể kết luận rằng nếu điểm của
họ lớn hơn mức trung bình, thì không phải tất cả họ đều đạt điểm B)
IV. Test a hypothesis and construct a confidence interval for the proportion
of a population.
1. Hypothesis Testing and Construct Confidence Interval:
a. Problem
“Does that data support the claim that the proportion of students having average score
under 80 is 70% of all students’ population?”.
b. Solution
Consider the problem: In this problem, we will test the hypothesis of average score and
construct a confidence interval for the proportion of the population
Test the hypothesis (at α = 0.05) whether the proportion is equal 70% or not.
6
Fail to reject H0 if -1.96 <= z <= 1.96
Test Statistic: z = 3.36
–> Reject H0
2. Conclusion
According to the above results, we can conclude the proportion of students having average
score under 80 is NOT 70% of all students’ population
1. The Gender
a. Confidence Intervals
To survey the grade through the gender. Male has 100 people, the average grade is
approximately 62.56, with a sample standard deviation of 14.64000331. Female has 99
7
people, the average grade is approximately 68.96, with a sample deviation of
15.48792785.
S 12 (n 1−1)+ S 22 (n 2−1)
= = 227.0379614
n1+ n2−2
Thus:
√
( x 1 − x 2 ) ± tα/2 × Sp 2 ( 1 + 1 ) = -6.4 ± 4.166
n 1 n2
→ We are 95% confident that the difference in the population means lies in the
interval [−10.566, −2.234], in the sense that in repeated sampling 95% of all intervals
constructed from the sample data in this manner will contain μ1 − μ2.
b. Hypothesis Testing
Male Female
n = 100 n = 99
x 1 = 62.56 x 2 = 68.96
S1 = 14.64000331 S2 = 15.48792785
The parameters of gender of student are u1 and u2, we want to know who student
that is male or female has greater grade. Test at the 5% level of significance whether
the data provide sufficient evidence to conclude that the popularity of the gender is
different?
8
H0: u1 = u2
vs.
Since the symbol in H1 is “≠”, this is a two-tailed test, so there are two critical values, ±
t(α/2) = ± t(0.025) with the heading df = 197 we read off
t(0.025) = 1.972079034 . The rejection region is (−∞, -1.972079034] ∪ [ 1.972079034,
∞).
–>The test statistic in the rejection region. The decision is “reject H0”.
c. Using Excel
9
We construct a point estimate and a 95% confidence interval means that:
2 2
S 1 (n 1−1)+ S 2 (n 2−1)
= = 218.034448
n1+ n2−2
Thus:
√
( x 1 − x 2 ) ± tα/2 × Sp 2 (
1 1
+ ) = 9.1 ± 4.278668296
n 1 n2
→ We are 95% confident that the difference in the population means lies in the
interval [4.821331704, 13.3786683], in the sense that in repeated sampling 95% of all
intervals constructed from the sample data in this manner will contain μ1 − μ2.
b. Hypothesis Testing:
Complete None
n = 73 n = 126
x 1 = 71.5 x 2 = 62.4
S1 = 13.74886186 S2 = 15.32123747
The parameters of student’s preparation are u1 and u2. We want to know which
students from “Complete” or “None” who have greater grades. Test at the 5% level of
significance whether the data provide sufficient evidence to conclude that the
popularity of the race is different?
H0: u1 = u2
vs.
10
Since the samples are independent, the test statistic is:
( x 1−x 2 )−( p 1− p 2)
√ 1 1 = 4.180854822
Sp2 ( + )
n1 n2
Since the symbol in H1 is “≠”, this is a two-tailed test, so there are two critical values, ±
t(α/2) = ± t(0.025) with the heading df = 197 we read off
t(0.025) = 1.972079034 . The rejection region is (−∞, -1.972079034] ∪
[ 1.972079034, ∞).
–>The test statistic in the rejection region. The decision is “reject H0”.
c. Using Excel
VI. Test a hypothesis and construct a confidence interval for the difference
in proportions of two populations.
1. Confidence Intervals
From the survey data, we can easily calculate confidence intervals of students' proportion
score by their gender. Then follow the mathematical formula to determine the confidence
interval P ± E where E=Z ( α /2)× √(( p(1−p))/n)
a) Male
11
n 100
xa 6
xb 27
xc 33
xd 27
xf 7
pa 6%
pb 27%
pc 33%
pd 27%
pf 7%
α 0.05
Z(α /2) 1.96
Ea 0.0466
Eb 0.087
Ec 0.0921
Ed 0.087
Ef 0.05
n 99
xa 13
xb 32
xc 37
xd 13
xf 4
pa 13.1%
pb 32.3%
pc 37.3%
pd 27.3%
pf 7%
α 0.05
Z(α /2) 1.96
Ea 0.0665
Eb 0.092
Ec 0.095
Ed 0.0878
Ef 0.05
12
→ The 95% confidence intervals of proportion score = A by female is (0.0645,0.1978)
→ The 95% confidence intervals of proportion score = B by female is (0.231,0.415)
→ The 95% confidence intervals of proportion score = C by female is (0.278,0.468)
→ The 95% confidence intervals of proportion score = D by female is (0.1852,0.3608)
→ The 95% confidence intervals of proportion score = F by female is (0.02,0.12)
13
x = 67.62814
y = 65.74372
Sxy = (72*72+90*82+….+53*51) -199*67.62814*65.74327 = 46900.03
Sxx = ¿ …….+53 ¿ ¿2-199*67.62814*67.62814 = 49332.27
46900.03
B1 = =0.9508 ; B0=65.74372−0.9508∗67.62814 =1.449
49332.27
Regression line=09508∗x +1.449
14