Professional Documents
Culture Documents
Report Guide 1
Report Guide 1
TASK 1:
-Không phải chọn 3 cái ra giải thích mà là chọn hết tất cả các cột để giải thích
-Nhưng chỉ nhận xét MODE thôi-không cần nhận xét mean vs median
VD: Dataset 1: SEX Mode = 0The number of students, whose sex are female, is
more than male students.
(Tương tự cho tất cả các dataset khác. Có thể nhìn file của nhóm Khương để làm mẫu vì
task1 chung số liệu nhưng mà PHẢI VIẾT CÂU VĂN KHÁC và có thể đổi vị trí các dataset, để
mắc công giống kì lém:D)
TASK 2:
-Trình bày lại hết hypothesis y như viết trong giấy(để table v ào)
Ho: µ1 - µ2 = 0
H1: µ1 - µ2 ≠ 0
The test statistic value: Using t-Test: Two-Sample Assuming Unequal Variances
(Để cái table chổ này)
According to the table we see, t-stat ∈ ±t-critical (ghi số liệu ra), then we cannot reject Ho as
false, µ1 - µ2 = 0.
Therefore, the average Portuguese grade of students from two schools are the same.
TASK 3:
In order to test if the variables of the regression model are significant or not, we conduct the test
of individual regression parameters:
Hypothesis:
Ho: β1=β2=β3=β4=β5
So, at 0.05 level of significance, we can reject H0 as P-value (of Mdeu, Mjob) < 0.05. It
means that based on the hypothesis testing, we have enough evidence to prove that the all
variables are significant.
(Tương tự cho Portugese Grade- do tất cả P-value của PG đề > 0.05 nên chổ “According to
the data, we compare the p-value” mình ghi lại hết tất cả p-value của
Medu,Fjob,Mjob,Fedu,reason bằng bao nhiêu-cái này kết luận khác:
So, at 0.05 level of significance, we cannot reject H0 as all P-value > 0.05. It means that based
on the hypothesis testing, we have no evidence to prove that the all variables are significant.
Model Y = 49.92548461
TASK 4:
-Cách trình bày y như task 3 nhưng do mình lấy số liệu khác (Schoolup, famup,paid,
activities, nursery, higher, internet, romantic)
-Cũng ghi “We take the regression test from Schoolup to romantic for both GM and PG,
which means the Output is Math/Port Grade and the Input are Schoolup, famup,paid,
activities, nursery, higher, internet, romantic:
- Y chang task3 nếu p-value của cái nào < 0.05 thì ghi lại xong kết luận giống như Math
grade của VD trên) Model Y= Model Y=39.14105+(-10.9199*schoolup)+(13.85493*higher)
TASK 5: (này tân làm lại rồi nha, chèn table thôi)
-Ghi lại đề
H0 : μ1-μ2 = 0
H1: μ1-μ2 ≠ 0
t-stat = ± 6.435024955
Since, t-stat NOT ∈ t-Critical two-tail, and p-value<0.05, we can reject Ho.
Hence, there is no evidence that students who performs well in Math also performs well in
Portuguese.
PART 2:
TASK 1: (task này tân với thảo làm duoi day nha, tại giải thích bằng word khó hiểu lắm
:D ))
Base on the descriptive statistic, we conclude that students, who have the more Activities level
, experience the higher English level significantly. To be specific: at the 0.0 and 0.3 Activities
level, the mean English level is approximately 1 and 2, respectively. At 1.0 and 2.0 Activities
level, the mean English level of students is 2.410 and 3. Especailly, the mode of English level
(according to the Activities level), is increasing at each level of Activities ( 1 at 0.0 Activities
level, 2 at 1.0 Activities level, and 3 at Activities level). It means that the students who have
more Activities level, study the higher English level.
As can be seen from the data, students seem to receive nearly the same GPA whether they
participate in university activities or not. The average GPA of students in Activities level 0.0 and
1.0 and 2.0 is approximately 74 ( we do not consider the Activities level 0.3 because of the rare
appearance). It means that although students take much or less time for university activities, it
will not have impact on the result of their GPA.
From the data showed, the number of students increase constantly at higher level of university
activities. In detail, 558 students at 0.0 Activities level; 970 students at 0.3 Activities level; 1104
students and 1316 students at 1.0 and 2.0 Activities level, respectively. That means when
attending to the university, students tend to participate more in university activities rather than
just to study for whole time.
HYPOTHESIS:
F-ratio = 117.0293592
F-critical = 3.002170142
TASK 3:
-Ghi lại đề
-Task này mình làm đúng nhưng viết quá vắng tắt. Viết lại thành câu đầy đủ chủ vị dễ
hiểu hơn
Câu a: ghi thêm câu According to the chart…. (đại ý là số lượng male trong English level(EL)
1.0 thì nhiều nhất so với male trong các EL còn lại (2.0 and 3.0). Ngược lại số lượng female
trong EL của 2.0 and 3.0 thì nhiều hơn male trong hai EL này. Mà EL càng cao thì trình độ
English càng giỏi, nghĩa là trình đọ EL 2.0 and 3.0 cao hơn trình độ EL 1.0 hoặc Trình độ EL 1.0
là thấp nhất. Suy ra female học giỏi English hơn male students.)
Câu b: Looking at the chart: (câu này viết như vầy cũng ổn rồi :D )
Group 3 has the highest income (Family Income/Month is from $800 to $1,600). Besides, in
group 3, the number of student has English level 3 is greater than other groups. Therefore, the
higher income, the better English level of students.
TASK 4:
-Task này mình làm đúng rồi copy lại với edit cho đẹp
-Nhớ thêm câu “At 0.05 level of significant level” vào đầu mỗi câu kết luận.(trước chữ
because)
VD câu a:
At the significant level α = 0.05, since F >FC, we can reject the null hypothesis. Its
means that based on the ANOVA table and hypothesis testing we have sufficient evidence to
prove that the students who participated in extra activities at college tend to have higher Salary
after Graduation.
(tương tự b,c)
TASK 5( cái này tân làm luôn rồi nha chỉ cần chèn lại cái table thôi)
Câu a:
In order to test the variables of the regression modal are significant, we have to conduct the
test of individual regression parameters.
HYPOTHESIS:
Ho: μ1= μ2 =0
P-value(Family Income/Month) = 0
Câu b:
Since, the variables Family Income/Month and Gender above are significant. We have statistical
evidence that variables has a linear relationship with Y and explanatory power with respect to
the dependent variable. Thus, it is okay to say that there is inequality in Salary after Graduation
based on different Gender.
TASK 6: (tân làm luôn rồi nha, chèn table lại là ôkie)
In order to test the variables of the regression modal are significant, we have to conduct the
test of individual regression parameters.
HYPOTHESIS:
P-value(Family Income/Month) = 0
P-value(Gender) = 0,04284
P-value(GPA)= 4,27055E-58
At 0.05 level of significance, we reject H0 as all above P-value < 0.05. Concluding that based on
the hypothesis testing, we have enough evidence to prove that the all variables are significant.
b) Comparing to the regression model obtained in the section 5(which has only 2
variables of the dependents variable Y), the dependents variable Y (Salary after
Graduation) provides linear relationship containing 6 variables(4 new variables:
English level, activities level, GPA, University Ranking ; 2 same variables:Family
Imcome/Month, Activites Level).