You are on page 1of 31

SHD 2793 / SHAD2033

Statistic II
GROUP ASSIGMENT
REMAATHI A/P PERANIANDY SX123240HAFF04

NUR FATHIRAH BT JAMALLUDIN SX150212HAFS04

NUR EMELIA BT ZULKIFLE SX150211HAFS04

ZURAIDAH BT ZULKIFLI SX150215HAFS04

NAZLINDA BT NASIR SX150178HADS04

VIJAYA SAKTHI UTHAIYA KUMAR SX120610HAFS04

LATHA D/O THULASIMANI SX120526HADS04

Lecturer name : DR Noor Azmi

Page 1
INTRODUCTION

Students are the key assets of universities. The students’ performance plays an important role in
producing best quality graduates who will become great leaders and manpower for the country thus
responsible for the country’s economic and social development. Academic achievement is one of the major
factors considered by employers in hiring workers especially for the fresh graduates. Thus, students have to
put the greatest effort in their study to obtain good grades and to prepare themselves for future opportunities
in their career at the same time to fulfill the employer’s demand.
This study aimed to determine the factors affecting the academic performance and study pattern on
university students. The factors affecting a student’s academic performance arise from several reasons. In
line with this, this study was conducted to determine the factors that affect the academic performance of the
students which may consequently help in the improvement of the students and teachers alike. This study may
benefit the students by allowing them to understand better the factors that can affect their academic
performance. They may be able to improve their academic performance with the findings that are established
by this study.
In addition, there is no question that business, education, and all fields of science have come to rely
heavily on the computer. This dependence has become so great that it is no longer possible to understand
research without substantial knowledge of statistics and without at least some rudimentary understanding of
statistical software. Thus, a quantitative-descriptive design was utilized. Self-reporting questionnaire was
the main method used for data gathering. Average weighted mean was used to determine the level of impact
of the different factors affecting the respondents’ academic performance.

Following were the major objectives of the study:


1. To determine different factors of which impact on students’ performance.
2. To find out the normality and correlations of the variables i.e. study hour’s, seriousness and test marks
between trial and final.

Page 2
STATISTICAL APPROACH

For each of the analysis 9 methods introduced below, it is necessary to obtain data from a designed a

questionnaire pertaining students’ performance. All 9 classes of analysis methods (frequency and descriptive

statistics test, normality of the variables test, correlations between the variables test, one simple test, two

sample test independent, two simple test dependent, regression, chi-square test and Anova) are suitable for

data from students’ performance as. However, designs for each of these studies can be optimized with

respect to performance effectiveness and the selected analysis approach. The number and spacing of the

concentrations will depend on the study being conducted and the type of data analysis to be utilized. For each

of the three approaches introduced below, the following is provided:

• A brief description of the use of each method in students’ performance tests.

• A brief outline of specific analysis methods presented in the later Chapters of this document.

• A listing of some major assumptions and limitations for each approach.

Statistical approach is a method of collecting, summarizing, analyzing & interpreting variable

numerical data. Statistical procedure can be contrasted with deterministic methods, which are appropriate

where observations are exactly reproducible or are assumed to be so. While statistical procedure are widely

used in the life sciences, in economics, and in agricultural science, they also have an important role in used

in the life sciences in the study of measurement errors, of random phenomena such as radioactivity or

meteorological events, and in obtaining approximate results where deterministic solutions are hard to apply.

Statistical procedure is a widely used for statistical analysis in social science. It is also used by

market researchers, health researchers, survey companies, government, education researchers, marketing

Page 3
organizations, data miners and others. In addition to statistical analysis, data management and data

documentation are features of the base software.

Data collection involves deciding what to observe in order to obtain information relevant to the

question whose answers are required, and then making the observations. Sampling involves choice of a

sufficient number of observations representing an appropriate population. Experiments with variable

outcomes should be conducted according to principles of experimental design.

Data summarization is the calculation of appropriate statistics and the display of such information in

the form of tables, graphs, or charts. Data may also be adjusted to make different samples more comparable,

using ratios, compensating factors, etc.

Statistical analysis relates observed statistical data to the oretical models, such as probability

distribution or models used in regression analysis. By estimating parameters in the proposed model and

testing hypothesis about rival models, one can assess the value of the information collected and the extent to

which the information can be applied to similar situations. Statistical prediction is the application of the

model thought to be most appropriate, using the estimated values of the parameters.

In this research, we use SPSS to describe a sample data set. SPSS stands for ‘Statistical Package for

the Social Sciences’. It is one of the main data analysis packages used for research. SPSS has a multitude of

commands and covers a wide range of quantitative analysis of which only a fraction will be covered for the

purposes of this course.

In data entry window, need enter and edit data whilst and the output window shows a results, tablets

and graphs. The Data Entry Window and the Output Window are two separate windows as such it must be

saved independently. Thus the Output file cannot be opened or viewed in the Data Entry Window.

Page 4
Level of data it is important to understand in order to correctly choose future tests of significance. Non-

parametric data used in situations which not measure or test subjects. Instead, in non-parametric data places

participants either in group, categories and rates

 Nominal (non-parametric) - numbers used to label categories. (Race, accommodation, have special

boy/girlfriend)

 Ordinal (non-parametric) - this data uses numbers to define an order of performance. (Study

seriousness)

 Interval data (parametric) - time, speed, and distance can all be measured by interval scales as we

have clocks, speedometers and measures (study hours)

 Ratio data (parametric) - like interval data but with an absolute zero. For instance temperature is

interval as it does not have an absolute zero. (Statistics mark trial & final)

Page 5
FINDINGS

1. Checking the data by producing frequency and descriptive statistics

 The frequency table is the table race of respondent. The first column lists the possible values race
(melayu, cina and india) and whether there are any missing values. The second column named
Frequency is the count. There were 40 melayu frequency, 8 melayu frequency and india 12 frequency
that was race of respondent. No missing information in this analysis. The Percent column tells us that
66.7% of melayu race of respondent, 13.3% of cina race of respondent and 20.0% of india race of
respondent from 60 respondent. Valid percent of the 60 individuals on whom have information,
66.7% melayu, 13.3 cina and 20.5% india. The column Cumulative Percent adds up the Valid Percent
values as you move down the table.

Page 6
 The first column lists the possible values of having special friend (yes and no). The second column
named Frequency is the count. There were 24 yes frequency and 36 no frequency of respondent. No
missing information in this analysis. The Percent column tells us that 40% of yes having special
friend of respondent and 60% of no having special friend of respondent from 60 respondent. Valid
percent of the 60 individuals on whom have information 40% of yes having special friend of
respondent and 60% of no having special friend. The column Cumulative Percent adds up the Valid
Percent values as you move down the table.

 The first column lists the possible values consistency (1,2,3,4,5,6 and 7). The second column named
Frequency is the count. There were 1 consistency was 11 frequency , 2 consistency was 5
frequency, 3 consistency was 17 frequency, 4 consistency was 12 frequency, 5 consistency was 7
frequency, 6 consistency was 3 frequency and 7 consistency was 5 frequency. No missing
information in this analysis. The Percent column tells us that 1 consistency was 18%, 2 consistency
was 8.3%, 3 consistency was 28.3%, 4 consistency was 20%, 5 consistency was 11.7%, 6
consistency was 5% and 7 consistency was 8.3%. Valid percent of the 60 individuals on whom
have information, 1 consistency was 18%, 2 consistency was 8.3%, 3 consistency was 28.3%, 4
consistency was 20%, 5 consistency was 11.7%, 6 consistency was 5% and 7 consistency was
8.3%. The column Cumulative Percent adds up the Valid Percent values as you move down the table.

Page 7
 The first column lists the possible values study hour ( 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 34, 35, 36, 37, 38, 39, 40, 42, 43, 44, 45, 48, 53, 55, 57, 58, 60, 65 and 75 ). The second column
named Frequency is the count. There were (20, 21, 23, 32, 34, 42, 43, 44, 45, 48, 53, 55, 57, 58, 65
and 75) study hour was 1 frequency, (24, 25, 26, 27, 31, 36 and 60) study hour was 2 frequency, (22,
30, 35, 37,38, and 39) study hour was 3 frequency and (28, 29, and 40) study hour was 4 frequency.
No missing information in this analysis. The Percent column tells us that (20, 21, 23, 32, 34, 42, 43,
44, 45, 48, 53, 55, 57, 58, 65 and 75) study hour was 1.7%, (24, 25, 26, 27, 31, 36 and 60) study hour
was 3.3%, (22, 30, 35, 37,38, and 39) study hour was 5.0% and (28, 29, and 40) study hour was
6.7%. Valid percent of the 60 individuals on whom have information, (20, 21, 23, 32, 34, 42, 43, 44,
45, 48, 53, 55, 57, 58, 65 and 75) study hour was 1.7%, (24, 25, 26, 27, 31, 36 and 60) study hour
was 3.3%, (22, 30, 35, 37,38, and 39) study hour was 5.0% and (28, 29, and 40) study hour was
6.7%. The column Cumulative Percent adds up the Valid Percent values as you move down the table.

Page 8
 The first column lists the possible values of Accomodation campus (on-campus,off-campus and
squatting on/off campus). The second column named Frequency is the count. There were 17 on
campus frequency, 19 off campus frequency and squatting on/off campus 24 frequency of
respondent. No missing information in this analysis. The Percent column tells us that 28.3% on
campus, 31.7% off campus and squatting on/off campus 40% of 60 respondent. Valid percent of the
60 individuals on whom have information 28.3% on campus, 31.7% off campus and squatting on/off
campus 40%. The column Cumulative Percent adds up the Valid Percent values as you move down
the table.

 The first column lists the possible values readiness (1,2,3,4,5,6 and 7). The second column named
Frequency is the count. There were 1 readiness was 14 frequency, 2 readiness was 4 frequency, 3
readiness was 12 frequency, 4 readiness was 13 frequency, 5 readiness was 8 frequency, 6 readiness
was 5 frequency and 7 readiness was 4 frequency. No missing information in this analysis. The
Percent column tells us that 1 readiness was 23.3%, 2 readiness was 6.7%, 3 readiness was 20.0%,
4 readiness was 21.7%, 5 readiness was 13.3%, 6 readiness was 8.3% and 7 readiness was 6.7%.
Valid percent of the 60 individuals on whom have information1 readiness was 23.3%, 2 readiness
was 6.7%, 3 readiness was 20.0%, 4 readiness was 21.7%, 5 readiness was 13.3%, 6 readiness was
8.3% and 7 readiness was 6.7%. The column Cumulative Percent adds up the Valid Percent values as
you move down the table.

Page 9
 The first column lists the possible values readiness (1, 2, 3, 4, 5, 6 and 7). The second column named
frequency is the count. There were 1 focus was 16 frequency, 2 focus was 5 frequency, 3 focus was
14 frequency, 4 focus was 14 frequency, 5 focus was 5 frequency, 6 focus was 3 frequency and 7
focus was 3 frequency. No missing information in this analysis. The Percent column tells us that 1
focus was 26.7%, 2 focus was 8.3%, 3 focus was 23.3%, 4 focus was 23.3%, 5 focus was 8.3%, 6
focus was 5.0% and 7 focus was 5.0%. Valid percent of the 60 individuals on whom have
information 1 focus was 26.7%, 2 focus was 8.3%, 3 focus was 23.3%, 4 focus was 23.3%, 5
focus was 8.3%, 6 focus was 5.0% and 7 focus was 5.0%. The column Cumulative Percent adds up
the Valid Percent values as you move down the table.

Page 10
 The first column lists the possible StatF ( 10, 25, 35, 40, 45, 49, 55, 58, 61, 62, 63, 64, 65, 68, 76, 82,
88, 89, 95, 96 and 97 ). The second column named Frequency is the count. There were (10, 40, 49,
63 and 64) StatF was 1 frequency, (25, 35, 65, 88, 95 and 96) StatF was 2 frequency, (45, 76, 82 and
97) StatF was 3 frequency, (58 and 89) StatF was 4 frequency, (55 and 61) StatF was 5 frequency
and (62) StatF was 6 frequency. No missing information in this analysis. The Percent column tells us
that (10, 40, 49, 63 and 64) statF was 1.7%, (25, 35, 65, 88, 95 and 96) statF was 3.3%, , (45, 76, 82
and 97) StatF was 5.0%, (58 and 89) StatF was 6.7%, (55 and 61) StatF was 8.35 and (62) statF
was 10.0%. Valid percent of the 60 individuals on whom have information, (10, 40, 49, 63 and 64)
statF was 1.7%, (25, 35, 65, 88, 95 and 96) statF was 3.3%, , (45, 76, 82 and 97) StatF was 5.0%,
(58 and 89) StatF was 6.7%, (55 and 61) StatF was 8.35 and (62) statF was 10.0%.. The column
Cumulative Percent adds up the Valid Percent values as you move down the table.

Page 11
 The first column lists the possible StatT ( 10, 25, 31, 35, 37, 38, 40, 41, 45,49, 52, 53, 55, 56, 57, 58,
61, 62, 63, 64, 65, 68, 75, 76, 77, 78, 79, 82, 83, 88, 89, 95, 96, 97, 98 and 100 ). The second column
named Frequency is the count. There were (10, 25, 31, 37, 38, 41, 49, 52, 53, 56, 57, 63,64, 75, 77,
78, 79, 83, 88, 95, 98 and 100) StatT was 1, (35, 55, 61, 65, 96 and 97) StatT was 2 frequency, (40,
45, 58, 62, 76, and 82) StatT was 3 frequency and (68 and 89) StatT was 4 frequency. No missing
information in this analysis. The Percent column tells us that (10, 25, 31, 37, 38, 41, 49, 52, 53, 56,
57, 63,64, 75, 77, 78, 79, 83, 88, 95, 98 and 100) StatT was 1.7%, (35, 55, 61, 65, 96 and 97) StatT
was 3.3%, (40, 45, 58, 62, 76, and 82) StatT was 5.0% and (68 and 89) StatT . Valid percent of the
60 individuals on whom have information, (10, 25, 31, 37, 38, 41, 49, 52, 53, 56, 57, 63,64, 75, 77,
78, 79, 83, 88, 95, 98 and 100) StatT was 1.7%, (35, 55, 61, 65, 96 and 97) StatT was 3.3%, (40, 45,
58, 62, 76, and 82) StatT was 5.0% and (68 and 89) StatT . The column Cumulative Percent adds up
the Valid Percent values as you move down the table.

Page 12
 The Descriptive Statistics table is the table many different numerical summaries, some of which we
have deleted to save space. The first column lists race of respondent, accomodation campus, having
special friend, study hour, readiness, focus, consistency, statT and statF . The second column named
N is the 60 respodent. Third column minimum lists of respondent, 1 (accomodation campus, having
special friend, readiness, focus and consistency), 10 (statT and statf) and 20 (study hour). Fourth
column maximum, 2 (having special friend), , 3 (race of respodent and accomodation campus), 7
(readiness, focus and consistency), 75 (study hour), 97 (statF) and 100 (statT). Fiveth column is mean
race (1.53), accomodation (2.12), having special friend (1.6), study hour (36.07), readiness (3.47),
focus (3.13), consistency (3.47), statT (64.73) and statF (65.83). Std deviation race (812),
accomodation (825), having special friend (494), study hour (11.883), readiness (1.845), focus
(1.732), consistency (1.761), statT (21.025) and statF (19.462).

Page 13
2. Testing the normality of the variables: study hours, seriousness of the study and statistics test marks, and
statistics final marks

 SPSS produces quite a bit of output. Shows the Case Processing Summary table. There are 60
individuals for whom we have a height value and one individual for whom we do not.

 The second table produced is the descriptives table shown in. This table includes many different
numerical summaries, some of which we have deleted to save space. We highlight the following for
study hour:
 The mean is 36.07 inches. This is boxed in red.
 The median is 35.00 inches. This is boxed in yellow.
 The standard deviation is 11.883 inches. This is boxed in green.
 The range is 55 inches. This is boxed in blue.
 The interquartile range is 12 inches. This is boxed in gray

Page 14
 We highlight the following for stat T:
 The mean is 64.73 inches. This is boxed in red.
 The median is 65.24 inches. This is boxed in yellow.
 The standard deviation is 21.025 inches. This is boxed in green.
 The range is 90 inches. This is boxed in blue.
 The interquartile range is 32 inches. This is boxed in gray

 We highlight the following for :focus


 The mean is 3.13 inches. This is boxed in red.
 The median is 3.00 inches. This is boxed in yellow.
 The standard deviation is 1.732 inches. This is boxed in green.
 The range is 6 inches. This is boxed in blue.
 The interquartile range is 3 inches. This is boxed in gray

Page 15
 We highlight the following for statF:
 The mean is 65.83 inches. This is boxed in red.
 The median is 62.50 inches. This is boxed in yellow.
 The standard deviation is 19.462 inches. This is boxed in green.
 The range is 87 inches. This is boxed in blue.
 The interquartile range is 26 inches. This is boxed in gray

Page 16
Page 17
Page 18
Page 19
Page 20
3. Testing the correlations between the variables study hours, seriousness of the study, statistics test marks
and statistics final mark achievements

 The results are presented in a matrix such that, as can be seen above, the correlations are replicated.
Nevertheless, the table presents the Pearson correlation coefficient, the significance value and the
sample size that the calculation is based on.
 In study hour, we can see that the Pearson correlation coefficient, r, is 0.1 and that this is statistically
significant (p < 0.0005).
 In statT, we can see that the Pearson correlation coefficient, r, is 0.044 and that this is statistically
significant (p < 0.0005).
 In statF, we can see that the Pearson correlation coefficient, r, is 0.116 and that this is statistically
significant (p < 0.0005).
 In focus, we can see that the Pearson correlation coefficient, r, is -152 and that this is statistically no
significant (p < 0.0005).

Page 21
4. Testing the dependency of
a. Special friend on race

 When reading this table we are interested in the results of the "Pearson Chi-Square" row. We can see
here that χ(1) = 3.681, p = .159. This tells us that there is no statistically significant association
between having special friend and race of respondent that is both yes and no equally race melayu,
versus race cina and versus race india.

Page 22
b. Special friend on accomodation

 When reading this table we are interested in the results of the "Pearson Chi-Square" row. We can see
here that χ(1) = 3.516, p = .172. This tells us that there is no statistically significant association
between having special friend and accommodation that is both yes and no equally on campus, versus
off campus and versus squatting 9on and off campus).

Page 23
5. Testing
a) Whether the statistics final marks of the students is equal to 75 marks

 This section of the table shows that the mean difference in the population means is -9.167 ("Mean
Difference" column) and the 95% confidence intervals (95% CI) of the difference are -14.19 to -4.14
("Lower" to "Upper" columns). For the measures used, it will be sufficient to report the values to 2
decimal places.

Page 24
b) The hours of study time on different type of accomodations

 This is the table that shows the output of the ANOVA analysis and whether we have a statistically
significant difference between our group means. We can see that the significance level is 0.726 (p
= .726), which is below 0.05 and therefore there is a statistically significant difference in the mean
length of time to complete the spreadsheet problem between the different courses taken. This is great
to know, but we do not know which of the specific groups differed. We can find this out in the
Multiple Comparisons table which contains the results of post-hoc tests.

Page 25
c) The seriousness of the study on different type of special friend

 Results of an independent samples t-test, we usually present a table with the sample sizes, means and
standard deviations. Regarding the significance test, we'll state that “on average, student did not focus
in study; t(58) = 1.5, p = .14.”

Page 26
d) The statistics mark final achievements on different type of special friend

 We can see that the group means are significantly different because the value in the "Sig. (2-tailed)"
row is less than 0.50. Looking at the Group Statistics table, we can see that those people who
undertook the yes having special friend had lower mark final achievement levels at the end of the
final mark than those who underwent a having special friend.

Page 27
e) The stastics mark final achievements on different race category

 We can see that the group means are significantly different because the value in the "Sig. (2-tailed)"
row is less than 0.10. Looking at the Group Statistics table, we can see that those people who
undertook the cina race had lower mark final achievements at the end of the final mark than those
who underwent a race melayu.

Page 28
f) The different of trial and final statistics marks

 You might report the statistics in the following format: t(degrees of freedom) = t-value, p =
significance level. In our case this would be: t(59) = -268, p < 0.0005. Due to the means of the two
jumps and the direction of the t-value, we can conclude that there was a statistically significant
improvement in jump distance following the trial marks from 64.73 to 65.83 (p < 0.0005).

Page 29
6. Testing the statistics mark final achievements that may be contributed by special friend, seriousness of the
study, duration of the study time and statistics trial marks.

Notice that all of the significance levels are < .05, so they are all significant. (Reject null hypothesis that they
are not associated with the dependent variable).
The unstandardized coefficients can be used to create an equation for Y.
b0 = (Constant) b1 = having special friend b2 = study hour b3= statT
Y = b0 + b1x1 + b2x2 + b3x3
Y = (65.074) + (4.920)*x1 + .206*x2 + (-.225)*x3
For this example, we’ll stop here, but to calculate Y you would just plug in the data from an individual
observation for each of these variables.

Page 30
CONCLUSIONS

In this study, 60 questionnaires were given to the students in one campus. Referring to frequency
table and the descriptive statistics, the list number of sample were associated with each particular in a
questionnaire.
A total of seven data that consist in the questionnaire are race of the student, accommodation,
whether student have special girlfriend or boyfriend, study hours, the study seriousness and the trial marks
for trial and final. This data is then made on each questionnaire given to the students all around campus.
On page 9, the descriptive table shows the basic summary of statistic for study hour, statistic mark for trial
and final data. The statistics include the mean, standard deviation, variance and skewers. As for the stem and
leaf plot shows how study hour affect the marks for test and final. Both mark for test and final shows the
similar value in the stem and leaf plot compare to study hour.
On page 16, cross tabulation of the responded shows that student regardless races prefer not to have
special girlfriend or boyfriend. The Chi – Square test shows the value of Pearson Chi – Square, X = 3.681
where the P = 0.159. This tell us that no statistically significant association between having a special friend,
(boyfriend and girlfriend) on races. All three races melayu, cina and india prefer not to have special friend.
On page 22, the Anova table shows significant value in the data is 0.726. Let  = 0.5. The significant value
shows higher than alpha value. Thus, the different type of accommodation does not affect the student’s study
time.
In conclusion, the data that consist of races of the student, accommodation, whether student have
special girlfriend or boyfriend, study hours, the study seriousness does not affect students on their trial and
final mark.

Page 31

You might also like