You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/308169118

Statistical Analysis on Students' Performance

Article · June 2016

CITATIONS READS

2 5,848

2 authors, including:

Oluwafemi Samson Balogun


Modibbo Adama University of Technology, Adama
43 PUBLICATIONS   142 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Books, Articles, and Posters for Colleagues View project

Certificates of Achievements, Accomplishments and Awards View project

All content following this page was uploaded by Oluwafemi Samson Balogun on 28 September 2016.

The user has requested enhancement of the downloaded file.


Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

Statistical Analysis on Students’ Performance

Elepo, Tayo Afusat1 & Balogun, Oluwafemi Samson*2


1
Department of Statistics, Kwara State Polytechnics,
Ilorin, Kwara State, Nigeria
1
elepotayo1@yahoo.com
*2
Department of Statistics and Operations Research,
Modibbo Adama University of Technology,
P.M.B. 2076, Yola, Adamawa State, Nigeria.
*2
balogun.os@mautech.edu.ng
Abstract: This research uses Cohen’s Kappa to examine the performance of students in the
Faculty of Science, University of Ilorin. The data was collected from eight departments in
the faculty and it covers the performance of students measured by their Grade Point
Average (GPA) and Cumulative Grade Point Average (CGPA) in both their first and final
year between 2000-2006 academic sessions. It is of interest to determine the proportion of
students that improved on their performance, dropped from the class of grade point which
they started with and those that maintained their performance using psychometrics
approach. Also, the strength of agreement that exist between the first and the final year
was examined.
Keywords: Cohen’s Kappa, Intra-class Kappa, Agreement, Raters.
1. Introduction of certificates of the lower tertiary
Education in a broad sense is the institution in Nigeria labor market
process of exposing the individuals has subjected the university into an
to concepts and activities which over-crowded community with
physically, mentally, morally and many still outside, eager to add to
spiritually help equip him/her with the congestion. In order to bring
the knowledge of things around him. about fair play and to exercise
Education also exposes the justice in the admission of
individual to further knowledge by candidates into the universities, the
means of books, mass media and National University Commission
academic institutions. (NUC) was set up to look into the
From the foregoing, thousands of affairs of the universities. The
people normally apply into the commission was established in 1978
Nigeria universities-the peak of and it has since embarked on some
tertiary institutions. The search for policies so as to ensure that
knowledge and the little recognition

56
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

admission processes are conducted above the upper control limit, which
without duplication of admission. indicate the positive shift in student
The information used for this CGPA was identified. Then, a
research was obtained by the solution to correct students’ poor
method of transcription from performance and was suggested. If
records and they are all secondary the charting statistics for all the
type of data. Data were collected semester fall within the control
from eight departments in Faculty of limits, the student has maintained
Science: Biochemistry, Chemistry, the desire target GPA value.
Physics, Geology, Computer According to (Balogun et al., 2014)
Science, Mathematics, Statistics and the main focus of their research is to
Microbiology. The data include develop models that can be used to
students’ performance measured by study the trend of graduate
their GPA and CGPA as emigration in Nigeria using log-
appropriate, in both their first and linear modeling based on the results
final year. from of likelihood ratio (G2),
The aim of this research is to use Akaike Information Criteria (AIC) ,
Cohen’s Kappa to study students’ Saturated model has a perfect fit for
performance of some departments in modeling graduate emigration in
the Faculty of Science, University of Nigeria. This implies all the three
Ilorin and objectives are: to know factors involved (discipline, year
the strength of agreement that exist and sex) has to be included in the
between student grade point in both model in order to have an
their first and final year, to know the appropriate result.
proportion of students that Since its introduction Kappa
maintained their CGPA (i.e. those statistics, several authors have
that maintained what they started applied the concept in different
with), to know the proportion of field, for instance; (Zeeshan et al.,
students that dropped from the class 2015) carried out an initial audit for
of grade point they started with and evaluating the case notes for each
to know the proportion of students team against the TONK score. In
that improved on their performance. order to evaluate the producibility of
this score, the Cohen’s kappa was
In a study conducted by (Akinrefon
used and substantial agreement was
& Balogun, 2014), control chart was
noted. The article by (Viera &
used to monitor students’
Garret, 2015) provided a basic
performances, the causes underlying
overview of Kappa statistics as one
the charting statistics that are less
measure of inter-observer
than the lower control limits were
agreement. They concluded that
identified which indicate a negative
“Kappa is affected by prevalence
shift in students CGPA. Also, the
but nonetheless kappa can provide
reason for charting statistics falling
more information than a simple

57
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

calculation of the raw proportion of 1.2 Categorical Data Analysis


agreement”. These are data consisting of a
(Kilem, 2002) established that the classification of the behavior or
unpredictable behavior of the PI and subjects into a number of mutually
Kappa statistics is due to a wrong exclusive and exhaustive
method of computing the chance corresponding categories. A
agreement probability. (Warrens, multivariate quantitative data is one
2015) reviewed five ways to look at in which each individual is
Cohen’s kappa. Nevertheless, the described by a number of attributes.
five approaches illustrate the All individuals with the same
diversity of interpretations available description are enumerated and
to researchers who use kappa. In count is entered into a cell of the
(Wang et al., 2015) Cohen’s kappa resulting contingency table
coefficient was used to assess 1.3 Contingency Tables
between raters agreement, which The multidimensional table in which
has the desirable property of each dimension is specified by
correcting for chance agreement. It discrete variable or grouped
was concluded that despite the continuous (range) variable gives a
limitations, the kappa coefficient is basic summary for multivariate
an informative measure of discrete and grouped continuous
agreement in most circumstance that data. If the cell of the table are
is widely used in clinical research. number of observation in the
1.1 Categorical Response Data corresponding values of the discrete
A categorical variable is one which variables then it is
the measurement scale consists of CONTINGENCY TABLES. The
set of categories. For instance, discrete or grouped continuous
political philosophy may be variables that can be used to classify
measured as “liberal”, “moderate”, a table are known as FACTORS.
or “conservative” also smoking Examples include Sex (Male or
status might be measured using Female), religion (Christianity,
categories “never smoked”, “former Islam, Traditional etc.).
smoker” and “current smoker” etc. Types of Contingency table:
Though categorical scales are One dimensional 1 J  tables
common in the social and Two dimensional (I×J) tables
biomedical science, they occur
Square tables (I×I)
frequently in the behavioral
Multidimensional tables
sciences, public health, ecology
education and marketing. They even 1.4 Measures of Agreement
occur in highly quantitative field Agreement is a special case of
such as engineering science and association which reflects the extent
industrial quality control. to which observers classify a given
subject identically into the same

58
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

category. In order to assess the to quantity agreement, it should be


psychometric integrity of different concerned about using a statistics
ratings, inter-raters agreement is that is the source of so much
computed. Inter-rater reliability controversy and it should consider
coefficients reveal the similarity or alternatives and make an informed
consistency of the pattern of choice.
responses, or the ranking-ordering One can distinguish between two
of responses between two or more possible uses of kappa as a way to
raters (or two or more rating test rater independence (that is, as a
sources), independent of the level or test statistic), and as a way to
magnitude of those ratings. For qualify the level of agreement (that
example, let us consider the table 1. is, as an effect-size measure).
Ratings of three subjects by three 2.2 Cohen’s Kappa Coefficient
raters, one observes from the table 1 Cohen’s kappa is one of the most
that all the raters were consistent in commonly used statistic for
their ratings, rater 2 maintained his assessing the nominal agreement
leading ratings followed by rater 1 between two raters (Warrens, 2010;
and rater 3 respectively. 2011).
Inter-rater agreement on the other (Cohen, 1960) proposed a
hand is to measure that ratings are standardized coefficient of raw
similar in level or magnitude. It agreement for nominal scales in
pertains to the extent to which the terms of the proportion of the
raters classify a given subject subjects classified into the same
identically into the same category. category by the two observers.
(Kozlowski & Haltrup, 1992) noted However, the idea of having an
that an inert-rater agreement index agreement measure was anticipated
is designed to “reference the before 1960. For example, decades
interchangeability among raters: it earlier Corrado Gini already
addresses the extent to which raters considered measures for assessing
makes essentially the same ratings”. agreement on a nominal scale
Thus, theoretically, obtaining high (Warrens, 2013). The proportion is
levels of agreement should be more estimated as;
difficult than obtaining high levels I
of reliability or consistency.  i    ii (1)
i 1
2. Materials and Method
2.1 Kappa Statistics And under the baseline constraints
There is wide disagreement about of complete independence between
the usefulness of Kappa statistics to ratings by the two observers, which
assess rater agreement. At the least, is the expected agreement
it can be said that: kappa statistics proportion estimated as;
should not be viewed as the
unequivocal standard or default way

59
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

I suggests the chance that the


 0    i. . j (2) agreement can be ignored. Later,
i 1
Cohen’s kappa was introduced for
The kappa statistics can now be measuring nominal scale chance-
written as; corrected agreement. (Scott, 1955)
i 0 defined  e using the underlying
Kc 
1 0 (3) assumption that the distribution of
where  i and  0 are as defined above proportion over the Ith categories for
the population is known, and is
(Landis & Koch, 1977a) have equal for the two raters. Therefore,
characterized different ranges of if the two raters are interchangeable,
values for kappa with respect to the in the sense that the marginal
degree of agreement they suggest. distributions are identical, then
Although, these original suggestions Cohen’s and Scott’s measures are
were admitted to be “clearly equivalent because Cohen’s kappa is
arbitrary”, they have become an extension of Scott’s index of
incorporated into the literature as chance-corrected measures. To
standards for the interpretation of determine whether K differs
the kappa values. For most significantly from zero, one could
purposes, values greater than 0.75 or use the asymptotic variance
so may be taken to represent formulae given by (Fleiss et al.,
excellent agreement beyond chance, 1969) for the general I  I table. For
values below 0.40 or so may be large n, Fleiss’ formulae is
taken to represent poor agreement practically equivalent to the exact
beyond chance, values between 0.40 variance derived by (Everitt, 1968)
and 0.75 may be taken to represent based on the central hypergeometric
fair to good agreement beyond distribution. Under the hypothesis of
chance and this is clearly shown in only chance agreement, the
table 2. Bias of one rater relative to estimated large-sample variance of
another refers to discrepancies K is given by;
between these marginal I
 e   e 2    i. .i  i.   .i  …..(4)
distributions. Bias decreases as the Var0  K e   i 1

n 1   e 
2
marginal distributions becomes
more nearly equivalent. The effect
of rater bias on kappa has been Assuming that
investigated by (Feinstein &
Ciccheti 1990) and (Bryt et al., K
……………………. (5)
1993). Var0  K 
Early approaches to this problem
have focused on the observed Follows a normal distribution, we
proportion of agreement; see can test the hypothesis of the chance
(Goodman & Kruskal, 1954), this

60
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

agreement with reference to the P3 = the proportion of those that


standard normal distribution dropped from the class of grade
2.3 Intraclass Kappa point they started with, that is, those
Intraclass kappa was defined for above the diagonal table.
data consisting of blind 3. Data Analysis
dichotomous ratings on each of n This section shows the analysis on
subjects by two fixed raters. It is Cohen kappa, Intra-class kappa
assumed that the ratings on a subject statistic and the proportion of
are interchangeable, that is, in the students who maintained, dropped
population of subject; the two and improved on their performance
ratings for each subject have a as shown in table 3 and 4 below.
distribution that is invariant under
This calculation is done to
permutations of the raters to ensure
demonstrate the percentage of
that there is no rater bias (Scott,
students who maintained, improved
1955), (Bloch & Kraemer, 1988),
and dropped in the CGPA they
(Donner & Eliasziw, 1992) and (
started with
Banergee et al., 1999). The
4.3538
Intraclass is estimated as; P1  100  54.42
7.9998
 e  
K i  0 e (6)
1 e P2 
2.7513
100  34.39
7.9998
where
 e    ii 0.8947
P3  100  11.18
  i.   . j
2
 7.9998
  
 0e   ..  Discussion, Summary and
 2 

 

Conclusion
4.1 Discussion
Furthermore, to obtain the For Physics Department, 55.56% of
proportion of those that maintained the students were able to maintain
the performance or grade they their grade point, 35.56% of the
started with, the proportion of those students improved while 8.89%
that improved on their performance dropped from the class of grade
and also the proportion of those that point they started with.
dropped from the class of grade For Statistics Department, 74.04%
point they started with. Let of the students were able to maintain
P1 = the proportion of those that their grade point, 18.52% of the
maintained what they started with, students improved while 7.4%
that is, the diagonal table dropped from the class of grade
P2 = the proportion of those that point they started with.
improved on their performance, that For Microbiology Department,
is, those below the diagonal table 48.98% of the students were able to

61
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

maintain their grade point, 44.89% their grade point, 31.46% of the
of the students improved while students improved while 5.56%
6.12dropped from the class of grade dropped from the class of grade
point they started with. point they started with.
For Mathematics Department, 36% 4.2.1 Summary
of the students were able to maintain From the above interpretation, we
their grade point, 64% of the could see that 54.42% of the
students improved while none students were able to maintain their
dropped from the class of grade CGPA that they started with,
point they started with. 34.39% of the students improved
For Geology Department, 46.15% of and 11.18% of the students dropped
the students were able to maintain from the class of grade point they
their grade point, 49.23% of the started with. Also, the strength of
students improved while 4.62% agreement between the first and the
dropped from the class of grade final year result is on the 0.40%.
point they started with.
4.3 Conclusion
For Computer Science Department,
It can be observed that Mathematics
51.49% of the students were able to
department has the highest number
maintain their grade point, 1.79% of
of students that improved on their
the students improved while 46.71%
performance, Statistics department
dropped from the class of grade
had the highest number of students
point they started with.
that maintained their grade point
For Biochemistry Department,
and Computer Science department
60.17% of the students were able to
had the highest number of students
maintain their grade point, 29.66%
that dropped from the grade point
of the students improved while
they started with. Also, the strength
10.17% dropped from the class of
of agreement that exist between the
grade point they started with.
first and the final year result is on
For Chemistry Department, 62.96% Average, that is, “fair”.
of the students were able to maintain

References review of interrater agreement


Akinrefon, A.A and Balogun, O.S. measures. The Canadian
(2014). Use of Shewart journal of Statistics, 20(1), 3-
control chart technique in 23.
monitoring student Bloch, D.A., Kraemer, H.C. (1988).
performance. Bulgarian Journal 2  2 Kappa coefficients:
of Science and Education Measures of agreement or
Policy, 8(2), 311-324. association. Biometrics, 45,
Banergee, M., Capozzoli, M., 269-287.
Mcsweeney, I.J. and Sinha, D. Balogun, O.S., Bright, D.E.,
(1999). Beyond Kappa: A Akinrefon, A.A. and

62
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

Abdulkadir, S.S. (2014). Kilem Gwett (2002). Kappa Statistic


Modeling Graduate is not satisfactory for assessing
Emmigration Nigeria using the extent of a agreement
Log-Linear Approach. between raters. Series:
Bulgarian Journal of Science Statistical Methods for Inter-
and Education Policy, 8(2), Rater Reliability Assessment,
375-391. No. 1: 1-5.
Bryt, T., Bishop, J. and Carlin, J.B. Kozlowski, S.W.J. and Hattrup, K.
(1993). Bias, prevalence and (1992). A disagreement about
kappa. J. Clin. Epidemiol., 46, within group agreement:
423-429. Disentangling issues of
Cohen, J. (1960). A coefficient of consistency versus consensus.
agreement for nominal scales. J. Applied Psych., 77(2), 161-
Edu. and Psych. Meas., 20, 37- 167.
46. Landis, R.J. and Koch, G.G.
Donner, A. and Eliasziw, M. (1992). (1977a). The measurement of
A goodness of fit approach to observer agreement for
inference procedures for kappa categorical data. Biometrics,
statistic: confidence interval 33, 159-174.
construction, significance Scott, W.A. (1955). Reliability of
testing and sample size content analysis: The case of
estimation. Statist. Med., 11, nominal scale coding. Public
1511-1519. Opinion. Quart., 19, 321-325.
Everitt, B.S. (1968). Moments of the Khan, Z., Sayers, A.E., Khattak,
statistics kappa and weighted M.U. and Chamber, I.R.
kappa. British J. Math. Statist. (2015). The TONK score: a
Psych., 21, 97-103. tool for assessing quality in
Feinstein, A.R. and Cicchetti, D.V. Trauma and Orthopaedic note-
(1990). High agreement but low keeping. SICOT J., 1(29):1-4.
kappa I: the problems of two Viera, A. J., Garraett, J.M. (2015).
paradoxes. J. Clin. Epidemiol., Understanding Interobserver
43, 543-548. Agreement: The Kappa
Fleiss, J.L., Cohen, J. and Everitt, Statistic. Research Series
B.S. (1969). Large sample (Family Medecine), 37(5): 360-
standards errors of kappa and 363.
weighted kappa. Psych. Bull., Warren, M. J. (2010). Inequalities
72, 323-327. between kappa and kappa like
Goodman, L.A. and Kruskal, W.H. statistc for k  k tables.
(1954). Measuring of Psychometrika 75: 176-185.
association for cross Warren, M. J. (2011). Cohen’s
classifications. J. Amer. Statist. kappa is weighted average.
Assoc., 49, 732-768.

63
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

Statistical Methodology 8: 473- Warren, M. J. (2015). Five ways to


484. look at Cohen’s Kappa. J.
Warren, M. J. (2013). A comparison Psychol. Psychother., 5(4):1-4.
of Cohen’s kappa and Wan Tang, Jun Hu, Hui Zhang, Pan
agreement coefficients by Wu and Hua H.E. (2015).
Corrado Gini. International Kappa coefficient: a popular
Journal of Research and measure of rater agreement.
Reviews in applied Sciences Shanghai Archive of
16: 345-351. psychiatry, 27(1): 62-67.

Appendix
Table 1: Example of Raters
Subject Rater 1 Rater 2 Rater 3
1 5 6 2
2 3 4 2
3 1 2 1

Table 2: The Range Of Kappa Statistic with the


Respective Strength of Agreement
Kappa statistic Strength of Agreement
<0.00 Poor
0.00-0.20 Slight
0.21-0.40 Fair
0.41-0.60 Moderate
0.61-0.80 Substantial
0.81-1.00 Almost perfect

Table 3: Cohen’s and Intra-Class Kappa Estimates


S/No Department Cohen’s Intra-class
Kappa Kappa
1 Physics 0.3410 0.3280
2 Statistics 0.6291 0.6279
3 Microbiology 0.2409 0.2214

64
Covenant Journal of Informatics and Communication Technology (CJICT) Vol. 4, No. 1, June, 2016

4 Mathematics 0.0315 0.035


5 Computer 0.2861 0.2615
Science
6 Geology 0.1955 0.1710
7 Biochemistry 0.4169 0.6017
8 Chemistry 0.3865 0.3822

Table 4: The proportion of Students that improved, maintained


and dropped
S/No Department P1 P2 P3 Sum

1 Physics 0.5556 0.3556 0.0889 1.0001=1


2 Statistics 0.7407 0.1852 0.074 0.9999=1
3 Microbiology 0.4898 0.4489 0.0612 0.9999=1
4 Mathematics 0.3600 0.6400 0 1.0000=1
5 Computer 0.5149 0.0179 0.4671 0.9999=1
Science
6 Geology 0.4615 0.4923 0.0462 0.9999=1
7 Biochemistry 0.6017 0.2966 0.1017 0.9999=1
8 Chemistry 0.6296 0.3148 0.0556 0.9999=1

Table 5: The Strength of Agreement for each Department


S/No Department Strength of Agreement
1 Physics Fair
2 Statistics Substantial
3 Microbiology Fair
4 Mathematics Slight
5 Computer Science Fair
6 Geology Moderate
7 Biochemistry Slight
8 Chemistry Fair

65
View publication stats

You might also like