You are on page 1of 30

International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.

2, Issue 4, Dec 2012 55-84 TJPRC Pvt. Ltd.,

ACADEMIC PERFORMANCE EVALUATION USING FUZZY C-MEANS


RAMJEET SINGH YADAV1 & P. AHMED2
1

Research Scholar, Department of Computer Science and Engineering, School Engineering and Technology, Sharda University, Greater Noida, UP, India.

Professor, Department of Computer Science and Engineering, School Engineering and Technology, Sharda University, Greater Noida, UP, India

ABSTRACT
In this paper we explore the applicability of K-means and Fuzzy C-Means clustering algorithms to student allocation problem that allocates new students to homogenous groups of specified maximum capacity, and analyze effects of such allocations on the academic performance of students. The paper also presents a Fuzzy set and Regression analysis based Dynamic Fuzzy Expert System model which is capable of dealing with imprecision and missing data that is commonly inherited in the student academic performance evaluation. This model automatically converts crisp sets into fuzzy sets by using C-Means clustering algorithm method. The comparative performance analysis indicates that the student group formed by Fuzzy C-Means clustering algorithm performed better than groups formed by K-Means and Hard CMeans clustering algorithms.

KEYWORDS: Fuzzy Logic, Clustering, K-Means Algorithm, Hard C-Means Algorithm, Fuzzy C-Means algorithm,
Fuzzy Expert Systems, Membership Function and Academic Performance Evaluation

INTRODUCTION
The student academic performance evaluation problem can be considered as a clustering problem where clusters (or classes) are formed on the basis of intelligence level of students, and the class size should not exceed the predefined capacity. The intelligence level wise grouping is essential for maintaining the homogeneity of the group otherwise it would be difficult to provide good educational services to highly diverse student population. Moreover, homogenous grouping of students having similar ranking (or some other measures) into classes would further make the academic performance results fairer, realistic and comparable. The existing practice of score aggregation based student similarity or his/her rank determination is unrealistic because scores are assembled from different score combinations. Universities use GPA (Grade Point Average), an example of score aggregation based measure, as a major criterion for student selection. Most universities consider 3.0 and above GPA as an indicator of good academic performance, hence, it remains the most common factor used by the academic planners to evaluate progression in an academic environment (S. S. Sansgiry, et al., 2006) despite its

limitations in providing a comprehensive view of the state of students performance evaluation and simultaneously discovering important details from their continuous performance assessments (O.J. Oyelade, et al., 2010). Furthermore, average score may lead to wrong conclusion. Especially, when details of data from which it is computed are not given. It has been observed that there are factors, other than academic ones, pose barriers to students attaining and maintaining high. Therefore, grouping or clustering students using cognitive as well as affective factors into different categories, and then defining performance measure may be a realistic approach. For example, consider a scenario where two students score 50, 60, 70, and 70, 60, 50 in three tests respectively. The average mark obtained by each is 60. Can we

56

Ramjeet Singh Yadav & P. Ahmed

conclude, from the average, that intelligence level of both the students is same? Of course not! The data indicates that one student is improving while the other is deteriorating consistentlyit may imply that one student is learning consistently from his experience. The example illustrates that the student ranking or modeling academic performance evaluation method should be based on class homogeneity a view point supported by other researchers (Z. Zukhri, et al., 2008). In addition to such computational issues, as mentioned before, the imprecision and vagueness in data collection process also affect the performance indicators evaluation. Unfortunately, this aspect is ignored in practice because generally hard computing based process, procedures and techniques are used in performance evaluation. Observation shows the soft computing techniques are more powerful and better suited in providing feasible solutions to the problems that deal with uncertainties and vagueness. For instance, the fuzzy logic, handles, imprecision, and uncertainty in a natural manner by providing a human oriented knowledge representation is possible, but it is weak in self learning and generalization of rules. A combination of fuzzy logic and genetic algorithm is expected to eliminate this weakness. Now, their power is being investigated. In their recent work Mankad K. et al., (2011) have reported an evolving rule based model for identification of multiple intelligence. Their genetic-fuzzy hybrid model identifies human intelligence. Zainudin Zukhri and Khairuddin Omar (2008) have reported successful application of Genetic Algorithm for solving difficult optimization problems in

new students allocation problem. Vuda Sreenivasarao et al, (2012) developed a model for improving academic performance evaluation of students based on data warehousing and data mining techniques that use soft-computing intensively. Their analysis indicates that the group homogeneity improves students academic performance thereby enhances education quality. An Artificial Neural Network (ANN) model reported in Obinity Afolayan Ayodele et al., (2010) that along with computation also derives meaning from imprecise data, extracts patterns and detects trends. This ability has added new dimensions in comprehending the complex phenomena that is buried in students data otherwise might have gone unnoticed using hard computing techniques. In practice, whether phenomena discovery or performance indicator computation, their accuracy depends on the data quality that in turn depends on the accuracy of data collection process and representation techniques. In order to address the data related issues, in education domain, Biswas (1995) suggested use of fuzzy sets (Zadeh, 1965) in students answer-sheets evaluation. Wang H.Y. and Chen S.M. (2007) recommended use of vague sets (Gau and Buehrer, 1993) instead of fuzzy sets to represents the vague marks of each question where the evaluator can use vague values to indicate the degree of the evaluators satisfaction for each question. In fuzzy sets the membership evaluation (characteristics function definition) is a major issue. In order to apply the fuzzy set in education domain effectively, there have been a lot efforts in defining the effective membership. Bai S.M. and Chen S.M. (2008) define fuzzy membership functions for fuzzy rules; Law C.K. (1996) used fuzzy numbers, and for more information on this issue consult: Chen S.M. and Lee C.H. (1999), Wang H.Y. and Chen S.M. (2006), Stathacopoulou R., et al. (2004), Guh Y.Y., et al. (2008), Gokmen E., et al. (2010), Hameed I.A. (2011), Baylari A. and Montazer Gh. A. (2009), Posey C.L. and Hawkes L.W. (1996), Stathacopoulou R., et al. (2007), Bhatt R. and Bhatt D. (2011), and Zhou D. and Ma J. (2000). The research works cited in the preceding paragraph indicates that the fuzzy logic, neural network and fuzzy neural network have already been employed in student modeling systems but almost nothing or very little has been mentioned about automatic generation of fuzzy membership function. This paper describes a method for automatic

Academic Performance Evaluation Using Fuzzy C-Means

57

generation of membership function for student academic performance evaluation. For this purpose we have used fuzzy Cmeans Clustering algorithm for automatic generation of membership function. In order to obtain the homogeneous clusters (or classes) of students, we have studied the performance of Fuzzy C-Means and K-Means clustering algorithms for student population clustering. For both the cases, we have developed students academic performance evaluation models. In this research paper, the proposed dynamic Fuzzy Expert System automatically convert the crisp data into fuzzy set and also calculate the total mark of a student sit in semsete-1 and semester-2 examination. The proposed idea is a starting attempt to use the applicability of Fuzzy C-Means clustering algorithm to analyze and find out modeling academic performance and to improve the quality of the students and teachers performance in educational domains. Fuzzy C-Means Clustering algorithm is a data warehousing and data mining techniques. Due to this reason it is more effective for improve the quality of education. The management can use some techniques to improve the course outcome according to the improve knowledge. Such knowledge can be used to give a good understanding of students enrollment pattern in the course under study, the faculty and managerial decision maker in order to utilize the necessary steps needed to provide extra classes. On the other hand, such types of knowledge the management system can be enhance their policies, improve their strategies and improve the quality of the system. The paper, besides introduction, has nine sections. The next Section gives a survey on Fuzzy approaches in academic performance evaluation. Section three describes Data Cluster Analysis for Academic Performance Evaluation. Section four describes Expert System and their components. Section five describes the architecture of the proposed Dynamic Fuzzy Expert System (DEFS). Section six describes experimental results of K-Means clustering technique. In Section seven, we present the experimental results of DEFS. Section eight describes the comparison of classical, Fuzzy Expert System, K-Means and Fuzzy C-Means Clustering methods for Modeling Academic Performance Evaluation. We conclude paper with section nine.

SURVEY OF FUZZY APPROACHES IN ACADEMIC PERFORMANCE EVALUATION


While fuzzy logic techniques have earned their place in a variety of field ranging from engineering to financial sector, to medicine, few efforts have been made to test the potential usefulness of these methods in the modeling academic performance evaluation. This section discusses the literature survey about the past and current research application of fuzzy logic. It discusses about the academic achievement of student and teacher, prediction model and academic performance evaluation fuzzy logic approaches in academic performance evaluation. A. Modeling Academic Performance Evaluation Using Soft Computing Techniques: A Fuzzy Logic Approach Ramjeet Singh Yadav et al., (2011) presented a method to deal with the modeling academic performance evaluation using fuzzy logic. Academic performance evaluation with fuzzy expert system comprised with three steps:
1. Fuzzification of inputs semester examination results and output performance value. 2. Determine of application rules and inference method. 3. Defuzzification of performance value.

Fuzzification of Semester Examination Results and Performance Value Fuzzification of semester examinations was carried out using input variables and their membership functions of fuzzy sets. Each student has two semester results both of which from input variables of the fuzzy logic based expert system. Each input variable has five triangular membership functions. The fuzzy sets of the input and output variable are given in Table-1 and Table-2 respectively.

58

Ramjeet Singh Yadav & P. Ahmed

Table 1: Fuzzy Set of Input Variable Linguistic variable Interval Very Low (VL) (0, 0, 25) Low (L) (0, 25, 50) Average (A) (25, 50, 75) High (H) (50, 75, 100) Very High (VH) (75, 100, 100) Table 2: Fuzzy Set of Output Variable Linguistic Variable Interval Very Low (VL) (0, 0, 25) Low (L) (0, 25, 50) Average (A) (25, 50, 75) High (H) (50, 75, 100) Very High (VH) (75, 100, 100) Experimental Results Ramjeet Singh Yadav et al., (2011) have proposed Fuzzy Expert System was tested with 20 students marks obtained in the Department of Computer Science and Applications, MG Kashi Vidyapith Varanasi, UP, India; appeared in semester-1 and semester-2 examinations. For each student, both semester examination scores were fuzzified by means of the triangular membership function. Active membership functions were calculated according to rule table, using Mamdani fuzzy decision techniques. The output (performance value) was calculated and then defuzzified by calculating the centre (centroid) of the resulting geometrical shape. This sequence was repeated using the semester examination scores for each student. Table-1 shows the semester scores and calculated students performance value. Table 3: Semester Score and Calculated Performance Value S.No. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. Semsester1 40 20 50 10 45 65 34 48 56 74 45 89 100 65 48 45 55 84 63 28 Semester2 65 35 65 20 65 45 60 55 90 70 50 100 100 35 50 55 25 80 65 30 Performance Value Fuzzy-1 Fuzzy-2 0.530 0.627 0.243 0.243 0.654 0.750 0.203 0.203 0.576 0.676 0.576 0.625 0.462 0.530 0.533 0.758 0.759 0.759 0.735 0.440 0.440 0.575 0.908 0.908 0.920 0.920 0.500 0.387 0.473 0.473 0.500 0.490 0.310 0.310 0765 0.778 0.639 0.753 0.310 0.241

Both inputs had same triangular membership functions. In the above Table-1, students 5 and 6 have same performance value. We conclude that the level of intelligence of both students is same. This is a fallacious conclusion since we find from the above Table-1 that the student 5 has improved consistently while student 6 has deteriorated consistently.

Academic Performance Evaluation Using Fuzzy C-Means

59

This is the drawback of Fuzzy Expert System proposed by Ramjeet Singh Yadav et al., (2011). Here, also pointed out that the problem of in this method is that fuzzy membership value is fixed by the expert domain. Solve such type of problem by the Fuzzy C-Means algorithm. B. Evaluation of Teachers Performance Evaluation Using Fuzzy Logic Techniques Sirigiri Pavani et al., (2012) presented a method to deal with the evaluation of teachers academic performance evaluation using fuzzy logic techniques. The descriptions of this method are given below: Fuzzification of Semester Examination Results and Performance Value Fuzzification of input parameters of teachers performance was carried out using input variables and their membership functions of fuzzy sets are given below in Table-4. Table 4: Fuzzy Set of Input Variables Input Name linguistic Variable Knowledge Bad Good Very Good Erratic Manageable Optimum Abstract Better Relevant Very Unimpression Impression Very Impression

Input Input-1

Range 01-50 25-75 50-100 01-50 25-75 50-100 01-50 25-50 50-100 01-50 25-75 50-100

Input-2

Speed Delivery

Input-3

Representation

Input-4

Over All Impression

The fuzzy sets of output (performance value) variable are shown in Table-5. Table 5: Fuzzy Set of Output Variable of Teachers Performance Output Performance Linguistic Range Variable Output Performance Poor 01-40 Good 40-80 Excellent 90-100 Experimental Results As per the input, output parameters fuzzified and rule base is generated by applying my own reasoning as an expert person to observe or taking decision to evaluate the performance of teacher. For the simplicity of discussion only the trapezoidal fuzzified are presented here for fuzzification of a real-valued variable is done with intuition, experience and analysis of the rues and conditions associated with input data variables. Here, there are 34 numbers of rule generated using AND and OR operator. Some rules are below: 1. 2. If (knowledge is bad) then (performance) is poor. If (knowledge is good) and (speed of delivery is manageable) and (presentation is relevant) then (performance is good). 3. If (knowledge is very good) and (speed of delivery is manageable) and presentation is relevant) then (performance is good). 4. If (knowledge is very good) and (speed of delivery is optimum) and (presentation is relevant) and (overall impression is high impressible) then (performance is excellent). The experimental results of this method are given in Table-6.

60

Ramjeet Singh Yadav & P. Ahmed

Table 6: Input Variables and Teachers Performance Value S.No. Knowledge 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 06.0 07.0 32.6 44.7 40.2 53.8 64.4 68.8 78.5 97.7 Speed of Delivery 12.9 12.2 35.6 38.6 47.7 41.7 58.3 76.5 81.1 87.6 Input Presentation 18 24.5 28.0 40.2 52.2 53.5 62.8 70.5 70.5 97.7 Over All Impression 15.9 9.85 37.0 31.1 41.1 55.2 64.4 75.0 70.0 96.2 Explanation 20.3 22.0 31.1 38.6 55.0 61.4 64.4 72.0 84.1 96.2 Output (Performance) Triangular 20.4 33.7 40.4 56.2 67.2 68.3 70.4 76.1 83.8 95.0

The above Table-6, the inference process when knowledge = 97.7, speed of delivery = 87.6, presentation = 97.7, overall impression = 96.2 and explanation = 96.2 then performance = 95. Here, we pointed out that the membership value of input variable and output variables are fixed by expert domain. In this method, there is no fixed set of procedure for the fuzzification. This is another drawback. Such type of problems solved by the fuzzy C-Means Clustering Algorithm C. Soft Computing Model for Academic Performance of Teachers Using Fuzzy Logic O.K. Chaudhari et al., (2012) presented a method to deal with the evaluation of teachers academic performance evaluation using fuzzy logic. The descriptions of this method are given below: Fuzzy Expert System for Academic Performance Evaluation Steps involved in the Fuzzy Expert System are as follows: Step-1 (Crisp Value (Data)): Teachers self-appraisal forms are filled in by respective teachers with sub activity which then recommended by the head of the department and head of the institution. The crisp data is tabulated from these forms (Table-7). Step-2 (Fuzzification (Fuzzy Input Value)): The input variables (elements) are then divided into linguistic variablesexcellent, very good, good, average and poor. O.K. Chaudhari, et al. (2012) has used the trapezoidal membership function for converting the crisp set into fuzzy set. Step-3 (Fuzzy Rule and Interference Mechanism): The rules determine input and output membership functions that will be used in inference process. These rules are linguistics and are entitled IF-THEN rules. From the discussion with the academic experts some rules are formulated from their practical and past experiences. Here, we pointed out that the drawback of this proposed study, there is need of academic expert for the generation of fuzzy rule and membership function. Step-4 (Fuzzy Output (Overall Performance) and Defuzzification (Performance)): The output variable is the overall performance of the teacher, which has five linguistic variables. The degree of membership function is given by equation (1): (1) This expression determines an output membership function value for each active rule. When one rule is active an AND operation is applied between inputs. The linguistic variables of output variable are shown in Table-7.

Academic Performance Evaluation Using Fuzzy C-Means

61

Numerical Results and Discussions In order to test the above proposed model by using fuzzy expert system and rules defined in the this study the data from one of the reputed engineering college have been used. From the input data the output variable overall performance of teacher is determined by direct method and also by using the fuzzy model developed in the study. Last two columns of Table-7 show the values of teachers performance by direct method and fuzzy expert system respectively. Table 7: Teachers Overall Performance (Crisp and Fuzzy) S.No. Input Variables Output Value F1 F2 F3 F4 F5 F6 Direct Fuzzy 1. 86 85 70 12 13 33 86 80 2. 85 92 90 12 14 34 92 90 3. 95 98 60 09 08 26 73 80 4. 80 95 73 10 15 32 87 80 5. 89 75 60 09 08 33 77 73 6. 94 80 60 12 10 34 84 80 7. 75 80 75 12 04 28 72 71 8. 67 75 75 09 08 33 76 76 9. 70 85 75 09 13 25 74 76 10. 85 90 90 12 08 25 77 89 11. 93 100 75 10 08 28 78 80 12. 82 80 70 09 08 30 75 75 13. 83 91 70 12 00 35 76 70 14. 80 95 73 12 00 21 63 70 15. 71 89 83 12 00 21 63 72 16. 83 90 82 12 00 26 69 76 17. 97 90 95 12 01 34 81 80 18. 75 97 90 10 02 17 61 70 19. 85 96 84 12 08 34 86 84 20. 71 95 76 10 03 23 65 72 21. 73 95 94 06 04 19 60 70 22. 70 94 85 12 09 18 68 80 23. 76 89 75 12 00 24 65 71 24. 72 95 80 12 00 23 65 74 25. 79 99 84 12 00 17 61 70 26. 86 96 90 12 00 14 59 70 27. 95 95 85 12 00 28 72 80 28. 81 96 72 10 00 26 67 70 29. 83 98 85 12 01 30 75 80 30. 79 93 73 11 02 25 68 70 31. 70 100 77 09 01 28 67 71 O.K. Chaudhari et al., (2012) observed that the difference in the direct value and the values determined by using fuzzy model. This is due to the weight age given on some important related to teaching learning process and overall development of the institute while framing the rules. Here, we observed that the membership function values of input variable and output variables for academic performance of teachers are fixed and decided by the domain expert. This is the drawback of the proposed Fuzzy Expert System. In this method, we also observed that this proposed Fuzzy Expert System cannot group or cluster the teachers performance. Such type of problem can solve by the fuzzy C-means clustering algorithm. D. Using Fuzzy Numbers in Educational Grading System Chiu-Keung Law (1996) presented a method for using fuzzy numbers in educational grading system. They also discussed a method to build the membership functions of several linguistic values with different weights. The description this method is given below:

62

Ramjeet Singh Yadav & P. Ahmed

Fuzzy Numbers of Educational Grading System Generally, Chiu-Keung Law (1996) has assigned the linguistic values A, B, C, D, and F to describe a students performance. It is important that the criteria of the performance of the ideal population (students who take the same course in the same school or district) be set before students take an examination. Thus, the criteria cannot be influenced by how well the subjects in the samples (students in a particular class) do on examination. They try to make the linguistic values A, B, C, D and F into corresponding reasonable normal fuzzy numbers membership functions. Advantage of the Fuzzy Educational Grading System As national Council of Teachers of Mathematics reported, only adding scores on examination will not give a full picture of what students know. The challenge for teacher is to try different ways of grading, scoring, and reporting to determine the best ways to describe students knowledge of mathematics. They list the raw scores of 10 students and their corresponding grade in Table-8: Table 8: The Raw Scores of 10 Students and their Corresponding Grade S.No. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. S1 10 14 08 05 02 00 02 04 01 00 S2 15 19 12 11 11 08 03 03 00 00 S3 20 24 15 17 19 01 09 02 02 00 S4 25 28 24 21 02 15 12 04 00 00 S5 30 94 27 05 11 03 00 02 01 00 Total 100 94 86 59 45 27 26 15 04 00 Fuzzy Performance Value 0.8878 08562 0.7978 0.5671 0.4386 0.3274 0.2945 0.1734 0.0980 0.0781 Grade A A B B C C D D F F with trapezoidal (or triangular)

From Table-8, although the highest and lowest degrees of membership are 0.8878 and 0.0781 known that the ideal percentage of receiving grades A and grade B are 15% and 10%. It is important to emphasize that this approach not only apply to an individual, but also a group of individuals. Here, we observed that the membership function values of input variable and output variables for academic performance (grading System) of students are fixed and decided by the expert (educational domain expert). This is the drawback of the proposed fuzzy numbers grading system for students academic performance. E. An Evaluation of Students Performance in Oral Presentation Using Fuzzy Approach Wan Suhan Wan Daud et al., (2011) presented a method for evaluating students academic performance using fuzzy logic approach. They pointed that the evaluation of students performance is a process of making judgment on a student based on several elements such as examinations, assignment, test, quiz, research work and so on. They have used the following methodology for evaluating students performance: Step-1 (Normalized the Marks): The mark obtained by each student has to be converted to the normalized values. Normalized value is referred to a range of [0, 1]. It can be obtained by dividing the mark for each criterion with the total mark. The normalized value will be the input value of this evaluation. Table-9 shows the examples marks and the normalized values obtained by a student for all the criteria.

Academic Performance Evaluation Using Fuzzy C-Means

63

Table 9: An Example of Mark and Normalized Value Criteria Introduction and Objective(C1) Research(C2) System Implementation(C3) Results(C4) Conclusion(C5) Organization(C6) Creativity(C7) Visual Aids(C8) Stage Presence(C9) Report with the panels(C10) Total Mark 15 20 15 15 10 05 05 05 05 05 Mark Obtained 11.67 15.33 12.00 12.67 08.00 03.67 03.00 03.12 04.17 03.50 Normalized Mark 0.78 0.77 0.80 0.84 0.80 0.73 0.60 0.62 0.83 0.70

The graph of membership function is developed in order to execute the fuzzification process. In this process, the input value is mapped into the graph of membership function to obtain the fuzzy membership value of that particular input value. Each membership value will represent the level of satisfaction. Table-10 shows 12 satisfaction levels that have been proposed in this study. Table 10: Standard Satisfaction Level and the Corresponding Degree of Satisfaction Satisfaction Laves Exceptional(E) Excellent(EX) Very Good(VG) Fairly Good(FG) Marginally Good(MG) Competent(C) Fairly Competent(FC) Marginally Competent(MC) Bad(B) Fairly Bad(FB) Marginally Bad(MB) Fail(B) Degree of Satisfaction 80-100(0.8-1.0) 75-79(0.75-0.79) 70-74(0.70-0.74) 65-69(0.65-0.69) 60-64(0.60-0.64) 55-59(0.55-0.59) 50-54(0.50-0.54) 45-49(0.45-0.49) 40-44(0.40-0.44) 35-39(0.35-0.39) 30-34(0.30-0.34) 00-29(0.00-0.29) Maximum Degrees of Satisfaction 1.00 0.79 0.74 0.69 0.64 0.59 0.54 0.49 0.44 0.39 0.34 0.29

Step-2: Calculate the Degree of satisfaction by formula given below: (2) Where yi = degree of membership value for each satisfaction level, i = 1, 2, 3,,12. Step-3: Compute the Final Mark. The final mark for kth student by the formula given below:

(3) Where wi = the total marks of ith criteria for i = 1,2, ..,10. The result obtained is put into the fuzzy grade sheet (Table-11) in the appropriate columns.

64

Ramjeet Singh Yadav & P. Ahmed

Table 11: Fuzzy Grade Sheet with Contain the Overall Fuzzy Marks of Student-1 Criteria C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 F 0 0 0 0 0 0 0 0 0 0 MB 0 0 0 0 0 0 0 0 0 0 FB 0 0 0 0 0 0 0 0 0 0 Fuzzy Membership Value MC FC CT MG FG 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.17 0 0 0.8 0.2 0 0 0 0.43 0.57 0 0 0 0 0 0 0 0 0 0 0.8 Degree of Satisfaction 0.770 0.759 0.830 0.765 1.000 0.732 0.600 0.619 0.958 0.700

VG 0.4 0.62 0 0.50 0 0.83 0 0 0 0.2

EX 0.6 0.38 0.81 0.50 0 0 0 0 0.2 0

ET 0 0 0.19 0 1 0 0 0 0.8 0

The Final Mark of student-1 = 0.7869 Table 12: The Results for 10 Students Obtained from Fuzzy and Non-Fuzzy Method St. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Non-Fuzzy Method Final Mark Linguistic Term 77 Excellent 89 Exceptional 71 Very good 56 Competent 69 Fairly Good 75 Excellent 73 Very Good 83 Exceptional 51 Fairly Competent 68 Fairly Good Fuzzy Evaluation Method Linguistic Term Very Good at 0.17, Excellent at 0.83 Exceptional at 1.0 Fairly Good at 0.18, Very Good at 0.82 Competent at 1.0 Fairly Good at 0.6, Very Good at 0.4 Excellent at 0.81, Exceptional at 0.19 Very Good at 0.4, Excellent at 0.6 Exceptional at 1.0 Fairly Competent at 1.0 Fairly Good at 0.6, Very Good at 0.4

Final Mark 0.79 0.90 0.73 0.59 0.71 0.80 0.77 0.87 0.54 0.71

The Table-12 shows the fuzzy marks obtained are higher than the non-fuzzy marks. Here, we pointed out that the student-1 has the performance of Very Good at 0.17 and also Excellent at 0.83. This is the drawback of the proposed method. We also pointed out that membership function is fixed and decided by the domain expert. F. Fuzzy Logic Based Evaluation of Performance of Students in Colleges Mamatha S. Upadhya (2012) presented a method for evaluation of students performance based on fuzzy logic. The description of this method is given below: Details about the Set Applied The proposed fuzzy system is dealt with, the range of possible values for the input and output variables are determined. These (in language of fuzzy set theory) are the membership function (input variables vs. the degree of membership function) used to map the real world measurement values to the fuzzy values. Values of the input variables are considered in term of percentage. The membership function input and output variables are given in Table-13, 14, 15 and 16 Table 13: Fuzzy Membership Function for the Input Variable (Student Attendance) Linguistic variable Medium Good Very Good Interval (0, 0, 40) (20, 50, 80) (60, 100, 100)

Academic Performance Evaluation Using Fuzzy C-Means

65

Table 14: Fuzzy Membership Function for the Input Variable (Teaching Effectiveness) Linguistic variable Less Effective Effective Highly Effective Interval (0, 0, 40) (20, 50, 80) (60, 100, 100)

Table 15: Fuzzy Membership Function for the Input Variable (Facilities) Linguistic Interval variable Medium (0, 0, 40) Good (20, 50, 80) Very Good (60, 100, 100) Table 16: Fuzzy membership Function for the Output Variable (Student Performance) Linguistic Interval variable Poor (0, 0, 30) Medium (0, 30, 60) Good (30, 60, 90) Very Good (60, 100, 100)

The rules framed for this study is provided below: 1. If student attendance is medium and teaching effectiveness is Less Effective and Facilities is medium then performance of student is Poor. 2. If student attendance is Good and teaching effectiveness is Less Effective and Facilities is medium then performance of student is Medium. 3. If student attendance is Very Good and teaching less effective is Less Effective and Facilities is medium then performance of student is Medium. Defuzzification At last, the crisp value of the Performance of Students is obtained as an answer. This is done by defuzzifying the fuzzy output. There are many defuzzification methods available in the literature but most commonly used are centroid and maximum defuzzification methods. The criteria used to select suitable defuzzification method are very difficult. In this proposed, centroid defuzzification method is used, which is given by: (4) Where A is the output fuzzy set and is the membership function.

RESULTS AND DISCUSSIONS


With the input values and using the above model, the inputs are fuzzified and then by using simple if-else rules and other simple fuzzy set operations, the output fuzzy function is obtained and using the criteria, the output value for performance of students is obtained. The fuzzy output for few different input values is provided in Table-17.

66

Ramjeet Singh Yadav & P. Ahmed

Table-17: Performance of students for Different Input Values S.No. 1. 2. 3. 4. 5. 6. 7. Student Attendance 40 80 80 30 90 35 65 Teaching Effectiveness 60 60 90 90 90 45 45 Facilities 50 70 70 40 30 65 35 Performance of Students 60.00 64.54 84.70 47.20 72.76 53.80 53.80

In the above Table-17, student 6 and 7 belong to same class (cluster). We conclude that the level of intelligence of both students is same. This is a fallacious conclusion since we find from the above Table-17 that the student 6 has improved consistently while student 7 has deteriorated consistently. This is the drawback of proposed fuzzy model for student academic performance. Solve such type of problem by the Fuzzy C-Means algorithm.

DATA CLUSTER ANALYSIS TECHNIQUES FOR ACADEMIC PERFORMANCE EVALUATION


The clustering problem can be stated simply as follows: Given a finite set of data, X, develop a grouping scheme for grouping the objects into classes. In classical cluster analysis, these classes are required to form a partition of X such that the degree of association is strong for data within blocks of the partition and weak for data in different blocks. However, this requirement is too strong in many practical applications, and it is thus desirable to replace it with a weaker requirement. When the requirement of a crisp partition of X is replaced with a weaker requirement of a fuzzy partition or a fuzzy pseudo partition on X, we refer to the emerging problem area as fuzzy clustering. Fuzzy pseudo partitions are often called fuzzy c-partitions, where c designates the number of fuzzy classes in the partition (S. Gagula-Palalic and M. Can, 2008). Pattern recognition techniques can be classified into two broad categories: unsupervised techniques and supervise techniques. An unsupervised technique does not use a given set of unclassified data, whereas a supervised technique uses a dataset with known classification. These two types of techniques are complementary to each other. The Hard C-Means and Fuzzy C-Means clustering techniques fall in unsupervised category. In this paper, we use K-Means, Hard C-Means and Fuzzy C-Means clustering techniques for students academic performance evaluation. A. K-Means Clustering The K-means clustering technique is an iterative algorithm in which items are moved among sets of clusters until the desired set is related. A high degree of similarity among elements in clusters is obtained, while a high degree of dissimilarity among elements in different clusters is achieved simultaneously. The K-Means clustering technique is used to classify data in a crisp sense. By this we mean that each data point will be assigned to one, and only one, data cluster. In this sense these clusters as also called partitions-that is, partitions of data. Define a family of sets partitions: (5) (6) (7) as a partition of X, where the following set-theoretic forms apply to those

Academic Performance Evaluation Using Fuzzy C-Means

67

Again, where

a finite set space is comprised of the universe of data samples, and C is the

number of cases, or partitions, or clusters, into which we want to classify the data. We note the obvious, (8) Where C = n classes just places each data sample into its own class, and C = 1 places all data samples into the same class; neither case requires any effort in classification, and both are intrinsically uninteresting. Equation (5) expresses the fact that the set of all classes exhausts the universe of data samples. Equation (6) indicates that none of the classes overlap in the sense that a data samples can belong to more than one class. Equation (7) simply express that a class cannot be empty and it cannot contain al, the data samples. Here the objective function (or classification criteria) to be used to classify or cluster the data. The one proposed for the hard K-Means algorithm is kwon as a within-class sum of squared errors approach using a Euclidean norm to characterize distance. This algorithm is denoted partition matrix, and the parameter, v, is a vector of cluster centers. This objective function is given by: (9) Where is a Euclidean distance measure (in m-dimensional feature space, , is given by (10) Since each data sample requires m coordinates to describe its location in
th

where U is the

between the kth data sample

and ith

cluster centre

-space, each cluster centre also

requires m coordinates to describe its location in this same space. Therefore, the i cluster centre is a vector of length m, the following manner: Step-I: Start with some initial configuration of prototypes Step-II: We compute the value for equation (4). Step-III: construct a partition matrix by assigning numeric values to U according to the following rule: (11) Step-IV: Update the prototype by computing the weighted average, which involves the entries of the partition matrix: (12) Until convergence criteria is met. B. Hard C-Means (HCM) Clustering Algorithms HCM is used to classify data in a crisp sense. By this we mean that data point will be assigned to one, and only one, data cluster. In this sense these clusters are also called partitions-that is, partitions of data. Assuming that a dataset contains, well-separated clusters, the goals of hard C-means algorithm are twofold (J. Yen, et al., 1999). 1. 2. To find the centre of these cluster. To determine the clusters (i.e., labels) of each point in the dataset. or the distance from the sample (e.g., choose them randomly). (a data set) to the centre, , of the ith class, using . The flow of the main optimization activities in K-Means clustering can be outlined in

68

Ramjeet Singh Yadav & P. Ahmed

In fact, the second goal can easily be achieved once we accomplished the first goal, based on that clusters are compact and well separated (J. Yen, et al., 1999). Given cluster centers, a point in the dataset belongs to the cluster whose center is the closet, i.e., (13) Where denotes the cluster of the cluster In order to achieve the first goal (i.e., finding the cluster centers),

we need to establish a criterion that can be used to search for these cluster centers. One such criterion is the sum of the distance between points in each cluster and their center. (14) Where P is a vector of cluster centers to be identified. This criterion is useful because a set of true cluster centers will give a minimal J value for s given data. Based on these observations, the hard C-means algorithm tries to find the cluster centers V that minimizes J. However,, J is also a function of partition, P, which is determined by the cluster centers V according equation (10). Therefore, the Hard C-means (HCM) searches for the true cluster center by iterating the following two steps: 1. 2. Calculating the current partition based on the current cluster. Modifying the current cluster centers using a gradient descent method to minimize the J function. The cycle terminate when the difference between clusters in two cycles is smaller than a threshold. This means that the algorithm has converged to a local minimum of J. C. Fuzzy C-Means (FCM) Clustering Algorithm The fuzzy C-Means algorithm (FCM) generalizes the hard C-Means algorithm to allow a point to partially belong to multiple clusters. Therefore, it produces a soft partition for a given dataset. In fact, it produces a constrained soft partition (J. Yen, et al., 1999). To this, the objective function J1 of hard C-Means has been extended in two ways: 1. 2. The fuzzy membership degrees in clusters were incorporated into the formula. An additional parameter m was introduced as a weight exponent in the fuzzy membership.

The extended objective function, denoted Jm, is (15) Where P is a fuzzy partition of the dataset X formed by . The parameter m is a weight that

determines the degree to which partial members of a cluster affect the clustering result. Like hard c-means, fuzzy c-means also tries to find a good partition by searching for prototypes vi that minimize the objective function Jm. Unlike hard Cmeans, however, the fuzzy C-means algorithms also need to search for membership functions The fuzzy C-means (FCM) algorithm is given below: FCM(X, c, m, ) X : An unlabeled data set C : the number of clusters to form that minimize Jm.

Academic Performance Evaluation Using Fuzzy C-Means

69

m : the parameter in the objective function : A threshold for the convergence criteria Initialize prototype Repeat

Compute membership function using equation (9). Update the prototype, vi in V using equation (10). Until Until convergence criteria is met. Fuzzy C-Means Theorem A constrained fuzzy partition if the following conditions are satisfied: (16) can be a local minimum of the objective function Jm only

(17)

Bases on this theorem, FCM updates the prototypes and the membership function iteratively using equation (16) and (17) until a convergence criterion is reached. D. Regression Model Regression is one of the most common problems in statistics. It consists in exploring the association between dependent and independent variables and in identifying their impact on the dependent variable. Ordinarily, we do not have knowledge of the exact functional relationship between the two random variables x and y, where to each vector x sampled according to a distribution P(x) there corresponds a scalar in accordance to a conditional distribution P(y/x). Typically we proceed by assuming that the target variables y is given by some deterministic function of x with added Gaussian noise that represents a measurement error or, more generally, our ignorance about the dependence of y on x (H. White, 1989): (18) The function is called the regression function and the statistical model described by the above equation is is a random variable having a normal distribution with zero mean, and a standard

called regression model. The error deviation

which does not depend on x or y, that is: (19)

70

Ramjeet Singh Yadav & P. Ahmed

This common assumption can be partly justified by results from experimental measurements and by the central limit theorem, which states that the sample mean of any reasonable distribution can be approximated by a normal distribution. It follows from this assumption and from (17) that the conditional distribution of y given x will be a normal distribution with mean and variance . Hence we obtain: (20) That is is the conditional mean of the output y given the input x. In other words, the regression of y on x is

that (deterministic) function of x that gives the mean value of y conditional on x. It can be demonstrated that the regression function is an excellent solution to the problem of fitting the data, i.e. among all functions of x, the regression is the best predictor of y given x, in the squared-error sense. Precisely, it can be shown that the minimum of the risk functional: (21) Is attained by the regression function . Thus the problem of regression estimation can be addressed in the

statistical learning framework, once the learning machine is assessed by a quadratic loss function: (22) In the case of a quadratic loss function, the empirical risk functional becomes: (23) Which is usually referred to as the Mean Squared Error (MSE)?

EXPERT SYSTEM
An expert system is a class of computer programs first developed by researchers in artificial intelligence (AI) during the 1970s (J.C. Giarratano and G. Riley, 2005) and has been applied commercially throughout the 1980s. Prof. Edward Feigenbaum of Stanford University, an early pioneer of expert systems technology, has defined an expert system as an intelligent computer program that uses knowledge and inference procedures to solve problem that are difficult enough to require significant human expertise for their solution. In other words, an expert system is a computer system that can perform the decision-making ability as a human expert. Expert system have been combined with database for human-like pattern recognition and automated decision systems to yield knowledge discovery through data mining and thus produce an intelligent database. The knowledge in expert systems may be either expertise, or knowledge that is generally available from books, magazines, and knowledgeable persons. For example, when we consult an expert (e.g., doctor, lawyer, or teacher) about a problem, the expert asks for the current information about our condition, searches his or her knowledge base (memory) for existing knowledge that relates to elements of the current situation, processes the information, arrives at a decision, and presents his or her solution.

Academic Performance Evaluation Using Fuzzy C-Means

71

Figure-1 shows the basic concept of a knowledge-based expert system. The user supplies facts or other information to the expert system and receives expert advice or expertise in response. Internally, the expert system consists of two main components: the knowledge base and an inference engine. The former contains the knowledge which is used by to draw by the latter to draw conclusions. These conclusions are the expert systems responses to the users queries for expertise. The experts knowledge about solving specific problems is called the knowledge domain of the expert. An experts knowledge is commonly specific to one problem domain as opposed to general problem solving area. Inference or reasoning is particularly important in the expert system because it is the technique by which expert system solve problems. Numerical techniques for reasoning under uncertainty have been applied to expert system, such as Bayesian network, the Dempster-Shafer theory of evidence and fuzzy logic. Inference engine may be called reasoning strategies. The inference engine directs the search through the knowledge base; a process that may involve the application of inference rules in what is called pattern matching. The control program decides which rule to investigate, which alternative to eliminate, and which attribute to match. The most common knowledge representation in the computational format is the IF.THEN control structure.

PROPOSED DYNAMIC FUZZY EXPERT SYSTEM (DEFS) FOR ACADEMIC PERFORMANCE EVALUATION
In this paper, we have proposed Dynamic Fuzzy Expert System (DEFS) for student academic performance evaluation. This proposed Dynamic Fuzzy Expert System (DEFS) consists of Fuzzy Logic, Fuzzy C-means clustering algorithm and Regression analysis model. The Fuzzy C-Means clustering algorithm is used for classify input space into different classes or clusters and regression analysis model used for output estimation of the input data. A. Dynamic Fuzzy Expert System (DFES) The world of information is surrounded by uncertainty and imprecision. The human reasoning process can handle inexact, uncertain, and vague concepts in an appropriate manner. Usually, the human thinking, reasoning, and perception process cannot be expressed precisely. These types of experiences can rarely express or measured using statistical or probability theory. Fuzzy logic provides a framework to model uncertainty, the human way of thinking, reasoning, and the perception process. Fuzzy system was introduced by Zadeh (1965). A fuzzy expert system is simply an expert system that uses a collection of fuzzy membership functions and rules, instead of Boolean logic, to reason about data (Schneider et al. 1996). The rules in a fuzzy expert system are usually of a form similar to the following: If A is Low and B is High then (X = Medium). Where A and B are input variables, X is an output variable. Here low, high and medium are fuzzy sets defined on A, B and X respectively. The antecedent (the rules premise) describes to what degree the rule applies, while the rules consequent assigns a membership function to each of one or more output variables. Let X is a space of objects and x be a generic element of X. A classical set , is defined as a collection of

elements objects, such that x can either belong or not belong to the set. A Fuzzy set A in X is defined as a set of ordered pairs: , where is called the membership function (MF) for the fuzzy set A. The MF maps

each element of X to a membership grade (or membership value) between zero and one. Figure-2 shows the basic architecture of proposed fuzzy expert system for modeling academic performance evaluation.

72

Ramjeet Singh Yadav & P. Ahmed

The main components of proposed dynamic fuzzy expert system are: a fuzzification interface, a fuzzy rule-base (knowledge base), an inference engine (decision making logic), and a defuzzification interface. 1. 2. Fuzzification Interface: The input variables are fuzzified by the Fuzzy C-Means clustering algorithm. Fuzzy Rule Base (Knowledge Base): Fuzzy if-then rules and fuzzy reasoning are the backbone of fuzzy expert systems, which are the most important modeling tools based on fuzzy set theory. The rule base is characterized in the form of if-then rules in which the antecedents and consequents involve linguistic variables. In this paper, we use very high, high, average, low and very low as linguistic variable. The collection of these rules forms the rule base for the fuzzy logic system. In this proposed dynamic fuzzy expert system, we have used the following rules for finding the knowledge base: 1. If student belong to very high then 2. If student belong to high then 3. If student belong to average then 4. If student belong to low then 5. If student belong to very low then Where X is the students mark obtained in semester-1 examination. constant determine by the method of regression analysis model. 3. Inference Engine (Decision Making Logic): Using suitable inference procedure, the truth value for the antecedent of each rule is computed and applied to the consequent part of each rule. Here, we have used the regression analysis model for decision making. This results in one fuzzy subset to be assigned to each output variable for each rule. Again, by using suitable composition procedure, all the fuzzy subsets to be assigned to each output variable are combined together to form a single fuzzy subset for each output variable. 4. Defuzzification Interface: Defuzzification means convert fuzzy output into crisp output. Here, we have used the height defuzzification technique for converting fuzzy output into crisp output (performance value of students). The defuzzification formula are given below: (24) With the help of equation (24), we can convert the fuzzy output into crisp output (performance value of a student). are

EXPERIMENTAL RESULTS OF K-MEANS TECHNIQUE


Let us consider, 20 students marks obtained by Semester-1 and Semester-2 examination. Table-18 shows the scores achieved by 20 B.Tech. 2nd year students in the Department of Computer Science and Engineering, Ashoka Institute of

Academic Performance Evaluation Using Fuzzy C-Means

73

Technology and Management, Aktha, Saranath, Varanasi-221007, Uttar Pradesh, India, appeared in semester-I and semester-II examination. Table 18: Data Set of Students Score in Semester-I and Semester-II S.No. Sem-1 Sem-2 S.No. Sem-1 Sem-2 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 40 20 50 10 45 34 48 56 74 45 65 35 65 20 65 60 55 90 70 50 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 65 89 100 65 48 45 55 84 63 28 45 100 100 35 50 55 25 80 65 30

The above data points (Table-18) are first divided into different clusters using K-Means clustering techniques For this purpose, we use MATLAB software for grouping (Clustering) the students data score in three groups (Clusters), namely cluster (very high), cluster (high), cluster (average), cluster (low) and Cluster (very low), shown in Table-19. Table 19: The membership functions for crisp clustering of Students Academic Performance Evaluation by KMeans Algorithms S.No. Sem-1 Sem-2 Classical Clustering (K-Means Clustering) Very high High (V) Average Low Very Low (VH) (A) (L) (VL) 1. 40 65 0 0 1 0 0 2. 20 35 0 0 0 1 0 3. 50 65 0 0 1 0 0 4. 10 20 0 0 0 0 1 45 65 0 0 1 0 0 5. 6. 34 60 0 0 1 0 0 7. 48 55 0 0 1 0 0 8. 56 90 1 0 0 0 0 9. 74 70 1 0 0 0 0 10. 45 50 0 0 1 0 0 65 45 0 1 0 0 0 11. 12. 89 100 1 0 0 0 0 13. 100 100 1 0 0 0 0 14. 65 35 0 1 0 0 0 15. 48 50 0 0 1 0 0 16. 45 55 0 0 1 0 0 17. 55 25 0 0 0 1 0 18. 84 80 1 0 0 0 0 19. 63 65 0 1 0 0 0 20. 28 30 0 0 0 1 0

In the above Table-19 shows that there 05 students belong to cluster (very high), 03 students belongs to cluster (high), 08 students belongs to cluster (average), 03 students belongs to cluster (low) and 01 students belongs to cluster (very low). Table-19 also shows that the 5th student belongs to cluster (Average) and 11th student belongs to cluster (high). We conclude that the level of intelligence of both students is that 11th student more intelligent than the 5th student. This is a fallacious conclusion, since we find from the above Table-19 that the 5th student has improved consistently while 11th student has deteriorated consistently. This is the drawback of K-means clustering algorithm. Other drawback of K-Means clustering algorithm is that cannot calculate the total mark of a student. We have solved such types of problem by the proposed Dynamic Fuzzy Expert System based on Fuzzy C-Means clustering algorithm and Regression model.

74

Ramjeet Singh Yadav & P. Ahmed

EXPERIMENTAL RESULT OF DYNAMIC FUZZY EXPERT SYSTEM (DFES) FOR MODELING ACADEMIC PERFORMANCE EVALUATION
The main goal of this paper is to propose a new methodology to carry out evaluate the academic performance of the students. In order to analyze and organize the Dynamic Fuzzy Expert System (DFES) with the help of Fuzzy set and Fuzzy C-Means clustering technique. Figure 2 illustrates the components of Dynamic Fuzzy Expert System. The proposed Dynamic Fuzzy Expert System is implemented using the Takagi-Sugeno-Kang (TSK) model and to defuzzify the resulting fuzzy set, the center of gravity (COG) defuzzification method is selected. The first step in using Fuzzy C-Means clustering within this model is to identify the parameters that will be fuzzified dynamically and to determine their respective range of values. The final result of this interaction is the value for each performance parameter. The proposed system has been simulated using the Fuzzy Logic (MATLAB) toolbox. Here, we use Fuzzy C-Means clustering Algorithms for classifying students scores data set (conversion of crisp score into fuzzy set), given in Table-18. For this purpose, we use Fuzzy Logic ToolboxTM 2.2.7 by MathWorks for classifying (Clustering) the students data score in five classes or clusters, namely Very High, High, Average, Low, and Very Low for modeling students academic performance evaluation, shown in Table20. Figue-3 shows the students dataset partitioned into three classes or cluster. Figue-4 shows the performance of objective function for students academic performance evaluation. Table 20: The Membership Functions for Fuzzy Clustering of Students Academic Performance Evaluation by Fuzzy C-Means Algorithms S.No. Sem-1 Sem-2 Classical Clustering (Fuzzy C-Means Clustering Method) Very High High Average Low Very Low (VH) (H) (A) (L) (VL) 1. 40 65 0.0138 0.0554 0.8574 0.0412 0.0322 2. 20 35 0.0036 0.0085 0.0290 0.0194 0.9395 3. 50 65 0.0180 0.1135 0.7891 0.0547 0.0247 4. 10 20 0.0115 0.0236 0.0563 0.0517 0.8569 45 65 0.0106 0.0518 0.8862 0.0323 0.0191 5. 6. 34 60 0.0181 0.0610 0.7755 0.0669 0.0784 7. 48 55 0.0054 0.0260 0.9163 0.0379 0.0145 8. 56 90 0.1674 0.4805 0.2206 0.0826 0.0489 9. 74 70 0.0150 0.9490 0.0184 0.0137 0.0039 10. 45 50 0.0120 0.0485 0.7708 0.1161 0.0525 65 45 0.0192 0.0893 0.1196 0.7410 0.0309 11. 12. 89 100 0.9713 0.0176 0.0052 0.0039 0.0019 13. 100 100 0.9518 0.0272 0.0092 0.0079 0.0038 14. 65 35 0.0021 0.0071 0.0107 0.9751 0.0050 15. 48 50 0.0137 0.0595 0.7240 0.1538 0.0491 16. 45 55 0.0029 0.0126 0.9566 0.0186 0.0093 17. 55 25 0.0173 0.0478 0.0975 0.7416 0.0957 18. 84 80 0.2989 0.5613 0.0661 0.0540 0.0197 19. 63 65 0.0364 0.6519 0.2004 0.0875 0.0237 20. 28 30 0.0066 0.0505 0.0543 0.0505 0.8722

Academic Performance Evaluation Using Fuzzy C-Means

75

Figure 3: Partition of the Students Score Dataset for Academic Performance Evaluation Table 21: The cluster centers of Very High, High, Average, Low and Very Low Cluster Center Sem.-1 Sem.-2 Cluster Centre of Very High 93.2948 98.8680 Cluster Centre of High 70.5267 72.6503 Cluster Centre of Average 44.7493 58.5596 Cluster Centre of Low 61.8312 35.7363 Cluster Centre of Very Low 19.8020 28.9976

Figure 4: Performance of Objective Function The component value of vectors P and V are obtained by soling the fuzzy clustering problem (Academic Performance Evaluation problem), which is basically constrained optimization problems in equation (15). A description of each item of notation as follows: The variable k represents the number of students sit in Semester-1 and Semester-2, who will be allocated into C classes or clusters. The variable C represents the number of classes or clusters, the value of this variable can be determined by the institution policy. The matrix consists of n rows and c columns, of which the element represents

the degree of membership (or the suitability level) of the kth student. The matrix

, consists of m rows and c

columns, of which the element represents the (weighted) average of students grade achieved by students, belong to the cluster (or class). In extreme condition, the value of the fundamental equation (10) is 0, which indicates the obtained clusters is,

are ideal, since they consist of students with the same level of mastery. Principally, the minimum the value of

then the better the clustering process. The application of fuzzy C-Means Algorithm (FCM) illustrated by a case described as dataset of students score marks shown in Table-20. Table-22 gives the value of elements of vector Ui (i=1, 2, 3). As an illustration, the values in the 11th row of Table-20 can be interpreted as:

76

Ramjeet Singh Yadav & P. Ahmed

From those five values, 11th student is the most suitable to be in class or cluster (Low), since he/she has the highest degree of membership to this class or cluster compared to the other four. 5th student is the most suitable to be in class or cluster (average), since he/she has the highest degree of membership to this class or cluster compared to the other four. Thus, we conclude that 5th student has improved consistently while 11th student has deteriorated consistently. By the same observations, the following class or cluster was obtained for students partitioning in Semester-1 and Semester-2 examinations: 1. 2. 3. 4. 5. The first class or cluster (Very High) consists of students numbers 12, and 13. The second class or cluster (High) consists of students numbers 8, 9, 18 and 19. The third class or cluster (Average) consists of students numbers 1, 3, 5, 6, 7, 10, 15, and 16. The fourth class or cluster (Low) consists of students numbers 11, 14 and 17. The fifth class or cluster (Very Low) consists of students numbers 2, 4 and 20. Thus, two students belong to class or cluster (Very High), four students belong to class or cluster (High), eight students belong to class or cluster (Average), three students belong to class cluster (Low) and three students belong to class or cluster (Very Low). Output Estimation: Regression problems deal with estimation of an output value based on input values. When used for classification, the input values are values from the database and the output values represents the classes. Regression can be used to solve classification problems. In actually, regression takes a set of data and fits the data to formal. The linear regression formula in two dimensional spaces is given bellow: (25) Where a and b are constant. They are determining by the normal equations for best fit of linear relationship of input and output. This model is estimate the actual relationship between input and output. We can use the generated linear regression model to predict an output value given an input value. Here, we use the regression analysis of output estimation of Dynamic Fuzzy Expert System (DFES) for modeling academic performance evaluation. In this proposed research work, we use linear regression model for estimation of output of Dynamic Fuzzy Expert System (DFES). Here we use the MATAB software for estimating the output of DFES. The output of cluster (Very High), cluster (High), Cluster (Average), cluster (Low) and Cluster (Very Low) are given bellow:

Average Low

Where X is students mark of semester-1.

Academic Performance Evaluation Using Fuzzy C-Means

77

Rule Generation 1. 2. 3. 4. 5. If Student belongs to cluster (very high) then student performance is very high If student is belongs to cluster (high) then student performance is high If student is belongs to cluster (average) then student performance is average( If student belongs to cluster (low) then student performance low If student belongs to cluster very low then student performance is very low ( . . . ).

If we take the first student of Table-20, then the output of Y is given by

Defuzzification (Calculation of Student Academic Performance) The final calculation of student academic performance is determined by the following formula:

Similarly, we can calculate the academic performance of other students given in Table-22. Table 22: The Membership Functions and Students Academic Performance Calculated by the Dynamic Fuzzy Expert System
S.No Sem-1 Sem-2 Very High (VH) 0.0138 0.0036 0.0180 0.0115 0.0106 0.0181 0.0054 0.1674 0.0150 0.0120 0.0192 0.9713 0.9518 0.0021 0.0137 0.0029 0.0173 0.2989 0.0364 0.0066 Dynamic Fuzzy Expert System method (Fuzzy C-Means Clustering Method) High Average Low (L) Very Low (H) (A) (VL) 0.0554 0.8574 0.0412 0.0322 0.0085 0.1135 0.0236 0.0518 0.0610 0.0260 0.4805 0.9490 0.0485 0.0893 0.0176 0.0272 0.0071 0.0595 0.0126 0.0478 0.5613 0.6519 0.0505 0.0290 0.7891 0.0563 0.8862 0.7755 0.9163 0.2206 0.0184 0.7708 0.1196 0.0052 0.0092 0.0107 0.7240 0.9566 0.0975 0.0661 0.2004 0.0543 0.0194 0.0547 0.0517 0.0323 0.0669 0.0379 0.0826 0.0137 0.1161 0.7410 0.0039 0.0079 0.9751 0.1538 0.0186 0.7416 0.0540 0.0875 0.0505 0.9395 0.0247 0.8569 0.0191 0.0784 0.0145 0.0489 0.0039 0.0525 0.0309 0.0019 0.0038 0.0050 0.0491 0.0093 0.0957 0.0197 0.0237 0.8722 Student Performance (SP) 58.257320 29.438568 57.545231 24.382494 57.753239 56.775181 56.118908 71.297348 74.884071 53.238884 46.385464 99.079208 98.510788 40.595856 51.915192 57.329090 34.151695 79.207535 69.206512 35.532959

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.

40 20 50 10 45 34 48 56 74 45 65 89 100 65 48 45 55 84 63 28

65 35 65 20 65 60 55 90 70 50 45 100 100 35 50 55 25 80 65 30

78

Ramjeet Singh Yadav & P. Ahmed

From above Table-22 shows that the 11th student is the most suitable to be in class or cluster (Low), since he/she has the highest degree of membership to this class or cluster compared to the other four. 5th student is the most suitable to be in class or cluster (average), since he/she has the highest degree of membership to this class or cluster compared to the other four. Thus, we conclude that 5th student has improved consistently while 11th student has deteriorated consistently. Therefore, we observed that the fuzzy C-Means clustering algorithm method is more suitable than the classical K-Means clustering algorithms method for evaluating academic performance.

COMPARISON OF CLASSICAL, FUZZY EXPERT SYSTEM, K-MEANS, FUZZY C-MEANS CLUSTERING EVALUATION
The comparison of Classical, Classical Fuzzy Expert, K-Means and Fuzzy C-Means Clustering algorithm method for students academic performance are given in Table-2 Table 23: Comparison of Classical, Fuzzy Expert System, K-Means, Fuzzy C-Means Clustering Algorithm Method Fuzzy Expert System Method K-Means Clustering Method Very Low (VL) High (H) Average (A) Low (L) Very High Dynamic Fuzzy Expert System method (Fuzzy C-Means Clustering Method) Very Low (VL) 0.0322 0.9395 0.0247 0.8569 0.0191 0.0784 0.0145 Student Performa nce (SP) 56.118908 56.775181 57.753239 24.382494 57.545231 29.438568 58.257320 High (H) Average (A) Low (L) Very High

ALGORITHM METHOD

FOR

MODELING

ACADEMIC

PERFORMANCE

Classical Method

Sem-1

Sem-2

S.No.

0.0554

0.8574 0.0290 0.7891 0.0563 0.8862 0.7755 0.9163

0.0036

0.0085

0.0180

0.1135

0.0115

0.0236

0.0106

0.0518

0.0181

0.0610

0.0054

0.0260

0.0379

51.50

53.30

48

55

7.

0.0669

47.00

62.50

34

60

6.

0.0323

55.00

67.60

45

65

5.

0.0517

15.00

20.30

10

20

4.

0.0547

57.50

75.00

50

65

3.

0.0194

27.50

24.30

20

35

2.

0.0412

0.0138

52.50

62.70

40

65

1.

19. 55 25 40.00 50.00 49.00 50.00 100.0 94.50 55.00 47.50 72.00 55 50 35 100 100 45 50 70 45 48 65 100 89 65 45 74 56 90 73.00

18.

17.

16.

15.

14.

13.

12.

11.

10.

9.

8.

63

84

65

80

64.00

82.00

75.30

77.80

31.00

49.00

47.30

38.70

92.00

90.80

57.50

44.00

75.90

75.80

0 0 0 1 0 0.0173 0.0478 0.0975 0.7416 0.0186 0.1538 0.9751 0.9566 0.7240 0.0107 0.0126 0.0595 0.0071 0.0272 0.0092 0.0079 0.0029 0.0137 0.0021 0.9518 0 0 0 0 0 0.9713 0.0176 0.0052 0.0039 0 0 0 0 0 1 1 0 0 0 0 0 0 0.0192 0.0893 0.1196 0.7410 0 0 1 0 0 1

0 0 1 0 0 0.0120 0.0485 0.7708 0.1161

1 0 0 0 0 0.0150 0.9490 0.0184 0.0137

1 0 0 0 0 0.1674 0.4805 0.2206 0.0826

Academic Performance Evaluation Using Fuzzy C-Means

0.0364

0.2989

0.6519

0.5613

0.2004

0.0661

0.0875

0.0540

0.0237

0.0197

0.0957

0.0093

0.0491

0.0050

0.0038

0.0019

0.0309

0.0525

0.0039

0.0489

69.206512

79.207535

34.151695

57.329090

51.915192

40.595856

98.510788

99.079208

46.385464

53.238884

74.884071

71.297348

79

80

Ramjeet Singh Yadav & P. Ahmed

Table-23 shows that the average marks of both 11th student and 5th student are same in classical method. Table-23 also shows that the 5th student belongs to cluster (average) and 11th student belongs to the cluster (high) in K-Means method and 5th student belongs to cluster (average), 11th student belongs to cluster (low) in Fuzzy C-Means method. We conclude that the level of intelligence of both students is same in classical (Mean) method. 5th Student is more intelligent than 11th student in fuzzy C-Means Clustering method. Thus, we can say that the Fuzzy C-Means clustering algorithm is more powerful clustering algorithm than the K-means clustering algorithm for academic performance evaluation. The fuzzy C-Means Clustering algorithm automatically generates the membership value of semester-1 and semester-2 examination scores of students marks for further treatment of student academic performance such as rule generation of fuzzy expert system. Figure-5 and Table-24 shows the comparison of K-Means and Fuzzy C-Means clustering algorithm for academic performance evaluation. The proposed Dynamic Fuzzy Expert System also calculates the total mark of a student sit in semester-1 and semester-2 examination. The proposed dynamic fuzzy Expert System is based on Fuzzy C-Means Clustering algorithm method, Regression analysis model and Fuzzy logic. Therefore, we can say that the proposed Dynamic Fuzzy Expert System method for modeling student academic performance evaluation is more powerful method in comparison to classical (mean) method, fuzzy logic method (Sirigiri Pavani et al., 2012, Chiu-Keung Law, 1996, Wan Suhan Wan Daud et al., 2011, Mamatha S. Upadhya, 2012) and Fuzzy Expert System method (Ramjeet et al. 2011, O.K. Chaudhari et al., 2012). The proposed Dynamic Fuzzy Expert System automatically converts the crisp set into fuzzy set. There is no need of the domain expert. Thus, the proposed Dynamic Fuzzy Expert System is more powerful method for evaluating the student academic performance. This method also evaluates the teacher academic performance for the different attributes. Table 24: Comparison of K-Means and Fuzzy C-Means Clustering Algorithm Clusters or K-Means Fuzzy C-Means Classes Clustering Clustering Very High 05 02 High 03 04 Average 08 08 Low 03 03 Very Low 01 03

CONCLUSIONS AND FUTURE WORK


In this paper, we have proposed Dynamic Fuzzy Expert system for modeling students academic performance evaluation based Fuzzy C-Means Clustering Algorithm, Fuzzy Logic and Regression analysis model. The proposed Dynamic Fuzzy Expert System automatically convert the crisp data into fuzzy set and also calculate the total marks of a student sit in semsetr-1 and semester-2 examination. The K-Means clustering algorithm is based on crisp set or classical logic and fuzzy C-Means clustering algorithm based on fuzzy logic techniques. In this paper, we have provided a simple and qualitative methodology to compare the predictive power of clustering algorithm and the Euclidean distance. We demonstrated our techniques using K-Means and Fuzzy C-Means clustering algorithm for modeling academic performance evaluation and combined with the deterministic model on a dataset of B.Tech. (Computer Science and Engineering), Saranath, Varanasi, UP, India, students, sit in semester-1 and semester-2 examination. Here, there are 20

35.532959

0.0066

0.0505

0.0543

0.0505

0.8722

29.00

24.10

28

20.

30

Academic Performance Evaluation Using Fuzzy C-Means

81

students sit in semester-1 and semester-2 examination provides the numerical interpretation of the results for modeling students academic performance evaluation. These both models, K-Means and Fuzzy C-Means algorithm clustering models improved on some limitation of the existing traditional methods, such as average method and statistical method. The Fuzzy C-Means Algorithm model based on fuzzy logic best model for modeling academic performance evaluation in comparison in comparison to the K-Means clustering algorithm model because this algorithm based on crisp set or classical logic. I n this paper, we have observed that the Fuzzy C-Means algorithm is best model for modeling academic performance in educational domain. Therefore, the fuzzy C-Means clustering algorithm serves as a good benchmark to monitor the progression of students modeling in educational domain. It also enhances the decision making by academic planners semester by semester by improving on the future academic results in the subsequence academic session. It worth of future research to use combine technique of fuzzy C-Means artificial neural networks called Neuro-Dynamic Fuzzy Expert system to evaluate student and teacher academic performance and also develop adaptive learning system and Intelligent Tutoring System for Internet based education like Distance Education. The system is implemented by using the Fuzzy Logic ToolboxTM 2.2.7 by MathWorks.

Figure 5: Comparison of K-Means and Fuzzy C-Means Clustering Algorithm for Modeling Academic Performance Evaluation

ACKNOWLEDGEMENTS
I would like to express my deep sense of gratitude and respect to my supervisor Prof. Pervez Ahmed, for their excellent guidance and suggestions. They have been to source of inspiration for me. I would like to render heartiest thanks to various friends for their priceless help and support. Last but not the least we thank our parents and wife and the almighty whose blessings are always there with us.

REFERENCES
1. K. Mankad, P.S. Sajja and R. Akerkar, Evolving Rules Using Genetic Fuzzy Approach: An educational case study, International Journal on Soft Computing. 2(1), pp. 35-46, 2011. 2. R. Biswas, An Application of fuzzy sets in Students Evaluation, Fuzzy sets and System, ELSEVIER, pp. 187194, 1995. 3. L.A. Zadeh, Fuzzy sets. Information and Control, 8, pp. 338-354, 1965.

82

Ramjeet Singh Yadav & P. Ahmed

4.

H.Y. Wang and S.M. Chen, Artificial Intelligence Approach to Evaluate Students Answerscripts Based on the Similarity Measure Between Vague Sets, Educational Technology and Society, 10(4), pp. 224-241, 2007.

5.

W.L. Gau, and D.J. Buehrer, Vague Sets. IEEE Transactions on System. Man and Cybernatics, 23(2), pp. 610614, 1993.

6.

S.M. Bai and S.M. Chen, Evaluating Students Learning Achievement Using Fuzzy membership functions and Fuzzy rules, Expert System with Applications, ELSEVIER, 34, pp. 399-410, 2008.

7. 8.

C.K. Law, Using Fuzzy Numbers in Educational Grading system, Fuzzy sets and System 83, pp. 311-323, 1996. S.M. Chen and C.H. Lee, New Methods for Students Evaluation Using Fuzzy Sets, Fuzzy Sets and System, 104, pp. 209-218, 1999.

9.

H.Y. Wang and S.M. Chen, New Methods for Evaluating Students Answerscripts Using Fuzzy Numbers Associated with Degrees of Confidence, 2006 IEEE International Conference on Fuzzy Systems, pp. 1004-1009, 2006.

10. R. Stathacopoulou, G.D. Magoulas, M. Grigoriadou and Samarakou, Neuro-Fuzzy Knowledge Processing in Intelligent Learning Environments for Improved Student Diagnosis, Information Science, ELSEVIER, 170(2-4), pp. 273-307, 2005. 11. Y.Y. Guh, M.S. Yang, R.W. Po, E.S. Lee, Establishing Performance Evaluation Structures by Fuzzy Relation Based Cluster Analysis, Computers and Mathematics Applications, 56, pp. 572-582, 2008. 12. E. Gokmen, T.C. Akinci, M. Tektas, N. Onat, G. Kocyigit and N. Tektas, Evaluation of Student Performance in Laboratory Applications Using Fuzzy Logic, Procedia Social and Behavioral Science, 2, pp. 902-909, 2010. 13. I.A. Hameed, Using Gaussian Membership Functions for Improving the Reliability and Robustness of Students Evaluation System, International Journal of Expert System with Applications, 38 (6), pp. 7135-7142, 2011. 14. A.Baylari and G.A. Montazer, Design a Personalized E-learning System Based on Item Response Theory and Artificial Neural Network Approach, Expert System with Applications, 36, pp. 8013-8021, 2009. 15. C.L. Posey and L.W. Hawkes, Neural Networks Applied in the Student Model. Intelligent Systems, 88, pp. 275298, 1996. 16. R. Stathacopoulou, M. Grigoriadou, M. Samarakou and D. Mitoropoulou, Monitoring Students Action and Using Teachers Expertise in Implementing and Evaluating the Neural Network-based Fuzzy Diagnostic Model, Expert Systems with Applications, 32, pp. 955-975, 2007. 17. R. Bhatt and D. Bhatt, Fuzzy Logic Based Student Performance Evaluation Model for Practical Components of Engineering Institutions Subjects, International Journal of Technology and Engineering Education,8(1), pp. 1-7, 2011. 18. C.R. Gupta and A.K. Dhawan, Diagnosis, Modeling and Prognosis of Learning System Using Fuzzy Logic and Intelligent Decision Vectors, International Journal of Computer Applications, 37(6), pp. 975-987, 2012. 19. J. Ma and D. Zhou, Fuzzy Set Approach to the Assessment of student Centered Learning, IEEE Transaction on Education, 43(2), pp. 112-120, 2000. 20. J. Krzysztof, Cios, W. Pedrycz, R.W. Swiniarski, A. Lukasz and Kurgan, Data Mining: A Knowledge Discovery Approach, Springer, pp. 263-265, 2007. 21. S. Gagula-Palalic and M. Can, Fuzzy Clustering Models and Algorithms for Pattern Recognition, Master Thesis, pp. 13-17, 2008. 22. J. Yen and R. Langari, Fuzzy Logic: Intelligence, Control and Information, Center for Fuzzy logic. Robotics and Intelligent Systems. Texas A & M University, pp. 375-401, 1999.

Academic Performance Evaluation Using Fuzzy C-Means

83

23. S. S. Sansgiry, M. Bhosle and K. Sail (2006), Factors that Affect Academic Performance among Pharmacy Students, American Journal of Pharmaceutical Education, pp. 231-243, 2006. 24. O.J. Oyelade, O.O. Oladipupo and Obagbua (2010), Application of K-Means Clustering Algorithm for prediction of students Academic Performance, International Journal of Computer Science and Information Security. 7(1), pp. 292-295, 2010. 25. Z. Zukhri and K. Omar, Solving New Student Allocation Problem with Genetic Algorithm: A Hard Problem for Partition Based Approach, International Journal of Soft Computing Applications. Euro Journal Publishing Inc., pp. 6-15, 2008. 26. S. Pavani, P.V.S.S. Gangadhar and K.K. Gulhare, Evaluation of Teachers Performance Evaluation Using Fuzzy Logic Techniques, International Journal of Computer Trends and Technology. 3(2), pp. 200-205, 2012. 27. V. Sreenivasarao and G. Yohannes, Improving Academic Performance of Students of Defense University Based on data Warehousing and Data Mining, Global Journal of Computer Science and Technology. 12(2), pp. 201209, 2012. 28. O. Afoayan and E. El-Shamir Absalom, Design and implementation of Students Information System for Tertiary Institutions Using Neural Network Techniques, International Journal of Green Computing. 1(1), pp. 115, 2010. 29. O.K. Chaudhari, P.G. Khot and K.C. Deshmukh, Soft Computing Model for Academic Performance of Teachers Using Fuzzy Logic, British Journal of Applied Science and Technology. 2(2), pp. 213-226, 2012. 30. A.Neogi, A.C. Mondal and S. Mandal, A Cascaded Fuzzy Inference System for University Non-Teaching Staff Performance Appraisal, Journal of Information Processing Systems, 7(4), pp. 595-612, 2011. 31. R.S. Yadav and V.P. Singh, Modeling Academic Performance Evaluation Using Soft Computing Techniques: A Fuzzy Logic Approach, International Journal on Computer Science and Engineering, 3(2), pp. 676-686, 2011. 32. W.S.W. Daud, K.A.A. Aziz and E. Sakib, An Evaluation of Students Performance in Oral Presentation Using Fuzzy Approach, Empowering Science, Technology and Innovation towards a Better Tomorrow. UMTAS 2011, (MO36), pp. 157-162, 2011. 33. M.S. Upadhyay, Fuzzy Logic Based of Performance of Students in College, Journal of Computer Applications (JCA), 5(1), pp. 6-9, 2012. 34. H. White, Learning in Artificial Neural Networks: A Statistical Perspective, Neural Computation, 1, pp. 425464, 1989. 35. J.C. Giarratano and G. Riley, Expert System: Principles and Programming, Fourth ed., PWS Publishing Com. Boston, MA, USA, 2005. 36. M. Schneider, G. Langholz, A. Kandel and G. Chew, Fuzzy Expert System Tools, Jhon Willy and Sons, USA, 1996. Ramjeet Singh Yadav is working as an Associate Professor and Head in the Department of Computer Science and Engineering, Ashoka Institute of Technology and Management, Paharia, Sarnath, Varanasi (Uttar Pradesh), India. In addition, he is a Research Scholar in the Department of Computer Science and Engineering, Sharda University, Greater Noida, Uttar Pradesh, India. His research interest areas are in Fuzzy Logic, Neural Networks, Genetics Algorithms, and Neuro Fuzzy Systems and Dynamic Fuzzy Expert Systems. He has published over four journal papers (one International and three National Journals), and fifteen papers in National and International Conference proceedings.

84

Ramjeet Singh Yadav & P. Ahmed

Professor Pervez Ahmed is working as a Professor in the Department of Computer Science and Engineering in Sharda University, Greater Noida, Uttar Pradesh, India. Professor Ahmed has more than three decades of teaching experience of Computer Science courses, at undergraduate and graduate levels, in universities in Iraq (1975-78), Canada (1979-88), India (1989-89) and Saudi Arabia (1990-2010). In 1999, he was appointed as Visiting Professor of Computer Science by the Commonwealth Secretariat, UK. He is the founder chairman of the Computer Science department of Aligarh Muslim University, UP, India, and has served as Chairman, Computer Science and Engineering department, International Science College, Al-Baha, Saudi Arabia. He has been a Senior Software Designer at PHILIPS/MICOM, Montreal, Canada; Research Fellow (MRI imaging) at Montreal Neurological Institute, McGill University, Canada, and visiting Scientist, Centre for Pattern Recognition and Machine Intelligence (CENPARMI), Montreal, Canada. His primary area of research is Pattern Recognition and Machine Intelligence. During his Ph.D. he developed, implemented and tested a novel technique for postal mail sorting by automatically recognizing the zip-codes that were extracted from the totally unconstrained handwritten mail addresses. The technique was tested on real-life data collected by the US postal service department.