Using Genetic Algorithms for Data Mining Optimization 2253
swer the following research questions: Can we find
of students? In otherwords, do there exist groups of students whouse these online resources in a
way? If so, can we identify that class for any individual student? With this informa-tion, can we
a student use the resources better, based on the usage of the resourceby other students in their groups?Wehope tofind similar patternsof use in the data gathered from LON-CAPA, andeventuallybeable to make predictions as to the most-beneficial course of studies foreach learner based on their present usage. The system could then make suggestions tothe learner as to how to best proceed.
2 Map the Problem to Genetic Algorithm
Genetic Algorithms have been shown to be an effective tool to use indatamining andpattern recognition. , , ,
, , . An important aspect of GAs in alearning context is their use in pattern recognition. There are two different ap-proaches to applying GA in pattern recognition:1.Apply a GA directly as a classifier. Bandyopadhyay and Murthy in  applied GAto find the decision boundary in N dimensional feature space.2.Use a GA as an optimization tool for resetting the parameters in other classifiers.Most applicationsof GAs in pattern recognition optimize some parameters in theclassification process. Many researchers have used GAs in feature selection , ,, . GAs has been applied to find an optimal set of feature weights that im-prove classification accuracy. First, a traditional feature extraction method such asPrincipal Component Analysis (PCA) is applied, and then a classifier such ask-NNis usedtocalculate the fitness function for GA , . Combination of classifi-ers is another area that GAs have been used to optimize. Kuncheva and Jaininused a GA for selecting the features as well as selecting the types of individual clas-sifiers in their design of a Classifier Fusion System. GA is also used in selecting theprototypes in the case-based classification .In this paper we will focus on the second approach and use a GA to optimize acombination of classifiers. Our objective is to
the students’ final grades basedon their web-use features, which are extracted from the homework data. Wedesign,implement, and evaluate a series of pattern classifiers with various parameters inorder to compare their performance on a dataset from LON-CAPA. Error rates for theindividual classifiers, their combination and the GA optimizedcombination are pre-sented.