You are on page 1of 8

A Decision Support System to improve e-Learning

Environments
Marta Zorrilla Diego García Elena Álvarez
Department of Mathematics, Statistics Department of Mathematics, Statistics Department of Applied Mathematics
and Computation, University of and Computation, University of and Computer Science University of
Cantabria Cantabria Cantabria
Avda. Los Castros s/n Avda. Los Castros s/n Avda. Los Castros s/n
Santander, Spain Santander, Spain Santander, Spain
34 942 20 20 63 34 942 20 14 20 34 942 20 18 44
zorrillm@unican.es diego.garcias@alumnos.unican. alvareze@unican.es
es

ABSTRACT platforms such as learning content management systems (LCMS),


Nowadays, due to the lack of face-to-face contact, distance course intelligent tutorial systems, adaptative and intelligent web-based
instructors have real difficulties knowing who their students are, systems, etc. to support the learning and teaching process.
how their students behave in the virtual course, what difficulties A lot has been written and said about guidelines for designing
they find, what probability they have of passing the subject, in virtual courses [3][5][25] and more and more instructors follow
short, they need to have feedback which helps them to improve them with the aim of increasing the pass rate. But, even if the
the learning-teaching process. Although most Learning Content course is well-designed, it may be not suitably adapted to
Management Systems (LCMS) offer a reporting tool, in general, students’ learning styles [10][14] or perhaps students feel under-
these do not show a clear vision of each student’s academic attended or lost in the hyperspace of the course and they require
progression. In this work, we propose a decision making system extra motivation [6].
which helps instructors to answer these and other questions using
data mining techniques applied to data from LCMSs databases. Instructors unfortunately have very few tools to monitor and track
The goal of this system is that instructors do not require data the student activity in the platform and so be able to detect and
mining knowledge, they only need to request a pattern or model, solve these problems. These systems offer some reporting tools
interpret the result and take the educational actions which they that, in general, show raw data (nº of accesses, time spent in the
consider necessary. course, nº of message read, etc.) in a tabular format. As a
consequence of this, getting a clear vision of each student or
group academic progression during the course is difficult and time
Categories and Subject Descriptors consuming for instructors [8].
H.2.8 Database Applications: data mining
But, at the same time, these systems accumulate a vast amount of
information which is very valuable and can be used to analyze
General Terms students’ behavior and the effectiveness of course design, to
Algorithms, Management, Design, Experimentation predict students’ performance and their final mark, to group stu-
dents according to their preferences, and in short, to improve the
Keywords educational process, if it is suitably treated. It is here that the
Data mining, Web mining, Data warehouse, E-learning, Distance decision support systems have a practical application of great
education. interest for the educational community, since they are orientated
to define and measure business key performance indicators (KPI)
and understand their behavior and process, summarize, report and
1. INTRODUCTION distribute the relevant information on time.
In recent years, more and more, universities and educational
centers offer the possibility of enrolling in their degrees and We have already developed a tool called MATEP [36] for
masters in a semi-presential or completely virtual (online) way in monitoring and analyzing the learners’ behavior in e-learning
order to facilitate the lifelong learning and to make this platforms, in particular for WebCT 4.0 (now BlackBoard). This
compatible with other activities. In general, they use e-learning tool uses the data registered in a data warehouse database which
come from log files and LCMS database (see Figure 1). MATEP
allows instructors to know useful, consistent, understandable
Permission to make digital or hard copies of all or part of this work for
information by means of expressive and easy-to-use static and
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that
dynamic reports which are built querying the data warehouse
copies bear this notice and the full citation on the first page. To copy database directly or the OLAP cubes.
otherwise, to republish, to post on servers or to redistribute to lists, With these reports, instructors know how their students progress
requires prior specific permission and/or a fee.
in the virtual course, compare their activity with respect to the
EDBT 2010, March 22-26, 2010, Lausanne, Switzerland. average student activity, get an idea of the learning style
Copyright 2010 ACM 978-1-60558-945-9/10/0003 ...$10.00.
according to the resources students use and assess the course algorithms nor their parameters and offer good visualization
design seeing the click stream carried out by the students. But, facilities to make their results meaningful to educators and e-
they do not yet answer other questions such as: learning designers.
• Knowing students’ profiles according to demographic As far as we know there are two works in this sense. TADA-Ed
and navigation information [15] and Moodle Data Mining Tool [23], but in both, instructors
have to have certain knowledge of data mining to use them.
• Grouping students’ according to the their style of
learning Our proposal is to extend our data warehouse architecture in order
to generate and store the data mining models [18]. That means,
• Knowing drop-out students’ profile choosing the variables that allow us to answer each question, to
• Predicting students’ grades specify the pre-processed tasks they require, to store them suitably
in the data warehouse, to determine the algorithm to be used in
• Finding out the questions which students fail more such a way that instructors only need to interpret the result and
frequently take the educational actions which they consider necessary, and
finally, to store the data mining models obtained in order to
• And so on. calculate incremental and refined patterns later.
But, answers to questions such as the previous ones can not be The paper is organized as follows. In Section 2 data mining
answered if data mining techniques are not used. For that, it is applied to educational context is introduced and works published
convenient that LCMSs add modules which support “intelligent in this area are mentioned. Section 3 explains the proposed
techniques”. Currently, data mining tools (Weka[30], Keel [2], architecture to develop the decision system. Section 4 gives
etc.) are normally designed more for power and flexibility than for details about how to obtain some patterns which are interesting
simplicity. Most of the current data mining tools are too complex for instructors. Finally, section 5 summarizes the goals of the
for educators to use and their features go well beyond the scope of work which we propose.
what an educator may want to do [23]. Consequently, these
modules must incorporate data mining capacities using an
intuitive and easy to use interface which require neither choosing

Figure 1. Extended MATEP architecture


Finally, it must be mentioned that there are other research works
2. RELATED WORK which focus on analysing distance student data using other
Educational data mining (EDM) is an emergent discipline technologies such as data warehousing and OLAP, for example
concerned with developing methods for exploring the unique [26][32][37].
types of data that come from the educational context [22].
In short, EDM is the application of the data mining techniques in 3. ARCHITECTURE
the area of education, with the aim of obtaining a better The proposed system with the aim of being generic and usable for
comprehension of the students’ learning process and of how they different e-learning platforms is designed based on a modular
participate in it, in order to improve the quality of the educational architecture as can be observed in Figure 1. This will have at least
system. the following modules:
Data mining techniques are extensively used in other fields such • A module to read and gather data from the e-learning
as business, marketing, bioinformatics, science and so on, but the platform, to carry out the pre-processing tasks related to
specific characteristics of data from e-learning environments make the application of data mining algorithms and to store
their application particular. One of these characteristics is the fact this data in the data warehouse database (this module
that it is difficult, or even impossible, to compare different gathers ETL processes and the Data Staging Area)
methods or measures a posteriori and decide which is the best
[16]. Take the example of building a system to transform hand- • A module which wraps the data mining algorithms
written documents into printed documents. This system has to (Data Mining Module)
discover the printed letters behind the hand-written ones. It is • A user-friendly interface oriented towards the analysis
possible to try several sets of measures or parameters and of results.
experiment what works best. Such an experimentation phase is
difficult in the educational field because the data is very dynamic Three open source data mining software packages, RapidMiner
and can vary a lot among samples (different course design, [17], Weka [30] and Keel [2] will be mainly used and tested for
students with different skills, different methods of assessment, our proposal. We have chosen these tools because they are open-
different resources used, etc.). This reduces the amount of data source, their algorithms can either be applied directly to a dataset
available to mine, only that corresponding to the students enrolled from their own interface or used in your own Java code and all of
in the course. Furthermore, as a consequence of not using more them contain tools for data pre-processing, classification,
data than that stored in the database of the e-learning platform, regression, clustering, association rules and visualization.
data mining models lack context information. That means that we Furthermore, RapidMiner is currently the leading open-source
will obtain a model but it will surely not be the best. We would data mining solution according to KDnuggets Data Mining
obtain more accurate patterns if we knew more about course Software Usage polls in 2009 [21]. Although RapidMiner
details, had background knowledge of the students or their interest incorporates most of the Weka algorithms, it still contains some
in the course and so on (this information could be obtained from algorithms, especially in the area of descriptor selection, which
surveys, for example). One advantage is that data sets are usually are not available in other software. Finally Keel will be used to
very clean, i.e., the values are correct, so that few pre-processing build models using evolutionary algorithms which are not
tasks are required. gathered in the other tools. With regard to commercial tools, BI
SQL Server 2005 will be used due to the fact that our data
There are a great number of works in which data mining warehouse is developed with it.
techniques are used in order to understand learner behaviour
[12][27], to recommend activities, topics, etc. [34] or to provide The communication among modules will be done by means of
instructional messages to learners [29] with the aim of improving XML files. Wherever it is possible, we use the standard Predictive
the effectiveness of the course, promoting group-based Model Markup Language (PMML). This is an XML-based
collaborative learning [20], predicting students’ performance language which provides a way for applications to define
[12], etc. A survey about the application of data mining to statistical and data mining models and to share models between
educational systems is found in [22]. PMML compliant applications.
Other data mining fields related to our aim are interactive data The user interface will be designed according to web standards.
mining and visual data mining. The first one aims to investigate
Likewise, we also think that a data mining service [28]
ways in which the user can become an integral part of the mining
independent from this architecture, would be good, in such a way
process [19][35]. The need for user inclusion is based on the
that instructors could benefit without having to deploy all the
premise that the concept of interestingness is subjective rather
system, they would only need to prepare the input data file
than objective and cannot therefore be defined in heuristic terms.
according to a template, request the generation of the model and
Regarding visual data mining, it focuses on integrating the user in
interpret the results.
the KDD process in terms of effective and efficient visualization
techniques, interaction capabilities and knowledge transfer
[9][13][31]. Although our aim is to avoid instructors needing to 4. PROPOSAL OF PATTERNS
know data mining techniques in order to take advantage of them, Initially, the set of models which we will design will use
the advances in both areas will help us twofold: how to include descriptive techniques such as clustering and association. This
the instructor’s participation in the discovery knowledge process allows us to gain an insight into students’ characteristics and
if necessary [4]; and how to explain the data mining results [33]. depict students’ learning patterns. Later, we will deal with
prediction and classification tasks, once we have determined used to divide the range of the attribute into intervals, for example
which parameters are more relevant. nº of sessions could be divided in low, medium or high.
We start giving answer to questions such as: We define session as a series of requests by the same identified
student (user) from the moment he or she connects to the course
• Knowing students’ profiles according to demographic until he or she disconnects or leaves it. We consider hit as each
and navigation information click in a web page. Resource is any tool available in the LMS
such as mail, forum, wiki, and so on. Lastly, we define action as
• Knowing drop-out students’ profile and successful
any activity inside a resource, for example sending a mail,
students’ profile
browsing content pages, etc.
• Knowing session patterns
Next, we show two possible templates for building the student
• Grouping students according to their use of course profile and session profile and a template for discovering the
resources resources which are used together.

• Finding out the questions which students fail more For this case study we have used the data from a virtual course
frequently entitled “Introduction to multimedia methods”. It is a subject of 6
ECTS which was taught in the first semester of 2009 at the largest
• Discovering the resources which are commonly used virtual campus in Spain, called G9 (this group is composed of 9
together Spanish universities; one of them is the University of Cantabria).
As it has been said, we have a data warehouse database, so we It is a practical subject in which a multimedia tool is taught. The
have the parameters from LCMS database already prepared to course is designed by means of web pages conformed to SCORM
make decisions. This means that the pre-processing tasks are [24] and include some video tutorials, flash animations and
reduced. Table 1 gathers the set of variables with which we can interactive elements. It is registered in Blackboard LMS.
work. These can be obtained with different levels of aggregation. Although the number of students enrolled in the course was 80,
For example, student time spent can be calculated per session, per only 45 did the first assignment, whose submission was 15 days
resource, per action, per week, etc. after the beginning of the course, and finally, 37 students followed
the course until the end.
First, we analyse the student profile. This template utilizes the
Table 1. Some of available variables. following input parameters: gender, age, number of sessions in the
course, time spent in the course, average sessions per week,
Variables average time spent per week.
Nº sessions Learner gender
These parameters were chosen in accordance with the instructor,
Nº hits Learner academic level since the model obtained will be explained depending on them.
Time spent Learner age We use EM (Expectation-Maximization) and KMeans as
clustering algorithms [30]. Given that the EM algorithm provides
Delay among sessions Nº chat room entered a probability distribution that can be used as a similarity criterion
Nº content pages viewed Nº wiki page edited to characterize the data, we utilize it in order to know the number
of clusters with which the KMeans algorithm will be executed
Nº messages post to forum Nº quizzes done (required parameter). We show the result obtained with KMeans
because it is easier to understand it graphically and statistically.
Nº messages read on forum Nº quizzes passed Each cluster is represented by its centroid, which means, the
Nº messages replied to forum Nº quizzes failed "average" of all its points.
As can be extracted from Figure 2, there is a 44% of females with
Nº messages sent by mail Nº items in test an age around 22 and they carry out more sessions than the rest
and these are longer. Males about 24 represent 26% of the
Nº messages read on mail Nº attempts in item population and dedicate less time per session (practically half) and
Nº assignments read Grade in item the number of visits is lower. The youngest males (31%) have a
behavior quite similar to the female group. The image on the right
Nº assignments submitted Mark in each assessment, test, shows each cluster graphically, using the percentile in which the
or assessed task cluster variable value is located. Cluster 1 and 2 are very similar,
the difference is in the gender variable which is not represented
due to the fact that it is not numerical.
In general, numerical data will be discretized into categorical
classes that are easier to understand for the instructor (categorical Next, we show the template for discovering the resources which
values are more user-friendly for the instructor than precise are commonly used together. This contains the following
magnitudes and ranges) [7]. For example, the mark could be parameters: session id, and a Boolean variable indicating if each
divided in 4 classes: fail, pass, good and excellent if the system one of the available resources in the course was visited: content-
has the criteria to do it. The equal-width method will be generally page, mail, discussion, chat, assignments, weblink, organizer,
learning objectives, assessment, calendar and others.
Average Cluster 0 Cluster 1 Cluster 2
Age 22,4872 24 22,1765 21,6667
Gender Male Male Female Male
TotalTime 1976,8065 1313,4806 2290,4953 2085,185
Sessions 128,0323 80,6032 146,4175 141,5108
AvgTimeWeek 115,8065 76,8806 134,26 122,1022
AvgSessionWeek 7,0323 4,3032 8,1233 7,7608
Instances: Instances: Instances:
26% (10) 44% (17) 31% (12)

Figure 2. Student profile.

Table 2. More interesting frequent itemset


We use the Apriori algorithm [1] (association rule) in this case
since its goal is to find frequent item sets. As a consequence of the Resources Percentage (%)
fact that some of the resources were used very little, those Organizer 80.3
variables with a rate of use inferior to 15% of the instances
(sessions) were removed. The file used had 5666 instances. We Discussion 61.3
executed the algorithm with a minimum support of 0.2 and a Other 43.5
minimum confidence of 0.7 as parameters. A support of 20%
Assignment 36.9
means that 20% of all the sessions under analysis show that the
resources in the antecedent and consequent of the rule are used Content 31.8
together. A confidence of 70% means that 70% of the sessions Mail 19.4
which used the resources shown in the antecedent of the rule, also
used the resources which appear in the consequent of the rule. Assignment Discussion 18.0
Organizer
The minimum support parameter is obtained from the frequent
item set calculation (first phase of the algorithm, see Table 2), Content Discussion Organizer 15.8
0.114 in this case and it is established a little higher, 0.2, in order Content Assignment Organizer 12.5
to obtain more interesting rules; and the confidence parameter is
Assignment Other Discussion 11.4
established slightly below the value of the most frequent item set,
Organizer
organizer with 80.3% in this case, in order to obtain rules in
which other resources appear.

Rule Disc. Assig. Content Instances Mail Instances Support Confidence


8 Yes 1799 ⇒ No 1593 0.28 0.89
16 Yes 2093 ⇒ No 1735 0.30 0.83
20 Yes No 2252 ⇒ No 1834 0.32 0.81
23 Yes 3476 ⇒ No 2795 0.49 0.80
(61%)
24 Yes No No 1772 ⇒ No 1411 0.24 0.80
27 Yes No 2544 ⇒ No 2002 0.35 0.79

Rule Disc. Assig. Mail Instances Content Instances Support Confidence


28 Yes No 2252 ⇒ No 1772 0.31 0.79
31 Yes No No 1834 ⇒ No 1411 0.24 0.77
41 Yes 3476 ⇒ No 2544 0.44 0.73
(61%)
44 0 Yes 2765 ⇒ No 2002 0.35 0.72

Rule Cont. Assig. Mail Instances Disc. Instances Support Confidence


33 No No No 1843 ⇒ Yes 1411 0.24 0.77
45 No No 2489 ⇒ Yes 1772 0.31 0.71
(43%)
Figure 3. More interesting rules obtained about the use of resources.
The proposed system will directly offer instructors the rules After this result, it is reasonable to want to know what the session
although they will be able to modify this latter parameter and pattern is. The template for obtaining session profile uses the
request a new rule set again. following input variables: time spent in session (minutes), hits and
time spent in content-pages, hits and time spent in collaborative
Table 2 and Figure 3 illustrate the interesting results obtained.
resources (mail, discussion, chat) and in the rest of resources of
Table 2 shows that students do not use all the resources in each
the course. The criterion used to build the clustering model was
session. What’s more, the use of all the resources in a session is
the same as the one utilised to generate the student profile.
quite infrequent. Only 11,4% of sessions used 4 resources. The
Observing Figure 4, we can discover that most of sessions are
organizer is the most used resource because it is the main page of
very short (6 minutes) and generally focused on reading the forum
the course, next the discussion, after that others (announcement,
(cluster 0). Sessions in which students spend more than half an
calendar, urls, web-links, etc.), and finally assignments and
hour are hardly 14% (cluster 2 and 3) and although the number of
content-pages.
hits in the discussion tool is the highest, most of the time is
As can be observed in Figure 3, in 61% of the sessions, the use of dedicated to content-pages and assignments. Finally, cluster 1
the forum is present and in only 27% of these sessions, students gathers brief sessions which can be considered as consulting
also accede to content-pages (R41). In the case of the mail, this visits.
percentage is lower, only 20% (R23). There are 43% of sessions
in which students neither accede to content-pages nor the
assignment tool and of these, in 71% students accede to the
discussion tool (R45).

Average Cluster 0 Cluster 1 Cluster 2 Cluster 3


SessionTime 14.0658 6.0482 27.0222 51.8116 66.7015
hit_mail 0.6873 0.6191 2.2519 0.9799 0.6741
hit_discussion 8.9338 7.1086 23.9259 17.4246 16.9726
hit_chat 0.0625 0.0021 2.3185 0.0402 0.0373
hit_contentpage 1.4481 0.6111 2.363 0.8769 11.5572
hit_assignments 1.1112 0.5813 3.3111 6.2739 1.4975
hit_weblinks 0.0672 0.0184 0.6222 0.0804 0.4428
hit_organizer 2.3489 1.5293 3.9259 3.1206 10.7015
hit_learningobjectives 0.1315 0.0955 0.7333 0.1834 0.301
hit_other 1.0856 0.7269 5.3333 3.4648 1.5249
time_mail 0.725 0.5591 4.1852 1.2965 0.9502
time_discussion 3.1068 1.9746 5.8667 9.0101 9.6592
time_chat 0.0018 0 0.0741 0 0
time_contentpage 4.9652 1.7227 5.8963 3.2613 44.5
time_assignments 2.9017 0.6796 5.3111 27.7412 3.6517
time_weblinks 0.0321 0.0116 0.6296 0.0226 0.0821
time_organizer 0.6511 0.2741 1.1037 0.9573 4.6318
time_learningobjectives 0.0178 0.0148 0.0444 0.0201 0.0423
time_other 1.201 0.501 2.5407 8.3367 1.9254
Instances: Instances: Instances: Instances:
83% (4731) 2% (135) 7% (398) 7% (402)

Figure 4. Session profile.


These reports were shown to the instructor in charge of the This work has been partially financed by Spanish Ministry of
course, and in her opinion, they allow her to gain an insight into Science and Technology under project ‘TIN2007-67466-C02-02’
the characteristics of her students with relation to the time spent and ‘TIN2008 – 05924’.
and the use of resources available in the course. Although it is
true that the learning process can be carried out without being
7. REFERENCES
connected, the interaction of the students with the different
resources contains information to improve it. This allows
instructors to validate or refute hypothesis used in the design of [1] Agrawal, R. and Srikant, R.: Fast Algorithms for Mining
the learning process. For example, knowing that there are few Association Rules in Large Databases. In: 20th International
sessions in which students accede to content-pages makes Conference on Very Large Data Bases, 478-499, 1994.
instructor suppose that most of the students do not study [2] Alcalá-Fdez, J., Sánchez, L., García, S., Jesus, M.J., Ventura,
connected or, what would be worse, they do not read the content- S., Garrell, J.M., Otero, J. Romero, C. Bacardit, J., Rivas,
pages. V.M., Fernández, J.C and F. Herrera. KEEL: A Software
This data can alert instructors and they can detect, for example, a Tool to Assess Evolutionary Algorithms to Data Mining
bad design of content-pages. Likewise, knowing that the forum is Problems. Soft Computing 13:3 (2009) 307-318, doi:
visited in practically each session and that it is the main resource 10.1007/s00500-008-0323-y
used obliges instructors to have knowledge and skills for the [3] Álvarez, E. Zorrilla, M. E. 2008. Orientaciones en el diseño
suitable utilization of this learning tool. y evaluación de un curso virtual para la enseñanza de
aplicaciones informáticas. Revista Iberoamericana de
On the other hand, the instructor suggested that we improve the
Tecnologías del Aprendizaje (IEEE-RITA), 3(2), 61-70.
presentation of the results in order for them to be more
http://webs.uvigo.es/cesei/RITA/200811/
understandable. For example adding an explanation similar to the
one used in this work. [4] Brin, Sergey and Page, Lawrence (1999) Dynamic Data
Mining: Exploring Large Rule Spaces by Sampling.
Technical Report. Stanford InfoLab.
5. CONCLUSIONS
Adding intelligence to e-learning platforms means giving tools the [5] Brown, A.R., Bradley D. 2005. Elements of Effective e-
ability to understand and profit from data (experience). Learning Design. International Review of Research in Open
Consequently, in this paper we present the proposal of a decision and Distance Learning. EDUCASE publications.
making system which helps distance instructors to know who their [6] Conrad, D. L. 2002. Engagement, excitement, anxiety and
students are, how they work, how they use the virtual course, fear: Learners’ experiences of starting an online course.
where they find the problems and so on, and in this way, American Journal of Distance Education, 16(4), pp. 205–
instructors can act as soon as they detect any difficulty, for 226.
example, proposing new tasks, re-organizating the content-pages,
[7] Dougherty, J., Kohavi, M., and Sahami, M. (1995).
adding new information, opening discussions and so on.
Supervised and unsupervised discretization of continuous
Likewise we propose some questions that, in our opinion, are of features. In Int. Conf. Machine Learning Tahoe City, CA,
interest to teaching staff and show how the answers are very pp.194–202.
useful for improving the learning and teaching process. These [8] Douglas, I., 2008. Measuring Participation in Internet
answers are obtained by means of data mining techniques. Lastly Supported Courses. International Conference on Computer
we also suggest a modular architecture for its implementation. Science and Software Engineering, 5, pp. 714-717.
This work presents two main challenges: firstly, to determine the [9] Durand, N, Cremilleux, B and Suzuki, E. 2006. Visualizing
input variables, the technique and the parameters with which to transactional data with multiple clusterings for knowledge
execute the algorithms to answer the teachers’ questions discovery. 16th International Symposium on Methodologies
appropriately; and secondly, to define a graphical interface which for Intelligent Systems, Bari , Italy.
allows instructors to interpret the results easily.
[10] Graf, S., Kindhuk and Liu, T. Identifying Learning Styles in
Regarding the first challenge, this work defines three templates Learning Management Systems by Using Indications from
which must be validated with data from other virtual courses in Students’ Behaviour. Proc. of the 8th IEEE International
order to be considered adequate; and with regard to the second, Conference on Advanced Learning Technologies. July,
we are studying different research works carried out in this field Santander, Spain. 2008.
such as [9][13][31]. [11] Han, J. Data mining: Concepts and Techniques. Morgan
Kaufmann. 2006.
6. ACKNOWLEDGMENTS [12] Hung, J., and Zhang, K. 2008. Revealing Online Learning
The authors are deeply grateful to CEFONT, the department of Behaviors and Activity Patterns and Making Predictions with
the University of Cantabria which is responsible for LCMS Data Mining Techniques in Online Teaching. Journal of
maintenance, for their help and collaboration. Likewise, the Online Learning and Teaching 8(4), pp. 426-436
authors gratefully acknowledge the valuable suggestions of the
[13] Kreuseler, M. and Schumann, H. 2002. A flexible approach
anonymous reviewers.
for visual data mining. IEEE Transaction on Visualization
and Computer Graphics, Vol. 8 (1), pp. 39-51
[14] Krichen, J. 2007. Investigating Learning Styles in the Online [27] Talavera, L., and Gaudioso, E. 2004. Mining student data to
Educational Environment. Proceedings of the 8th ACMSIG- characterize similar behaviour groups in unstructured
information Conference on Information Technology collaboration spaces. In Workshop on artificial intelligence
Education, 127-134, Destin, Florida, USA, 18 - 20 de in CSCL. 16th European conference on artificial intelligence,
October 2007. pp. 17–23.
[15] Merceron, A., and Yacef, K. 2005. TADA-Ed for [28] Tsai, C., and Tsai, M. 2005. A dynamic Web service based
Educational Data Mining. Interactive Multi-media Electronic data mining process system. The Fifth International
Journal of Computer-Enhanced Learning, Vol. 7, nº 1, May Conference on Computer and Information Technology, pp.
2005. 1033 – 1039. 21-23 Sept. 2005
[16] Merceron, A. and Yacef, K.. 2008. Interestingness Measures [29] Ueno, M., and Okamoto, T. 2007. Bayesian Agent in e-
for Association Rules in Educational Data. 1st International Learning. Proceedings of the Seventh IEEE International
Conference on Educational Data Mining (EDM08). Conference on Advanced Learning Technologies (ICALT),
Montreal, Canada pp.282-284
[17] Mierswa, I., Wurst, M., Klinkenberg, R. , Scholz, M., and [30] Witten, I. H., and Frank, E. 2005. Data Mining: Practical
Euler, T.: YALE: Rapid Prototyping for Complex Data Machine Learning Tools and Techniques (Second Edition).
Mining Tasks, in Proceedings of the 12th ACM SIGKDD Morgan Kaufmann. ISBN 0-12-088407-0
International Conference on Knowledge Discovery and Data [31] Wong, P., Whitney, P.and Thomas, J. Visualizing association
Mining (KDD-06), 2006. rules for text mining. In Proceedings of IEEE Information
[18] Millan, S, Zorrilla, M. E., Menasalvas, 2005. E. Intelligent e- Visualization INFOVIS. IEEE Computer Society Press,
learning platforms infrastructure. XXXI Latin American 1999.
Informatics Conference (CLEI’2005). Cali, Colombia. [32] Xiaohua Hu, X, Cercone, N.2004. A data warehouse/online
[19] Pendharkar, P. C. 2003. Managing Data Mining technologies analytic processing framework for web usage mining and
in Organizations: Techniques and Applications. ISBN 1- business intelligence reporting. International Journal of
59140-057-0. Intelligent Systems, Vol. 19 ( 7), pp. 585-606
[20] Perera, D., Kay, J., Koprinska, I., Yacef, K., and Zaïane, O. [33] Yao, Y.Y., Zhao, Y. and Maguire, R.B. (2003). Explanation-
R. 2009. Clustering and Sequential Pattern Mining of Online oriented association mining using rough set theory.
Collaborative Learning Data. IEEE Trans. on Knowledge Proceedings of Rough Sets, Fuzzy Sets and Granular
and Data Eng. 21, 6 (Jun. 2009), 759-772. DOI= Computing, pp. 165-172.
http://dx.doi.org/10.1109/TKDE.2008.138 [34] Zaïane, O. 2002. Building a recommender agent for e-
[21] Piatetsky-Shapiro. 2009. Data Mining Tools Used Poll. learning systems. Computers in Education, 2002.
KDNuggets.com Proceedings of the International Conference on Computers in
[22] Romero, C. and Ventura, S. Educational Data Mining: A Education, pp. 55–59.
Survey from 1995 to 2005. Expert Systems with [35] Zhao, Y., Chen, Y.H. and Yao, Y.Y., User-centered
Applications, 33(1), 135-146, 2007. interactive data mining, in: Yao, Y.Y. and Shi, Z.Z. (Eds.)
[23] Romero, C., Ventura, S., Espejo, P.G., and Hervas, C. Data International Journal of Cognitive Informatics and Natural
Mining Algorithms to Classify Students. International Intelligence (IJCiNi), 2(1), 58-72, IGI Global, 2008.
Conference on Educational Data Mining, Canada, 2008. [36] Zorrilla, M., and Álvarez, E. 2008. MATEP: Monitoring and
[24] SCORM 2004 3rd Edition. The Sharable Content Object Analysis Tool for e-Learning Platforms. Proceedings of the
Reference Model, ADL. 2009. 8th IEEE International Conference on Advanced Learning
Technologies. Santander, Spain.
[25] Steen, H. L. 2008. Effective eLearning Design. Journal of
Online Learning and Teaching, 4 (4). [37] Zorrilla, M. E. 2009. Data Warehouse Technology for E-
http://jolt.merlot.org/vol4no4/steen_1208.htm Learning. In book Methods and Supporting Technologies for
Data Analysis. Studies in Computational Intelligence 225,
[26] Silva, D.R., Vieira, M.T.P.(2002). Using Data Warehouse pp. 1–20. D. Zakrzewska et al. (Eds.) Springer-Verlag Berlin
and Data Mining Resources for Ongoing Assessment of Heidelberg.
Distance Learning. In: Proceedings of IEEE Intl. Conf. on
Advanced Learning Technologies (ICALT 2002)

You might also like