You are on page 1of 17

“Awareness is the greatest agent for change” – Eckhart Chapter 1

Tolle
Introduction
Education is the platform that enables the individuals and the society to contribute in the
development of the country. Each individual is equipped with the weapons of knowledge,
and skills. Providing the right guidance and skills to the youth has a significant effect on
the overall progress and economic growth of the nation. In the last decade, with increase
in the number of students enrolled, the amount of data related to students has also grown
tremendously in the education domain, making it difficult to analyze the academic
performance of the students. Therefore, it has become essential to develop powerful
means for analyzing the students' academic data for the extraction of useful knowledge.
This discovered knowledge will assist in improving the performance of students. The
academic achievement of the students depend on various factors such as demographics,
examination, socio-economic, etc. In the education domain, various factors are involved
corresponding to different entities viz. student, faculty, infrastructure, and learning
environment that are multi-dimensional in nature. The student entity describes the basic,
personal, academic, and examination details about each student. Similarly, the faculty and
the infrastructure entities represent the teacher profile and facilities provided in a specific
course. There can be direct or indirect relationships among various attributes of students
that ought to be identified. Moreover, excessive growth in databases necessitates
developing new technologies that use information and knowledge intelligently. Hence,
there is a need for an automated decision support system that discovers the educational
patterns to analyze the performance of the students.

1.1 Data Mining

Data mining is the technique of extracting knowledge from heterogeneous data sources. It
is the technology that explores a large amount of data in search of consistent patterns and
systematic relationships between variables. It involves validation of the discovered
patterns to new subsets of data, once the ground relations among different entities are
identified [1]. Data mining is an iterative process that starts with problem definition at
phase 1, followed by the collection of data through data sampling and data preparation via
transformation of data at phase 2, following this a model is constructed and evaluated at

1
Chapter 1: Introduction

phase 3, and at last, this process ends with knowledge deployment, as shown in Figure
1.1.

Problem Definition

Data Assembly and


Data Preparation

Accessing of data
Sampling of Data
Transformation of data Model Construction
and Assessment

Generate Model
Text Model Assess & Knowledge
Understand Model
Deployment

Model Apply
Custom Reports
External Applications

Figure 1.1: Process of Data Mining

The actual task of data mining is the semi-automatic or automatic analysis of the huge
amount of data to extract previously unidentified, useful patterns, clusters of data records,
uncommon records, and dependencies. For gathering valuable facts and quicker
repossession of data, numerous arenas have opted for the data mining skills. A few
tendencies [2] of data mining that reveal the quest of knowledge such as the structure of
combined and communicating data mining atmosphere are: -

 Domain-specific applications that consist biomedicine (DNA), finance, retail, and


telecommunication data mining.
 Visual data mining comprising of visualization of data, visualization of the mining
result, visualization of the mining process, and interactive visual mining.
 Web mining is that form of data mining techniques which is useful in discovering
patterns from the web.

2
Chapter 1: Introduction

 Educational data mining is the research area with a set of computational approaches
to explore educational datasets. Data mining techniques, when applied in the
education sector, results in Educational Data Mining.
 Distributed data mining is the approach that distributes the workload among several
sites to make the system scalable.
 Real-time data mining involves the intrusion detection system, a technology to
protect network systems that are used in real-time.
 Multi-database mining is a distributed approach of data mining in which data is
distributed across multiple databases. These databases can be aggregated using
several techniques to create a single dataset that can be mined using standard
approaches.
 Intelligent query answering helps in semantic query optimization and querying
database knowledge. Data mining tools can be applied to database systems for
intelligent query answering.
 Privacy protection and information security can be achieved with the help of data
mining.
Apart from these, data mining techniques can also be applied to the areas of corporate
analysis, production control, risk management, science exploration, and customer
retention. The data mining specializing in the educational domain is named as
Educational Data Mining (EDM). It deals with the approaches, tools, and research plans
that are used to attain information from educational records viz. online logs, and
examination results, and then evaluates this information to formulate decisions.

1.2 Educational Data Mining

EDM is the research area that analyzes the educational data to identify meaningful
information about different types of learners and their learning behavior, and the effect of
educational policies implemented in various learning environments. The process of EDM
takes raw data as input for various educational settings and transforms it into valuable
facts and information. This information assists educational policymakers, school
administrators, teachers, and students in making informed decisions about how to manage
and interact with educational resources. It enables the data-driven decision-making in
improving the current educational practices and learning materials. With the advancement
of technological use in educational information systems, a large amount of student data
has been made available. This makes it significant to use EDM for analyzing student’s

3
Chapter 1: Introduction

learning behavior. EDM helps in effective assessment of educational institutions for the
optimal usage of learning resources.

Learners

Use learning Performance


resources predictions

Educational data Discovered


Educational
Information System (Learning objects, Knowledge
Collects Knowledge
& uses Learner profiles) modeling (Learning patterns)

Performance
Recommendations predictions

Educators

Figure 1.2: Educational Information Mining in a nutshell [3]

Figure 1.2 depicts the information mining from an educational information system that
collects and uses educational data, available in the form of learning objects, event logs,
and student profiles. From the educational data, EDM tasks are formulated in the form of
student profiling and knowledge modeling. This educational data helps the students in
using learning resources effectively and also helps the educators about the usage of this
discovered knowledge i.e. learning patterns for performance prediction. EDM focuses on
various difficulties and challenges associated with diverse segments in the learning
phenomenon [4]. These challenges are as follow:

 Finding relevant course materials for students.


 Organizing and delivering of course material to end-users.
 Identifying the factors that affect the performance of students.
 Getting feedback from end-users on the delivered course material.
 Analyzing the significance of teaching aid provided, delivered course material, and
feedback received from end-user.

Various practices, in contrast to the different educational datasets, leads to the unveiling
of several problems. So, it is essential to select the right problem formulation technique
corresponding to the desired research objectives to get the desired results.

4
Chapter 1: Introduction

1.3 Problem Formulations under EDM

The selection of the optimal technique is centered on the structure of the educational
dataset. However, the decision to apply these techniques is always based on research
objectives. So, the researcher has to select a specific technique keeping in mind the value-
set of available datasets. Table 1.1 describes the most prominent problem formulation
techniques related to educational data mining [3]. All these techniques are elaborated
corresponding to their specific objectives.

Table 1.1: Problem Formulations techniques under EDM [3]

Techniques Purpose

Classification  Detection of student behavior.


 Development of domain models.
 Discovering students learning styles and preferences.
 Understanding the educational outcomes of the students.
Clustering  Grouping similar students based on learning behavior and
their performance.
Predictive Modeling  Prediction of either a student qualifies a course or not.
Relationship Mining  Discovering the curriculum associations in a course.
 Finding bottlenecks in specific study programs.
Visual Analytics  Analyzing of educational processes or erudition outcome by
visualizing the model.
Discovery with  Student characteristics or contextual variables.
Models  Determining the relations among different student behaviors.
Refinement of Data  Labeling the data that helps in the improvement of the
for Individual prediction model.
decision  Identification of the students’ learning patterns.

All the techniques discussed in Table 1.1 are suitable to find the results corresponding to
the desired objectives of the researchers. However, prominent researchers in the field of
education, propose that association rule mining, classification, regression, and clustering
are the most common and useful techniques used in the domain of educational data
mining.

5
Chapter 1: Introduction

1.4 Academic Performance Modeling

Several studies [6, 7, 17, 18, 19, 20] have propounded different factors that are having a
direct or indirect impact on the students' academic performance, although very few of
them have come out with prediction models. Particularly, an accurate prediction of
students’ performance in academics at the higher secondary level is required for offering
a student the mandatory support in learning process. The academic performance modeling
can be categorized into four segments, namely ‘student modeling’, ‘decision support
system’, ‘adaptive system’, and ‘scientific evaluation’ that are centered on the target
audience [5] as labeled in Figure 1.3. The first two categories can be further classified
into the subcategories for defining the specifications corresponding to their specific
domain. There might not be any similarity between these two subcategories, but there are
differences in their objectives. There is a need to identify these differences for
distinguishing these sub-categories clearly. Details of all these categories and their
corresponding subcategories are given below.

Student Decision
Modeling Support System

Target
Audience

Adaptive Scientific
System Evaluation

Figure 1.3: Segmentation of Academic Performance Model

6
Chapter 1: Introduction

1.4.1 Student Modeling (SM): It is used to estimate the value of the attributes of the
student for analyzing their performance. This segment accounts for the cognitive aspects
of student activities, such as analyzing the student’s performance or behavior,
identification of the students’ goals and plans, discerning his/her prior acquired
knowledge, maintaining an episodic memory, and describing characteristics of the
learners. The student modeling is further categorized into sub-segments to highlight the
similarities and differences between them. The student modeling is further sub-classified
as ‘Students Performance and Characteristics’, ‘Detecting Students Learning Behavior’,
‘Students Profiling and Grouping’, and ‘Social Network Analysis’. The details about each
sub-segment are given below:

 Students Performance and Characteristics: This segment focuses on the


assessment of variables that describe the behavioral characteristics of students.
These values indicate students’ performance and achievement of their learning
outcome. The major research work is based on the prediction of student's academic
performance, but some studies also look into the behavioral characteristics of
students, such as their collaboration with other students [5]. The classification and
regression techniques can also be used for the prediction of student performance and
characteristics. Zimmermann et al. [6] introduced a model-based approach to
predict graduate-level performance using the indicators of undergraduate-level
performance. This model can also be clubbed with the decision trees to estimate the
motivation level of students.

 Detecting Students Learning Behavior: This segment focuses on the detection of


the students’ learning behavior, such as drop out, retention, academic failure,
cheating cases, etc. The classification and clustering methods are used for the
detection of behavioral characteristics of students, but feature selection and outlier
detection also be applied for the same purpose. Consequently, Dekker et al. [7] also
made use of a decision tree classifier to predict student dropout in electrical
engineering courses. This segment can also be used for the production of rules for
the detection of potential symptoms of low performance in different subjects.
 Students profiling and Grouping: The objective of this segment is to profile
students on the basis of their various characteristics and knowledge expertise.
Kinnebrew et al. [8] used sequence-mining techniques to identify learning behavior
that differentiates a distinct group of students. This task is different from clustering

7
Chapter 1: Introduction

because while clustering the students, the greatest dissimilarity between clusters, but
this is not the case in grouping tasks. While using a grouping task for forming teams
in a specific course, one prefers to have groups that are similar and comprising of
dissimilar students who can complement each other. Similar to other applications of
EDM, different data mining methods such as feature selection and clustering
techniques can be used to profile and group the students.
 Social Network Analysis: The objective of this segment is to recognize the
relations between individuals and illustrate it using a graph. The other segments of
student performance modeling focus on the individuals, but the social network
analysis focuses only on the association properties assigned to the relationships
among individuals. Reyes and Tchounikine [9] have applied social network analysis
techniques to discover structural characteristics of learning groups. Consequently,
the social network analysis can be used to structure the educative communities to
model the work analysis to measure the cohesion in the learning environment.

1.4.2 Decision Support System: This segment of academic performance modeling


augments the performance of students so as to assist the administration in decision-
making activities. Decision support systems are beneficial for instructors to get a better
insight about the students’ performance. Based on their specialization and target user, the
decision support systems have been categorized into five sub-segments named ‘report
generation’, ‘planning and scheduling’, ‘alerts for the administration’, ‘concept maps’,
and ‘recommendation system’. The details about each sub-segment of DSS is given
below:

 Report Generation: Visualization is used in educational environments by providing


useful information to educators and administrators to help them with decision
making. The core purpose of this segment is to find and highlight the information
related to course activities and provide feedback to educators and administrators. For
example, Romero et al. [10] used the association rule mining technique to provide
feedback to the instructors derived from the multiple-choice quiz data.
 Planning and Scheduling: The objective of this segment is to help the
administration in planning and scheduling the students' academic courses. Various
methods, such as discovery with models, cluster analysis, and classification can be
used for planning and scheduling of specific dataset. Tai-chang et al. [11] enhanced
the course planning by establishing the probability of enrollees completing their

8
Chapter 1: Introduction

courses based on the students’ preferences. This segment can also help the educators
in the allocation of resources, counseling processes, or the other tasks involved in
planning and scheduling.
 Alerts for Administration: The alerts for administration serves as an online tool for
updating the administration in real-time. This segment also works on the detection of
unwanted behavior of the students. Knowles [12] introduced a dropout early warning
system using the statistical models and regression techniques. However, in the case
of an offline learning environment, this segment can act as an early performance
prediction warning system for the administration.
 Concept Maps: This segment is a domain model for educators, which acts as a
graphical tool for organizing and representing knowledge. Chun-hsiung et al. [13]
used an Apriori algorithm to automatically construct the concept map of learners.
Such kind of concept maps assists the educator in enhancing the performance of the
student in academics. Examples of concept maps are the hierarchy of topics in the
course material, relationships of skills and test items, correlation of test items, and
knowledge components.
 Generating Recommendation System: This segment focuses on performance
enhancement in the student learning management system. The recommendation
system serves as commendations to the students, but it can be targeted to any
stakeholder. Some of the examples of this category are course recommendations to
students or test item recommendations to educators. The collaborative filtering and
association-rule based techniques can also prove to be useful in this system. The
discovery with the model method can also be used in the development of the
recommendations system. Vialardi et al. [14] used the performance predictor model
for generating recommendations. This predictor model predicts the success of each
student in each course and would thereby recommend courses in which the student is
most likely to be successful.

1.4.3 Adaptive System: This segment of application is related to the use of intelligent
systems in the online learning environment. This segment assists educators in adapting
the learning characteristics of students. In various online learning systems, numerous
learners with different needs are involved in the system. However, if the number of
applicants grows, it becomes difficult to meet specific needs of all learners. Adaptive
systems can help educators in meeting the needs of every individual learner. This

9
Chapter 1: Introduction

adaptation can take on the form of adapting to the course material, instruction pace,
providing hints, ordering and generating tests, etc. For example, Alaofi et al. [15]
explored the personalization of a digital library using the students' profile information to
improve search results about learning content.

1.4.4 Scientific Evaluation: Evaluation is one of the aspects of the education


environment which might not always be intuitive. Therefore, an evaluator has to be
provided to help educators in exploratory learning environments. Similar to other fields of
study, theories and hypotheses about the process of learning and possible improvement
methods have been used in education. This category of application mostly targets the
researchers as the end-user, but any of the developed or tested theories can be used in
other applications targeting other stakeholders later. Jiangang et al. [16] proposed a new
method for scoring in a game or a scenario-based task using a distance function.

1.4.5 Target Audience: One important goal of EDM is improving the quality of learning,
and in this process of learning only two groups of users come to mind i.e. learners and
educators. We also look into the target users corresponding to each formulated segment.
The end-users in educational environments are students, educators, administrators, and
researchers all of them correspond to all the modeling segments of academic performance
[5]. Any research in EDM may address one or more than one of the segments at the same
time. Table 1.2 maps all the possible relationships between the target users respective to
the applications domain of academic performance modeling respectively.

Table 1.2: Target users of various segments of academic performance modeling [5].

Academic Performance Students Educators Administrators Researchers


Model
Learner performance & Y Y Y
characteristics
Detecting students learning Y Y
behavior
Student profiling and Y Y Y
grouping
Social network analysis Y Y Y
Reports generation Y Y Y

10
Chapter 1: Introduction

Planning and scheduling Y Y Y


Alerts for administration Y Y
Concept maps Y Y
Generating a Y Y Y Y
recommendation system
Adaptive systems Y Y Y Y
Scientific evaluation Y Y Y

Table 1.2 describes the possible target user corresponding to each application under
EDM. It also reveals that many applications can target more than one user. Learners have
been the target of EDM in various applications, such as grouping students, generating
recommendations, and adaptive systems. Most of the applications in categories of student
modeling and decision support systems target educators as their end-users. Student
modeling provides a better understanding of the students’ state of learning, and decision
support systems can directly help educators in making improvement in the learning
process. Therefore, student modeling also assists the administrator of educational
institutions in making higher-level decisions. Researchers also represent a group of end-
users, as the objective of the research is to understand the learning process, develop
theories and test them. For example, researchers can use social network analysis (SNA) to
pinpoint the properties that are valuable in the prediction of the performance of the
student. Therefore, more than one application can collaborate to serve multiple users.

1.5 Motivation

In the last decade, the school education system in India has increased in size, but the
quality of the education system in terms of student success rate and retention rate is still
unsatisfactory. Numerous realities come into light regarding the student dropout rate
while referring to the MHRD education census from 2011 to 2014 [21]. Figure 1.4
presents statistics about students’ dropout rates from 2011 to 2014. The education survey
of 2011 describes that the dropout rate for I to V classes was 28.9%; for I to VIII, it was
42.4%, and for I to X standard, it was 52.8 [Appendix-A]. In 2012, the dropout rate for I
to V classes was 27.0%; for I to VIII, it was 40.6%, and for I to X, it was 49.3%
[Appendix-B]. Similarly, in 2013, the dropout rate for I to V classes was 27.0%; for I to

11
Chapter 1: Introduction

VIII, it was 40.6%, and for I to X, it was 49.3% [Appendix-C]. Moreover, in 2014, there
was no change in the situation up till V class. The dropout rate was 19.8%, for up to VIII
class, it was 36.3%, and for up to class X, the dropout rate was 47.4% [Appendix-D].

Figure 1.4: Students’ dropout rate from 2011 to 2014, a census by MHRD
in ‘Education Statistics at a Glance.’

Apparently, the size of the dataset plays a crucial role in discovering valuable facts and
information. However, in national context, bigger datasets have rarely been used for the
analytical process [22]. For better understanding and implementation of the educational
system in the national aspect, there is a need for large standardized databases. Not
limiting to dataset size, the versatility of the available dataset is another challenge. The
dataset while considering only one academic session may fail to reflect the versatility of
the educational system. Moreover, the data acquisition process also needs to be
streamlined so that it reflects the correct picture of the learning phenomenon. In relation
to the effective analytical process, the dataset must be in a requisite format and must be
free from any anomalies. Thus, working with the bigger versatile dataset, poses a
challenge of data transformation and integration in the data warehouse as it may limit the
analytical mining techniques of the generic system and the results will also not be up to
the mark [173].

So, there is a need to explore the possibility of a multidimensional dataset corresponding


to an educational domain that would specifically work with a subject-oriented educational
dataset. With the streamlined data collection, there is the requirement of transformation

12
Chapter 1: Introduction

and preprocessing of a larger dataset as data consistency always plays a crucial role in the
effective analytical process. With the availability of larger attribute sets, the generic
mining techniques do not contribute as much as the collaborative mining techniques.
Similarly, the academic performance also depends on various factors like demographics,
examination, and socio-economic parameters of students. These factors necessitate the
development of an explicit tool for the analysis of educational data. It will also be useful
for framing new policies for better utilization of current resources. Such a system also
helps in getting a better insight into the causes of unexpectedly high failure rates.
Therefore, considering the social as well as research issues, we need explicit research to
analyze the educational pattern that assists in the improvement of the learning
phenomena.

1.6 Objectives of the Study

The core objective of this research work is to propose an approach in the educational area
to analyze the academic performance of students. In relation to the above mentioned
proportion, we have conducted special studies of the problems to meet the following
objectives:

 Subject Oriented multidimensional dataset acquisition from different data sources


with the main focus on data banks of authorized departments of school education.
 To integrate and preprocess the dataset to remove anomalies and, later transform the
dataset to a consistent format for effective feature extraction and analysis.
 To propose and implement hybrid techniques/algorithms for the better manipulation
of the features of different approaches to perform constraint-based multidimensional
frequent pattern mining for association rule-based classification system of the
preprocessed dataset.
 To identify the interesting patterns, that emerge from the mined data and to perform
a subject-oriented multidimensional analysis of these entities.
 To build confidence in EDM based on the validation of results and efficiency of
proposed techniques using objective and empirical analysis.

This research is an attempt to apply the data mining technique in the education area for
the analysis and evaluation of the students’ academic data so as to enhance the quality of
the school education system. Appropriate methodologies and datasets have to be used to
meet the objectives formulated under this study.

13
Chapter 1: Introduction

1.7 Research Contributions

The core objective of this proposed research is to discover the learning pattern of students
for the betterment of the educational scenario. This study puts forward the need for an
automated analytical tool that would discover the similarities among student's attributes.
The development of an automated association rule-based classification system helps in
attaining the formulated research objectives. Therefore, the proposed study has performed
the following key contributions to discover the frequent patterns from the
multidimensional educational dataset.

 Analysis of EDM Systems: Numerous studies were carried out with EDM systems
involved in the discovery of educational patterns. Most of the systems target the
web-based learning platform. But in the national context, there is only an offline
dataset which is also non-volatile in nature. When we talk about the size and
versatility of the chosen dataset, it was also not enough to describe the complete
result of the target source. Furthermore, the matrices involved in existing EDM
systems are not enough to discover the multi-hidden dependencies among learners’
attributes. Therefore, all these factors focus on the domain of multidimensional
frequent pattern discovery of educational datasets. Therefore, we choose it as the
research objective to fulfill the requirement of the education department for
exploring the huge amount of students’ data and answer the educational queries.
 Structural Design: The architecture was designed to specifically handle the huge
amount of educational datasets and analyze the learning behavior of students. This
architecture comprises of various sub-sections as ‘multidimensional dataset
extraction, ‘association rule classification system’, ‘data visualization’ and
‘objective & subjective validation’. We have opted this architecture as our guidance
model to design an EDM based analytical tool to analyze the educational patterns of
students.
 Multidimensional Dataset Construction: The initial step of the proposed
methodology is the production of a required dataset. This process starts with the
acquisition of a dataset from the aforesaid source. After getting the required dataset,
the data cleaning is performed to remove all the anomalies in the collected dataset.
The setting up of working attributes also needs to be done for choosing suitable
attributes for the analytical purpose. The transformation of the filtered dataset is
performed to convert the dataset into the required format. The array-based

14
Chapter 1: Introduction

multidimensional feature is used to handle the multidimensional kind of dataset.


Various arrays were derived to store the dataset corresponding to each attribute of
students. Lastly, the prepared dataset was loaded into the warehouse which was
developed to store the multidimensional kind of dataset. This warehouse is connected
with the proposed analytical tool to make it available as an internal warehouse.
 Development of an automated Association Rule Classification System: The main
phase is the development of an automated association rule classification system,
which is based on the hybridization of frequent pattern mining techniques. The
collaboration of associative techniques is devised to perform the constraint-based
frequent pattern mining on a multidimensional dataset. The results of collaborative
associative pattern mining have been input for the classification of association rules
to discover the multi-hidden dependencies among students attribute.
 Visualization of discovered Pattern Set: Data visualization is a prominent way to
represent the massive amount of information in a graphical way. This
communication is achieved with a systematic mapping between graphic charts and
data values in the creation of the visualization. Therefore, the visual analytics feature
has been embedded within the proposed analytical tool to graphically demonstrate
the result set. The various kinds of charts have also made available to map selected
attributes corresponding to the data. Because suitable mapping is necessary to
illustrate the correct picture of the discovered pattern set.
 Objective validation of the Proposed Methodology: The predefined constraint and
thresholds are the benchmarks to validate the result set objectively. Therefore,
various parameters have to be set-up to validate the result set in an objective manner.
Some thresholds were input by the analyst to perform the mining on the input
dataset. Similarly, some benchmarks are also been embedded within an analytical
tool to validate the proposed methodology against the input dataset. Therefore, the
analyst will get only those pattern sets that must qualify the desired benchmarks. In
such a case, the eminence of the result set should not be questionable.
 Empirical Analysis: The empirical analysis has also needs to be done for validating
proposed methodology in a subjective manner. A questionnaire is formulated to
qualitatively validate the developed techniques against the derived result sets. We
have targeted the domain experts to input their responses corresponding to the asked
questionnaire. The responses are then recorded to statistically validate the proposed

15
Chapter 1: Introduction

methodology is as to derive the optimal hypothesis for the betterment of educational


phenomena.

1.8 Thesis Outline

 Chapter 1: This chapter presents a brief introduction to data mining and education
data mining. The motivation of proposed research and academic performance
modeling is explained with the objectives of the research in details.

 Chapter 2: This chapter elaborates about the literature that is finalized for this
research work. The timeline of this research work is decided to review the progress
of research work in the area of educational data mining. The categorization of
reviewed literature is done for arranging the literature corresponding to their
respective research work.

 Chapter 3: This chapter introduces the proposed research methodology to describe


the high-level design and detailed design of system architecture. The high-level
design highlights the overall design of the proposed system, and detailed design
explains the working of each sub-module under the proposed analytical tool.

 Chapter 4: This chapter describes the process of construction of the required data-
set about learners. The construction of the required dataset comprises of various
steps viz. acquisition of relevant data from educational sources, removal of
anomalies, handling of missing values, setting up of working attributes, and the
transformation of data into the requisite format. The transformed dataset has to be
loaded into the warehouse with all attribute sets of students along with metadata of a
loaded dataset.

 Chapter 5: This chapter describes the core phase of the proposed research named
as association rule-based classification system. The associative and classification
mining has collaborated to discover the multi-hidden dependencies among learners’
attributes. This frequent pattern discovery has been devised to discover the
educational patterns from the dataset that is multidimensional in nature. This
module also allows the domain expert to input their self-derived educational pattern
that may not be returned by mining algorithms. These integrated pattern set further
classified according to target-attribute to illustrate the decision tree of the
association rules. These discovered patterns set assists in the discovery of learning

16
Chapter 1: Introduction

pattern about students and derive an optimal solution for the betterment of the
educational scenario.

 Chapter 6: This chapter describes the data visualization module under EDM
analytical tool. The visual analytics feature is embedded within this module to
illustrate the graphical picture of the discovered rule set. This chapter also
demonstrates the visualization of the discovered rule set that was taken into
consideration for the experimental study.

 Chapter 7: This chapter describes the validation and analysis of the proposed
methodology. The validation has to be performed in an objective as well as
subjective manner. The objective validation brings out the correctness of the
discovered results set against the desired parameters. The subjective validation
empirically proves the proposed methodology corresponding to the responses of
domain experts.

 Chapter 8: The final chapter summarizes the established research and provides the
conclusion corresponding to the formulated objectives. It also lists out the future
scope for which the researcher to continue with this novel methodology to derive
more efficient results from the educational dataset.

1.9 Chapter Summary


This chapter bring to light the relevance of education domain towards society. It begins
with the explanation about the concept of data mining and its collaboration with the other
disciplines that reveals the quest for knowledge. This is followed by the inclination of
data mining towards the education domain to explain the concept of ‘educational data
mining’. The details about various problem formulation techniques under EDM is also
included in this chapter. In addition to this, various segments of academic performance
modeling corresponding to the target audience are also explained to differentiate between
the current state of the art with the requirement analysis. Subsequently, this chapter
describes the motivational study that was considered for formulating the research
objectives. The research contribution is also part of this chapter that concludes with
outline of the thesis.

17

You might also like