A Case Assessment of Knowledge-Based Fit in Frame For Diagnosis of Human Eye Diseases

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/341161646
A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN FRAME FOR DIAGNOSIS

OF HUMAN EYE DISEASES
Article · April 2020

DOI: 10.31838/jcr.07.06.133
CITATIONS READS
0 70
3 authors, including:
Nilamadhab Mishra Gnanaprakasam Thangavel

VIT Bhopal University Gayatri Vidya Parishad College of Engineering
37 PUBLICATIONS 213 CITATIONS 6 PUBLICATIONS 2 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Data Distribution and Knowledge Inference View project
PREDICTING NETWORK INTRUSION USING A HYBRID COMBINATION OF TWO METHODS View project
All content following this page was uploaded by Nilamadhab Mishra on 13 May 2020.
The user has requested enhancement of the downloaded file.

Journal of Critical Reviews
ISSN- 2394-5125 Vol 7, Issue 6, 2020
Review Article
A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN FRAME FOR DIAGNOSIS OF

HUMAN EYE DISEASES
Nilamadhab Mishra1,Gnanaprakasam Thangavel 2* ,Johny Melese Samuel3
1 School of Computer Science and Engineering, VIT Bhopal University, India
2* Dept. of CSE, Gayatri Vidya Parishad College of Engineering, India
* Corresponding Author: gnanagvp@gvpce.ac.in

3 IS research scholar, College of Computing, Debre Berhan University, Ethiopia
Received: 09.02.2020 Revised: 12.03.2020 Accepted: 22.04.2020
Abstract
The Eye is the most important sensory organ of vision function. But some eye diseases can lead to vision loss, so it is important to
identify and treat eye disease as early as possible. Eye care professionals can help protect their patients from vision loss or blindness by
recognizing common eye diseases and recommending for an eye exam. Eye diseases with early detection, treatment, and appropriate
follow-up care, vision loss and blindness from eye disease can be prevented or delayed. In this study, rule-based eye disease
identification and advising the knowledge-based system are projected. The projected system is targeting using hidden knowledge
extracted by employing the extraction algorithm of data mining. To identify the best prediction model for the diagnosis of eye disease,
four experiments for four classification algorithms were performed. Finally, the researchers decided to use the rules of the J48 pruned
classification algorithm for further use in the development of a knowledge base of KBS because it exhibited better performance with a
98.5 % evaluation result. In this work, the integration is done between the J48 pruned classifier and PROLOG and converted from rule
representation to PROLOG understandable format. Thus, SWI-Prolog 7.6.4 has used to implement the prototype of eye disease advising
KBS and Java Net Beans IDE 8.2 with JDK 1.8.0 to integrate the model.
Keywords:Eye Disease, data mining, Knowledge-Based System, Integrator, Prolog, Net Beans IDE
© 2019 by Advance Scientific Research. This is an open-access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/)
DOI: http://dx.doi.org/10.31838/jcr.07.06.133
INTRODUCTION used to classify future objects and develop a better

Eye problems have been recognized worldwide as one of the understanding of the classes of the objects in the database [16-
major public health problems, particularly in developing 17]. As mentioned in [18- 20] and [21-22], classification is also
countries where 90% of the blind live and international actions called supervised learning. It is called supervised learning
to prevent avoidable blindness have been gaining momentum because it works on labelled attributes in which there is a
over the last decade. According to the world health organization specially chosen attribute and the aim is to use the data given to
(WHO), about 37 million people are blind and 124 million people predict the values of that attribute for instances that have not yet
have low vision worldwide [1-4]. A large proportion of low been seen. The chosen attributes in classification are categorical
vision (91.2%) and blindness (87.4%) are due to avoidable such as ‘high‘, ‘low’ or medium’, [23-26]. Classification is a two-
(either preventable or treatable) causes. Females and rural step process [27-30] consisting of model construction and model
residents carry greater risk for eye problems. The burden of eye usage. In the first step, a classifier is built describing a
disease is believed to pose huge economic and social impacts on predetermined or labelled set of data classes or concepts. This is
individuals, society, and the nation at large [5-7]. A computer- the learning step (or training phase), where a classification
based system (expert system), over-dependence on human algorithm builds the classifier by analysing or learning from a
experts, can be minimized. Knowledgebase (KBS) benefits the training set made up of database instances and their associated
individual by providing a high-quality decision within a given class labels. This step is called model construction. Generally,
time frame and facilitating job security and personal classification is a process of construction model that define data
development [8-10]. Also, artificial expertise (AE) has some class and used to predict the class of objects whose class label is
features that make it more beneficial over human expertise such unknown. It finds out the relationship between predictor value
as permanent, easy to transfer, easy to document, consistent, and and the target value. The model is based on the analysis of a set
affordable [11- 12]. of training data. The data; historical, for classification is typically
divided into two datasets: one for building the model; the other
Classification is the process of classifying a data instance into one for testing the model. Thus the various classification approaches
of several predefined categorical classes based on the training set can be employed on medical data for obtaining specific
containing known observations. A regression task begins with information and disease diagnosis. Decision tree, Byes classifier,
data instances in which the target values are known. The neural network, support vector machine, and rule-based learning
relationships between predictors and the target are summarized are some of the classification data mining techniques. The
in a regression model that can be applied to different data general objective of this study is to construct a knowledge base
instances in which the target values are unknown [13-15]. system prototype that can update its knowledge base using the
Classification is the derivation of a function or model which hidden knowledge extracted from Eye disease dataset by using
determines the class of an object based on its attributes. A set of data mining classification techniques. The following are the
objects is given as the training set in which every object is research specific objectives that help to achieve the general
represented by a vector of attributes along with its class. A objective of the study. To understand methods, approaches,
classification function or model is constructed by analysing the techniques, and tools from works of literature, to apply different
relationship between the attributes and the classes of the objects data mining (DM) algorithms and select suitable data mining
in the training set. Such a classification function or model can be classification algorithms for constructing predictive models, to
Journal of critical reviews 754

A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN FRAME FOR DIAGNOSIS OF HUMAN EYE DISEASES
acquire knowledge from the predictive model, for knowledge Knowledge-based system PROLOG is used. PROLOG is used
base construction, to build an integrator aiming at automatically because the researcher is more familiar than other AI
building a knowledge base from the predictive model, to update programming languages used to develop a knowledge-based
knowledge base based on the new knowledge obtained from data system. SWI-PROLOG editor is used to represent rules. The
mining, and to evaluate the performance and user acceptance Prolog program consists of a set of facts accompanied by a set of
levels of the knowledge-based system[31-35]. The knowledge conditions that the solution must satisfy; the computer can figure
acquisition for the knowledge-based system is effected out for itself how to infer the solution from the given facts. This is
automatically by employing data mining techniques rather than called logic programming [41- 42]. Prolog is based on formal
undertaking an interview with experts. Data mining results are logic and solves problems by applying techniques originally
integrated into the knowledge base by using the integrator developed to prove theorems in logic. It is a versatile language.
application. The integrator directly creates knowledge after Also, it classifies new instances rapidly [43- 45]. Hence, rules are
mining rules from the dataset. It has a graphic user interface for used to represent knowledge for the knowledge-based system.
selecting evaluated rules which are developed by using SWI- The set of discovered rules has to be verified for accuracy,
Prolog, Net beans IDE programming tools. The advice after consistency, or no redundant or contradictory rules and
detecting and identifying an Eye disease is targeted mainly for usefulness (rules showing the decision-making process) for
beginner Eye care professionals and primary health care workers. knowledge base being developed [46-50]. The accuracy of the
However, the prototype does not give detail treatment advising models developed using data mining techniques is evaluated
services by using medications. To treat Eye patients with based on finding the accuracy of classifiers, Precision, Recall, F-
medication it requires selection of the appropriate medicine, measure, and True Positive rate. The researcher also evaluated
dosage, duration, frequency of the medication, etc. It needs a the KBS using system performance testing by preparing test
detailed study on those medications. Due to the short time cases and users’ acceptance testing questionnaire which helps
available for the research, the study does not address all the the researcher to make sure that whether the potential users
treatment planning. would like to use the proposed system frequently and whether
the proposed systems meets user requirements. The main
Also, the time and resources constraints are limited to cover the benefits of using a knowledge-based system are increased output
entire KBS of all diseases in this study. Hence, this study does and productivity, improved quality, reduced downtime,
not include the diagnosis of all diseases of Eye. The disease capturing scarce expertise, flexibility, and reliability, integrated
covered includes the following eye disease; Bacterial knowledge, educational benefits/ease of training, enhance the
conjunctivitis, Cataract, Viral Conjunctivitis, Trachoma, Glaucoma, problem-solving capability, and knowledge documentation and
Chalazion, Corneal ulcer, Blepharitis, Nearsightedness (Myopia), ease of knowledge transfer [51-52]. With the proper utilization
and Farsightedness (Hyperopia). Thus, the proposed system of knowledge, the KBS increases productivity and enhances
covers the diagnosis of common diseases which can be identified problem-solving capabilities most flexibly. Such systems also
by the domain experts with differential diagnosis of clinical have document knowledge for future use and training. This leads
patient complain, Eye condition, and symptoms. Knowledge to increased quality in the problem-solving process [53].
Discovery in Database (KDD) model is followed for the data
mining task. Knowledge Discovery Databases is the process of The rest of this paper is organized as follows. Section 2 discusses
extracting and refining useful knowledge from large databases the integrated framework outline of a data mining model with
[36-37]. KDD has been used by different researchers to discover the knowledge-based system. Our implementation and analysis
knowledge from a large collection of records. KDD has been used are deliberated in section 3. Section 4 highlights the discussion of
by different researchers to discover knowledge from a large the classifier models. Finally, section 5 concludes this paper
collection of records. It has seven steps such as Data cleaning, along with future work.
Data integration, Data selection, Data transformation, Data
mining, Pattern evaluation, and Knowledge presentation [38-40]. INTEGRATED FRAMEWORK OUTLINE
The knowledge that the researcher acquired from Data mining The purpose of this research is to integrate data mining results
classification techniques is in the form of rules. Rules are for the development of a knowledge-based system. The
constructed in the form of an if-then format. These if-then rules knowledge base is the core of a certain knowledge-based system.
statements are used to formulate the conditional statements that For that knowledge, the acquisition is done using the J48 pruned
constitute the knowledge base. Rule-based representation is rule extraction algorithm, which achieves the best for Eye disease
highly expressive, is easy to interpret and easy to generate. To dataset. The challenge here is how is it possible to integrate data
mine the hidden knowledge from the pre-processed dataset and mining and knowledgebase system? The following sections
compare the performance of classifiers, the researchers used discuss the combination of this issue. Figure 1(a)(b) suggests
WEKA 3.9.0 data mining tool. Also, to develop an application that the general system design and structure of integrating data
maps the knowledge acquired from the data mining classifiers mining extracted hidden knowledge about Eye disease and their
with knowledge-based system Java NetBeans IDE 8.2 with JDK types based on Eye disease dataset with the knowledge-based
1.8.0 is employed. NetBeans offers easy and efficient project system. The Structure shows that the data mining tasks used for
management, has the best support for the latest Java generating knowledge from a collection of Eye disease datasets.
technologies, and can be installed on all operating systems Then following the validation of rules, the generated ruleset is
supporting Java. To represent rules in the knowledge base and encoded to the knowledge base.
constructing the Rule-based Eye diagnosis and advising

Figure-1(a) (b): Implementation architecture
In the context of the Data pre-processing, raw data cannot be some special characters, removing unwanted tokens, replacing
used directly for processing with the machine learning some logical operator by another logical operator, and replacing
algorithms. They first need to be pre-processed into the machine- comparison operator with another comparison operator. The
understandable format. Raw data can be stored in several replacement and removal of special characters, logical operators,
formats, including text, Excel, or other database types of files. and comparison operators are based on the tokenization process
Sometimes the raw data is not in any format. Having data already illustrated in table 1. Hence, all the rules are pre-processed in
in a format understandable by algorithms can result in better the same fashion. Then fact And Rule Generator module
time efficiency for the processing of the data [36]. After the continues its task of reversing the right-hand side to the left-
collected 1120 instances of the KDD dataset for this study, 53 of hand side from the tokenized rules.
them are found redundant from Bacterial conjunctivitis,
Nearsightedness, Farsightedness, and Blepharitis disease type. After rules are cleaned by using the rule Pre-processor section,
Therefore, before the actual mining task is performed, these the next step is integrating rules and facts following the syntax of
instances are removed at the data pre-processing stage. Designed PROLOG for creating the knowledge base needed to enhance the
for building a predictive model that classifies instances into reasoning process. This requires reversing the order of the rules
labeled classes rule extraction algorithms is used. In this study, from IF…THEN constructs to
algorithms such as J48 pruned, J48 un-pruned, PART, and JRIP THEN…IF construct for backward chaining. Hence this section
which are capable of generating rules are selected and employed exchanges the position of the left-hand side and right-hand side
for the mining task. Extraction algorithms produce knowledge in of J48 pruned rules. Prolog rules have both head and body but
the form of rules from the dataset. These rules should be facts have only heads. Hence, the component first builds the
validated to make sure that they are a reproduction of the heads of the rules having the format: ‘predicate (conclusion):-‘.
dataset. Attributes or combinations of attributes together form After that ‘:-‘is concatenated which means IF in PROLOG. To
the rules. These rules should be evaluated to make sure that the make it a complete rule, the body part (antecedents) must be
attributes and values of the attributes represent or reveal the concatenated with the heads. The implementation phase of a KBS
conclusion. For this study, several rules are generated by the is the transformation of the acquired discovered knowledge into
algorithm to identify an instance of the KDD dataset as Bacterial a computer program. The diagnosis process is undertaken by
conjunctivitis, Cataract, Viral Conjunctivitis, Trachoma, Glaucoma, interacting with the user through presenting a serious of
Chalazion, Corneal ulcer…. For that, most rules used a questions for the user and the user responds by saying “yes” or
combination of attributes and a small number of them used a “no” based on the question, the system provides a disease of Eye
single attribute with the respective values for attributes. and recommendation to advise the user in decision making
Therefore, before using the generated rules as part of the through the user interface. The medical knowledge acquired
knowledge base, the rules are evaluated in consultation with cannot immediately be interpreted into system development or
domain experts. Knowledge Base is a container of rules about programming tools. The discovered knowledge has to be
Eye disease which are generated by the J48 pruned rule transformed in a format so that it can be easier for the
extraction algorithm after planned by the integrator to PROLOG implementation tools to act on the knowledge, encoding the
logical format. User Interface is the interaction point between the knowledge. Knowledge representation is the tool for the
user and the system. The user interface can be a graphical user description of the acquired knowledge which enables performing
interface (GUI). In the sequence of integration of data mining the task of encoding the knowledge obtained, from Discovered
with a knowledge-based system, Graphical User Interface for the knowledge as well as human expert’s advice into a software tool
integrator for the knowledge-based system. In this study, an that will be used to develop the proposed system. Thus, for this
attempt is made to design an automatic integration of the result study, the acquired knowledge has been represented using the
of the data mining model with the knowledge base. To achieve production rule method of knowledge representation. Therefore,
this Java NetBeans IDE 8.2 with JDK 1.8.0 and PROLOG has been in this study, the proposed diagnostic knowledge base system of
used. This is done by understanding the standard format diseases of Eye has been designed in a rule-based in which the
followed by J48 pruned rule construction and PROLOG formalism experts’ knowledge has been represented using production rule.
[37-40].
Next, the mining is accomplished, the result is written as a text
file. The rule Pre-processor element is responsible for removing

Table-1: Rules before and after tokenization
Before rule preprocessing After rule preprocessing
IF A2 =YES AND A14=YES AND A2 =YES, A14=YES,A20=NO: -

A20=NO THEN Bacterial bacterial conjunctivitis
conjunctivitis
IF A1=YES AND A2=YES AND A1=YES, A2=YES, A10=YES,

A10=YES AND A13=YES THEN A13=YES:- Viral Conjunctivitis
Viral Conjunctivitis
IF A 20=YES THEN Corneal ulcer 20=YES:- Corneal ulcer
IF A1=YES AND A8=YES AND A1=YES, A8=YES, AND A15=YES,
A15=YES A17=YES AND A20=YES A17=YES, A20 =YES:-Corneal
THEN Corneal ulcer ulcer
IF A18=YES THEN A18=YES:-
Nearsightedness(Myopia) Nearsightedness(Myopia)
IMPLEMENTATION AND ANALYSIS parameters is taken into consideration for each classifier
A total of four experiments targeting building predictive models algorithm since it allows achieving better accuracy compared to
are undertaken. In this study, 1055 instances are ready for modifying the default parameters ‘values. Generally, the
experimentation after data pre-processing. The dataset contains researcher compared implementation tools as shown in table 2
22 attributes and all of them are involved in all experiments. Also below.
after undertaking several experiments, the default value of
Table 2: Critical comparisons of implementation tools

Comparison of implementation tools
Criteria SWI-Prolog Netbeans JESS Eclipse
Tools
Open source and good for research Yes Yes No No

Windows’ compatibility Yes Yes Yes Yes
Support GUI Yes Yes Yes Yes
Easy to use Yes Yes Yes Yes
The first experiment shows that the model built using a J48 based on the tokenization. Hence, all the rules are pre-processed
pruned classifier involving its default value of parameters and in the same fashion. Then fact And Rule Generator module
10-fold cross-validation is selected as a test option. Hence, the continues its task of reversing the right-hand side to the left-
model developed using J48 pruned classifier has a tree of size 39 hand side from the tokenized rules.
and the number of leaves 20. The algorithm has correctly The second experiment indicates that the model built using the
classified 1039 instances and only 16 instances are classified J48 Unpruned decision tree algorithm. This experiment has
incorrectly taking 0.05 seconds to build the model. An overview involved the unpruned “True” parameters with respective values
is given in Table 3. The rule Pre-processor element is and 10-fold cross-validation test mode. Then, the algorithm
responsible for removing some special characters, removing registered prediction accuracy of 98.2938% in which J48
unwanted tokens, replacing some logical operator by another unpruned has correctly classified 1037 instances out of 1055,
logical operator, and replacing comparison operator with which means it has incorrectly classified 18 instances taking 0.04
another comparison operator. The replacement and removal of seconds to build the model.
special characters, logical operator and comparison operator are
Table 3: Confusion matrix for the J48 pruned classification algorithm
Classified as
a b C d e f g h i j
138 0 0 0 0 0 0 0 0 0 a = Bacterial
conjunctivitis
0 133 0 0 0 0 0 0 0 0 b = Cataract
0 0 144 0 0 0 0 0 0 0 c = Viral
Actual Conjunctivitis
0 0 0 119 0 0 0 0 0 0 d = Trachoma
0 0 0 0 106 0 0 0 7 0 e = Glaucoma
0 0 0 0 0 76 0 0 0 0 f= Chalazion
0 0 0 2 0 1 85 0 5 0 g = Corneal ulcer

0 0 0 0 0 0 0 80 1 0 h = Blepharities
0 0 0 0 0 0 0 0 80 0 i=Nearsightedness
(Myopia)
0 0 0 0 0 0 0 0 0 78 j=Farsightedness
(Hyperopia)
An overview of performance analysis classifiers is given in Table the model. The use of a data mining approach in knowledge base
5. The diagnosis process is undertaken by interacting with the development involves a set of techniques for searching through
user through presenting a serious of questions for the user and datasets, looking for hidden correlations and trends which are
the user responds by saying “yes” or “no” based on the question, inaccessible using conventional data analysis techniques. The
the system provides a disease of Eye and recommendation to basic techniques for data mining include decision-tree induction,
advise the user in decision making through the user interface. In rule induction, instance-based learning, Bayesian learning,
this study, knowledge is generated from the sampled collection support vector machines, ensemble techniques, clustering, and
of Eye disease dataset, and the knowledge base is constructed association rules. This study is aimed at designing prototype
automatically as rules and facts. rule-based human eye disease diagnosis and advising
knowledge-based systems by using an automatically constructed
In this experiment, the PART rule induction algorithm is knowledge base based on knowledge acquired from data mining
employed. It generated 16 rules by involving all the attributes of models and providing advice for eye care professionals and
the dataset and a 10-fold cross-validation test option. The primary health care.
algorithm registered prediction accuracy of 98.2938% in which
1037 instances out of 1055 are correctly classified. The KBS systems are programmed to imitate human problem-solving
algorithm has incorrectly classified only 18 instances by taking by referencing databases of knowledge on a particular subject.
0.06 seconds to build the model (Table 4). The other rule Expert systems are a type of KBS, which is a computer model of
induction algorithm selected for this study is JRip. Therefore, to human expertise in a specific domain of work. They are capable
generate IF-THEN rules from the experimental Eye disease of offering advice and decision-support related to specific
dataset JRip algorithm with its default values of the parameter problem-solving in a well-defined knowledge domain. KBS acts
and 10-fold cross-validation test mode is employed. JRip like an expert consultant, asking for information, applying this
correctly classified 1036 instances from 1055. The number of information to the rules it has learned, and drawing conclusions
incorrectly classified instances is 24. The algorithm has as in Table 5.
generated 19 rules. The algorithm takes 0.08 seconds to develop
Table 4: Model Performance analysis by class for the PART classification algorithm
TP Rate Precisio Recall F- Class
n measure
1.00 1.00 1.00 1.000 Bacterial conjunctivitis
0.985 1.00 0.985 0.992 Cataract
1.00 0.986 1.00 0.993 Viral Conjunctivitis
1.00 0.983 1.00 0.992 Trachoma
0.938 1.00 0.938 0.968 Glaucoma
1.000 0.974 1.000 0.987 Chalazion
0.903 1.00 0.903 0.949 Corneal ulcer
0.988 1.00 0.988 0.994 Blepharities
1.000 0.860 1.000 0.925 Nearsightedness
(Myopia)
1.000 1.000 1.000 1.000 Farsightedness
Table 5: Summarised Performance of Classifiers

Classifier
Correctly Incorrectly Time take to
classified classified build the
instances instances model(in
second)
No. percent No. percent
age age
J48 10 98.4834 16 1.5166 0.05
pruned 39 % %
J48 10 98.2938 18 1.7062 0.01
unpruned 37 % %
JRip 10 98.1991 19 1.8009 0.08
36 % %
PART 10 98.2938 18 1.7062 0.06
37 % %

Figure 2: Accuracy of J48 pruned, J48 Unpruned, JRip, and PART Algorithms
As described in figure 2, there is a minor difference among the All the selected algorithms allow generating rules from the
classifiers in terms of classifying the dataset correctly. Even if dataset. The results of the algorithms are evaluated based on
their minor difference, J48 pruned has registered the best prediction accuracy in classifying the instances of the dataset
prediction accuracy by classifying 1039 instances out of 1055 into Bacterial conjunctivitis, Viral Conjunctivitis, Trachoma,
correctly. Results of J48 unpruned and PART show that an equal Glaucoma, Corneal ulcer, Cataract, Blepharitis, Chalazion,
number of incorrectly classified instances. The highest incorrect Nearsightedness (Myopia), and Farsightedness (Hyperopia). The
classification is registered by the JRip algorithm. The Prediction Prediction accuracy shows us the general classification accuracy
accuracy shows us the general classification accuracy of the of the algorithms. Apart from prediction accuracy, classifiers are
algorithms. Apart from prediction accuracy, classifiers are also also evaluated to measure how they correctly classified each
evaluated to measure how they correctly classified each class to class to their correct class or incorrectly classified to another
their correct class or incorrectly classified to another class. As class. Hence, to evaluate the performance of the classifiers
stated earlier, pruned and unpruned decision trees and two employed in this study True Positive rate, Precision, Recall, and
decision rule induction algorithms are used for the experiments. F-measure are used as discussed in figure 3.
Figure 3: Performance Comparison of the Classifier Models
J48 pruned has registered the best result in terms of precision, all types of diseases are above 98.48% which is a great
recall, and F-measure values as compared to another performance in predicting identifying and diagnosis each disease
classification algorithm. Moreover, the True Positive rate of correctly. The FP rate is almost minor for most diagnosis class.
classifiers is also compared. From Table 3-7 we reveal that the This shows the model developed using J48 pruned is acceptable
highest TP Rate of 98.5% was scored by J48 pruned model for constructing the rule base of the knowledge base system.
followed by the J48 unpruned model and PART that achieved
98.3%. The least TP Rate of 97.2% was scored by JRip. Speed refers to the execution time it takes a classifier to be
trained. To build this model, JRip, PART, J48 pruned and J48
From the precision values, the highest scores of 98.7% were unpruned classifier took much time respectively.
registered by J48 pruned followed by the J48 unpruned model
and PART that achieved 98.5%. The least scores of 98.4% were J48 pruned classifiers have generated 20 rules but among them,
registered by JRip. When we come to the Recall and F-Measure 17 rules are selected for implementation which is correctly
results, J48 pruned model with all attributes achieved the highest identifying disease which has proved by the domain expert. The
scores of 98.5% and 98.5 %respectively, whereas the least Recall rules involved 14 features/attributes among the 22
rate of 98.2% and F-Measure of 98.2 % were presented by the features/attributes from the sample dataset. The algorithm
JRip. generated 1 rule for each Eye disease namely, viral conjunctivitis,
Near-sightedness (Myopia), Farsightedness (Hyperopia), and
The rule acquired from the classifier algorithms is used for Chalazae and two rules for each Bacterial conjunctivitis, Corneal
constructing the knowledge base. ulcer, Glaucoma, Trachoma and Blepharitis Eye disease and the
remaining three generated rules for cataract eye disease
To develop an effective knowledge base system, acquiring identification.
relevant rules is dominant. Hence from the four algorithms, the
researcher selected the classifier which best performed on Based on the classification rules discovery, we proceed to design
classifying the dataset. J48 pruned has the best performance a working prototype to extend into a KBS product.
among the four classifiers. Its prediction accuracy and TP rate for

Figure 4: Prototype of an Integrated Knowledge Base System
This page consists of components such as the combo box and command button links with other dialog boxes to display the
command buttons. The command buttons of the GUI are used to results of the detected diagnosis and recommendation to users
fire a Prolog query. Combo boxes allow users to select (figure 4).
alternatives given by the inference engine. The Diagnose
Figure 5: diagnosis and recommendation of bacterial conjunctivitis after “Diagnose”
The disease of eye diagnosis is started when the user can select Eye disease diagnostic system (EDDS-KBS) has met the desired
“Yes” or “No” option from a combo box based on patient performance with the least errors.
complain, symptom and eye condition or case which is listed in
the label Then, the user clicking “Diagnose” command button and DISCUSSION AND INFERENCE
if match with encoded rule the system immediately gives the As discussed in the evaluation section, the proposed system
result of the detected disease followed by its recommendation to achieves favourable results with a 90.1% system performance
assist the practitioner to take action. Figure 5 shows the results testing result and an 86% user acceptance testing result. The
of detected bacterial conjunctivitis eye disease and its advice or overall performance of the prototype system is 88.1%. This
recommendation. To assure that Eye disease diagnostic system- indicates that using integrated knowledge acquisition techniques
KBS meets the requirement it is developed for, it has to be tested. is better than using manual knowledge acquisition techniques
Test cases are one of the predominant evaluation mechanisms separately.
for evaluating the performance of the proposed system which
helps the researcher to compare and contrast the domain In the beginning, this study has four research questions to
experts’ judgment. The system performance testing focuses on answer, and let‘s discuss how these questions have been
testing the behavior of the knowledge-based system to check that answered with this study. The first research question of this
it is satisfactory in the eyes of the user. But accurate in study was “Is it possible to use rules resulted from production
performance measures is the system, how complete the rules in data mining to construct the rule-based knowledge-
knowledge-based system is, it will be difficult if the system based system and provide advice for the user?” To answer this
doesn‘t meet user requirements. It does not take into question, four experiments for rule classification algorithms
consideration the internal mechanics of the system and tends to namely J48 pruned, J48 unpruned, PART, and JRip under ten-fold
be subjective. To make the model into industrial applicability, the Cross-Validation test option/mode was conducted and the
experiments showed that J48 pruned classification algorithm is
the best rules classifications resulted to develop the prediction

model that can predict the type of eye disease. It records better investigation on the applicability of integration of data mining
performance with 98.4 and the researcher decided to use the with a Knowledge-Based System in eye health care.
results for further use in the development of the knowledge base
of KBS then collect the advice for each detected disease from the ✓ The researcher conducted the research on a selected
domain expert to encode with a prototype. The second question sample of diseases of the eye which could be differentiated
was “How it is possible to describe the knowledge-based system by common typical symptoms by applying differential
from knowledge extracted using data mining techniques?” To diagnosis techniques. To fully implement, further study
answer this question, the first eye disease dataset was collected should be studied to incorporate all eye health problems.
from DBRH's future selection and pre-processing was ✓ This study considered the DBRH dataset to integrate the
undertaken. Secondly, to extract hidden knowledge rule induced rule with the knowledge base system. So that
classification algorithms were conducted. Finally, the best rule future studies might need to discover knowledge and
classification results were pre-processed, converted into Prolog patterns at different sources.
understandable able to format to integrate with a knowledge ✓ To enhance the performance of the prototype knowledge-
base, and describe it. The third question was “How Data Mining based systems, the hybrid strategy approaches should be
results could be integrated with Knowledge-Based System for investigated which combines case-based reasoning. The
Diagnosis and advice of Human Eye disease?” To answer this Addition of case-based reasoning helps the system to learn
question, a prototype Knowledge-Based was developed using the from documented experiences.
knowledge that is acquired from discovered knowledge and ✓ In this work, the prototype system displays only clinical
which enabled us to integrate the WEKA result automatically signs in word expression which is not easy to understand
with the knowledge base. Then to call the knowledge base that is the conditions for the users who have no or little experience
constructed with Prolog from Java, the researcher has added a in diagnosis. This can be enhanced by using multimedia files
JPL file in Java library. such as pictures and videos to understand the problem
easily by match the condition with these files.
The fourth question was “How to evaluate the performance and ✓ The scope of the prototype is limited to identifying eye
user acceptance of the prototype?” To answer this question, disease and recommending first-line treatments and
system performance was undertaken by preparing test case medications. For chronic and acute eye disease detailed
which helps to compare domains’ judgment with the proposed specification of medications is required. Therefore, further
system’s responses. In this case, the test case included 21 investigation should be done on the treatment planning of
samples of eye disease instances and 22 attributes with a eye problems.
respected value which was unlabelled. Finally, 19 eye disease
instances were correctly labelled out of 21 which show 90.1% CONFLICT OF INTERESTS
system performance is correct. The authors declare that there is no conflict of interest regarding
the publication of this paper.
CONCLUSIONS AND RECOMMENDATIONS
In this study, the possibility of integrating data mining models ACKNOWLEDGMENT
with the knowledge-based system is comprehended and The authors would like to express thanks to the School of
discovered. The integration process begins by taking samples of Computing Sciences and Engineering, VIT Bhopal University,
the DBRH eye disease dataset which is found in the Amara region India, Gayatri Vidya Parishad College of Engineering, India, and
Debre Birhan Hospital, Ethiopia, Africa. The dataset is pre- College of Computing, Debre Berhan University, Ethiopia, Africa
processed and made suitable for mining steps. Then the for supporting this research.
researcher extracted knowledge in the form of rules using the
WEKA data mining tool. Data mining has demonstrated to extract REFERENCES
hidden knowledge from a large collection of the dataset. Hence, 1. HeMavatHi, P. S., & SHenoy, P. (2014). Profile of microbial
four experiments for four classification algorithms namely J48 isolates in ophthalmic infections and antibiotic
pruned, J48 unpruned, PART, and JRip under ten-fold Cross- susceptibility of the bacterial isolates: a study in an eye care
Validation test option/mode were conducted. Finally, the data hospital, Bangalore. Journal of clinical and diagnostic
mining classifier, J48 pruned is employed for the knowledge research: JCDR, 8(1), 23.
acquisition step since it has performed best among the selected 2. World Health Organization. (2006). Sight test and glasses
classifiers with an accuracy of 98.5%. The implementation of the could dramatically improve the lives of 150 million people
prototype system is accomplished by using the SWI-Prolog tool with poor vision. In Sight test and glasses could
which supports GUI integration for the user interface. The user dramatically improve the lives of 150 million people with
interface is designed by using the Net Beans IDE Java program poor vision.
and connected to the Prolog knowledge base system via the 3. Aemero, A., Berhan, S., & Yeshigeta, G. (2015). Role of health
Prolog inference engine. The study reveals further research extension workers in eye health promotion and blindness
investigation to fully implement the integrating of data mining prevention in Ethiopia. JOECSA, 18(2).
with Knowledge-Based Systems in eye health care and disease 4. Abdulkerim, M. (2013). Towards integrating data mining
diagnosis in the domain area. Based on the results of the study, with knowledge based system: the case of network
the following recommendations are suggested for further intrusion detection (Doctoral dissertation, Addis Ababa
investigation on the applicability of integration of data mining University).
with a Knowledge-Based System in eye health care. We 5. Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996,
conducted the research on a selected sample of diseases of the August). Knowledge Discovery and Data Mining: Towards a
eye which could be differentiated by common typical symptoms Unifying Framework. In KDD (Vol. 96, pp. 82-88).
by applying differential diagnosis techniques. To fully implement, 6. Prentzas, J., & Hatzilygeroudis, I. (2007). Categorizing
further investigation should be made to incorporate all eye approaches combining rule‐based and case‐based
health problems by incorporating emerging computing reasoning. Expert Systems, 24(2), 97-122.
technologies. 7. Oprea, M. (2006). On the Use of Data-Mining Techniques in
Knowledge-Based Systems. Economy Informatics, 6, 21-24.
The study is a promising study for further research investigation 8. Shiferaw, B., Gelaw, B., Assefa, A., Assefa, Y., & Addis, Z.
to fully implement the integrating of data mining with (2015). Bacterial isolates and their antimicrobial
Knowledge-Based Systems in eye health care and disease susceptibility pattern among patients with external ocular
diagnosis in the domain area. Based on the results of the study,
the following recommendations are suggested for further

infections at Borumeda hospital, Northeast Ethiopia. BMC system for software design problems. European Journal of
ophthalmology, 15(1), 103. Scientific Research, 62(3), 311-320.
9. Berhane, Y., Worku, A., Bejiga, A., Adamu, L., Alemayehu, W., 32. Efraim, T. (2011). Decision support and business
Bedri, A., ... & Kebede, T. D. (2007). National survey on intelligence systems. Pearson Education India.
blindness, low vision and trachoma in Ethiopia: Methods 33. Levesque, H. J. (1986). Knowledge representation and
and study clusters profile. Ethiopian Journal of Health reasoning. Annual review of computer science, 1(1), 255-
Development, 21(3), 185-203. 287.
10. Akerkar, R., & Sajja, P. (2010). Knowledge-based systems. 34. Mishra, N., Lin, C. C., & Chang, H. T. (2014, December). A
Jones & Bartlett Publishers. cognitive oriented framework for IoT big-data management
11. Akerkar, R., & Sajja, P. (2010). Knowledge-based systems. prospective. In 2014 IEEE International Conference on
Jones & Bartlett Publishers. Communiction Problem-solving (pp. 124-127). IEEE.
12. Schreiber, G., Wielinga, B., & Breuker, J. (Eds.). (1993). 35. Mishra, N. (2018). Internet of Everything Advancement
KADS: A principled approach to knowledge-based system Study in Data Science and Knowledge Analytic Streams.
development (Vol. 11). Academic Press. International Journal of Scientific Research in computer
13. Datta, R. P., & Saha, S. (2011). An Empirical comparison of science and Engineering, 6(1), 30-36.
rule based classification techniques in medical databases 36. Žarko, I. P., Pripužić, K., Serrano, M., & Hauswirth, M. (2014,
(No. 1107). June). Iot data management methods and optimisation
14. Covington, M. A., Nute, D., & Vellino, A. (1996). Prolog lgorithms for mobile publish/subscribe services in cloud
programming in depth. Prentice-Hall, Inc. environments. In 2014 European conference on networks
15. Fayisa, D. (2018). Integrated Predictive Model and and communications (EuCNC) (pp. 1-5). IEEE.
Knowledge Based System for Wheat Disease Detection: Case 37. Gubbi, J., Buyya, R., Marusic, S., & Palaniswami, M. (2013).
of Arsi Zone (Doctoral dissertation, ASTU). Internet of Things (IoT): A vision, architectural elements,
16. Brose, L. S., & Bradley, C. (2009). Psychometric and future directions. Future generation computer systems,
development of the retinopathy treatment satisfaction 29(7), 1645-1660.
questionnaire (RetTSQ). Psychology, health & medicine, 38. Anantharam, P., Barnaghi, P., & Sheth, A. (2013, June). Data
14(6), 740-754. Processing and Semantics for Advanced Internet of Things
17. Han, J., Pei, J., & Kamber, M. (2011). Data mining: concepts (IoT) Applications: modeling, annotation, integration, and
and techniques. Elsevier. perception. In Proceedings of the 3rd International
18. Mining, W. I. D. (2006). Data mining: Concepts and Conference on Web Intelligence, Mining and Semantics (pp.
techniques. Morgan Kaufinann, 10, 559-569. 1-5).
19. Jackson, J. (2002). Data mining; a conceptual overview. 39. Mishra, N., Chang, H. T., & Lin, C. C. (2015). An Iot
Communications of the Association for Information knowledge reengineering framework for semantic
Systems, 8(1), 19. knowledge analytics for BI-services. Mathematical
20. Phyu, T. N. (2009, March). Survey of classification problems in engineering, 2015.
techniques in data mining. In Proceedings of the 40. Dagnino, A., & Cox, D. (2014, July). Industrial Analytics to
International MultiConference of Engineers and Computer Discover Knowledge from Instrumented Networked
Scientists (Vol. 1, pp. 18-20). Machines. In SEKE (pp. 86-89).
21. Serapião, A., & Bannwart, A. C. (2013). Knowledge discovery 41. Chang, H. T., Mishra, N., & Lin, C. C. (2015). IoT big-data
for classification of three-phase vertical flow patterns of centred knowledge granule analytic and cluster framework
heavy oil from pressure drop and flow rate data. Journal of for BI applications: a case base analysis. PloS one, 10(11).
Petroleum Engineering, 2013. 42. Mishra, N., Chang, H. T., & Lin, C. C. (2018). Sensor data
22. Quinlan, J. R. (2014). C4. 5: programs for machine learning. distribution and knowledge inference framework for a
Elsevier. cognitive-based distributed storage sink environment.
23. Tayel, S., Reif, M., & Dengel, A. (2013). Rule-based Complaint International Journal of Sensor Networks, 26(1), 26-42.
Detection using RapidMiner. In Conference: RCOMM (pp. 43. Mishra, N. (2017). In-network Distributed Analytics on
141-149). Data-centric IoT Network for BI-service Applications.
24. Esseynew, S. (2011). Prototype Knowledge Based System International Journal of Scientific Research in Computer
for Anxiety Mental Disorder Diagnosis (Doctoral Science, Engineering and Information Technology
dissertation, Addis Ababa University). (IJSRCSEIT), ISSN, 2456-3307.
25. Sasikumar, M., Ramani, S., Raman, S. M., Anjaneyulu, K. S. R., 44. Patnaik, B. C., & Mishra, N. (2016). A Review on Enhancing
& Chandrasekar, R. (2007). A practical introduction to rule the Journaling File System. Imperial Journal of
based expert systems. New Delhi: Narosa Publishing House. Interdisciplinary Research, 2(11).
26. Achour, S. L., Dojat, M., Rieux, C., Bierling, P., & Lepage, E. 45. Chang, H. T., Li, Y. W., & Mishra, N. (2016). mCAF: a multi-
(1999). Knowledge acquisition environment for the design dimensional clustering algorithm for friends of social
of a decision support system: application in blood network services. SpringerPlus, 5(1), 1-15.
transfusion. In Proceedings of the AMIA Symposium (p. 46. Chang, H. T., Liu, S. W., & Mishra, N. (2015). A tracking and
187). American Medical Informatics Association. summarization system for online Chinese news topics. Aslib
27. Prasad, T. V. (2012). Hybrid systems for knowledge Journal of Information Management.
representation in artificial intelligence. arXiv preprint 47. Mishra, N., Lin, C. C., & Chang, H. T. (2015). A cognitive
arXiv:1211.2736. adopted framework for IoT big-data management and
28. De Kock, E. (2005). Decentralising the codification of rules knowledge discovery prospective. International Journal of
in a decision support expert knowledge base (Doctoral Distributed Sensor Networks, 11(10), 718390.
dissertation, University of Pretoria). 48. Mishra, N., Lin, C. C., & Chang, H. T. (2014). Cognitive
29. Dokas, I. M. (2005, September). Developing Web Sites For inference device for activity supervision in the elderly. The
Web Based Expert Systems: A Web Engineering Approach. Scientific World Journal, 2014.
In ITEE (pp. 202-217). 49. Mishra, N., Chang, H. T., & Lin, C. C. (2014). Data-centric
30. Schmoldt, D. L., & Rauscher, H. M. (2012). Building knowledge discovery strategy for a safety-critical sensor
knowledge-based systems for natural resource application. International Journal of Antennas and
management. Springer Science & Business Media. Propagation, 2014.
31. Al-Saiyd, N. A., Mohammad, A. H., Al-Sayed, I. A., & Al-
Sammarai, M. F. (2011). Distributed knowledge acquisition

50. O’Neil, E. C., Henderson, M., Massaro-Giordano, M., & Bunya, Outcomes after Anti–Vascular Endothelial Growth Factor
V. Y. (2019). Advances in dry eye disease treatment. Current Treatment for Neovascular Age-Related Macular
Opinion in Ophthalmology, 30(3), 166-178. Degeneration: Age-Related Eye Disease Study 2 Report
51. Perez, V. L., Pflugfelder, S. C., Zhang, S., Shojaei, A., & Haque, Number 19. Ophthalmology Retina, 4(1), 3-12.
R. (2016). Lifitegrast, a novel integrin antagonist for 53. Tong, A. Y., Passi, S. F., & Gupta, P. K. (2020). Clinical
treatment of dry eye disease. The ocular surface, 14(2), Outcomes of Lifitegrast 5% Ophthalmic Solution in the
207-215. Treatment of Dry Eye Disease. Eye & Contact Lens, 46, S20-
52. Keenan, T. D., Vitale, S., Agrón, E., Domalpally, A., Antoszyk, S24.
A. N., Elman, M. J., ... & Group, R. (2020). Visual Acuity
View publication stats

A Case Assessment of Knowledge-Based Fit in Frame For Diagnosis of Human Eye Diseases

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Case Assessment of Knowledge-Based Fit in Frame For Diagnosis of Human Eye Diseases

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN FRAME FOR DIAGNOSIS

Article · April 2020

Nilamadhab Mishra Gnanaprakasam Thangavel

SEE PROFILE SEE PROFILE

Data Distribution and Knowledge Inference View project

The user has requested enhancement of the downloaded file.

A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN FRAME FOR DIAGNOSIS OF

* Corresponding Author: gnanagvp@gvpce.ac.in

Received: 09.02.2020 Revised: 12.03.2020 Accepted: 22.04.2020

INTRODUCTION used to classify future objects and develop a better

Journal of critical reviews 754

Journal of critical reviews 755

Figure-1(a) (b): Implementation architecture

Journal of critical reviews 756

Table-1: Rules before and after tokenization

Before rule preprocessing After rule preprocessing

IF A2 =YES AND A14=YES AND A2 =YES, A14=YES,A20=NO: -

IF A1=YES AND A2=YES AND A1=YES, A2=YES, A10=YES,

Table 2: Critical comparisons of implementation tools

Criteria SWI-Prolog Netbeans JESS Eclipse

Open source and good for research Yes Yes No No

Table 3: Confusion matrix for the J48 pruned classification algorithm

Journal of critical reviews 757

Table 5: Summarised Performance of Classifiers

Journal of critical reviews 758

Figure 3: Performance Comparison of the Classifier Models

Journal of critical reviews 759

Figure 4: Prototype of an Integrated Knowledge Base System

Figure 5: diagnosis and recommendation of bacterial conjunctivitis after “Diagnose”

Journal of critical reviews 760

Journal of critical reviews 761

Journal of critical reviews 762

Journal of critical reviews 763

View publication stats

You might also like