You are on page 1of 11

DEFENCE UNIVERSITY, COLLEGE OF ENGINEERING

BISHOFTU

A CASE ASSESSMENT OF KNOWLEDGE-BASED FIT IN


FRAME FOR DIAGNOSIS OFA
HUMAN EYE DISEASES
Group name ID NO
1. Kasech maze --------------------------------------------------------------------148/09

2. Ahmed Nur----------------------------------------------------------------------129/09

3. Tamrat wedbo-------------------------------------------------------------------160/09

4. Abrhalyi g.tasdik-----------------------------------------------------------------127/09

In Partial fulfillment for the award of the degree


Of
Bachelor of Technology
In
Computer Engineering
March 2020
Abstract
The Eye is the most important sensory organ of vision function. But some eye diseases can lead to vision
loss, so it is important to identify and treat eye disease as early as possible. Eye care professionals can
help protect their patients from vision loss or blindness by recognizing common eye diseases and
recommending for an eye exam. Eye diseases with early detection, treatment, and appropriate follow-
up care, vision loss and blindness from eye disease can be prevented or delayed. In this study, rule-
based eye disease identification and advising the knowledge-based system are projected. The projected
system is targeting using hidden knowledge extracted by employing the extraction algorithm of data
mining. To identify the best prediction model for the diagnosis of eye disease, four experiments for four
classification algorithms were performed. Finally, the researchers decided to use the rules of the J48
pruned classification algorithm for further use in the development of a knowledge base of KBS because
it exhibited better performance with a 98% evaluation result.

INTRODUCTION
Eye problems have been recognized worldwide as one of the major public health problems, particularly
in developing countries where 90% of the blind live and international actions to prevent avoidable
blindness have been gaining momentum over the last decade. According to the world health
organization (WHO), about 37 million people are blind and 124 million people have low vision
worldwide . A large proportion of low vision (91.2%) and blindness (87.4%) are due to avoidable (either
preventable or treatable) causes. Females and rural residents carry greater risk for eye problems. The
burden of eye disease is believed to pose huge economic and social impacts on individuals, society, and
the nation at large .A computerbased system (expert system), over-dependence on human experts, can
be minimized. Knowledgebase (KBS) benefits the individual by providing a high-quality decision within
a given time frame and facilitating job security and personl development . Also, artificial expertise (AE)
has some features that make it more beneficial over human expertise such as permanent, easy to
transfer, easy to document, consistent, and affordable

Classification is the process of classifying a data instance into one of several predefined categorical
classes based on the training set containing known observations. A regression task begins with data
instances in which the target values are known. The relationships between predictors and the target are
summarized in a regression model that can be applied to different data instances in which the target
values are unknown. Classification is the derivation of a function or model which determines the class of
an object based on its attributes. A set of objects is given as the training set in which every object is
represented by a vector of attributes along with its class. A classification function or model is
constructed by analyzing the relationship between the attributes and the classes of the objects in
the training set. Such a classification function or model can be used to classify future objects and
develop a better understanding of the classes of the objects in the database .
INTEGRATED FRAMEWORKOUTLINE
The purpose of this research is to integrate data mining results for the development of a knowledge-
based system. The knowledge base is the core of a certain knowledge-based system. For that
knowledge, the acquisition is done using the J48 pruned rule extraction algorithm, which achieves the
best for Eye disease dataset. The challenge here is how is it possible to integrate data mining and
knowledgebase system? The following sections discuss the combination of this issue. Figure 1(a)(b)

suggests the general system design and structure of integrating data mining extracted hidden
knowledge about Eye disease and their types based on Eye disease dataset with the knowledge-based
system. The Structure shows that the data mining tasks used for generating knowledge from a
collection of Eye disease datasets. Then following the validation of rules, the generated ruleset is
encoded to the knowledge base.

In the context of the Data pre-processing, raw data cannot be used directly for processing with
the machine learning algorithms. They first need to be pre-processed into the machineunderstandable
format. Raw data can be stored in several formats, including text, Excel, or other database types of
files. Sometimes the raw data is not in any format. Having data already in a format understandable by
algorithms can result in better time efficiency for the processing of the data . After the collected 1120
instances of the KDD dataset for this study, 53 of them are found redundant from Bacterial
conjunctivitis, Nearsightedness, Farsightedness, and Blepharitis disease type. Therefore, before the
actual mining task is performed, these instances are removed at the data pre-processing stage. Designed
for building a predictive model that classifies instances into labeled classes rule extraction algorithms is
used. In this study, algorithms such as J48 pruned, J48 un-pruned, PART, and JRIP which are capable of
generating rules are selected and employed for the mining task. Extraction algorithms produce
knowledge in the form of rules from the dataset. These rules should be validated to make sure that
they are a reproduction of the dataset. Attributes or combinations of attributes together form the

rules. These rules should be evaluated to make sure that the attributes and values of the attributes
represent or reveal the conclusion.

IMPLEMENTATION AND ANALYSIS


A total of four experiments targeting building predictive models are undertaken. In this study,
1055 instances are ready for experimentation after data pre-processing. The dataset contains 22
attributes and all of them are involved in all experiments. Also after undertaking several experiments,
the default value of parameters is taken into consideration for each classifier algorithm since it allows
achieving better accuracy compared to modifying the default parameters ‘values. Generally, the
researcher compared implementation tools as shown in table 2 below

The first experiment shows that the model built using a J48 pruned classifier involving its
default value of parameters and 10-fold cross-validation is selected as a test option. Hence, the model
developed using J48 pruned classifier has a tree of size 39 and the number of leaves 20. The algorithm
has correctly classified 1039 instances and only 16 instances are classified incorrectly taking 0.05
seconds to build the model. An overview is given in Table 3. The rule Pre-processor element is
responsible for removing some special characters, removing unwanted tokens, replacing some logical

operator by another logical operator, and replacing comparison operator with another comparison
operator. The replacement and removal of special characters, logical operator and comparison
operator are based on the tokenization. Hence, all the rules are pre-processed in the same fashion.
Then fact And Rule Generator module continues its task of reversing the right-hand side to the lefthand
side from the tokenized rules.

The second experiment indicates that the model built using the J48 Unpruned decision tree
algorithm. This experiment has involved the unpruned “True” parameters with respective values and
10-fold cross-validation test mode. Then, the algorithm registered prediction accuracy of 98.2938% in
which J48 unpruned has correctly classified 1037 instances out of 1055, which means it has incorrectly
classified 18 instances taking 0.04 seconds to build the model.

An overview of performance analysis classifiers is given in Table 5. The diagnosis process is undertaken
by interacting with the user through presenting a serious of questions for the user and the user
responds by saying “yes” or “no” based on the question, the system provides a disease of Eye and
recommendation to advise the user in decision making through the user interface. Inthis study,
knowledge is generated from the sampled collection of Eye disease dataset, and the knowledge base is
constructed automatically as rules and facts.

In this experiment, the PART rule induction algorithm is employed. It generated 16 rules by involving all
the attributes of the dataset and a 10-fold cross-validation test option. The algorithm registered
prediction accuracy of 98.2938% in which 1037 instances out of 1055 are correctly classified. The
algorithm has incorrectly classified only 18 instances by taking 0.06 seconds to build the model (Table
4). The other rule induction algorithm selected for this study is JRip. Therefore, to generate IF-THEN
rules from the experimental Eye disease dataset JRip algorithm with its default values of the parameter
and 10-fold cross-validation test mode is employed. JRip correctly classified 1036 instances from 1055.
The number of incorrectly classified instances is 24. The algorithm has generated 19 rules. The algorithm
takes 0.08 seconds to develop the model. The use of a data mining approach in knowledge base
development involves a set of techniques for searching throughdatasets, looking for hidden correlations
and trends which are inaccessible using conventional data analysis techniques.

KBS systems are programmed to imitate human problem-solving by referencing databases of knowledge
on a particular subject. Expert systems are a type of KBS, which is a computer model ofhuman expertise
in a specific domain of work. They are capable of offering advice and decision-support related to
specificproblem-solving in a well-defined knowledge domain. KBS acts like an expert consultant, asking
for information, applying this information to the rules it has learned, and drawing conclusions as in Table
5.
As described in figure 2, there is a minor difference among the classifiers in terms of classifying the
dataset correctly. Even if their minor difference, J48 pruned has registered the best prediction accuracy
by classifying 1039 instances out of 1055 correctly. Results of J48 unpruned and PART show that an
equal number of incorrectly classified instances. The highest incorrect classification is registered by the
JRip algorithm. The Prediction accuracy shows us the general classification accuracy of the algorithms.
Apart from prediction accuracy, classifiers are also evaluated to measure how they correctly classified
each class to their correct class or incorrectly classified to another class. Their correct class or
incorrectly classified to another class. As stated earlier, pruned and unpruned decision trees and two
decision rule induction algorithms are used for the experiments. All the selected algorithms allow
generating rules from the dataset. The results of the algorithms are evaluated based onprediction
accuracy in classifying the instances of the dataset into Bacterial conjunctivitis, Viral Conjunctivitis,
Trachoma, Glaucoma, Corneal ulcer, Cataract, Blepharitis, Chalazion, Nearsightedness (Myopia), and
Farsightedness (Hyperopia). The Prediction accuracy shows us the general classification accuracy of the
algorithms. Apart from prediction accuracy, classifiers are also evaluated to measure how they
correctly classified each class to their correct class or incorrectly classified to another class. Hence, to
evaluate the performance of the classifiers employed in this study True Positive rate, Precision, Recall,
and F-measure are used as discussed in figure 3.

J48 pruned has registered the best result in terms of precision, recall, and F-measure values as
compared to another classification algorithm. Moreover, the True Positive rate of classifiers is also
compared. From Table 3-7 we reveal that the highest TP Rate of 98.5% was scored by J48 pruned
model followed by the J48 unpruned model and PART that achieved From the precision values, the
highest scores of 98.7% were registered by J48 pruned followed by the J48 unpruned model and PART
that achieved 98.5%. The least scores of 98.4% were registered by JRip. When we come to the Recall
and F-Measure results, J48 pruned model with all attributes achieved the highest scores of 98.5% and
98.5 %respectively, whereas the least Recall rate of 98.2% and F-Measure of 98.2 % were presented by
the JRip

The rule acquired from the classifier algorithms is used for constructing the knowledge base. To develop
an effective knowledge base system, acquiring relevant rules is dominant. Hence from the four
algorithms, the researcher selected the classifier which best performed on classifying the dataset. J48
pruned has the best performance classifying the dataset. J48 pruned has the best performance all types
of diseases are above 98.48% which is a great performance in predicting identifying and diagnosis each
disease correctly. The FP rate is almost minor for most diagnosis class.

This page consists of components such as the combo box and command buttons. The command buttons
of the GUI are used to fire a Prolog query. Combo boxes allow users to select alternatives given by the
inference engine. The Diagnose command button links with other dialog boxes to display the results of
the detected diagnosis and recommendation to users results of the detected diagnosis and
recommendation to users
The disease of eye diagnosis is started when the user can select “Yes” or “No” option from a combo box
based on patient complain, symptom and eye condition or case which is listed in the label Then, the user
clicking “Diagnose” command button and if match with encoded rule the system immediately gives the if
match with encoded rule the system immediately gives the assist the practitioner to take action. Figure
5 shows the results of detected bacterial conjunctivitis eye disease and its advice or recommendation.
To assure that Eye disease diagnostic systemKBS meets the requirement it is developed for, it has to be
tested. Test cases are one of the predominant evaluation mechanisms for evaluating the performance of
the proposed system which helps the researcher to compare and contrast the domain experts’
judgment. The system performance testing focuses on testing the behavior of the knowledge-based
system to check that it is satisfactory in the eyes of the user. But accurate in performance measures is
the system, how complete the knowledge-based system is, it will be difficult if the system doesn‘t meet
user requirements. It does not take into consideration the internal mechanics of the system and tends
to be subjective. To make the model into industrial applicability, the Eye disease diagnostic system
(EDDS-KBS) has met the desired performance with the least errors.

DISCUSSION AND INFERENCE


As discussed in the evaluation section, the proposed system achieves favourable results with a 90.1%
system performance testing result and an 86% user acceptance testing result. The overall performance
of the prototype system is 88.1%. This overall performance of the prototype system is 88.1%. This
indicates that using integrated knowledge acquisition techniquesis better than using manual knowledge
acquisition techniques separately

In the beginning, this study has four research questions to answer, and let‘s discuss how these questions
have been answered with this study. The first research question of this study was “Is it possible to use
rules resulted from production rules in data mining to construct the rule-based knowledgebased
system and provide advice for the user?” To answer this question, four experiments for rule
classification algorithms namely J48 pruned, J48 unpruned, PART, and JRip under ten-fold Cross-
Validation test option/mode was conducted and the experiments showed that J48 pruned classification
algorithm is the best rules classifications resulted to develop the prediction model that can predict the
type of eye disease. It records better performance with 98.4 and the researcher decided to use the
results for further use in the development of the knowledge base of KBS then collect the advice for each
detected disease from the domain expert to encode with a prototype. The second question was “How
it is possible to describe the knowledge-based system from knowledge extracted using data mining
techniques?” To answer this question, the first eye disease dataset was collected from DBRH's future
selection and pre-processing was undertaken.

CONCLUSIONS AND RECOMMENDATIONS


In this study, the possibility of integrating data mining models discovered. The integration process
begins by taking samples of the DBRH eye disease dataset which is found in the Amara region Debre
Birhan Hospital, Ethiopia, Africa. The dataset is preprocessed and made suitable for mining steps. Then
the researcher extracted knowledge in the form of rules using the WEKA data mining tool. Data mining
has demonstrated to extract hidden knowledge from a large collection of the dataset. Hence, four
experiments for four classification algorithms namely J48 pruned, J48 unpruned, PART, and JRip under
ten-fold CrossValidation test option/mode were conducted. Finally, the data mining classifier, J48
pruned is employed for the knowledge acquisition step since it has performed best among the selected
classifiers with an accuracy of 98.5%. The implementation of the prototype system is accomplished by
using the SWI-Prolog tool which supports GUI integration for the user interface.

Investigation on the applicability of integration of data mining with a Knowledge-Based System in eye
health care.

✓ The researcher conducted the research on a selected sample of diseases of the eye which could be
differentiated by common typical symptoms by applying differential diagnosis techniques. To fully
implement, further study should be studied to incorporate all eye health problems.

✓ This study considered the DBRH dataset to integrate the induced rule with the knowledge base
system. So that future studies might need to discover knowledge and patterns at different sources.

✓ To enhance the performance of the prototype knowledgebased systems, the hybrid strategy
approaches should be investigated which combines case-based reasoning. The Addition of case-based
reasoning helps the system to learn

✓ In this work, the prototype system displays only clinical signs in word expression which is not easy to
understand the conditions for the users who have no or little experience in diagnosis. This can be
enhanced by using multimedia files such as pictures and videos to understand the problem easily by
match the condition with these files.

✓ The scope of the prototype is limited to identifying eye disease and recommending first-line
treatments and medications. For chronic and acute eye disease detailed specification of medications is
required. Therefore, further investigation should be done on the treatment planning of eye problems.
REFERENCES
-www.goole.com

1. HeMavatHi, P. S., & SHenoy, P. (2014). Profile of microbial isolates in ophthalmic infections and
antibiotic susceptibility of the bacterial isolates: a study in an eye care hospital, Bangalore. Journal of
clinical and diagnostic research: JCDR, 8(1), 23.

2. World Health Organization. (2006). Sight test and glasses could dramatically improve the lives of 150
million people with poor vision. In Sight test and glasses could dramatically improve the lives of 150
million people with poor vision.

3. Aemero, A., Berhan, S., & Yeshigeta, G. (2015). Role of health extension workers in eye health
promotion and blindness prevention in Ethiopia. JOECSA, 18(2).

4. Abdulkerim, M. (2013). Towards integrating data mining with knowledge based system: the case of
network intrusion detection (Doctoral dissertation, Addis Ababa University). 5. Fayyad, U. M., Piatetsky-
Shapiro, G., & Smyth, P. (1996, August). Knowledge Discovery and Data Mining: Towards a Unifying
Framework. In KDD (Vol. 96, pp. 82-88).

6. Prentzas, J., & Hatzilygeroudis, I. (2007). Categorizing approaches combining rule‐based and case‐
based reasoning. Expert Systems, 24(2), 97-122.

You might also like