You are on page 1of 5

Fourth International Conference on Devices, Circuits and Systems (ICDCS'18) 223

DECISION SUPPORT SYSTEM FOR HEART


DISEASE DIAGNOSIS USING INTERVAL
VAGUE SET AND FUZZY ASSOCIATION
RULE MINING
P. Umasankar V. Thiagarasu
Assistant Professor, Department of Computer Science Associate Professor, Department of Computer Science
M.I.E.T College of Arts and Science Gobi Arts and Science College
Tiruchirappalli, India Erode, India
pus.msu2013@gmail.com

Abstract—The most common death is due to the condition The decision making of the hesitant group in the dataset is the
that affects the heart is Cardiovascular disease (CVD). The most complicated task where the group comes into the heart
inadequate oxygen to the heart leads to the symptoms like fatigue disease category or belongs to the no heart disease category.
and chest pain (angina). This paper proposes a framework which
incorporates the pre-processing step, Interval Vague set, Fuzzy To solve the decision making issues in this type of
Association Rule mining and Fuzzy Correlation rule mining for group, Vague set is introduced. Vague set represents the group
the decision making process. In this paper, the proposed which is not come into the certain category. Through the
framework mainly focused on the criteria that are causing the Vague Sets (VSs) and intuitionistic fuzzy sets, the data
heart attack among the people. The pre-processing step is used to vagueness is studied and analyzed by the various researchers.
reduce the size of the heart disease dataset. Using the Rule In the year of 1994, Vague sets concepts were introduced by
Mining algorithm, the set of rules are generated for the Gau & Buehrer [3]. By the Bustince&Burillo [4] in the year
prediction of heart diseases based on the selected criteria. 1996, showed that the intuitionistic fuzzy sets are indeed for
Interval vague set is used to solve the decision making problem the vague sets. The difference between intuitionistic fuzzy sets
among the doctors regarding the heart disease among the patient and vague sets based researcher on the graphical and algebraic
who are in the hesitant state. by many researchers [5] [6] [7]. Interval Vague set (IVS)
concept is introduced by merging the vague sets and interval
Keywords— Interval Vague set, Fuzzy Association Rule
valued fuzzy sets by Zhi-feng [8] in the year 2001. The
Mining, Fuzzy Correlation Mining, Heart Disease, and Genetic
comprehensive vital operations of interval vague sets and
Algorithm
vague sets are given [9] [10]. There are many fields, which
utilize the function of vague set which is the fuzzy set of higher
I. INTRODUCTION order. In interval vague set, to represent the objective more
In 2011, World Health Organization WHO, (2011) stated practically and realistically, uncertainty function, truth
Cardiovascular Diseases (CVDs) [1] are the most common membership function and false membership notions are used.
cause of death worldwide: due to the CVD many number of Uncertainty degree, Support degree and Negative degree
people die yearly than many other diseases. In the year of 2004, represents the people’s understanding by Interval Vague sets.
17.1 million people were died due to the CVD. From that In this paper, interval vague set is utilized to rank the
around 7.2 million deaths occurred from coronary heart disease attributes in a decision making problem with fuzzy numbers,
whereas the rest died from stroke. The deaths from CVD aiming to identify the optimal attributes in the health care
occurred in the middle and low income countries and it is industry for the predicting the heart disease among the patients.
almost equally in both women and men. From the surveys,
nearly 23.6 million people will die by CVD in the year of 2030.
And that may be due to stroke and heart disease. These are II. PROPOSED FRAMEWORK
estimated to endure as the solitary important reasons of death. In the following figure 1 depicts the framework for the
In the Region of Eastern Mediterranean, the occurrence of prediction rate of the heart disease by utilizing the Genetic
death will increase by great percentage. Because of the change algorithm as the pre-processing step. This step is used to reduce
in the food habits, lifestyle and work culture, the main the size of the attributes, Interval Vague set is used to make
escalation in the number of death will happen in the region of decision based on the multiple criteria and in this problem, the
South-East Asia. So there is a need for great important for the Interval is based on the age criteria of the patient. Association
periodic inspection to predict the heart disease with most Rule mining is used to find out the general relationship among
efficient and careful methods [2]. Since the significance of this the random variables in the dataset. Correlation rule mining is
paper based on the hesitant group of the heart disease patient. used to find out the magnitude of the relationship. Finally the

ISBN: 978-1-5386-3476-9/18/$31.00 ©2018 IEEE Electronics and Communication Engineering


16th & 17th Mar 2018 Karunya Institute of Technology and Sciences

Authorized licensed use limited to: MULTIMEDIA UNIVERSITY. Downloaded on November 20,2023 at 14:45:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Devices, Circuits and Systems (ICDCS'18) 224

prediction rate for the heart disease diagnosis by using Fuzzy condition is fulfilled. The following algorithm is used to reduce
Association Rule mining and Fuzzy Correlation Rule Mining the size of the input dataset.
algorithms. This gives the importance of the both rule mining
Input: Heart Disease dataset (for testing), Pre-calculated set
algorithm and Interval vague set in the health care industry.
of chromosomes.
Output: Various type of information
Step 1: Set the overall population
Step 2: Crossover Rate equal to the range of 0.15, Mutation
Rate equal to the range of 0.35.
Step 3: While the number of generation is not attained
Step 4: For every chromosome in overall population
Step 5: For every pre-calculated chromosome
Step 6: Get the fitness
Step 7: End for
Step 8: Assign optimum fitness. The fitness of that
chromosome
Step 9: End for
FIGURE 1: A Novel Framework for Prediction of Heart Disease by using
Interval Vague set
Step 10: Eradicate some chromosomes with worse fitness
Step 11: Apply crossover to the chosen pair of
A. Pre-Processing Step chromosomes of the population.
Genetic algorithm is used to solve the various Step 12: Apply mutation to each and every chromosome of
combinational optimization problems. Since it uses an the population.
intelligent probabilistic search algorithm. In the year of 1970’s,
Holland firstly developed the Genetic Algorithm theoretical Step 13: End while.
foundations. The evolutionary process of biological organisms,
GA is coined and it is the motivation to develop the GA by B. Interval Vague Set
Holland. The set of all Interval Vague
g Sets in X is denoted by IVS
Genetic Algorithm is an intelligent probabilistic search (X). Then for each x and are closed
algorithm which can be applied to a variety of combinational intervals and their lower and upper end points are denoted by:
optimization problems. Theoretical foundations of Genetic then,
Algorithm were initially developed by Holland in 1970’s. The
inspiration of GA is based on the evolutionary process of
biological organisms in nature. During the course of evolution, where . For each
natural population evolves according to the principle of natural the hesitancy degree of a vague interval of X
selection and survival of the fittest. Individuals who are easily in A is defined and denoted as:
adaptable to all environmental conditions and have higher
fitness are more likely to reproduce and generate offspring
while lower fitness individuals are eliminated from population. C. Fuzzy Association Rule Mining
The combination of good characteristics from highly adaptive From the previously published paper by authors P.
ancestors may produce even more fit offspring. In this way, Umasankar and V. Thiagarasu [11], the following Fuzzy
species evolve more and more to become well adapted on Association Rule Mining and Fuzzy Correlation Rule Mining
environment. are used.
A Genetic Algorithm stimulates these processes by taking The fuzzy item-sets which frequently occur together in
an initial population of individuals and applying GA operators large databases are found using fuzzy association rules. All the
in each generation. Each individual is encoded as a methods used for mining fuzzy association rules are based
chromosome which is a solution to the problem. A upon a support-confidence framework where fuzzy support
chromosome is a collection of genes, means an individual is and fuzzy confidence are used to identify the fuzzy association
made up of genes. The fitness of each individual is calculated rules.
by objective function. Highly fit individuals are given chances Let be a set of fuzzy items,
for reproduction, in crossover procedure. Mutation is optional be the set of fuzzy records, and each
for changing some of genes in individual to avoid duplicity. fuzzy record is represented as a vector with m values,
This evolution, selection, crossover process repeated until the
, where is the degree that

ISBN: 978-1-5386-3476-9/18/$31.00 ©2018 IEEE Electronics and Communication Engineering


16th & 17th Mar 2018 Karunya Institute of Technology and Sciences

Authorized licensed use limited to: MULTIMEDIA UNIVERSITY. Downloaded on November 20,2023 at 14:45:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Devices, Circuits and Systems (ICDCS'18) 225

appears in recordd , . Then a fuzzy 8 Maximum HR


9 Exer Ind
association rule is defined as an implication form such as 10 ST by exercise
, where are two fuzzy item-sets. 11 Slope peak exc ST
12 Major vessels colored
The fuzzy association rule holds in T with the 13 Thal
fuzzy support ( ) and the fuzzy confidence 14 Diameter Narrowing
). The fuzzy support and fuzzy confidence
are given
g as follows:
IV. RESULT AND DISCUSSION
The following table 2 presents the result of the feature
selection technique in the pre-processing step.
(1)
Table 2: The result obtained from Genetic Algorithm
PSO ACO Genetic Algorithm
(2) Cholestrol Chest Pain Chest Pain
If the is greater than or equal to a
Maximum HR Rest ECG ExerInd
predefined threshold, minimal fuzzy support , and the
ST by exercise ExerInd Major Vessels colored
is also greater than or equal to a pre-
Thal Slope Peak by ST Thal
defined threshold, minimum fuzzy confidence ( ), then
Chest Pain Major Vessels Colored -
is considered as an interesting fuzzy association
rule, and it means that the presence of the fuzzy item-set in Major Vessels Thal -
a record can imply the presence of the fuzzy itemsets in the Colored
same record. Rest SBP - -
D. Fuzzy Correlation Rule Mining ExerInd - -
Mining fuzzy association rules is better done by finding
frequent fuzzy item-sets using candidate generation method Before starting the procedure of mining the fuzzy
(Agrawal et al. 1993; Agrawal & Srikanth, 1994) [12] [13]. correlation rules from this pre-processed dataset and the
Apriori is a seminal algorithm proposed for mining frequent interval vague technique,
q the thresholds need to be determined
fuzzy item-sets. The algorithm uses prior knowledge of
in advance. Here, is set to 0.25, is set to 0.80, is set to
frequent fuzzy item-set properties. Apriori employs an
iterative approach known as level-wise search, where k-item 0.30. First, by using the following formula (3), the degree of
sets are used to explore (k+1) –item sets. First, the set of 1 interest (significance) to the heart disease for each patient, and
item sets is found by scanning the fuzzy database to transform Table 3a to table 4. Therefore, each entry of
accumulate the count for each item, and collecting those items table 4 is the degree of interest (significance) that the features
that satisfy minimum support. The resulting set is denoted by or attributes for the patient used in the heart disease patient Pj.
L1. Next L1 is used to find L2, the set of frequent fuzzy 2-
itemset, which is used to find L3, and so on, until no more Table 3a: A Pre-processed Heart Disease Dataset
frequent fuzzy k-item-sets can be found. The finding of each A1(Chest A2 A3 A4
Pain) (Exeind) (Major Thal
Lk requires one full scan of the database. The algorithm Vessel
consists of two steps, namely (i) the join step and (ii) the prune Colored)
step, for candidate generation. The fuzzy correlation rules 1 0.0 0.6 0.2
mining method already proposed by Thiagarasu &Umasankar, 1.2 1.0 1.3 0.3
(2017) [14] Will be utilized in this section.
1.5 1.0 0.4 0.1
III. DATASET DESCRIPTION 3.9 0.0 3.5 0.1
The dataset is used in this study is of the Hungarian 3.6 1.0 6.3 0.8
institute of cardiology. The dataset contains 303 instances and 3.2 0.0 1.8 3.8
14 attributes of the heart disease patient.
1.3 1.0 1.6 0.6
Table 1: The attributes of the heart disease dataset
4.0 0.0 4.4 1.2
Serial Number Attributes
1 Age 3.8 0.0 6.8 5.2
2 Gender 1.8 0.0 1.7 2.4
3 Chest Pain
4 Rest SBP 1.3 1.0 5.4 3.5
5 Cholestrol 3.8 1.0 5.4 2.4
6 Fasting Blood
7 Rest ECG 3.6 1.0 5.1 5.7

ISBN: 978-1-5386-3476-9/18/$31.00 ©2018 IEEE Electronics and Communication Engineering


16th & 17th Mar 2018 Karunya Institute of Technology and Sciences

Authorized licensed use limited to: MULTIMEDIA UNIVERSITY. Downloaded on November 20,2023 at 14:45:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Devices, Circuits and Systems (ICDCS'18) 226

0.4 1.0 0.9 1.1 The fuzzy support and fuzzy simple correlation coefficient
2.5 7.4 0.8 0.4 of each patient
p of P2 are computed
p and listed in table 6.
.
Table 3b: The values and their range for the attributes in the Pre-
processed dataset Table 6: the fuzzy support and the fuzzy simple correlation coefficient of each
Input Field Range Linguistic Representation patient of P2
Chest Pain 0-1 Typical Angina P2 fsup r
1.1-2 Atypical Angina
(A1, A2) 0.33 0.33
2.1-3 Non Anginal Pain
3.1-4 Asymptomatic (A3, A2) 0.35 0.35
Exercise Test 1 Yes
(A4, A2) 0.16 0.20
0 No
Scan Report 0-3 Normal
3.1-6 Fixed Defect The fuzzy support and fuzzy simple correlation coefficient of
6.1-7 Reversible Defect
Major Colored 0-1.5 Low each patient of are computed and listed in table 7.
Vessels 1.6-3 Mild
3-6 High
Table 7: the fuzzy support and the fuzzy simple
p correlation coefficient of each
patient of
fsup r
(3)
(A1, A3) 0.31 0.45
(A1, A4) 0.19 0.31
(A3, A4) 0.22 0.42

Table 8: the fuzzy support and the fuzzy simple correlation coefficient of each
Table 4: The degree of significance attribute that the reason patient of P3
for heart disease patient P3 fsup r
A1(Chest A2 A3 A4
Pain) (Exeind) (Major Thal ({A1, A3}, A2 ) 0.24 0.36
Vessel (A1, {A2, A3}) 0.27 0.52
Colored)
0.2 0.1 0.1 0.1 ({A2, A1}, A3) 0.27 0.46
0.1 0.8 0.1 0.0
0.1 0.1 0.0 0.1 The above table 8 represents the the fuzzy support and the
fuzzy simple correlation coefficient of each patient of P3.
0.4 0.5 1.0 0.0
0.4 0.1 1.0 0.6
0.6 0.2 0.2 0.4
0.1 0.4 0.2 0.1 Table 8: The fuzzy confidences of the patient using
1.0 1.0 1.0 0.2 Interval vague and fuzzy correlation rules generated
Generated Rules fconf
1.0 1.0 1.0 0.8 0.77
0.3 1.0 1.0 0.2 0.61
0.1 1.0 0.5 0.4 0.65
0.9 0.4 0.5 0.2 0.74
0.4 0.1 0.5 1.0 0.63
1.0 1.0 0.2 0.1 0.50
0.2 0.1 0.1 0.0 0.57
0.77
Table 5: A (Aj) of each attribute
0.87
Aj A(Aj)
0.82
P1 0.43
P2 0.54
P3 0.47 V. CONCLUSION
P4 0.26 To diagnosis the heart disease, the hospitals need to provide
the most appropriate attributes to their patients to diagnosis the
disease. In this paper, to discover strong correlations between a
specific attribute and other attributes, we propose an algorithm

ISBN: 978-1-5386-3476-9/18/$31.00 ©2018 IEEE Electronics and Communication Engineering


16th & 17th Mar 2018 Karunya Institute of Technology and Sciences

Authorized licensed use limited to: MULTIMEDIA UNIVERSITY. Downloaded on November 20,2023 at 14:45:44 UTC from IEEE Xplore. Restrictions apply.
Fourth International Conference on Devices, Circuits and Systems (ICDCS'18) 227

for mining Interval vague fuzzy correlation rules. Our [6] Liu, P. D., & Guan, Z. L.(2008). Research on group decision making
experimental results can show that there are highly positive based on the vague set and hybrid aggregation operators, Journal of
Wuhan University of Technology, 30(10): 152–155.
relationships among the chest pain, thallium scan, exercise
[7] Liu, P. D., & Guan, Z. L. (2009). An approach for multiple attribute
induced angina and Major Colored Vessels. Of course, this decision-making based on Vague sets, Journal of Harbin Engineering
kind of information is quite useful for the patient to provide the University, 30(1), 106–110.
appropriate diagnosis to the customer and remove useless [8] Zhi-feng, Ma., Cheng, Z.H., Xiaomei, Z.,(2001), Interval valued vague
factors to reach effective diagnosis system in the health care decision systems and an approach for its rule generation,Acta
industry. Electronica Sinica, 29(5), 585-589.
[9] Bustince, H, Burillo, P, (1995), Correlation of intervalvalued
intuitionistic fuzzy sets, Fuzzy Sets and Systems, 74, 237-244.
REFERENCES
[10] Bustince, H, Burillo, P, (1995), Correlation of intervalvalued
intuitionistic fuzzy sets, Fuzzy Sets and Systems, 74, 237-244.
[1] WHO (2011), “Fact Sheet: Cardiovascular Diseases”, World Health [11] Thiagarasu, V., &Umasankar, P. (2016).Mining Correlation Rules
Organization. Geneva. forMultiple Attribute Group Decision MakingModels with Vague
[2] Tsipouras G., M. and Fotiadis I., D. (2008), “Automated Diagnosis of Sets.International Journal of Applied Engineering Research, 11(16),
Coronary Artery Disease Based on Data Mining and Fuzzy Modeling”, 8848-8857.
IEEE Transactions on Information Technology in Biomedicine. 12(4), [12] Agrawal.R, Imielinski.T.,&Swami.A,. „Mining Association Rules
pp.447-458, 2008. between Sets of Items in Large Databases‟.Proceedings of the ACM
[3] Tsipouras G., M. and Fotiadis I., D. (2008), “Automated Diagnosis of SIGMOD International Conference on Management of Data,
Coronary Artery Disease Based on Data Mining and Fuzzy Modeling”, Washington D.C. (1993), 207-216.
IEEE Transactions on Information Technology in Biomedicine. 12(4), [13] Agrawal.R, and Srikant,R. “Fast Algorithms for mining Association
pp.447-458, 2008. rules”, Proceedings of 20th international conf. on very large databases,
[4] Bustince, H, Burillo, P, (1996), Vague sets are intuitionistic fuzzy sets, Santiago, Chile, 1994, pp.487–499.
Fuzzy Sets and Systems, 79, 403-405. [14] P.Umasankar, V. Thiagarasu, “Mining Correlation Rules for Interval -
[5] Liu,P. D. (2009), Multi-Attribute decision-making method research Vague Sets”, International Journal Of Engineering And Computer
based on interval vague set and TOPSIS method, Technological and Science, Volume 6 Issue 2 Feb. 2017, Page No. 20362-20371.
Economic Development of Economy, Baltic Journal of Sustainability.
15(3), 453–463.

ISBN: 978-1-5386-3476-9/18/$31.00 ©2018 IEEE Electronics and Communication Engineering


16th & 17th Mar 2018 Karunya Institute of Technology and Sciences

Authorized licensed use limited to: MULTIMEDIA UNIVERSITY. Downloaded on November 20,2023 at 14:45:44 UTC from IEEE Xplore. Restrictions apply.

You might also like