Professional Documents
Culture Documents
3.4 Gantt Chart and Process Model and Client Communication17Error! Bookmark not
defined.
1
4.1 Block Diagram................................................. Error! Bookmark not defined.26
List of Tables
Table 3.2.1: Feature Analysis ........................................................Error! Bookmark not defined.
2
Chapter 1
Introduction
1.1 Introduction
Data mining is the process of finding previously unknown patterns and trends in
databases and using that information to build predictive models. In today's world data mining
plays a vital role for prediction of diseases in medical industry. In medical diagnosis, the
information provided by the patients may include redundant and interrelated symptoms and signs
especially when the patients suffer from more than one type of disease of same category. The
physicians may not able to diagnose it correctly. So it is necessary to identify the important
diagnostic features of a disease and this may facilitate the physicians to diagnosis the disease
early and correctly.
Fuzzy Systems is used for solving a wide range of problems in different application
domains. The use of Genetic Algorithms for designing Fuzzy Systems allows us to introduce the
learning and adaptation capabilities. The topic has attracted considerable attention in the
Computation Intelligence community. The paper briefly reviews the classical models and the
most recent trends for Genetic Fuzzy Systems. Accurate and reliable decision making in
oncological prognosis can help in the planning of suitable surgery and therapy, and generally,
improve patient management through the different stages of the disease. To indicate that the
reliable prognostic marker model than the statistical and artificial neural-network-based methods.
1.1.1 Necessity of accurate heart disease prediction system
In today’s stressful & hectic lifestyle it is difficult to take care of heart health. Heart
disease (HD) is a major cause of morbidity and mortality in the modern society. Medical
diagnosis is an important but complicated task that should be performed accurately and
efficiently and its automation would be very useful. All doctors are unfortunately not equally
skilled in every sub specialty and they are in many places a scarce resource. A system for
automated medical diagnosis would enhance medical care and reduce costs. System not only
help medical professionals but it would give patients a warning about the probable presence of
heart disease even before he visits a hospital or goes for costly medical check-ups.
3
1.2 Motivation
Data mining techniques have been widely used in clinical decision support systems for
prediction and diagnosis of various diseases with good accuracy. These techniques have been
very effective in designing clinical support systems because of their ability to discover hidden
patterns and relationships in medical data. One of the most important applications of such
systems is in diagnosis of heart diseases because it is one of the leading causes of deaths all over
the world. Almost all systems that predict heart diseases use clinical data-set having parameters
and inputs from complex tests conducted in labs. None of the system predicts heart diseases
based on risk factors such as age, family history, diabetes, hypertension, high cholesterol,
tobacco smoking, alcohol intake, obesity or physical inactivity, etc. Heart disease patients have
lot of these visible risk factors in common which can be used very effectively for diagnosis.
System based on such risk factors would not only help medical professionals but it would give
patients a warning about the probable presence of heart disease even before he visits a hospital or
goes for costly medical checkups.
The main objective of this research is to develop an Intelligent Heart Disease Prediction
System using Weighted Associative Classifiers that can be used in making expert decision with
maximum accuracy. The system can be implemented in remote areas like rural regions or
country sides, to imitate like human diagnostic expertise for treatment of heart ailment. The
system can be easily updated as and when the new training data set will be available. Also the
system is user friendly and includes less number of steps, as the prediction rules are already
generated and stored in Rule Base. The system is helpful for practice nor to confirm his/her
decision during heart Disease prediction.
Data mining techniques are popularly used for various real world applications related to
predicting of things. Heart disease prediction system is also built with the help of such data
mining techniques.
In order to create a learning system, our system requires a learning algorithm like neural
network.
This system takes input from user in form of parameters and casts a prediction after
learning on thousands of train and test data.
This project proposes a new hybrid approach that uses a collaboration of Genetic
Algorithm , Neural Networks and Fuzzy Logic. It is basically a Neuro-Fuzzy System
with Genetic Algorithm.
4
1.4 Scope of the project
Complaint Processing: Getting ready to take appropriate action to satisfy the customers
based on any probable negative reviews identified through the analysis of tweets.
Sales Department: Identifying the region specific sales of a product and it's correlation
with those region's product.
1) Social impact: The project developed will provide a user friendly environment as the
distribution of the data will be at the client site.
2) Ethical or legal impact: The hybrid learning algorithms used are Open Source Entities which
are increasingly used in Data Mining.
3) Environmental impact: No direct environmental impact.
4) Industrial Impact: This will benefit not only doctors for prediction, but also patients who can
use this system at their homes thereby reducing cost of major reports.
5
4. Hospitals can use our product to train upcoming doctors (medical students).
5. Doctors can use our system at their respective clinics as well as the patients can use this
at their respective homes.
6
Chapter 2
Table 1:
Sr. Year of Title Conference Methodology used
no. Publication
1. 2013 Intelligent Heart IEEE This research has
Disease developed a
Prediction prototype using
System Using data mining
Data Mining techniques,
Techniques namely, Decision
Trees, Naïve
Bayes and Neural
Network. Results
show that each
technique has its
unique strength
in realizing the
objectives of the
defined mining
goals.
2. 2012 Intelligent and IJCSI It presents
effective heart methodology for
attack prediction the extraction of
system using significant
patterns from the
7
data mining and heart disease
artificial neural warehouses for
network heart attack
prediction.
The
preprocessed data
warehouse is
then clustered
using the K-
means clustering
algorithm
8
2.2 PROPOSED WORK
Problem Statement:
Data mining techniques have been widely used in clinical decision support systems for
prediction and diagnosis of various diseases with good accuracy. These techniques have been
very effective in designing clinical support systems because of their ability to discover hidden
patterns and relationships in medical data. One of the most important applications of such
systems is in diagnosis of heart diseases because it is one of the leading causes of deaths all over
the world. Almost all systems that predict heart diseases use clinical dataset having parameters
and inputs from complex tests conducted in labs. None of the system predicts heart diseases
based on risk factors such as age, family history, diabetes, hypertension, high cholesterol,
tobacco smoking, alcohol intake, obesity or physical inactivity, etc. Heart disease patients have
lot of these visible risk factors in common which can be used very effectively for diagnosis.
System based on such risk factors would not only help medical professionals but it would give
patients a warning about the probable presence of heart disease even before he visits a hospital or
goes for costly medical checkups.
Objective:
The objective of this research is to create an intelligent & cost effective system which
will overcome the limitations of existing system and improve its performance.A novel way to
enhance the performance of a model that combines genetic algorithms and neuro fuzzy logic for
feature selection and classification is proposed.
Neuro fuzzy systems are fuzzy systems which use ANN’s theory in order to determine
their properties (fuzzy sets and fuzzy rules) by processing data samples. The main objective of
this research is to develop a prototype Intelligent Heart Disease Prediction.
Scope:
Various soft computing methods have been used for the detection of a potential medical
problem. Thus, a reliable method for both feature selection and classification is required. The
feature selection is based on a new genetic algorithm and classification is based on neuro fuzzy
system (NFS). Feature selection is another factor that impacts classification accuracy. By
extracting as much information as possible from a given data set while using the smallest number
of features, we can save significant computation time and build models that generalize better for
unseen data points. System with NFS and genetic algorithm using historical heart disease
databases to make intelligent clinical decisions which traditional decision support systems
cannot. The NFS model integrates adaptable fuzzy inputs with a modular neural network to
rapidly and accurately approximate complex functions. Fuzzy inference systems are also
valuable, as they combine the explanatory nature of rules (MFs) with the power of neural
networks.
9
Proposed Architecture:
10
Chapter 3
Requirement Gathering, Analysis and Planning
Method Of Production:
We will follow the Spiral i.e. Iterative Model for software development.
We will be making use of Matlab IDE to develop our project.
Production Technique:
It might have happened so many times that you or someone yours need doctors
help immediately, but they are not available due to some reason.
The Heart Disease Prediction application is an end user support and online
consultation project.
Here, we propose a web application that allows users to get instant guidance on
their heart disease through an intelligent system online.
The application is fed with various details and the heart disease associated with
those details.
The application allows user to share their heart related issues.
It then processes user specific details to check for various illness that could be
associated with it.
Here we use some intelligent data mining techniques to guess the most accurate
illness that could be associated with patient’s details.
Based on result, system automatically shows the result specific doctors for further
treatment.
The system allows user to view doctor’s details.
11
Intel 1.66 GHz Processor Pentium 4
Windows XP, Windows 7,8
Visual Studio 2010
12
3.1.2 FINANCIAL FEASIBILITY STUDY:
As we are using Matlab, it is not going cost us anything as it is an open source
development toolkit available for free of cost. So, the total cost estimation will be:total cost of
electricity + total cost of Internet usage.
We propose some improvements to our project that increases the classification accuracy
over the previous accuracy baseline. The objective of this research is to create an intelligent &
cost effective system which will overcome the limitations of existing system and improve its
performance.A novel way to enhance the performance of a model that combines genetic
algorithms and neuro fuzzy logic for feature selection and classification is proposed.
We will be using GitHub which is also an open source tool which will add zero cost to
the total estimation cost. It is mainly used for our project management so that every member of
the team can work on the code simultaneously.
3.1.3MARKET SURVEY:
There is growing evidence that merging classification and association rule mining together can
produce more efficient and accurate classification system than traditional classification
techniques.
Genetic algorithms are typically used for problems that cannot be solved efficiently with
traditional techniques.
Genetic algorithms seem to be useful for searching very general spaces and optimization
problems.
Coronary heart disease is epidemic in India and one of the major causes of disease burden and
deaths. Data from registrar general of India shows that heart diseases are major cause of death in
India, studies to determine the precise cause of death in urban Chennai and rural areas of A.P
have revealed that CVD cause about 40% of the deaths in urban and 30% in rural areas.
In today’s stressful & hectic lifestyle it is difficult to take care of heart health. Heart disease
(HD) is a major cause of morbidity and mortality in the modern society. Medical diagnosis is an
important but complicated task that should be performed accurately and efficiently and its
automation would be very useful. All doctors are unfortunately not equally skilled in every sub
specialty and they are in many places a scarce resource. A system for automated medical
diagnosis would enhance medical care and reduce costs. System not only help medical
professionals but it would give patients a warning about the probable presence of heart disease
even before he visits a hospital or goes for costly medical check-ups.
13
3.1.4 LEGAL, SOCIAL, SCHEDULE FEASIBILITY:
LEGAL FEASIBILTY:
Our project will meet the legal requirements of Government of India.
The initial prototypes of our project will be based on dummy HEART DISEASE
PREDICTION data; hence there would be no legal conflicts with doctors as well as HEART
DISEASE PREDICTION data lenders.
In the later stages of development we will be testing out project on the Matlab, and will
acquire all the required permissions for accessing their structure from Matlab Inc.
SOCIAL FEASIBILITY:
The Application will reduce a lot of labour work. Hence the Efforts will be
reduced.
The app will not require use of papers which is good sign for society.
SCHEDULE FEASIBILITY:
A project will fail if it takes too long to be completed before it is useful. Typically this
means estimating how long the system will take to develop and if it can be completed in a given
time period using some methods like payback period.
This phase will help in many ways. For example, in the middle of the project you realize
that the project is not moving in the speed you want, you can hire more people to increase the
efforts and can complete the project in the given time. Also as the alternative, if your budget is
not allowing you to hire more staffs you can retrain your existing ones in order to make your
work in speed.
To review our schedule we have prepared a corresponding Timeline Gantt chart which
shows a proper execution of our project that will take place and we have kept some buffer time if
something takes longer than expected.
ECONOMIC FEASIBILITY:
We have conducted economic feasibility studies for public and private clients across a
broad range of industries and projects including agriculture, environment and natural resources,
recreation and tourism facilities, commercial and industrial businesses, business cooperatives,
water development and renewable energy projects.
14
3.2 Methodology and Technology
Methodology
The objective of this research is to create an intelligent & cost effective system which will
overcome the limitations of existing system and improve its performance. A novel way to
enhance the performance of a model that combines genetic algorithms and neuro fuzzy logic for
feature selection and classification is proposed.
Various soft computing methods have been used for the detection of a potential medical
problem. Thus, a reliable method for both feature selection and classification is required. The
feature selection is based on a new genetic algorithm and classification is based on neuro fuzzy
system (NFS). Feature selection is another factor that impacts classification accuracy. By
extracting as much information as possible from a given data set while using the smallest number
of features, we can save significant computation time and build models that generalize better for
unseen data points.
Neuro fuzzy systems are fuzzy systems which use ANN’s theory in order to determine their
properties (fuzzy sets and fuzzy rules) by processing data samples. The main objective of this
research is to develop a prototype Intelligent Heart Disease Prediction.
System with NFS and genetic algorithm using historical heart disease databases to make
intelligent clinical decisions which traditional decision support systems cannot. The NFS model
integrates adaptable fuzzy inputs with a modular neural network to rapidly and accurately
approximate complex functions. Fuzzy inference systems are also valuable, as they combine the
explanatory nature of rules (MFs) with the power of neural networks.
Technology
The different types of technologies that have been used are as follows-
SOFTWARE REQUIREMENTS:
Google Chrome
Matlab
HARDWARE REQUIREMENTS:
1 GB RAM.
200 GB HDD.
15
3.3 TIMELINE CHART, PROCESS MODEL AND CLIENT COMMUNICATION
3.3.1 TIMELINE CHART:
16
17
18
Figure 3: Time Line Charts
19
3.3.2 PROCESS MODEL
In this project after careful and relentless consideration; we as a team has come to the conclusion
that we would be proceeding with Iterative or Incremental (specifically Spiral Model) of
software development. The main motive behind choosing this model are its advantages in
making it feasible for us to iteratively modify our project according to client requirements.
After having a look at all the client specifications and requirements and after having a brief
discussion regarding the same with our Institute Guide we have come to the conclusion that our
project will require extensive and rigorous work in every expect of its domain and we would
further need to modify them according to client satisfaction, we will also need to analyze early
models of our project and identify various risks that are difficult to detect on paper and remove
them in later stages .
20
3.3.3 CLIENT COMMUNICATION
CLIENT:
Doctors
4) Is there any specific demand from your side for the software?
No, nothing as such, just the software should be able to provide a proper statics which
should be based on the data whether positive or negative.
7) How frequently do you want to know about the development of the software?
We would like to be involved in every step. So it would be good if we are informed every
month.
21
3.4 SYSTEM ANALYSIS
A data flow diagram (DFD) is a graphical representation of the "flow" of data through an
information system, modelling its process aspects. Level 0 DFD represents the whole system as
an abstract entity , with User, HEART DISEASE PREDICTION and Company being the
external entity,the user does not directly access the system, he/she send “data” on the HEART
DISEASE PREDICTION API which then pre-processes the data,the company sends precise
specification to the system based on which it extracts relevant data by accessing the HEART
DISEASE PREDICTION API,after processing and rating of the data , the system generates a
textual and graphical summary of result and sends it to company/user.
LEVEL- 0 DFD
22
LEVEL-1 DFD
Level 1 DFD is the conjecture of Level 0 DFD, which is represents the flow of the processes going
through the whole system.The following DFD consist of elaboration of the processes defined in level 0
DFD. Fuzzy logic and neural networks are natural complementary tools in building intelligent
systems. While neural networks are low-level computational structures that perform well when dealing
with raw data, fuzzy logic deals with reasoning on a higher level, using linguistic information acquired
from domain experts.
23
3.4.2 ULM DIAGRAM AND BLOCK DIAGRAM
Use-Case Diagram
The Use Case diagram consists of three main users which is Data Collector, HEART DISEASE
PREDICTION, User. The Data Collector is the main component of overall process. It will
collect all data which are entered by the user to which it will process by using modules such as –
Perform Sentiment Analysis, Extract significant phrases, Machine Learning Classifier, Generate
graph and summary text. The data collector will have only one connectivity for http request to be
confirmed. User will be entering data and also will get a review about his data.
24
Sequence Diagram
The Sequence diagram consists of four main components of our system i.e. User,
HEART DISEASE PREDICTION, Application and the Company. Here the interaction between
them is shown. To initiate the process company begins by requesting for HEART DISEASE
PREDICTION data analysis. Further the request is accessed by the application by extracting
information of the users from the HEART DISEASE PREDICTION and processing it. Finally
the result is sent to the company as well as the users.
25
Activity Diagram
Architecture Diagram
26
Chapter 4
27
4.2 Flowchart
Flowchart: The below chart shows the flow of activities in our system. It demonstrate
various stages of our project
28
Chapter 5
Design Methodology
5.1 Database Design
5.2 Gui Design
5.3 Testing Report
DESCRIPTION:
In this application there are certain requirements which need to be dealt with.
Requirements like its users/patients account, credentials etc. For that application has to have a
“Login” page that has these elements:
● Account/Username
● Password
● Login/Sign in Button
Unit testing of above mentioned case can be as follows:
● Field length – username and password fields
■ In here the module is concerned with the field length of certain attributes
like password which should be according to mentioned description
(special characters).
● Input field
■ Value of input field should be valid like in case of user name should be
appropriate.
● Login button will forward user to the home page only if the credentials match. If
not, then a pop up message of “invalid credentials” will appear.
Integration testing, of particular case can be as follows:
● Form Login
■ User sees the welcome message after entering valid values and pushing
the login button.
■ User should be navigated to the welcome page or home page after valid
entry and clicking Login button.
In this particular case considered for functional testing, description can be as follows:
1. The expected behavior is checked, i.e. the user will be able to login on clicking the
login button after entering valid username and password values
2. After successful login user is navigated to home page.
3. There should be an error message that should appear on invalid login.
4. To check whether there is any stored site cookies for login fields.
5. Confirmation of registration is send to respective student.
29
5.4 Sample Source Code
CODE:
function net=apply_ga_generation_variation(net,Opt,O,input)
input_weights=net.IW;
new_weights=input_weights{1,1};
new_weight(1:12,1:12)=rand;
input_weights=new_weight;
net.IW{1,1}=input_weights;
no_of_weights=144;
[x_ga_opt, err_ga] = ga(handle, no_of_weights, ga_opts);
ii=1;
kk=1;
for jj=1:12:length(x_ga_opt)
weights(kk,1:12)=x_ga_opt(1,jj:jj+11);
kk=kk+1;
end
net.IW{1,1}=weights;
function net=apply_ga(net,Opt,O)
input_weights=net.IW;
new_weights=input_weights{1,1};
new_weight(1:12,1:12)=rand;
input_weights=new_weight;
net.IW{1,1}=input_weights;
no_of_weights=144;
[x_ga_opt, err_ga] = ga(handle, no_of_weights, ga_opts);
ii=1;
kk=1;
30
for jj=1:12:length(x_ga_opt)
weights(kk,1:12)=x_ga_opt(1,jj:jj+11);
kk=kk+1;
end
net.IW{1,1}=weights;
function mse_error=calc_mse(output,targets)
for ii=1:length(output)
mse_error(ii) = sum((output(ii)-targets(ii)).^2)/length(output);
end
function pr=findminmax(p)
if iscell(p)
[m,n] = size(p);
pr = cell(m,1);
for i=1:m
pr{i} = minmax([p{i,:}]);
end
elseif isa(p,'double')
pr = [min(p,[],2) max(p,[],2)];
else
error('Argument has illegal type.')
end
function weights=fitnessfun(x)
% a=weight_to_optimise;
weights=x;
% The following code example describes a function that returns the mean squared error for a
given input of weights and biases,
% a network, its inputs and targets.
31
% weights and biases vector
y = net(inputs);
% Calculating the mean squared error
mse_calc = sum((y-targets).^2)/length(y);
% end
% The following code example describes a script that sets up a basic Neural Network problem
and the definition of a
% function handle to be passed to GA. It uses the above function to calculate the Mean Squared
Error.
clc
clear all
close all
load M;
32
b=M(1:length(M),1:12);
I=b';
O=M(1:length(M),13)';
a=b';
load net
output=[];
input=[10 20 30 40 50 60 70 80 90 100 ];
output_mse=[];
output1=[];
output_mse1=[];
for ii=1:length(M)
Opt(ii)=(sim(net,I(1:12,ii)));
end
for kk=1:length(input)
for it=1:10
net=apply_ga_generation_variation(net,Opt,O,input(kk));
net.trainParam.epochs = 50;
[net tr]=train(net,I,O);
for ii=1:length(M)
Optga(ii)=(sim(net,I(1:12,ii)));
end
mse_error=calc_mse(Optga,O);
output=[output (Optga)];
output_mse=[output_mse (mse_error) ];
end
output1=[output1 mean(output) ];
output_mse1=[output_mse1 mean(output_mse) ];
end
33
% applied to the GUI before loadingscreen_OpeningFunction gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to loadingscreen_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one
% instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
34
% UIWAIT makes loadingscreen wait for user response (see UIRESUME)
% uiwait(handles.figure1);
axes(handles.axes4)
matlabImage = imread('logo.jpg');
image(matlabImage)
axis off
axis image
% --- Outputs from this function are returned to the command line.
function varargout = loadingscreen_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
create_network
% --- Executes during object creation, after setting all properties.
function uipanel3_CreateFcn(hObject, eventdata, handles)
% hObject handle to uipanel3 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles empty - handles not created until after all CreateFcns called
35
% MAINGUI M-file for maingui.fig
% MAINGUI, by itself, creates a new MAINGUI or raises the existing
% singleton*.
%
% H = MAINGUI returns the handle to a new MAINGUI or the handle to
% the existing singleton*.
%
% MAINGUI('CALLBACK',hObject,eventData,handles,...) calls the local
% function named CALLBACK in MAINGUI.M with the given input arguments.
%
% MAINGUI('Property','Value',...) creates a new MAINGUI or raises the
% existing singleton*. Starting from the left, property value pairs are
% applied to the GUI before maingui_OpeningFunction gets called. An
% unrecognized property name or invalid value makes property application
% stop. All inputs are passed to maingui_OpeningFcn via varargin.
%
% *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one
% instance to run (singleton)".
%
% See also: GUIDE, GUIDATA, GUIHANDLES
function [M]=normalize_data(M)
%fields in Dataset as per column
% sex Age Cholestrol BP chest_pain ECG heart_rate Physical_activity Diebetis Slope
num_vessels Thal Result
[r c]=size(M);
for ii=1:r
% Normalize age
x=M(ii,1);
if x >= 20 && x <= 34
Age=-2;
elseif x >= 35 && x <= 50
Age=-1;
elseif x>= 51 && x <= 60
Age=0;
36
Age=1;
elseif x >= 80 ;
Age=2;
end
M(ii,1)=Age;
% normalize Blood Pressure
x=M(ii,4);
if x < 120
BP=-1;
elseif x > 120 && x <= 139
BP=0;
elseif x >= 140
BP=1;
end
M(ii,4)=BP;
x=M(ii,5);
if x < 200
Cholestrol=-1;
elseif x > 200 && x <= 239
Cholestrol=0;
elseif x >= 240
Cholestrol=1;
end
M(ii,5)=Cholestrol;
% Normalize Heart Rate
x=M(ii,8);
if x <= 100
HtRate=-1;
elseif x > 100 && x <= 150
HtRate=0;
elseif x >= 151
HtRate=1;
end
M(ii,8)=HtRate;
end
count=1;
row_num=[];
37
while count ~= length(M)
if M(count,9)==0
row_num=[row_num count];
end
count=count+1;
end
M(row_num(1,:),:)=[]; %consdier only dataset which has diabetes=true and delete other data
with sugar=0;
38
'gui_LayoutFcn', [] , ...
'gui_Callback', []);
if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Outputs from this function are returned to the command line.
function varargout = plot_error_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
39
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
h=findobj('Type','axes','Tag','axes1');
handles.axes1=h;
axes(h)
plot(mse_error1,'-k.');
hold on
plot(mse_error2,'-r*');
legend('error after ga','error before ga')
h=findobj('Type','axes','Tag','axes2');
handles.axes2=h;
axes(h)
generations=[10 20 30 40 50 60 70 80 90 100 ];
mean_mse_generations=[0.0002,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0004,0.0004,0.0
004;]
h=findobj('Type','axes','Tag','axes2');
axes(h)
plot(mean_mse_generations,generations,'-k.');
title('Variance in Generations');
ylabel('Generations')
xlabel('MEAN MSE')
guidata(hObject, handles);
40
% --- Executes during object creation, after setting all properties.
function axes1_CreateFcn(hObject, eventdata, handles)
% hObject handle to axes1 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles empty - handles not created until after all CreateFcns called
41
if nargin && ischar(varargin{1})
gui_State.gui_Callback = str2func(varargin{1});
end
if nargout
[varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:});
else
gui_mainfcn(gui_State, varargin{:});
end
% End initialization code - DO NOT EDIT
% --- Outputs from this function are returned to the command line.
function varargout = train_network_OutputFcn(hObject, eventdata, handles)
% varargout cell array for returning output args (see VARARGOUT);
% hObject handle to figure
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
42
maingui
% --- Executes on button press in pushbutton4.
function pushbutton4_Callback(hObject, eventdata, handles)
% hObject handle to pushbutton4 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles structure with handles and user data (see GUIDATA)
maingui;
% for ii=1:50
% Stockdata(ii).id=ii;
% % Stockdata(ii).date=M(ii,1);
% Stockdata(ii).stock_price_open=M(ii,2);
% Stockdata(ii).stock_price_high=M(ii,3);
% Stockdata(ii).stock_price_low=M(ii,4);
% Stockdata(ii).stock_price_close=M(ii,5);
% % Stockdata(ii).stock_volume=M(ii,6);
% % Stockdata(ii).stock_price_adj_close=M(ii,7);
% end
M=normalize_data(M);
save M;
for ii=1:length(M)
data_to_train(ii,1)=M(ii,1);
data_to_train(ii,2)=M(ii,2);
data_to_train(ii,3)=M(ii,3);
data_to_train(ii,4)=M(ii,4);
data_to_train(ii,5)=M(ii,5);
data_to_train(ii,6)=M(ii,6);
data_to_train(ii,7)=M(ii,7);
data_to_train(ii,8)=M(ii,8);
data_to_train(ii,9)=M(ii,9);
data_to_train(ii,10)=M(ii,10);
data_to_train(ii,11)=M(ii,11);
data_to_train(ii,12)=M(ii,12);
data_to_train(ii,13)=M(ii,13);
43
end
b=data_to_train(1:length(M),1:12);
Input=b';
Output=data_to_train(1:length(M),13)';
for ii=1:1:12
val= findminmax(Input(ii,1:length(M)));
RS(ii,1)=val(1,1);
RS(ii,2)=val(1,2);
end
ep = load('myfile.mat');
epchs = ep.epochs;
trn_fn = ep.transfer_fn;
if ep.layer == 3
S=[12 10 1];
[trainV, valV, testV] = dividevec(Input, Output, 0.10, 0.10);
net=newff(RS,S,{trn_fn,trn_fn, trn_fn});
else
S=[12 1];
[trainV, valV, testV] = dividevec(Input, Output, 0.10);
net=newff(RS,S,{trn_fn,trn_fn});
end
net.trainParam.epochs = epchs;
[net,tr,Y,E,Pf,Af]=train(net,Input,Output);
save net1 net
mean_error=tr.perf;
No_of_epoch=tr.epoch;
h=findobj('Type','axes','Tag','axes6');
axes(h)
% axes(handles)
for ii=2:5:length(No_of_epoch)
epoch(ii)=No_of_epoch(ii);
MSE(ii)=mean_error(ii);
end
44
plot(epoch,MSE,'-r*')
% test_data
guidata(hObject, handles);
% --- Executes during object creation, after setting all properties.
function axes3_CreateFcn(hObject, eventdata, handles)
% hObject handle to axes3 (see GCBO)
% eventdata reserved - to be defined in a future version of MATLAB
% handles empty - handles not created until after all CreateFcns called
clc
clear all
close all
load M;
b=M(1:length(M),1:12);
I=b';
O=M(1:length(M),13)';
a=b';
load net
output=[];
input=[20 40 60 80 100 ];
output_mse=[];
for ii=1:length(M)
Opt(ii)=(sim(net,I(1:12,ii)));
end
for kk=1:5
net=apply_ga_variation(net,Opt,O,input(kk));
net.trainParam.epochs = 50;
[net tr]=train(net,I,O);
for ii=1:length(M)
Optga(ii)=(sim(net,I(1:12,ii)));
end
mse_error=calc_mse(Optga,O);
output=[output mean(Optga)];
output_mse=[output_mse mean(mse_error) ];
end
45
5.5 Screenshot of the System
46
47
48
49
Chapter 6
Conclusions
Data mining techniques and methods applied in patient medical dataset has resulted in
innovations, standards and decision support system that have significant success in improving
the health of patients and the overall quality of medical services. But we still need systems which
could predict heart diseases in early stages. In this study, a new hybrid model of Neural
Networks and Genetic Algorithm to optimize the connection weights of ANN so as to improve
the performance of the Artificial Neural Network. The system uses identified important risk
factors for the prediction of heart disease and it does not require costly medical tests. Risk factors
data of 50 patients was collected and the results obtained showed training accuracy of 96.2% and
a validation accuracy of 89% as specified in TABLE IV. With using hybrid data mining
techniques we could design more accurate clinical decision support systems for diagnosis of
diseases.
We can build an intelligent system which could predict the disease using risk factors hence
saving cost and time to undergo medical tests and checkups and ensuring that the patient can
monitor his health on his own and plan preventive measures and treatment at the early stages of
the diseases.
50
References
[1]Syed Umar Amin, Kavita Agarwal and Dr. Rizwan Beg, “Genetic Neural Network Based
Data Mining in Prediction of Heart Disease Using Risk Factor”, Proceeding of IEEE Conference
on Information and Communication Technologies(ICT), pp. 1227-1231, April 2013.
[2] Awang, R. & Palaniappan, S., “Intelligent heart disease predication system using data mining
technique”. IJCSNS International Journal of Computer Science and Network Security. Vol. 8,
No. 8, 2008.
[3] Patil, S. & Kumaraswamy, Y., “Intelligent and effective heart attack prediction system using
data mining and artificial neural network, “European Journal of Science Research. Vol 31, No.
4.2009.
[4] P.K. Anooj, “Clinical Decision Support System : Risk Level Prediction of Heart Disease
Using Decision Tree Fuzzy Rules” , IJRRCS, Vol. 3, No. 3, pp. 1659-1667, June 2012.
[6] Ms. Preeti Gupta, Ms. Punam Bajaj, “Heart Disease Diagnosis System Based On Data
Mining And Neural Network”, International Journal Of Engineering Sciences & research
Technology ISSN: 2277-9655 Scientific Journal Impact Factor: 3.449 (ISRA), Impact Factor:
1.852
[7]Murat Karabatak, M. Cevdet Ince,"New feature selection method based on association rules
for diagnosis of erythemato-squamous diseases", ELSEVIER Expert Systems with Applications
36 (2009) 12500–12505.
[8]Sean N. Ghazavi, Thunshun W. Liao, “Data mining by fuzzy modeling with selected
features", ELSEVIER Artificial Intelligence in Medicine (2008) 43, pp.195—206.
[9] K.Priya, 2 T.Manju,3R.Chitra, “ Predictive Model Of Stroke Disease Using Hybrid Neuro-
Genetic Approach”, International Journal Of Engineering And Computer Science :2319-7242
Volume 2 Issue 3 March 2013 Page No. 781-788.
51
Appendix A
Abbreviation and symbols
Appendix B
Definitions
1. Data Analytics: Data analytics (DA) is the science of examining raw data withthe purpose
of drawing conclusions about that information.
3. Bioinformatics: The science of collecting and analyzing complex biological data such as
genetic codes
52