You are on page 1of 8


Heart Disease Prediction System
A major challenge facing healthcare organizations (hospitals, medical centers) is
the provision of quality services at affordable costs. Quality service implies
patients correctly and administering treatments that are effective. Poor clinical
can lead to disastrous consequences which are therefore unacceptable. Hospitals
also minimize the cost of clinical tests. They can achieve these results by
appropriate computer-based information and/or decision support systems.
Most hospitals today employ some sort of hospital information systems to manage
their healthcare or patient data.
These systems are designed to support patient billing, inventory management and
generation of simple statistics. Some hospitals use decision support systems, but
they are largely limited. Clinical decisions are often made based on doctors’
intuition and experience rather than on the knowledge rich data hidden in the
This practice leads to unwanted biases, errors and excessive medical costs which
affects the quality of service provided to patients.
Existing system
 Clinical decisions are often made based on doctors’ intuition and experience
rather than on the knowledge rich data hidden in the database.
 This practice leads to unwanted biases, errors and excessive medical costs
affects the quality of service provided to patients.
 There are many ways that a medical misdiagnosis can present itself. Whether a
doctor is at fault, or hospital staff, a misdiagnosis of a serious illness can have
very extreme and harmful effects.
 The National Patient Safety Foundation cites that 42% of medical patients feel
Mail us :
Mobile No : 7385350430

enhance patient safety.Pune they have had experienced a medical error or missed diagnosis. it also helps to reduce treatment costs.  This suggestion is promising as data modeling and analysis tools.. and improve patient outcome. To enhance visualization and ease of interpretation.  Thus we proposed that integration of clinical decision support with computer based patient records could reduce medical errors. and operations. Patient safety is sometimes negligently given the back seat for other concerns. then people will fear going to the hospital for treatment. data mining. e.  The main objective of this research is to develop a prototype Intelligent Heart Disease Prediction System (IHDPS) using three data mining modeling techniques. Each value is known Mail us : Mobile No : 7385350430 . Naïve Bayes and Neural Network. Each row corresponds to a given member of the data set in question. If they continue. Decision Trees. errors and excessive medical costs which affects the quality of service provided to patients. It lists values for each of the variables.g. usually presented in tabular form. namely. Each column represents a particular variable. Proposed Systems  This practice leads to unwanted biases. such as the cost of medical tests. such as height and weight of an object or values of random numbers.  Medical Misdiagnoses are a serious risk to our healthcare profession. decrease unwanted practice variation. have the potential to generate a knowledge-rich environment which can help to significantly improve the quality of clinical decisions.  So its providing effective treatments. We can put an end to medical misdiagnosis by informing the public and filing claims and suits against the medical practitioners at fault. drugs. Analyzing the Data set: A data set (or dataset) is a collection of data.

the values will normally all be of the same kind. The data set may comprise data for one or more members. It is assumed that problems such as missing data. However. such as real numbers or integers. The attribute “PatientID” was used as the key. not consisting of numerical values). which need to be indicated in some way. the rest are input attributes. Naives Baye’s Implementation in Mining: I recommend using Probability For Data Mining for a more in-depth introduction to Density estimation and general use of Bayes Classifiers. The values may be numbers.Pune as a datum. but may also be nominal data (i. for example representing a person's ethnicity. corresponding to the number of rows. values may be of any of the kinds described as a level of measurement. For each variable. Mail us : info@ocularsystems. Here in our project we get a data set from . the records for each set were selected randomly. More generally. inconsistent data. To avoid bias. The records were split equally into two datasets: training dataset (455 records) and testing dataset (454 records).. for example representing a person's height in centimeters. and duplicate data have all been resolved. A total of 500 records with 15 medical attributes (factors) were obtained from the Heart Disease database lists the Mobile No : 7385350430 .e. The attribute “Diagnosis” was identified as the predictable attribute with value “1” for patients with heart disease and value “0” for patients with no heart disease. with Naive Bayes Classifiers as a special case.dat file as our file reader program will get the data from them for the input of Naïve Bayes based mining process. there may also be "missing values". But if you just want the executive summary bottom line on learning and using Naive Bayes classifiers on categorical attributes then these are the slides for you.

Here our questionnaire is based on the attribute given in the data set. Sex (value 1: Male. and often have standardized answers that make it simple to compile data. Questionnaires are also sharply limited by the fact that respondents must be able to read the questions and respond to them. If B represents the dependent event and A represents the prior event. the algorithm counts the number of cases where A and B occur together and divides it by the number of cases where A occurs alone. Bayes' theorem can be stated as follows. value 0:< 120 mg/dl) Mail us : info@ocularsystems. so the our questionnaire contains : Input attributes 1. Fasting Blood Sugar (value 1: > 120 mg/dl. Applying Naïve Bayes to data with numerical attributes and using the Laplace correction (to be done at your own time. predict the class of the following new example using Naïve Bayes classification: with some numerical attributes). Chest Pain Type (value 1: typical type 1 angina. predict the class of the following new example using Naïve Bayes classification: Designing the Questionnaire: Questionnaires have advantages over some other types of medical symptoms that they are cheap.Pune Bayes' Theorem finds the probability of an event occurring given the probability of another event that has already occurred. do not require as much effort from the questioner as verbal or telephone surveys. not in class)( data with some numerical attributes). such standardized answers may frustrate users. value 4: asymptomatic) 3. value 3: non-angina pain. Bayes' Theorem: Prob(B given A) = Prob(A and B)/Prob(A) To calculate the probability of B given A. value 2: typical type angina. Mobile No : 7385350430 . value 0 : Female) 2.

Serum Cholesterol (mg/dl) 11.Pune 4. Thal (value 3: normal. Weight in Kgs. Age in Year 14. Thalach – maximum heart rate achieved 12. Slope – the slope of the peak exercise ST segment (value 1: unsloping. Oldpeak – ST depression induced by exercise relative to rest 13. value 0: no) 6. Exang – exercise induced angina (value 1: yes. value 3: downsloping) 7. Restecg – resting electrographic results (value 0: normal. value 2: flat. value 7:reversible defect) 9. Trest Blood Pressure (mm Hg on admission to the hospital) 10. value 1: 1 having ST-T wave abnormality. Height in cms Mobile No : 7385350430 . Mail us : info@ocularsystems. value 6: fixed defect. CA – number of major vessels colored by floursopy (value 0 – 3) 8. value 2: showing probable or definite left ventricular hypertrophy) 5.

Pune Mail us : Mobile No : 7385350430 .

The models are trained and validated against a test dataset. The system extracts hidden knowledge from a historical heart disease database.Pune Conclusion A prototype heart disease prediction system is developed using three data mining classification modeling techniques. All three models are able to extract patterns in response to the predictable state. Lift Chart and Classification Matrix methods are used to evaluate the effectiveness of the Mobile No : 7385350430 . DMX query language and functions are used to build and access the models. Mail us : info@ocularsystems.

each with its own strength with respect to ease of model interpretation. All three models could answer complex queries. three. Another area is to use Text Mining to mine the vast amount of unstructured data available in healthcare databases.g. Decision Trees results are easier to read and interpret. Clustering and Association Rules. It can also incorporate other data mining techniques. and Neural Network. For example. Five mining goals are defined based on business intelligence and data exploration. The goals are evaluated against the trained models. access to detailed information and accuracy. Continuous data can also be used instead of just categorical data. it can incorporate other medical attributes besides the 15 listed in Figure Mobile No : 7385350430 .Pune The most effective model to predict patients with heart disease appears to be Naïve Bayes followed by Neural Network and Decision Trees. The drill through feature to access detailed patients’ profiles is only available in Decision Trees. Naïve Bayes could answer four out of the five goals. Although not the most effective model. Time Series. e. two.. Another challenge would be to integrate data mining and text mining . Mail us : info@ocularsystems. IHDPS can be further enhanced and expanded. Naïve Bayes fared better than Decision Trees as it could identify all the significant medical predictors. Decision Trees. The relationship between attributes produced by Neural Network is more difficult to understand.