You are on page 1of 5

A SURVEY ON PREDICTION IN HEALTH DATABASE

J.SUMITHA

Abstract:

Health can be effected by various environmental factors such as pollution, our
behavioral problems such as smoking, drinking, and our food products such as taking fast
foods and by taking medicines for our health such as pills for cough, head ach, pain killers
and so on which in turns gives adverse effects to our body. In this paper, health and its
affecting factors are predicted using Bayesian networks are analyzed. The study in this paper
about the health database and its affecting factors which may provide awareness about the
health affecting factors in future. Various classification techniques along with the prediction
are used to classify the health problems and it’s affecting factors in this paper. This paper is
also surveyed recently used methods to determine health and its effecting factors.

Keywords: Bayesian network, C4.5, CART, Drug safety, Health, Induction learning
algorithm, Prediction, and. Renal transplantation center.

1. INTRODUCTION

Health and its affecting factors can be analyzed by many algorithms including Bayesian
networks. Bayesian networks are used in many applications. But, in this, a survey is made on
many experiments to analyze the Bayesian networks along with prediction used in health and
it’s affecting factors. In this, an automatic learning method of Bayesian networks with the
decision trees is used to predict the distribution of cancer, cardiology and dengue. These
diseases such as cancer, cardiology and dengue have been predicted based on the
characteristics [1].Health can also affected by the changes occurring in the coastal and river
areas. The increase in the composition of nitrogen inputs in the river flow made the toxic
micro-organisms in one side and carbon content is increased on other side. These makes the
oxygen content to be reduced in the water which in turns makes fishes to be diseased and
even leads to death. These diseased fishes which caught and taken as a food by humans those
whom also gets affected by the disease. The prediction of diseased fishes and the killed fishes
due to this rich nitrogen inputs over the river flow within a particular year can be predicted
[3] .The coastal area which is highly affected by these chemical compositions is classified
and predicted using recent methodologies [4].The probability of patient registration with
certain diseases in a transplantation centers are classified and predicted by using prediction
algorithms. The combination of decision trees and the Bayesian network is used to predict the
registration of the patients in a transplantation center [5].

In recent decade, a new methodology is used to detect the problems of existing adverse drug
reaction in china. An automatic algorithm is used which is incorporated in the Bayesian
networks is used to predict the adverse events for the concern drugs[7].The statistical
methods along with Bayesian approach is used to detect the adverse events as soon as
possible in drugs recently. The self control case series method (sccs) is an integrative part of
a Bayesian network is used to estimate the relative incidence of adverse events of a vaccine
for safe incurring in pharmacovigilance [7]. The patients those whom taking multiple drugs
are considered and the corresponding drug reactions are recorded for prediction of a
particular drugs [7].

5 C4. date of report.5 is a one of the simplest approach in the classification methods and it is used in a number of induction learning methods.) repeating 30 times by each iteration. many health databases has been made surveyed in which cancer database can be predicted on the basis of its tumor characteristics [1]. drugs and its adverse events. Some of them are considered as a prediction tool for determining the predictive performance of the health database. Renal transplantation database is divided into two sets.The classification algorithms taken here are CART decision trees. HEALTH DATABASE In this. The advantage of SRS database is that the pharmaceutical companies. Step 2: C4. ALGORITHMS Many algorithms are surveyed for predicting the performance of health and its affecting factors.5 algorithm is applied to control database. One is the training set (90%) and the another set is validation set (10%) [5]. sex. cardiology and the dengue database in a locality or in an area [1]. Step 4: Evaluate the predictive performance of both C4.5 algorithms and Bayesian networks. 2.1 C4. Cardiological database can be predicted on the basis of their respective symptoms. Step 3: Repeating the same process for 10%.to 100% of control database. The advantage of this database is that McNemar test gives more accurate performance when compared to other tests. C4.. health authorities and drug monitoring centers uses SRS database for global safety screening purposes. Spontaneous Reporting System is a health database in which it is comprises of suspected adverse drug reactions and along with a remedy [7].5 in this survey is that it gives more accurate predictive performance for the distribution of cancer. A combination of the Bayesian and C4.e. 3.5 algorithm is applied to predict the distribution of these databases in a locality or an area. 3. and self control case series method (SCCS) [8]. They are follows: Step 1: The health database is divided into i) control database and ii) the validation database. many works has been done to classify the health and its affecting factors. 20%.. (i. The advantage of this C4.5 analysis consists of four steps. .5 and Bayesian networks in a validation database. The predictive performance of CART and Bayesian algorithms are evaluated on the validation set. Individual records in the SRS database consists of age. The C4. Apart from the classification algorithms. Dengue database can be predicted on the basis of the ambiential characteristics. The other methodologies used in this survey are an induction learning techniques [1]. The drawback of the SRS database is that it is only containing the reports of the adverse effects but it is failed to determine the number of individuals consuming a particular drug.Chi2test is made to ensure the characteristics of two data sets.

3. CART method uses binary recursive portioning for analysis. The predictive performance is measured by McNewar test and is evaluated on the validation sets. These fishes when took as a food by humans also gets affected. In this. They are follows: Step 1: The dataset is first split into a number of homogeneous classes. CART analysis consists of five steps in this survey [5].Bursuk [3] in his work.2 CART CART is one of the most popular classification methods in many application fields especially in medicine. Only a small difference is observed. Step 2: The root node is then spitted into a number of child nodes. he proposed Bayesian network for determining the abundance of nitrogen inputs made the deficient in the oxygen supply in water and made fishes to death or affecting by disease.The drawback of the C4. he proposed a probability of patient registration with certain diseases in a transplantation center. Sahar Bayat [5] in his recent work. In his work. A split criterion is used to split the root nodes into Child nodes and for selecting the best classifier sample node. he makes an analysis over a river affecting factors which is then become harmful for human health. The drawback of this algorithm in this survey is that the complexity underlying in the CART is easily predicted by Bayesian networks. the predictive performance is predicted and measured. In this.5 gives better prediction when compared to the Bayesian networks and also he makes proving that the C4.E. Step 4: The GINI split criterion is used until the stopping criterion allowed the maximum tree nodes of 5. he proved that the C4. dengue and cardio logical diseases in a locality or in an area. RELATED WORKS Pablo Felgaer [1] gave a combination of automatic learning method of Bayesian network with the decision trees for predicting the distribution of the cancer. he selected a strong discriminating factor for predicting the performance. The main advantages that make CART so popular are its simplicity and its interpretability. Step 5: Then.5 is that it makes the classification only a 10% better precision than the other network. 4. he categorizes diseases in two models. For this. Mark. These complexities are predicted by selecting some discriminating factors for performance evaluation. The predictive performance is based upon the characteristics of the corresponding diseases. Step 3: The Splitting continues on the child nodes until the stopping criterion are applied.5 is better suited algorithm for inductive machine learning methods. . One is the Bayesian network model and another is the CART model.

5. The relative incidence is predicted and its prediction is for drug safety [8]. it gives more effective results than the Bayesian network. The Bayesian networks which is used in estimating the distribution of the disease is . CONCLUSION In this. For this. He surveyed the side effects of peoples when incurring the multiple drugs at the same time and the humans taking single drug with high dose frequently [8]. used statistical methods to detect adverse events earlier in drugs. it is suggested that the recently used advanced Markov model can show more better prediction than the Bayesian networks. he formulated a method for detecting the adverse drug reactions.This diagram has been adopted from [5]. it shows better prediction in the various applications especially. it is made clear that the Bayesian network used for predicting health and its affecting factors is found to be a better networks than other networks that can be used for prediction . A Bayesian approach is used for determining the high dimensionality in data. he used SRS database for detecting the drugs and its corresponding side effects. Since it is an advanced recently used probabilistic algorithm. for instances. while on predicting the distribution of the diseases [1] and for predicting the composition of the rich nitrogen supply in the river flow [3]. in medicine. The prediction used along with this Bayesian networks is discussed as in which significant sectors it is used.BCPNN algorithm is used to detect the adverse events earlier. in his review. The proportional ADR reporting ratio (PRR) and the reporting odds ratio (ROR) is used for predicting the performance measure and for observing the data. BAYESIAN NETWORK MODEL Chen Wen-ge [7] in his recent work. it is also suggested that the Markov model whether it is applied for predicting the registration of patients in transplantation center [5] and for detecting the ADRs of the corresponding drugs [7]. Through this survey. DISCUSSIONS In this survey. David Madigan [7]. a survey is made about the Bayesian networks used in health affecting factors. He also proposed therapy along with the drugs and its side effects. 6.but. Farrington [8] proposed self control case series method to estimate the incidence of side effects occurring for humans.

Markov model is a probabilistic model which is a one of the advanced model in the classification methods.2002b. prediction in health domain using Bayesian networks optimization based on induction learning techniques.. 1995). J. [3] Mark E. 2008 International Conference on 12-14 Dec. Crowder. 228-235. Ezawa and T..Vol. Intell. the Bayesian networks are expected with more advancement to be used for estimating many existing problems not only concern in health areas but also in various areas. . it is for deriving accurate result for abundance of nitrogen input in the rivers [3]. Durham. [6] Bayat. 17. As such in a way.-P. Marc cuggia. International Journal of Modern Physics C . Nicholas School of the Environment and Earth Sciences Duke University. 2009. Comparison of Bayesian Network and Decision Tree Methods for Predicting Access to the Renal Transplant Waiting List. 2008 On page(s): 202 – 205. In this survey. A. IOS Press. Relative incidence estimation from case series for vaccine safety evaluation. [2] K. In future. Eby. (1995). No. CA. Probabilistic prediction of fish health and fish kills in the Neuse River estuary using the elicited judgment of scientific experts. Luc rimat. Borsuk. S. © 2009 European Federation for Medical Informatics. L.World Scientific publishing Company. REFERENCES [1] Pablo felgaer. Computer Science and Software Engineering. especially in medicine. Markov Model is recognized as more advanced and more effective model for recent times and is suggested as it is highly effective in predicting many application fields. Craig A. [5] Sahar bayat. Integrative environmental prediction using Bayesian networks: A synthesis of models describing estuarine eutrophication. (2008) Modeling access to renal transplantation waiting list in a French healthcare network using Bayesian method. Conf. Uncertainty Arti. Paola britos. North Carolina USA. Adlassnig et al. p. it is concluded that the markov model is having an effective future in predicting medicines and its effects which are affecting the health. Fraud/uncollectible debt detection using a Bayesian network based learning system: A rare binary outcome with mixed data structures. Reckhow. ramon Garcia Martinez. Medical Informatics in a United and Healthy Europe K. E.(2002). Stow. [4] Borsuk. Analyzing the existing problems. 157. 3 (2006) 447–455 . D. The self control case series method which is used for drug safety in pharmogovigilance is also predicted using Bayesian Networks [8]. M. [8] Farrington. A Study on Signal Detection and Automatic Warning Algorithm for Adverse Drug Reaction. an automatic algorithm is used to predict the result.accurately estimated.). Biometrics 51. (Eds. Studies in Health Technology and Informatics 136:605–610 [7] Wen-ge. (San Francisco.Michèle Kessler. Schuermann. B. Delphine rossille. P. and Kenneth H.BCPNN algorithm is used to detecting the adverse signal detection in the ADRs [7]. et al.and Jian-xiong. and L. Proc. The paper suggested that Markov model can be implemented along with the other Bayesian networks for better prediction. In review.C. The Dispropositionality concept is used behind it.