Breast Cancer Diagnostics

Breast Cancer Diagnostics with Bayesian Networks
Reevaluating the Wisconsin Breast Cancer Database with BayesiaLab
Stefan Conrady, stefan.conrady@conradyscience.com Dr. Lionel Jouffe, jouffe@bayesia.com March 5, 2011
Conrady Applied Science, LLC - Bayesias North American Partner for Sales and Consulting
Introduction to Bayesian Networks
Table of Contents
Introduction
About the Authors Stefan Conrady Lionel Jouffe 2 2 2
Case Study & Tutorial

Background Wisconsin Breast Cancer Database Notation Data Import Unsupervised Learning Model 1: Markov Blanket Model 1 Performance Model 2: Augmented Markov Blanket Model 2 Performance Structural Coefficient Conclusion 3 3 4 4 6 7 11 13 14 16 22
Model Application
Interactive Inference Target Interpretation Tree Summary References Contact Information Conrady Applied Science, LLC Bayesia SAS Copyright 23 24 26 27 28 28 28 28
www.conradyscience.com | www.bayesia.com!
Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab
Introduction
Data classification is one of the most common tasks in the field of statistical analysis and countless methods have been developed for this purpose over time. A common approach is to develop a model based on known historical data, i.e. where the class membership of a record is known, and to use this generalization to predict the class membership for a new set of observations. Applications of data classifications permeate virtually all fields of study, including social sciences, engineering, biology, etc. In the medical field, classification problems often appear in the context of disease identification, i.e. making a diagnosis about a patients condition. The medical sciences have a long history of developing large body of knowledge, which links observable symptoms with known types of illnesses. It is the physicians task to use the available medical knowledge to make inference based on the patients symptoms, i.e. to classify the medical condition, in order to enable appropriate treatment. Over the last two decades, so-called medical expert systems have emerged, which are meant to support physicians in their diagnostic work. Given the sheer amount of medical knowledge in existence today, it should not be surprising that significant benefits are expected from such machine-based support in terms of medical reasoning and inference. In this context, several papers by Wolberg, Street, Heisey and Managasarian became much-cited examples. They proposed an automated method for the classification of Fine Needle Aspirates1 through imaging processing and machine learning, with the objective of achieving a greater accuracy in distinguishing between malignant and benign cells for the diagnosis of breast cancer. At the time of their study, the practice of visual inspection of FNA yielded an inconsistent diagnostic accuracy. The proposed new approach would increase this accuracy reliably to over 95%. This research was quickly translated into clinical practice and has since been applied with continued success. As part of their studies in the late 1980s and 1990s, the research team generated what became known as the Wisconsin Breast Cancer Database, which contains measurements of hundreds of FNA samples and the associated diagnoses. This database has been extensively studied, especially outside the medical field. Statisticians and computer scientists have proposed a wide range of techniques for this classification problem and have continuously raised the benchmark for predictive performance. Our objective with this paper is to present Bayesian networks as a very practical framework for working with this kind of classification problem. Furthermore, we intend to demonstrate how the BayesiaLab software can extremely quickly and simply create a Bayesian network model that is on par performance-wise with virtually all existing models. Also, while most of our previous white papers focused on marketing science applications, we hope that this case study from the medical field can demonstrate their universal applicability of Bayesian networks. We speculate that our modeling approach with Bayesian networks (as the framework) and BayesiaLab (as the software tool) achieves 99% of the performance of the best conceivable, custom-developed model, while only requiring 10% of the development time. This allows researchers to focus more on the subject matter of their studies, because they are less
Fine needle aspiration (FNA) is a percutaneous (through the skin) procedure that uses a fine gauge needle (22 or 25
gauge) and a syringe to sample fluid from a breast cyst or remove clusters of cells from a solid mass. With FNA, the cellular material taken from the breast is usually sent to the pathology laboratory for analysis. www.conradyscience.com | www.bayesia.com 1
distracted by the technicalities of traditional statistical tools. As a result, Bayesian networks and BayesiaLab are a very important innovations accelerating research and in pursuing translational science.
About the Authors

Stefan Conrady Stefan Conrady is the co-founder and managing partner of Conrady Applied Science, LLC, a privately held consulting firm specializing in knowledge discovery and probabilistic reasoning with Bayesian networks. In 2010, Conrady Applied Science was appointed the authorized sales and consulting partner of Bayesia SAS for North America. Stefan Conrady has extensive management experience in the fields of product planning, marketing science and advanced analytics. Prior to establishing his own firm, he was heading the Analytics & Forecasting group at Nissan North America. Lionel Jouffe Dr. Lionel Jouffe is co-founder and CEO of France-based Bayesia SAS. Lionel Jouffe holds a Ph.D. in Computer Science and has been working in the field of Artificial Intelligence since the early 1990s. He and his team have been developing BayesiaLab since 1999 and it has emerged as the leading software package for knowledge discovery, data mining and knowledge modeling using Bayesian networks. BayesiaLab enjoys broad acceptance in academic communities as well as in business and industry. The relevance of Bayesian networks, especially in the context of consumer research, is highlighted by Bayesias strategic partnership with Procter & Gamble, who has deployed BayesiaLab globally since 2007.
www.conradyscience.com | www.bayesia.com
Case Study & Tutorial

Background
To provide context for this study, we quote Mangasarian, Street and Wolberg (1994), who conducted the original research related breast cancer diagnosis with digital image processing and machine learning: Most breast cancers are detected by the patient as a lump in the breast. The majority of breast lumps are benign, so it is the physicians responsibility to diagnose breast cancer, that is, to distinguish benign lumps from malignant ones. There are three available methods for diagnosing breast cancer: mammography, FNA with visual interpretation and surgical biopsy. The reported sensitivity, i.e. ability to correctly diagnose cancer when the disease is present of mammography varies from 68% to 79%, of FNA with visual interpretation from 65% to 98%, and of surgical biopsy close to 100%. Therefore mammography lacks sensitivity, FNA sensitivity varies widely, and surgical biopsy, although accurate, is invasive, time consuming and costly. The goal of the diagnostic aspect of our research is to develop a relatively objective system that diagnoses FNAs with an accuracy that approaches the best achieved visually.
Wisconsin Breast Cancer Database

This breast cancer database was created through the clinical work of Dr. William H. Wolberg at the University of Wisconsin Hospitals in Madison. As of 1992, Dr. Wolberg had collected 699 instances of patient diagnoses in this database, consisting of two classes: 458 benign cases (65.5%) and 241 malignant cases (34.5%). The following eleven attributes2 are included in the database: 1. 2. 3. 4. 5. 6. 7. 8. 9. Sample code number Clump Thickness (1 - 10) Uniformity of Cell Size (1 - 10) Uniformity of Cell Shape (1 - 10) Marginal Adhesion (1 - 10) Single Epithelial Cell Size (1 - 10) Bare Nuclei (1 - 10) Bland Chromatin (1 - 10) Normal Nucleoli (1 - 10)
10. Mitoses (1 - 10) 11. Class (benign/malignant) Attributes 2 through 9 were computed from digital images of fine needle aspirates (FNA) of breast masses. These features describe the characteristics of the cell nuclei in the image. The class membership was established via subsequent biopsies or via long-term monitoring of the tumor.
Upon exclusion of the row identifier, this database is ideally suited for the evaluation version of BayesiaLab, which is
limited to ten nodes. www.conradyscience.com | www.bayesia.com 3
We will not go into detail here regarding the definition of the attributes and their measurement. Rather, we refer the reader to papers referenced in the bibliography. The Wisconsin Breast Cancer Database is available to any interested researcher from the UC Irvine Machine Learning Repository.3 We use this database in its original format without any further transformation, so our results can be directly compared to dozens of methods that have been developed since the original study.
Notation
To clearly distinguish between natural language, software-specific functions and study-specific variable names, the following notation is used: BayesiaLab-specific functions, keywords, commands, etc., are shown in bold type. Attribute/variable/node names are capitalized and italicized.
Data Import
Our modeling process begins with importing the database, which is available in a CSV format, into BayesiaLab. The Data Import Wizard guides the analyst through the required steps.
In the first dialogue box of the Data Import Wizard, we can click on Define Typing and specify that we wish to set aside test set of the database. Following common practice, we will randomly select 20% of the 699 records as test data, and, as a result, the remaining 80% will serve as our training data set.
UC Irvine Machine Learning Repository website: http://archive.ics.uci.edu/ml/ 4
In the next step, the Data Import Wizard will suggest the data type for each variable (or attribute4 ). Attributes 2 through 10 are identified as continuous variables and Class is read as a discrete variable. Only for the first variable, Sample code, the analyst has to specify Row Identifier, so it is not mistaken for a continuous predictor variable.
For the import process of this study, the most important step is the selection of the discretization algorithm. As we know that the exclusive objective is classification, we will choose the Decision Tree algorithm, which will discretize each variable for an optimum information gain with respect to the target variable Class. Bayesian networks are entirely non-parametric, probabilistic models and for their estimation they require a certain minimum of observations. To help us with the selection of discretization levels, we use the heuristic of five observations per parameter and probability cell. Given that we have a relatively small database with only 560 observations,5 three discretization intervals for each variable appear to be an appropriate choice. If we used a higher number of discretization levels, we would most likely need more observations for the reliable estimation of the parameters.
4 5
Attribute and variable are used interchangeably throughout the paper. 560 cases are in the training set (80%) and 139 are in the test set (20%). 5
Upon clicking Finish, we will immediately see a representation of the newly imported database in the form of a fully unconnected Bayesian network. Each variable is now represented as a blue node in the graph panel of BayesiaLab.
The question mark symbol, which is associated with the Bare Nuclei node, indicates that there are missing values for this variable. Hovering over the question mark with the mouse pointer while pressing the i key will show the number of missing values.
Unsupervised Learning
When working with BayesiaLab, it is recommended to always perform Unsupervised Learning first on any newly imported database. This is the case, even when the exclusive objective is predictive modeling, for which Supervised Learning will later be the main tool. Learning>Association Discovering>EQ will initiate the EQ algorithm, which, in this case, is suitable for the initial review of the database. For larger databases with significantly more variables, the Maximum Weight Spanning Tree is a very fast algorithm and can be used first instead.
The analyst can visually review the learned network structure and compare it to his or her domain knowledge. This quickly provides a sanity check for the database and the variables and it may highlight any inconsistencies.
Furthermore, one can also display the Pearson correlation between the nodes, by selecting Analysis>Graphic>Pearsons Correlation and clicking the Display Arc Comment button in the toolbar.
For instance, a potentially incorrect sign of a correlation would noticed immediately by the analyst as the arcs are colorcoded. Red and blue arcs indicate negative and positive Pearson correlations respectively.
Model 1: Markov Blanket

Now that all data is stored within BayesiaLab (and reviewed through the Unsupervised Learning step), we can proceed to the modeling stage. Given our objective of predicting the state (benign versus malignant) of the variable Class, we will define it as the Target Variable by right-clicking on the node and selecting Set as Target Variable from the contextual menu. We need to specify this explicitly, so the subsequent Supervised Learning algorithm can use Class as the dependent variable. The supervised learning algorithms are then available under Learning>Target Node Characterization.
In most cases, the Markov Blanket algorithm is a good starting point for any predictive model. This algorithm is extremely fast and can even be applied to databases with thousands of variables and millions of records, although database size is not a concern in this particular study. The Markov Blanket for a node A is the set of nodes composed of As parents, its children, and its childrens other parents (=spouses).
The Markov Blanket of the node A contains all the variables, which, if we know their states, will shield the node A from the rest of the network. This means that the Markov Blanket of a node is the only knowledge needed to predict the behavior of that node A. Learning a Markov Blanket selects relevant predictor variables, which is particularly helpful when there is a large number of variables in the database (In fact, this can also serve as a highly-efficient variable selection method in preparation for other types of modeling, outside the Bayesian network framework). Upon Markov Blanket learning for our database, the resulting Bayesian network looks as follows:
This suggests that Class, has a direct probabilistic relationship with all variables except Marginal Adhesion and Single Epithelial Cell Size, which are disconnected. The lack of their connection with the Target indicates that these nodes are independent given the nodes in the Markov Blanket. For a better visual interpretation, we will apply the Force Directed Layout algorithm and obtain a view with the Class at its center. Both unconnected variables are shown at the bottom of the graph.
Beyond distinguishing between predictors (connected nodes) and non-predictors (disconnected nodes), we can further examine the relationship versus the Target Node Class by highlighting the Mutual Information of the arcs connecting the nodes. This function is accessible within the Validation Mode via Analysis>Graphic>Arcs Mutual Information.
The thickness of the arcs is now proportional to the Mutual Information, i.e. the strength of the relationship between the nodes. Intuitively, Mutual Information measures the information that X and Y share: it measures how much knowing one of these variables reduces our uncertainty about the other. For example, if X and Y are independent, then knowing X does not provide any information about Y and vice versa, so their Mutual Information is zero. At the other extreme, if X and Y are identical then all information conveyed by X is shared with Y: knowing X determines the value of Y and vice versa.
Formal Definition of Mutual Information
! p ( x, y ) $ I ( X; Y ) = ( ( p( x, y)log # " p( x ) p( y ) & % y 'Y x 'X
10
We can also show the values of the Mutual Information on the graph by clicking on Display Arc Comments. In the top part of the comment box attached to each arc, the Mutual Information of the arc is shown. Below, expressed as a percentage and highlighted in blue, we see the relative Mutual Information in the direction of the arc (parent node ! child node). And, at the bottom, we have the relative mutual information in the opposite direction of the arc (child node ! parent node).
Model 1 Performance
As we are not equipped with specific domain knowledge about the variables, we will not further interpret these relationships but rather run an initial test for Network Performance we want to know how well this Markov Blanket model can predict the states of the Class variable, i.e. benign versus malignant. This test is available via Analysis>Network Performance>Targeted.
Using our previously defined test set for validating our model, we obtain the following, rather encouraging results: Markov Blanket - Test Set
11
Of the 87 benign cases of the test set, 96.5% were correctly identified (true negative), which corresponds to a false positive rate of 3.5%. More importantly though, of the 52 malignant cases, 100% were identified correctly (true positive) with no false negatives. This yields a total precision of 97.8%. Analogous to the original papers on this topic, we will also perform a K-Fold Cross Validation, which will iteratively select different test and training sets, and, based on those, learn and test the model. The Cross Validation can be performed via Tools>Cross Validation>Targeted.
We choose 10 samples, i.e. 10 iterations with 69 cases as test samples and 630 training cases.
The results from the Cross Validation confirms the good performance of this model. The overall precision is 96.7%, with a false negative rate of 2.9%.
12
Markov Blanket - Cross Validation
At this point we might be tempted to conclude our analysis, as our Markov Blanket modeling is already performing at a level comparable to the most sophisticated (and complex) models ever developed from this database. More remarkable though is the minimal effort that was required for creating our model with the Supervised Learning algorithms in BayesiaLab. Even a new user of BayesiaLab would be expected to replicate the above steps in less than 30 minutes.
Model 2: Augmented Markov Blanket

BayesiaLab offers an extension to the Markov Blanket algorithm, named Augmented Markov Blanket, which performs an unsupervised learning algorithm on those nodes, which were previously selected by the Markov Blanket learning. This allows to identify influence paths between the predictor variables and can potentially help improve the prediction performance. This sequence of algorithms can be started via Learning>Target Node Characterization>Augmented Markov Blanket.
13
As can be expected, the resulting network is somewhat more complex than the standard Markov Blanket.
The additional arcs (compared to the Markov Blanket network) are highlighted with green markers.
Model 2 Performance
With this Augmented Markov Blanket network we now proceed to performance evaluations, analogous to the Markov Blanket model. Initially, we evaluate the performance on the test set.
14
Augmented Markov Blanket - Test Set
To complete the evaluation of this model, we will also perform a K-Fold Cross Validation. Augmented Markov Blanket - Cross-Validation
15
Despite the greater complexity of the model, we only see a marginal improvement in overall precision.
Structural Coefficient
Up to this point, we have not addressed the Structural Coefficient (SC), which is the only adjustable parameter for all the learning algorithms in BayesiaLab. This parameter is available to manage network complexity. By default, this Structural Coefficient is set to 1, which reliably prevents the learning algorithms from overfitting the model to the data. In studies with relatively few observations, the analysts judgment is needed for determining a potential downward adjustment of this parameter. On the other hand, when data sets are very large, increasing the parameter to values higher than 1 will help manage the network complexity. Given the fairly simple network structure of Model 1, complexity was of no concern. Model 2 is more complex, but still very manageable. The question is, could a more complex network provide greater precision without overfitting? To answer this question, we will perform the Structural Coefficient Analysis, which generates several metrics that help in making a trade-off between complexity and precision. The function Tools>Cross Validation>Structural Coefficient Analysis starts this process.
We are prompted to specify the range of the Structural Coefficient to be examined and the number of iterations. The Number of Iterations determines the interval steps to be taken within the specified range of the Structural Coefficient. Given the relatively light computational load, we choose 50 iterations. With more complex models, we might be more conservative, as each iteration re-learns and re-evaluates the network. Furthermore, we select Compute Structure/ Targets Precision Ratio to compute our target metric.
The resulting report will show us how the network structure changes as a function of the Structural Coefficient. This can be interpreted as the degree of confidence the analyst should have in any particular arc in the structure.
16
Clicking Graphs, will show a synthesized network, consisting of all structures generated during the iterative learning process.
17
The reference structure is represented by black arcs, which show the original network learned prior to the start of the Structural Coefficient Analysis. The blue-colored arcs are not contained in the reference structure, but they appear in networks that have been learned as a function of the different Structural Coefficients (SC). The thickness of the arcs is proportional to the frequency of individual arcs existing in the learned networks. More importantly for us, however, is determining the correct level of network complexity for a reliable and accurate prediction performance while avoiding to overfit the data. We can plot several different metrics in this context by clicking Curve. Structure/Targets Precision Ratio is the most relevant metric in our case and the corresponding plot is shown below. This first plot shows the metric computed for the whole database.
Typically, the elbow of the L-shaped curve identifies a suitable value for the Structural Coefficient (SC). More formally, we would look for the point on the curve where the second derivative is maximized. With a visual inspection, an SC value of around 0.4 appears to be a good candidate for that point. The portion of the curve, where SC values approach 0, shows the characteristic pattern of overfitting, which is to be avoided. In order to further validate this interpretation, we will also compute the same metric for the training/test database.
18
This graph has the same properties as the previous one and suggests a similar SC value. As a result, we can have some confidence in this new value for the Structural Coefficient. We will also plot the Targets Precision alone as a function of the SC. On the surface, the curve resembles an L-shape, too, but the curve moves only within roughly 1 percentage point, i.e. between 97% and 98%. For practical purposes, this means that the curve is virtually flat.
19
! As a result, the Structure/Targets Precision Ratio # i.e. "
Structure
Target's Precision %
$ & is primarily a function of the numerator, i.e. Struc-
ture, as the denominator, Targets Precision, is nearly constant across a wide range of SC values, as per the graph above. The joint interpretation of Targets Precision and Structure/Targets Precision Ratio indicates that little can be gained with lowering the SC, but that there is a definite risk of overfitting. Nevertheless, we relearn the network with an SC of 0.4, generating, as expected, a more complex network, which is displayed below.
The performance of the model (with SC=0.4) on the test set appears to be virtually the same,
20
Augmented Markov Blanket (SC=0.4) - Test Set
and the result from the K-Fold Cross Validation is not materially different from the previous performance with SC=1. Augmented Markov Blanket (SC=0.4) - Cross-Validation
21
Conclusion
The models reviewed, Markov Blanket and Augmented Markov Blanket (SC=0.4 and SC=1), have performed at virtually indistinguishable levels in terms of classification performance. The greater complexity of either Augmented Markov Blanket specification did not yield the expected precision gain. Precision and false negatives are shown as the key metrics in the summary table below.
B>??*"J
1*"2'0,3+*(2#/ =>.?#(/#@,1*"2'0,3+*(2#/,ABCD<E =>.?#(/#@,1*"2'0,3+*(2#/,ABCD:68E
G#&/,B#/,A(D<H4E )*+&#, !"#$%&%'( -#.*/%0#& 456789 : 456789 < 476F;9 :
C"'&&,I*+%@*/%'(,A(D;44E )*+&#, !"#$%&%'( -#.*/%0#& 4;65<9 5 456<89 F 4;65<9 5
In this situation, the choice of model should be determined by the most parsimonious specification. This provides the best prospect of good generalization of the model beyond the samples observed in this study. The originally specified Markov Blanket model will thus be recommended as the model of choice. Reestimating these models with more observations could potentially change this conclusion and might more clearly differentiate the classification performance. For now, however, we select the Markov Blanket model and it will serve as the basis for the next section of this paper, Model Application.
22
Model Application
Interactive Inference
Without further discussion of the merits of each model specification, we will now show how the learned Markov Blanket model can be applied in practice. For instance, we can use BayesiaLab to review the individual classification predictions made based on the model. This feature is call Interactive Inference, which can be accessed via Inference>Adaptive Inference.
This will bring up Monitors for all variables in the Monitor Panel, and the navigation bar above allows scrolling through each record of the test set. Record #0 can be seen below with all the associated observations highlighted in green. Given the observations shown, the model predicts a 99.76% probability that the cells from this FNA sample are malignant (the Monitor is highlighted in red).
For reference, we will also show record #22, which is classified as benign.
23
Most cases are rather clear-cut, as above, with record #19 being one of the few exceptions. Here, the probability of malignancy is 73%.
Target Interpretation Tree

In situations, when only individual cases are under review by a pathologist (rather than a batch of cases from a database), BayesiaLab can also express the model in the form of a Target Interpretation Tree. It is a kind of decision tree, which prescribes in which sequence evidence should be sought for gaining the maximum amount of information towards a diagnosis. As can been seen in the tree diagram, Uniformity of Cell Size provides the highest information gain. Upon obtaining this piece of evidence, Uniformity of Cell Shape will bring the highest information gain among the remaining variables. Due to the size of a complete Target Interpretation Tree, only three levels of evidence are shown in the following diagram.
24
25
In our particular example, this may not be relevant, as all pieces of evidence, i.e. all observations regarding the FNA are obtained simultaneously. However, in the context of other diagnostic methods, such as mammography and surgical biopsy, a tree-based decision structure can help prioritize the sequence of exams, given the evidence obtained up to that point.
Summary
By using Bayesian networks as the framework, we have shown a practical new modeling approach based on the widely studied Wisconsin Breast Cancer Database. Our prediction accuracy is comparable with the results of all known studies on this topic. With BayesiaLab as the software tool, modeling with Bayesian networks becomes accessible to a very broad range of analysts and researchers, including non-statisticians. The speed of modeling, analysis and subsequent implementation make BayesiaLab a suitable tool in many areas of research and especially for translational science.
26
References
Abdrabou, E. A.M.L, and A. E.B.M Salem. A Breast Cancer Classifier based on a Combination of Case-Based Reasoning and Ontology Approach. El-Sebakhy, E. A, K. A Faisal, T. Helmy, F. Azzedin, and A. Al-Suhaim. Evaluation of breast cancer tumor classification with unconstrained functional networks classifier. In the 4th ACS/IEEE International Conf. on Computer Systems and Applications, 281287, 2006. Hung, M. S, M. Shanker, and M. Y Hu. Estimating breast cancer risks using neural networks. Journal of the Operational Research Society 53, no. 2 (2002): 222231. Karabatak, M., and M. C Ince. An expert system for detection of breast cancer based on association rules and neural network. Expert Systems with Applications 36, no. 2 (2009): 34653469. Mangasarian, Olvi L, W. Nick Street, and William H Wolberg. Breast cancer diagnosis and prognosis via linear programming. OPERATIONS RESEARCH 43 (1995): 570--577. Mu, T., and A. K Nandi. BREAST CANCER DIAGNOSIS FROM FINE-NEEDLE ASPIRATION USING SUPERVISED COMPACT HYPERSPHERES AND ESTABLISHMENT OF CONFIDENCE OF MALIGNANCY. Wolberg, W. H, W. N Street, D. M Heisey, and O. L Mangasarian. Computer-derived nuclear features distinguish malignant from benign breast cytology* 1. Human Pathology 26, no. 7 (1995): 792796. Wolberg, William H, W. Nick Street, and O. L Mangasarian. MACHINE LEARNING TECHNIQUES TO DIAGNOSE BREAST CANCER FROM IMAGE-PROCESSED NUCLEAR FEATURES OF FINE NEEDLE ASPIRATES. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.127.2109. Wolberg, William H, W. Nick Street, and Olvi L Mangasarian. Breast Cytology Diagnosis Via Digital Image Analysis (1993). http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9894.
27
Contact Information
Conrady Applied Science, LLC 312 Hamlets End Way Franklin, TN 37067 USA +1 888-386-8383 info@conradyscience.com www.conradyscience.com Bayesia SAS 6, rue Lonard de Vinci BP 119 53001 Laval Cedex France +33(0)2 43 49 75 69 info@bayesia.com www.bayesia.com
Copyright
2011 Conrady Applied Science, LLC and Bayesia SAS. All rights reserved. Any redistribution or reproduction of part or all of the contents in any form is prohibited other than the following: You may print or download this document for your personal and noncommercial use only. You may copy the content to individual third parties for their personal use, but only if you acknowledge Conrady Applied Science, LLC and Bayesia SAS as the source of the material. You may not, except with our express written permission, distribute or commercially exploit the content. Nor may you transmit it or store it in any other website or other form of electronic retrieval system.
28

Breast Cancer Diagnostics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Breast Cancer Diagnostics

Uploaded by

Copyright:

Available Formats

Breast Cancer Diagnostics with Bayesian Networks

Reevaluating the Wisconsin Breast Cancer Database with BayesiaLab

Stefan Conrady, stefan.conrady@conradyscience.com Dr. Lionel Jouffe, jouffe@bayesia.com March 5, 2011

Introduction to Bayesian Networks

Case Study & Tutorial

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

About the Authors

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Case Study & Tutorial

Wisconsin Breast Cancer Database

limited to ten nodes. www.conradyscience.com | www.bayesia.com 3

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

UC Irvine Machine Learning Repository website: http://archive.ics.uci.edu/ml/ 4

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Model 1: Markov Blanket

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

! p ( x, y ) $ I ( X; Y ) = ( ( p( x, y)log # " p( x ) p( y ) & % y 'Y x 'X

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Markov Blanket - Cross Validation

Model 2: Augmented Markov Blanket

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Augmented Markov Blanket - Test Set

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

! As a result, the Structure/Targets Precision Ratio # i.e. "

$ & is primarily a function of the numerator, i.e. Struc-

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Augmented Markov Blanket (SC=0.4) - Test Set

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

G#&/,B#/,A(D<H4E )*+&#, !"#$%&%'( -#.*/%0#& 456789 : 456789 < 476F;9 :

C"'&&,I*+%@*/%'(,A(D;44E )*+&#, !"#$%&%'( -#.*/%0#& 4;65<9 5 456<89 F 4;65<9 5

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Target Interpretation Tree

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

Breast Cancer Diagnostics with Bayesian Networks and BayesiaLab

You might also like

G#&/,B#/,A(D<H4E )+&#, !"#$%&%'( -#./%0#& 456789 : 456789 < 476F;9 :

C"'&&,I+%@/%'(,A(D;44E )+&#, !"#$%&%'( -#./%0#& 4;65<9 5 456<89 F 4;65<9 5