Professional Documents
Culture Documents
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Problem How can we generate a classier from an arbitrarily sized database of labeled instances, where attributes are not necessarily independent? Solutions
1 2 3
Darin Morrison
Problem How can we generate a classier from an arbitrarily sized database of labeled instances, where attributes are not necessarily independent? Solutions
1 2 3
Darin Morrison
Problem How can we generate a classier from an arbitrarily sized database of labeled instances, where attributes are not necessarily independent? Solutions
1 2 3
Darin Morrison
Problem How can we generate a classier from an arbitrarily sized database of labeled instances, where attributes are not necessarily independent? Solutions
1 2 3
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Naive-Bayes Classiers
Pros
1 2 3 4
Fast Induced classiers are easy to interpret Robust to irrelevant attributes Uses evidence from many attributes Assumes independence of attributes Low performance ceiling on large databases
Cons
1 2
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Decision-Trees
Pros
1 2
Fast Segmentation of data Fragmentation as number of splits becomes large Interpretability goes down as number of splits increase
Cons
1 2
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Algorithm
dom(makeNBTree) cod(makeNBTree)
1
2 3
For each attribute Xi , evaluate the utility, u(Xi ), of a split on attribute Xi . For continuous attributes, a threshold is also found at this stage. Let j = argmaxi (ui ), i.e., the attribute with the highest utility. If uj is not signicantly better than the utility of the current node, create a Naive-Bayes classier for the current node and return. Partition the set of instances T according to the test on Xj . If Xj is continuous, a threshold split is used; if Xj is discrete, a multi-way split is made for all possible values. For each child, call the algorithm recursively on the portion of T that matches the test leading to the child.
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Utility
Utility of Node Computed by discretizing the data and computing 5-fold cross-validation accuracy estimate of using NBC at node Utility of Split Computed by weighted sum of utility of nodes, where weight given to node is proportional to num of instances that reach node Signicance Split is signicant i the relative reduction in error is greater than 5% and there are at least 30 instances in node Intuition Attempt to approximate whether generalization accuracy for NBC at each leaf is higher than single NBC at current node
Darin Morrison NBTree: A Naive Bayes/Decision-Tree Hybrid
Outline
1
Motivation Problem and Solutions Consideration of Existing Solutions Naive-Bayes Classiers Decision-Trees Learning Curves New Solution: NBTree Denition of NBTree Performance Summary
Darin Morrison
Graphs
Darin Morrison
Graphs
Darin Morrison
Statistics
Average Accuracy C4.5 Naive-Bayes NBTree Number of Nodes per Tree letter adult DNA led24 C4.5 2109 2213 131 49 NBTree 251 137 3 1 81.91% 81.69% 84.47%
Darin Morrison
In practice, NBTrees are shown to scale to large databases and, in general, outperform Decision Trees and NBCs alone
Darin Morrison
Appendix
References
R. Kohavi. Scaling Up the Accuracy of Naive-Bayes Classiers: a Decision-Tree Hybrid Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996.
Darin Morrison