Professional Documents
Culture Documents
Weekly Assignment 5
1. Clearly explain how the scalability issue in Decision Tree algorithms is managed by using following
methods.
a. SLIQ
b. SPRINT
c. Rain-Forest
𝑃(𝑐|𝑥) is the posterior probability of class (c, target) given predictor (x, attributes).
𝑃(𝑐) is the prior probability of class.
𝑃(𝑥|𝑐) is the likelihood which is the probability of predictor given class.
𝑃(𝑥) is the prior probability of predictor.
Similarly,
𝑃(𝑋|𝐵𝑢𝑦𝑠𝐶𝑜𝑚𝑝𝑢𝑡𝑒𝑟 = 𝑁𝑜) =
= 0.019.
To find the class𝐶𝑖 , that maximizes 𝑃(𝑋|𝐶𝑖 )𝑃(𝐶𝑖 ), we compute
Advantages
Can easily and quickly predict the category of the test data set. Also performs well in multi-class
prediction.
If you maintain the independence assumption, the performance of the naive Bayes classifier will
be better than other models (such as logistic regression) and you will need less training data.
Compared with numeric variables, it performs well in the case of categorical input variables. For
numerical variables, assume a normal distribution (bell-shaped curve, which is a strong
assumption).
Disadvantages
If the categorical variable has a category (in the test data set) and that category is not observed
in the training data set, the model will be assigned a probability of 0 (zero) and it will not be able
to make predictions. This is often referred to as "zero frequency".
Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is
almost impossible for us to obtain a set of completely independent predictors.
Bayesian Belief Networks, or "Bayesian Networks" for short, provide a simple way to apply
Bayes' theorem to complex problems.
Bayesian networks are a network-based framework for representation and analyzing models
involving uncertainty.
They are used for intelligent decision aids, intelligent diagnostic aids, data mining etc.
This was invented as a result of cross-fertilization of ideas between artificial intelligence,
decision analysis and statistics communities.
7. Discuss how the BBN is differ from other knowledge representation and probabilistic analysis tools.
9.
𝑃(𝑊|𝐶) = 𝑃(𝑊|𝑆, 𝑅) × 𝑃(𝑅|𝐶) × 𝑃(𝑆|𝐶) + 𝑃(𝑊|𝑆, 𝑅`) × 𝑃(𝑅`|𝐶) × 𝑃(𝑆|𝐶) + 𝑃(𝑊|𝑆`, 𝑅) × 𝑃(𝑅|𝐶)
× 𝑃(𝑆`|𝐶) + 𝑃(𝑊|𝑆`, 𝑅`) × 𝑃(𝑅`|𝐶) × 𝑃(𝑆`|𝐶)
= 0.99 × 0.8 × 0.1 + 0.9 × 0.2 × 0.1 + 0.9 × 0.8 × 0.9 + 0 × 0.2 × 0.9
= 0.7452