Professional Documents
Culture Documents
1. Sex Classification
Problem: classify whether a given person is a male or a female based on the measured
features. The features include height, weight, and foot size.
Training
The classifier created from the training set using a Gaussian distribution assumption
would be:
∑𝑛 (
𝑖=1 𝑋𝑖 − � )2
𝑋
𝜎2 =
𝑛−1
Let's say we have equi-probable classes so P(male)= P(female) = 0.5. There was no
identified reason for making this assumption so it may have been a bad idea. If we
determine P(C) based on frequency in the training set, we happen to get the same
answer.
Testing
The evidence (also termed normalizing constant) may be calculated since the sum of
the posteriors equals one.
The evidence may be ignored since it is a positive constant (Normal distributions are
always positive). Now, let's determine the sex of the sample.
1 (𝑋−𝑋�)2
−
𝐺 (𝑋) = 𝑒 2𝜎 2
𝜎√2𝜋
𝑃(ℎ𝑒𝑖𝑔ℎ𝑡|𝑚𝑎𝑙𝑒)
(6−5.855)2
1 −
= 𝑒 2×0.00386667
√2𝜋 × 0.00386667
= 1.578883
P(weight|male) = 5.9881e-06
P(weight|female) = 1.6789e-2
Since posterior numerator (female) > posterior numerator (male), the sample is
female.