You are on page 1of 23

Ensemble Learning

Which of the two options increases your


chances of having a good grade on the exam?
Solving the test individually
Solving the test in groups

Why?

Ensemble Learning
Weak classifier A

Ensemble Learning
Weak classifier B

Ensemble Learning
Weak classifier C

Ensemble Learning
Ensemble of A, B, and C

Ensemble Learning
For an ensemble to work the following conditions must
be true:
The errors of the classifiers need not to be strongly correlated
(think about the exam example, if everyone knows by heart
exactly the same chapters, will it help to solve the test in
groups?)
The errors of the individual classifiers making up the example
need to be less than 0.5 (at least better than chance)

Ensemble Learning
Suppose we have a set of binary classifiers,
each with a probability of error of 1/3 and that
the errors of any two classifiers are independent.
(Two events A and B are independent if p(A&B)
= p(A)p(B)).
What is the probability of error of an ensemble of
3 classifiers? 7 classifiers? 21 classifiers?

Ensemble Learning
For 3 classifiers, each with probability of error of
1/3, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.

Ensemble Learning
For 3 classifiers, each with probability of error of
1/3, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.
pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3
= 3* (1/3)^2 (2/3) + (1/3)^3
= 2/9 + 1/27 = 7/27 = 0.26
(down from 0.33 for a single classifier)

Ensemble Learning
For 3 classifiers, each with probability of error of
1/2, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.

Ensemble Learning
For 3 classifiers, each with probability of error of
1/2, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.
pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3
= 3* (1/2)^2 (1/2) + (1/2)^3
= 3/8 + 1/8 = 1/2 = 0.5
(same as single classifier case)

Ensemble Learning
For 3 classifiers, each with probability of error of
2/3, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.

Ensemble Learning
For 3 classifiers, each with probability of error of
2/3, combined by simple voting, the probability
of error is equal to the probability that two
classifiers make an error plus the probability that
all three classifiers make an error.
pe(ens) = c(3,2) pe^2(1-pe) + c(3,3) pe^3
= 3* (2/3)^2 (1/3) + (2/3)^3
= 4/9 + 8/27 = 20/27 = 0.74
(up from 0.67 for a single classifier)

Ensemble Learning

Ensemble Learning
How to build ensembles:

Ensemble Learning
How to build ensembles:
1. Heterogeneous ensembles (same training data, different
learning algorithms)

Ensemble Learning
How to build ensembles:
1. Heterogeneous ensembles (same training data, different
learning algorithms)
2. Manipulate training data (same learning algorithm,
different training data)

Ensemble Learning
How to build ensembles:
1. Heterogeneous ensembles (same training data, different
learning algorithms)
2. Manipulate training data (same learning algorithm,
different training data)
3. Manipulate input features (use different subsets of the
attribute sets)

Ensemble Learning
How to build ensembles:
1. Heterogeneous ensembles (same training data, different
learning algorithms)
2. Manipulate training data (same learning algorithm,
different training data)
3. Manipulate input features (use different subsets of the
attribute sets)
4. Manipulate output targets (same data, same algorithm,
convert multiclass problems into many two-class
problems)

Ensemble Learning
How to build ensembles:
1. Heterogeneous ensembles (same training data, different
learning algorithms)
2. Manipulate training data (same learning algorithm,
different training data)
3. Manipulate input features (use different subsets of the
attribute sets)
4. Manipulate output targets (same data, same algorithm,
convert multiclass problems into many two-class
problems)
5. Inject randomness to learning algorithms.

Ensemble Learning
How to build ensembles:
The two dominant approaches belong to category 3:
Manipulate training data (same learning algorithm,
different training data)
They are: bagging and boosting

Ensemble Learning
Bagging - Training
1. k = 1;
2. pi = 1/m , for i=1,...,m
3. While k < EnSize
3.1 Create training set Tk (normally of size m) by sampling from
T with replacement according to probability distribution p.
3.2 Build classifier Ck using learning algorithm L and training set Tk
3.3 if errror_T (Ck) < threshold k = k+1
3.4 Goto 3.1
4. Output C1, C2,..., Ck
Classification: Classify new examples by voting among C_1, C_2,..

Ensemble Learning
Boosting - Training
1. k = 1, for i=1,...,m w1i = 1/m ,
2. While k < EnsSize
2.1 for i=1,...,m
pi = wi/sum(wi)
2.2 Create training set Tk (normally of size m) by sampling from T
with replacement according to probability distribution p.
2.3 Build classifier Ck using learning algorithm L and training set Tk
2.4 Classify examples in T
2.5 if errror_T (Ck) < threshold
k = k+1
For i = 1 to m
if Ck(x_i) != yi
wi = wi * Beta
-- (Beta >1) Increse w of misclassified examples
2.6 Goto 3.1
3. Output C1, C2,...,Ck

You might also like