You are on page 1of 9

ADDIS ABABA UNIVERSITY

COLLEGE OF NATURAL AND COMPUTATIONAL SCIENCE

SCHOOL OF INFORMATION SCIENCE

DEPARTMENT OF INFORMATION SYSTEM

Datamining and datawarehouse

Assignment – 2: -

By – Askual Assefa NSR/8486/10


Classification

Description

In this report, I have experimented classification techniques on two Datasets: Weather and Iris.
The weather dataset is used to build a classification model on whether to play or not based on a
given instance. The weather data set has 4 nominal attributes: outlook (sunny, overcast, rainy),
temperature (hot, mild, cold), humidity (high, normal) and windy (TRUE, FALSE). The class label
for the weather dataset is play (YES, NO). On the other hand, the iris data set is used to predict
whether an instance belong to Iris-setosa, Iris-versicolor and Iris-virginca iris subspecies. The iris
data set has 4 numeric attributes: sepallength, sepalwidth, petallength, petalwidth. The
weather dataset had 14 instances which is very little to be used for accurate knowledge
extraction. The dataset of weather and iris is complete since there was no missing data in the
attribute.
Results
I have used algorithims: RandomTree, J48 decision tree, and REPTree. In this section, I am going
to present the results of the three algorithms of the two datasets.

Weather

A,Randomtree algorithm

1
Correctly Classified Instances 8 57.1429 %

Incorrectly Classified Instances 6 42.8571 %

Precision Recall F-Measure Class

0.667 0.667 0.667 yes

0.400 0.400 0.400 no

Weighted Avg. 0.571 0.571 0.571

=== Confusion Matrix ===

a b <-- classified as

6 3 | a = yes

3 2 | b = no

B. J48 decision tree

2
Correctly Classified Instances 7 50 %

Incorrectly Classified Instances 7 50 %

Precision Recall F-Measure Class

0.625 0.556 0.588 yes

0.333 0.400 0.364 no

Weighted Avg. 0.521 0.500 0.508

=== Confusion Matrix ===

a b <-- classified as

5 4 | a = yes

3
3 2 | b = no

c. REPTree

Correctly Classified Instances 8 57.1429 %

Incorrectly Classified Instances 6 42.8571 %

Precision Recall F-Measure Class

0.615 0.889 0.727 yes

0.000 0.000 0.000 no

Weighted Avg. 0.396 0.571 0.468

=== Confusion Matrix ===

a b <-- classified as

8 1 | a = yes

5 0 | b = no

Iris
a. RandomTree

4
Correctly Classified Instances 138 92 %
Incorrectly Classified Instances 12 8%
Precision Recall F-Measure Class

1.000 1.000 1.000 Iris-setosa

0.896 0.860 0.878 Iris-versicolor

0.865 0.900 0.882 Iris-virginica

Weighted Avg. 0.920 0.920 0.920

=== Confusion Matrix ===

a b c <-- classified as

50 0 0 | a = Iris-setosa

0 43 7 | b = Iris-versicolor

5
1 5 45 | c = Iris-virginica
B, J48 decision tree

Precision Recall F-Measure Class


1.000 0.980 0.990 Iris-setosa
0.940 0.940 0.940 Iris-versicolor
0.941 0.960 0.950 Iris-virginica

Weighted Avg.0.960 0.960 0.960

=== Confusion Matrix ===

a b c <-- classified as
49 1 0 | a = Iris-setosa
0 47 3 | b = Iris-versicolor

6
0 2 48 | c = Iris-virginica
C, REPTree.

Correctly Classified Instances 141 94 %


Incorrectly Classified Instances 9 6 %
Precision Recall F-Measure Class

1.000 1.000 1.000 Iris-setosa

0.902 0.920 0.911 Iris-versicolor

0.918 0.900 0.909 Iris-virginica

Weighted Avg. 0.940 0.940 0.940

=== Confusion Matrix ===

a b c <-- classified as

50 0 0 | a = Iris-setosa

0 46 4 | b = Iris-versicolor

7
1 5 45 | c = Iris-virginica
Conclusion
The accuracy of the weather dataset when using RandomTree, J48 & REPTree are 57.14%, 50% &
57%. The accuracy of the Iris dataset when using RandomTree, J48 & REPTree are 92%, 96% & 94%.

You might also like