Professional Documents
Culture Documents
By Kushal Anjaria
The second pattern that we will study is the classification algorithm. In Data Mining, one of the most common tasks is
to build models to predict the class of an object based on its attributes. Here the object can be seen as a customer, patient,
transaction, email message, or even a single character. Characteristics of such objects can be, for example, for the patient
object, heart rate, blood pressure, weight, and gender. In contrast, the patient object's class would most commonly be
positive/negative for a specific disease. This section considers learning a classification tree model using the data we
have about such objects. Figure-1 illustrates the classification task in detail.
Given a collection of records (training set), each record contains a set of attributes; one of the attributes is class. Find a
model of a class attribute as a function of the values of the other variable
If you observe the decision tree, then you will realize that
In data mining, we aim to design the best decision tree which best fits our training set. Best fit means for each row of
the training set; our decision tree should give us the correct result.