You are on page 1of 2

1.

10 Binaryzation :
By known that Data Mining process may require many algorithms to deal with
multiple tasks and also each of these algorithms require data to be presented
in a particular format. When the data is not in the desired format, then it
needs to be transformed by applying some conversion process. Binaryzation
is such kind of it.
Usually, best Binaryzation approach is the one that produces the best result
for the data mining algorithm that will be used to analyse the data.
It can be defined as, “the process of converting both continuous and discrete
attributes into binary attributes”.
This conversion process uses 3 steps, such as:
- Assigning numerical value
- Finding number of binary attributes required
- Conversion into binary
Ex.:
Suppose our algorithm uses a categorical attribute with ‘m’ number of
values.
Step 1 : Assigning numerical value
If it is nominal type, then numbers assigned would be between [0, m – 1 ].
If it is ordinal type as it has order, then first assignment has to follow the
order.
Step 2 : Finding number of Binary attributes required
Suppose n is the number of binary attributes required, then it can be
calculated using the formula as:

Step 3 : Conversion into binary


Here, number assigned is converted to its respective binary value.
Here, the assignment depends on the number of binary attributes
calculated in the previous step (n value).
Ex. If n is 3, then use 3 – bit representation, such that 2 = 010,
3 = 011……..likewise…..
Prob.: Attributes are : { awful, poor, ok, good, great }
1. Arrange them in the table form and assign the values in the range from
0 to m-1.
2. Finding the number of binary attributes using the formula such that
m=5. For ceil function, here n would be 3. So, the three binary
attributes are x1, x2 and x3.
3. Binary conversion such that n=3 as 3 – bit representation

Attribute Integer X1 X2 X3
values value
Awful 0 0 0 0
Poor 1 0 0 1
Ok 2 0 1 0
Good 3 0 1 1
great 4 1 0 0

Here , no mutual relationship should exist between the attributes. Some


situations, it may create an issue. To overcome this,
Number of binary attributes = number of values
So, in this case x1 to x5.

Attribute Integer X1 X2 X3 X4 X5
values value
Awful 0 1 0 0 0 0
Poor 1 0 1 0 0 0
Ok 2 0 0 1 0 0
Good 3 0 0 0 1 0
Great 4 0 0 0 0 1

You might also like