Professional Documents
Culture Documents
+ Stepwise backward elimination: the procedure starts with the full set of attributes. At each step, it
removes the worst attribute remaining in the set.
+ Combination of forward selection and backward elimination: at each step, the procedure selects the
best attributes and removes the worst from among the remaining attributes.
+ Decision tree induction: construct a flow chart like structure where each internal node denotes a
test on an attribute, each branch corresponds to an outcome of the test, and each external node
denotes a class prediction. At each node , the algorithm chooses the best attribute to partition the
data into individual classes. Based on the given data, a tree is construct that those attributes that do
not appear in the tree are assumed to be irrelevant
5.
Purpose of normalization: to scale the data of an attribute so that it falls in a smaller range
Some methods of data normalization:
- Decimal scaling: moving the decimal point of values of the data, we divide each value of the data by
the maximum absolute value of data using the formula: vi' = vi/ 10^j
where j is the smallest integer such that max(|vi'|) < 1.
- Min-max normalization: linear transformation is performed on the original data. Minimum and
maximum value from data is fetched and each value is replaced according to the formula.
v' = (v - min(A)) * (new_max(A) - new_min(A)) / (max(A) - min(A)) + new_min(A)
- Z-score normalization: values are normalized based on mean and standard deviation of the data A.
The formula: new entry = (old entry - standard deviation of A) / mean of A