Professional Documents
Culture Documents
THEORY :
Classification is to identify the category or the class label of a new observation. First, a set of data is
used as training data. The set of input data and the corresponding outputs are given to the algorithm.
So, the training data set includes the input data and their associated class labels. Using the training
dataset, the algorithm derives a model or the classifier. The derived model can be a decision tree,
mathematical formula, or a neural network. In classification, when unlabeled data is given to the
model, it should find the class to which it belongs. The new data provided to the model is the test data
set.
However, smooth partitions suggest that each object in the same degree belongs to a cluster. More
specific divisions can be created like objects of multiple clusters, a single cluster can be forced to
participate, or even hierarchic trees can be constructed in group relations. This filesystem can be put
into place in different ways based on various models. These Distinct Algorithms apply to each and
every model, distinguishing their properties as well as their results. A good clustering algorithm is
able to identify the cluster independent of cluster shape.
Partitioning Method
Hierarchical Method
Density-based Method
Grid-Based Method
Model-Based Method
Constraint-based Method
OUTPUT :
LO3 : Implement the appropriate data mining methods like classification, clustering or association
mining on large dataset using open source tool like WEKA.
LO4 : Implement various data mining algorithms from scratch using language like python/java etc.