Professional Documents
Culture Documents
But some times users may not know what kinds of patterns in their data
may be interesting
Because some patterns may not hold for all of the data in the database, a measure of
certainty or “trustworthiness” is usually associated with each discovered pattern.
Data Mining Functionalities
Data mining functionalities, and the kinds of patterns they can discover, are:
Cluster Analysis
Outlier Analysis
Concept/Class Description
Data can be associated with classes or concepts.
Class : A collection of things sharing a common attribute
For example, to study the characteristics of software products whose sales increased
by 10% in the last year, the data related to such products can be collected by
executing an SQL query.
Examples include pie charts, bar charts, curves, multidimensional data cubes,
and multidimensional tables.
The result could be a general profile of the customers, such as they are 40–50
years old, employed, and have excellent credit ratings.
The system should allow users to drill down on any dimension, such as on
occupation in order to view these customers according to their type of
employment.
Concept/Class Description
The target and contrasting classes can be specified by the user, and the
corresponding data objects retrieved through database queries.
For example, the user may like to compare the general features of software
products whose sales increased by 10% in the last year with those whose sales
decreased by at least 30% during the same period.
Concept/Class Description
60% of the customers who infrequently buy such products are either seniors
or youths, and have no university degree.
Use the model to predict the class of objects whose class label is
unknown
Example:
Classify a large set of items in the store, based on three kinds of
responses to a sales campaign: good response, mild response, and no
response.
Derive a model for each of these three classes based on the descriptive
features of the items, such as price, brand, place made, type, and
category.
IF-THEN rules:
Classification and Prediction
Example:
Decision tree:
Predict the amount of revenue that each item will generate during an
upcoming sale at AllElectronics, based on previous sales data.
Cluster Analysis
A database may contain data objects that do not comply with the
general behaviour or model of the data.
These data objects are outliers. Most data mining methods discard
outliers as noise or exceptions.
Some applications such as fraud detection, the rare events can be more
interesting than the more regularly occurring ones.