You are on page 1of 11

Mining Various Kinds of Association Rules

For many applications, it is difficult to find strong associations among data items at low or primitive
levels of abstraction due to the sparsity of data at those levels. Strong associations discovered at high
levels of abstraction may represent commonsense knowledge.

Mining Various Kinds of Association Rules

1. Mining Multilevel Association Rules

For many applications, it is difficult to find strong associations among data items at low or primitive
levels of abstraction due to the sparsity of data at those levels. Strong associations discovered at high
levels of abstraction may represent commonsense knowledge.

Therefore, data mining systems should provide capabilities for mining association rules at multiple levels
of abstraction, with sufficient flexibility for easy traversal among different abstraction spaces.

Let’s examine the following example.

Mining multilevel association rules: Suppose we are given the task-relevant set of transactional data in
Table for sales in an All Electronics store, showing the items purchased for each transaction.

The concept hierarchy for the items is shown in Figure. A concept hierarchy defines a sequence of
mappings from a set of low-level concepts to higher level, more general concepts. Data can be
generalized by replacing low-level concepts within the data by their higher-level concepts, or ancestors,
from a concept hierarchy.
Association rules generated from mining data at multiple levels of abstraction are called multiple-level
or multilevel association rules. Multilevel association rules can be mined efficiently using concept
hierarchies under a support-confidence framework. In general, a top-down strategy is employed, for
each level; any algorithm for discovering frequent itemsets may be used, such as Apriori or its variations.

Using uniform minimum support for all levels (referred to as uniform support): The same minimum
support threshold is used when mining at each level of abstraction. For example, in Figure 5.11, a
minimum support threshold of 5% is used throughout (e.g., for mining from “computer” down to “laptop
computer”). Both “computer” and “laptop computer” are found to be frequent, while “desktop
computer” is not.

When a uniform minimum support threshold is used, the search procedure is simplified. The method is
also simple in that users are required to specify only one minimum support threshold. An Apriori-like
optimization technique can be adopted, based on the knowledge that an ancestor is a superset of its
descendants: The search avoids examining itemsets containing any item whose ancestors do not have
minimum support.
Using reduced minimum support at lower levels (referred to as reduced support): Each level of
abstraction has its own minimum support threshold. The deeper the level of abstraction, the smaller the
corresponding threshold is. For example, in Figure, the minimum support thresholds for levels 1 and 2
are 5% and 3%, respectively. In this way, “computer,” “laptop computer,” and “desktop computer” are
all considered frequent.

Using item or group-based minimum support (referred to as group-based support):

Because users or experts often have insight as to which groups are more important than others, it is
sometimes more desirable to set up user-specific, item, or group based minimal support thresholds
when mining multilevel rules. For example, a user could set up the minimum support thresholds based
on product price, or on items of interest, such as by setting particularly low support thresholds for
laptop computers and flash drives in order to pay particular attention to the association patterns
containing items in these categories.

2. Mining Multidimensional Association Rules from Relational Databases and Data Warehouses

We have studied association rules that imply a single predicate, that is, the predicate buys. For instance,
in mining our All Electronics database, we may discover the Boolean association rule

Following the terminology used in multidimensional databases, we refer to each distinct predicate in a
rule as a dimension. Hence, we can refer to Rule above as a single dimensional or intra dimensional
association rule because it contains a single distinct predicate (e.g., buys) with multiple occurrences (i.e.,
the predicate occurs more than once within the rule). As we have seen in the previous sections of this
chapter, such rules are commonly mined from transactional data.

Considering each database attribute or warehouse dimension as a predicate, we can therefore mine
association rules containing multiple predicates, such as

Association rules that involve two or more dimensions or predicates can be referred to as
multidimensional association rules. Rule above contains three predicates (age, occupation, and buys),
each of which occurs only once in the rule. Hence, we say that it has no repeated predicates.
Multidimensional association rules with no repeated predicates are called inter dimensional association
rules. We can also mine multidimensional association rules with repeated predicates, which contain
multiple occurrences of some predicates. These rules are called hybrid-dimensional association rules. An
example of such a rule is the following, where the predicate buys is repeated:
Note that database attributes can be categorical or quantitative. Categorical attributes have a finite
number of possible values, with no ordering among the values (e.g., occupation, brand, color).
Categorical attributes are also called nominal attributes, because their values are ―names of things.‖
Quantitative attributes are numeric and have an implicit ordering among values (e.g., age, income, and
price). Techniques for mining multidimensional association rules can be categorized into two basic
approaches regarding the treatment of quantitative attributes.
CONSTRAINT BASED ASSOCIATION RULES:

 A data mining process may uncover thousands of rules from a given set of data, most of which
end up being unrelated or uninteresting to the users.

 Often, users have a good sense of which “direction” of mining may lead to interesting patterns
and the “form” of the patterns or rules they would like to find.

 Thus, a good heuristic is to have the users specify such intuition or expectations as constraints to
confine the search space.

 This strategy is known as constraint-based mining.

 Constraint based mining provides

o User Flexibility: provides constraints on what to be mined.

o System Optimization: explores constraints to help efficient mining.

 The constraints can include the following:

 Knowledge type constraints: These specify the type of knowledge to be mined, such as
association or correlation.

 Data constraints: These specify the set of task-relevant data.

o Dimension/level constraints: These specify the desired dimensions (or attributes) of the
data, or levels of the concept hierarchies, to be used in mining.

 Interestingness constraints: These specify thresholds on statistical measures of rule


interestingness, such as support, confidence, and correlation.

o Rule constraints: These specify the form of rules to be mined. Such constraints may be
expressed as rule templates, as the maximum or minimum numbers of predicates that
can occur in the rule antecedent or consequent, or as relationships among attributes,
attribute values, and/or aggregates. The above constraints can be specified using a high-
level declarative data mining query language and user interface.

Constraint based association rules: - In order to make the mining process more efficient rule based
constraint mining: - allows users to describe the rules that they would like to uncover. - provides a
sophisticated mining query optimizer that can be used to exploit the constraints specified by the user. -
encourages interactive exploratory mining and analysis.

Constrained frequent pattern mining: Query optimization approach

 Given a frequent pattern mining query with a set of constraints C, the algorithm should be:

o Sound: it only finds frequent sets that satisfy the given constraints C.
o Complete: all frequent sets satisfying the given constraints are found.

 A naïve solution:

o Find all frequent sets and then test them for constraint satisfaction.

 More efficient approaches:

o Analyze the properties of constraints comprehensively.

o Push them as deeply as possible inside the frequent pattern computation.

There are two forms of data analysis that can be used for extracting models describing important classes
or to predict future data trends. These two forms are as follows −

 Classification

 Prediction

Classification models predict categorical class labels; and prediction models predict continuous valued
functions. For example, we can build a classification model to categorize bank loan applications as either
safe or risky, or a prediction model to predict the expenditures in dollars of potential customers on
computer equipment given their income and occupation.

What is classification?

Following are the examples of cases where the data analysis task is Classification −

 A bank loan officer wants to analyze the data in order to know which customer (loan applicant)
is risky or which are safe.

 A marketing manager at a company needs to analyze a customer with a given profile, who will
buy a new computer.

In both of the above examples, a model or classifier is constructed to predict the categorical labels.
These labels are risky or safe for loan application data and yes or no for marketing data.

What is prediction?

Following are the examples of cases where the data analysis task is Prediction −

Suppose the marketing manager needs to predict how much a given customer will spend during a sale at
his company. In this example we are bothered to predict a numeric value. Therefore the data analysis
task is an example of numeric prediction. In this case, a model or a predictor will be constructed that
predicts a continuous-valued-function or ordered value.

Note − Regression analysis is a statistical methodology that is most often used for numeric prediction.
How Does Classification Works?

With the help of the bank loan application that we have discussed above, let us understand the working
of classification. The Data Classification process includes two steps −

 Building the Classifier or Model

 Using Classifier for Classification

Building the Classifier or Model

 This step is the learning step or the learning phase.

 In this step the classification algorithms build the classifier.

 The classifier is built from the training set made up of database tuples and their associated class
labels.

 Each tuple that constitutes the training set is referred to as a category or class. These tuples can
also be referred to as sample, object or data points.

Using Classifier for Classification

In this step, the classifier is used for classification. Here the test data is used to estimate the accuracy of
classification rules. The classification rules can be applied to the new data tuples if the accuracy is
considered acceptable.
Classification and Prediction Issues

The major issue is preparing the data for Classification and Prediction. Preparing the data involves the
following activities −

 Data Cleaning − Data cleaning involves removing the noise and treatment of missing values. The
noise is removed by applying smoothing techniques and the problem of missing values is solved
by replacing a missing value with most commonly occurring value for that attribute.

 Relevance Analysis − Database may also have the irrelevant attributes. Correlation analysis is
used to know whether any two given attributes are related.

 Data Transformation and reduction − the data can be transformed by any of the following
methods.

o Normalization − the data is transformed using normalization. Normalization involves


scaling all values for given attribute in order to make them fall within a small specified
range. Normalization is used when in the learning step, the neural networks or the
methods involving measurements are used.

o Generalization − the data can also be transformed by generalizing it to the higher


concept. For this purpose we can use the concept hierarchies.

Note − Data can also be reduced by some other methods such as wavelet transformation, binning,
histogram analysis, and clustering.

Comparison of Classification and Prediction Methods

Here are the criteria for comparing the methods of Classification and Prediction −
 Accuracy − Accuracy of classifier refers to the ability of classifier. It predicts the class label
correctly and the accuracy of the predictor refers to how well a given predictor can guess the
value of predicted attribute for a new data.

 Speed − this refers to the computational cost in generating and using the classifier or predictor.

 Robustness − It refers to the ability of classifier or predictor to make correct predictions from
given noisy data.

 Scalability − Scalability refers to the ability to construct the classifier or predictor efficiently;
given large amount of data.

 Interpretability − It refers to what extent the classifier or predictor understands.

Formal Classification and prediction Definition:

 Classification and prediction are two forms of data analysis those can be used to extract models
describing important data classes or to predict future data trends.

 Such analysis can help to provide us with a better understanding of the data at large.

 Classification predicts categorical (discrete, unordered) labels, prediction models


continuous valued functions.

Let’s Understand Classification a morsel more:

 The goal of data classification is to organize and categorize data in distinct classes.

 A model is first created based on the data distribution.

 The model is then used to classify new data.

 Given the model, a class can be predicted for new data.

 In general way of saying classification is for discrete and nominal values.

Let’s Understand Prediction a morsel more:

 The goal of prediction is to forecast or deduce the value of an attribute based on values of other
attributes.

 A model is first created based on the data distribution.

 The model is then used to predict future or unknown values.


Summarization of Classification and Prediction:

 If forecasting discrete value ( Classification )

 If forecasting continuous value ( Prediction )

Understanding Classification and prediction in Data Aspirant way:

Classification:

 Suppose from your past data (train data) you come to know that your best friend likes above
movies.

 Now one new movie (test data) released and hopefully you want to know your best friend like it
or not.

 If you strongly conformed about chances of liking that movie by your friend, you can take your
friend to movie this weekend.

 If you clearly observe the problem it is just whether your friend like or not.

 Finding solution to this type of problem is called as classification. This is because we are
classifying the things to their belongings (yes or no, like or dislike )
 Keep in mind here we are forecasting discrete value (classification) and the other thing this
classification belongs to supervised learning.

 This is because you are learning this from your train data.

 Mostly classification is binary classification in which we have to predict whether output belongs
to class 1 or class 2 (class 1 : yes, class 2: no )

 We can use classification for predicting more classes too. Like (suppose colors:
RED,GREEN,BLUE,YELLOW,ORANGE)

Prediction:

 Suppose from your past data (train data) you come to know that your best friend liked above
movies and you also know how many times each particular movie seen by your friend.

 Now one new movie (test data) released same like above, now you are going to find how many
times this present newly released movie will your friend watch is it , 5 times, 6 times,10 times
anything.

 If you clearly observe the problem it is about finding the count, sometimes we can say this as
predicting the value.

 Keep in mind, here we are forecasting continuous value (Prediction) and the other thing this
prediction is also belongs to supervised learning.

 This is because you are learning this from you train data.

You might also like