You are on page 1of 7

Associative Classification Common

Research Challenges
Neda Abdelhamid Ahmad Abdul Jabbar Fadi Thabtah
Information Technology Ebusiness department Applied Business
Auckland Institute of Studies Canadian University of Dubai Nelson Marlborough Institute
Auckland, New Zealand Dubai, UAE of Technology
nedah@ais.ac.nz 121200344@students.cud.ac.ae Auckland, New Zealand
Fadi.Fayez@nmit.ac.nz

Abstract. Association rule mining involves discovering concealed correlations among variables often from sales
transactions to help managers in key business decision involving items shelving, sales and planning. In the last
decade, association rule mining methods have been employed in deriving rules from classification dataset in
different business domains. This has resulted in an emergence of new classification approach called Associative
Classification (AC), which often produces higher predictive classifiers than classic approaches such as decision
trees, greedy and rule induction. Nevertheless, AC suffers from noticeable challenges some of which have been
inherited from association rules and others have been resulted from building the classifier phase. These challenges
are not limited to the massive numbers of candidate ruleitems found, the very large classifiers derived, the inability
to handle multi-label datasets, and the design of rule pruning, ranking and prediction procedures. This article
highlights and critically analyzes common challenges faced by AC algorithms that are still sustained. Hence, it
opens the door for interested researchers to further investigate these challenges hoping to enhance the overall
performance of this approach and increase it applicability in research domains.

Keywords: Associative Classification Challenges, Data Mining, Prediction, Rules


1. Introduction
Data mining is a research field combining other scientific research domains primarily related to computer science
(Artificial Intelligence (AI), databases, search methods) and mathematics (statistics and probability) (Abdelhamid,
et al., 2015). There are many definitions for data mining, for example (Thabtah, et al., 2015) defined data mining
as producing new useful patterns from datasets utilising intelligent tools. We have defined data mining as a
research field concerned about revealing unseen knowledge from scattered data in a user preferred form for
specific use.
In market basket analysis (Agrawal and Srikant, 1994) such as large retails stores, e.g. Walmart or Asda, there are
a huge number of transactions executed at the different cash registers by customers. These transactions contain
valuable information that can be exploited by management in making primary decisions related to product
shelving, seasonal sales, and marketing promotions. Generating such information from transactional databases
can be done using association rule mining tools. Association rule mining is considered a descriptive task where
correlations among items sold are discovered as If-Then rules in order for store managers to make use of them.
In recent years, association rule algorithms have been modified to treat data related to classification problems (Liu
et al., 1998) shifting the aim from descriptive to be predictive. This shifting necessitates re-modelling the entire
algorithm’s steps since the prediction steps, rule ranking and pruning, have been added. In classification data such
as medical diagnoses, the goal is not descriptive rather predictive and it is to guess the “type of illness”, hence
rules having only the class attribute values in their consequent, are the only ones relevant. In other words, the
General Practitioner (GP) is not interested in rules having the patient’s features or the symptoms in their
consequent which are normally produced if a typical association rule algorithm is applied on this application’s
data. These new requirements lead to the appearance of AC (Li et al., 2001) (Thabtah et al., 2006) (Schmida, et
al., 2015) that combines association rule and classification steps together to produce classifiers. The role of
association rule in AC becomes limited to learning rules that have class value on their consequents (Class
Associative Rules- CARs) and discarding all other rules. Once CARs are derived, new phases including rule
sorting, pruning and test data prediction are imposed.
Hereunder, we briefly list the main steps performed by an AC algorithm.
1) Pre-processing (Optional): Discretisation of continuous attributes in the training dataset
2) Rule induction: This step consists of two sub-steps
 Discovering frequent ruleitem(items plus class- definition 7 in Section 2.1): These are attribute values
plus class that have frequency above a predefined user threshold named minimum support (minsupp)
(defined in Section 2.1).
 Rule formation: These are frequent ruleitems that have confidence values above a predefined user
threshold called minimum confidence (minconf) (defined in Section 2.1)
3) Classifier construction: Choosing the most accurate rules after applying all candidate rules extracted in step 2
on the training data. This step also applies rules sorting.
4) Test Data Class Prediction: Predicting the class of test data using the classifier which has been built in Step (3).
In this step, the performance of the classifier is also recorded.
One of the main advantages of AC is its ability to discover new knowledge (If-Then rules) that other classification
approaches are unable to find. The main reason for finding these rules is the learning methodology employed
which tests every single correlation between the attribute value(s) in the training dataset and the class value. These
new rules proved to enhance the predictive power of the classifier, though, some of these rules can be redundant
or conflicting in which if no appropriate pruning is invoked, they can cause the exponential growth problem (Li
et al., 2001). This problem usually happens when the minsupp is set to a very small value. Another important
advantage of AC is the simplicity of the output it generates which contains simple rules. This surely enables the
decision maker to easily understand, interpret and maintain the classifier.
There are a number of AC algorithms that have been proposed in the last decade including Classification based
Association (CBA) (Liu et al., 1998), Classification based on Multiple Association Rules (CMAR) (Li et al.,
2001), Multiclass Multilabel Classification based Association Rules (MMAC) (Thabtah et al., 2006),
Classification based on Boosting Association Rules (CBAR) (Yoon and Lee, 2008), Looking at Class AC
(Thabtah, et al., 2010), Multiclass Associative Classification (Abdelhamid, et al., 2012b) (MAC), and others, i.e.
(Li and Zaiane, 2015) (Sasirekha and Punitha, 2015). These algorithms employ different methodologies for rule
induction, rule sorting, rule pruning, and test data prediction.
The ultimate aim of this article is to shed the light on current primary challenges related to AC in data mining.
We would like to direct the interested researcher’s attention to possible research areas related to AC aiming to
enhance the overall performance of AC methods and expand their applicability in domains. We have noticed that
large number of AC algorithms have been disseminated in the last decade but limited applications have been
utilizing this data mining approach and therefore this paper shifts researcher’s awareness to the vital issues around
AC algorithm's design and implementation.
The paper is structured as follows: We firstly highlight the problem of AC along with its relevant definitions.
Then the solution scheme along with a simple example are also highlighted in the same section. Section 3 is
devoted to challenges related to AC and possible open research problems linked to AC. Lastly, conclusions are
given in Section 4.

2. Associative Classification Approach


2.1 The problem and Related Definitions
The AC problem has been defined in (Thabtah, et al., 2007) as: Given a training dataset D, which has n distinct
attributes A1, A2,…,An and C is a list of classes. The number of cases in D is denoted |D|. An attribute may be
categorical (where each attribute takes a value from a known set of possible values) or continuous. For categorical
attributes, all possible values are mapped to a set of positive integers. In the case of continuous attributes, any
discretisation method can be applied. The goal is to construct a classifier from D, e.g. Cl : A  C , which can
forecast the class of test cases where A is the set of attribute values and C is the set of classes.
The majority of AC algorithms mainly depend on a threshold called minsupp which represents the frequency of
the attribute value and its associated class (AttributeValue, class) in the training dataset from the size of that
dataset. Any attribute value plus its related class that passes minsupp is known as a frequent ruleitem, and when
the frequent ruleitem belongs to a single attribute, it is said to be a frequent 1- ruleitem. Another important
threshold in AC is the minconf, which can be defined as the frequency of the attribute value and its related class
in the training dataset from the frequency of the attribute value in the training data. Hereunder are the main
definitions related to AC:
Definition 1: An AttributeValue can be described as an attribute name Ai and its value ai, denoted (Ai, ai ).
Definition 2: The jth row or a training case in D can be described as a list of attribute values (Aj1, aj1), …, (Ajv,
ajv), plus a class denoted by cj.
Definition 3: An AttributeValueSet set can be described as a set of disjoint attribute values contained in a training
case, denoted < (Ai1, ai1), …, (Aiv, aiv)>.
Definition 4: A ruleitem r is of the form <antecedent, c>, where antecedent is an AttributeValueSet and cCis a
class.
Definition 5: The actual occurrence (actoccr) of a ruleitem r in D is the number of examples in D that match r’s
antecedent.
Definition 6:The support count (suppcount) of ruleitem is the number of examples in D that matches
r’santecedent, and belongs to a class c.
Definition 7: A ruleitemr passes the minsupp if, suppcount(r)/ |D| ≥ minsupp. Such a ruleitem is said to be a
frequent ruleitem.
Definition 8: A ruleitem r passes minconf threshold if suppcount(r) / actoccr(r) ≥ minconf.
Definition 2.9: A rule is represented as: Antecedent  c , where antecedent is an AttributeValueSet and the
consequent is a class.

2.2 General Solution Scheme


The majority of AC algorithms operate in three steps, step "1" involves rules induction , and in step "2", a classifier
is built from the candidate rules discovered in step "1", and lastly the classifier is evaluated on test data in step
"3". To show the process of rules induction and constructing the classifier, consider the training data displayed in
Table 1, which represents three attributes (Att1, Att2) and the class attribute (Class). The minsupp and minconf are
assumed to be set given 30% and 50%, respectively for presentation purpose. A typical AC algorithm initially
discovers all frequent ruleitems which hold enough supports (Table 2). Once all frequent ruleitems are found, then
CBA transforms the subset of which hold enough confidence values into rules. The bold rows within Table 2 are
the rules, and from those the classifier is derived. A rule is inserted into the classifier if it covers a certain number
of training examples. Meaning, a subset of the candidate rules is chosen to make the classifier which in turn is
evaluated on a test dataset to obtain its effectiveness.
Normally, AC algorithms discover frequent ruleitems by passing over the training dataset multiple times. In the
initial scan, frequent 1- ruleitems set is derived, and in each subsequent scan, they start with frequent ruleitems
Table 1: Training dataset

Row # Att1 Att2 Class Table 2: Frequent items derived by CBA from Table 2.1
1 a1 b1 c2
2 a1 b1 c2 Frequent Support Confidence
3 a2 b1 c1 attribute
4 a1 b2 c1 value
5 a3 b1 c1 <a1>, c2 40% 57.10%
6 a1 b1 c2 <a1>, c1 30% 42.85%
<b1>, c2 30% 60%
7 a2 b2 c1
<b2>, c1 40% 80%
8 a1 b2 c1
<a1,b1>,c2 30% 100%
9 a1 b2 c1
<a1,b2>, c1 30% 75%
10 a1 b2 c2

discovered in the previous scan (n) in order to derive possible frequent (n +1)- ruleitem, and so on. Once all
frequent ruleitems are derived, the algorithms generate the set of candidate rules from the frequent ruleitems that
pass the minconf threshold. Overall, the step of generating the frequent ruleitems is a hard task that requires
excessive processing because of the possible ruleitems support counting in each iteration (Abdelhamid, et al.,
2012b).
3. Associative Classification Possible New Challenges
3.1 Multi-label Rules Generation
One of the challenges in AC is that most current algorithms are unable to generate all class labels associated with
an attribute value in the dataset. Commonly, an AC algorithm finds only the highest frequency class linked with
the attribute value. Nevertheless, there could be more than one class linked with the rule’s body in different rows
in the training dataset with high frequency making choosing only one class questionable. For instance, consider
attribute value < x1 , x2 > in a training data of 100 examples and two classes ( c 2 , c 3 ). Assume that < x1 , x2 > are
connected with classes c2 and c3 10 and 9 times respectively. A typical AC algorithm will induce a rule such as
x1  x2  c2 and not consider the rule x1  x2  c3 since attribute value < x1 , x2 > appeared 10 times with class
c 2 which is only one extra training example than class c 3 . Though, class c 3 should be included in the rule rather
discarded since it brings up crucial information for the decision maker and has large frequency. So favoring class
c 2 over c 3 due to one additional training example is not justified. As a matter of fact, not generating the possible
class labels for each rule can be seen as ignoring useful knowledge that can be important to the classifier’s
accuracy. The research question(s) raised to treat the abovementioned problem is: Would deriving additional
useful knowledge (rules) from single label data improve the predictive performance of the classifier?
Two promising research directions have been proposed AC to handle the discovery of multi-label rules from single
label classification datasets. One solution proposed in (Thabtah, et al., 2006) indicated that single classifiers can
originally be derived from the training dataset then merging them can be a possible solution to form multiple
labels rules. They introduced a separate phase during rule induction called recursive learning that initially
produces the first single label classifier, removes all data examples covered by the classifier and then produces
the next classifier from the remaining uncovered training examples and so forth. One obvious shortcoming from
this approach is the fact that classifiers are produced from parts of the training dataset and not the entire data
examples at once. A recently modified version of MMAC called eMCAC was developed by (Abdelhamid, 2014).
The author relaxed the recursive learning and was be able to induce multi-label rules directly from the training
dataset by keeping conflicting rules and then merge them later on while building the classifier. Experiments using
UCI datasets, trainer scheduling dataset as well as website phishing classification datasets, showed competitive
results in error rate when comparing eMCAC with MMAC and other rule induction algorithms.

3.2 Improving Classifiers Performance


Different important issues related to enhancing the various steps performed in current AC algorithms have to be
investigated as follows:
 Cutting down the number of rules in the classifier without harming the classification accuracy
 Enhancing the class assignment process
 Improving rule sorting
Hereunder we briefly discuss each issue.
A. Reducing the Classifier Size
One of the main problems associated with AC is that the classifier size derived by this type of algorithms is
normally large. The main reason for extracting a huge number of rules is due to the mechanism of inducing the
rules which is inherited from association rule where every relationship between an attribute value and the class
value is discovered. These massive rules create three possible obstacles
 Very large classifiers that are not easy to be maintained by decision makers
 The combinatorial explosion problem where the available resources on the computer machine will be unable
to handle information processing during the different steps. This will cause the algorithm to crash.
 A number of redundant rules are derived
Thus, rule pruning particularly after producing the candidate rules and before constructing the classifier aiming to
minimize the classifier’s size, is essential. The research question(s) to be answered for this problem is: Can the
number of rules in the classifier be reduced and how?
There have been a number of classification algorithms that tackled this problem, i.e. MAC, LCA, yet the number
of generated candidate rules are still massive. Pruning methods such as database coverage (Liu, et al., 1998),
partial and full matching (Thabtah, et al., 2011), specific rule pruning (Li et al., 2001) and others have minimized
the classifier size yet the available number of rules are still uncontrollable. We believe that new approaches not
related to traditional confidence and support counting can be a research direction to this persistent problem in AC.
A possible encouraging direction is Causal AC (Yu et al., 2011) which restrains rule induction phase to causality
relationship rather traditional statistical counting of items and their classes in the training dataset. Another new
interesting direction is trying to produce dynamic classifiers that consider data examples overlapping among rules.
(Qabajeh, et al., 2015) have proposed a new learning methodology in rule induction that whenever a rule is
generated, all counters associate with the waiting to be generated rules (candidate rules) should be amended on
the fly to reflect the actual removal of the generated rule’s data examples. This can ensure a real time classifiers
are formed with no data overlapping among rules.

B. Test Data Class Assignment


When a test data requires a class, most current AC algorithms, e.g. ( Liu et al., 1998) (Thabtah et al., 2006), seek
for the first rule in the classifier identical to the test data and allocate its class to the test data. Nevertheless, there
could be more than one rule contained within the test data in the classifier which makes selecting just a single rule
inappropriate and unfair decision. In addition, using all relevant rules matching the test data to make the
classification will be more legitimate decision simply because a) No single rule preference occurs and b) larger in
size rules set is utilised. Hence, enhancing the prediction step in AC is a crucial task that necessitates careful
design of the prediction method in order to maximize the chance of allocating the right class to the test data.
In the last few years, a number of prediction methods that aggregate scores using more than one rule to allocate
the appropriate class for a test data, have been proposed. One example is (Jabez, 2011 ) which utilises Chi-Square
testing hypothesis from Statistics to come up with a final score for a group of rules. The authors clusters rules that
match the test data into groups based on the class labels. Then a score for each group is computed using Chi-
Square and the class that belongs to the largest score is assigned to the test data. Another promising method that
uses both the rule rank in the classifier as well as the cluster size (# of rule) together have been proposed in (Ayyat
et al., 2014). This method have showed some superiority in improving the predictive power in AC if compared to
previously developed methods that are based around one rule prediction decision. Finally, different prediction
methods using aggregated scores from rules' confidence and support values have been proposed in (Thabtah, et
al., 2011). A comprehensive comparative section in that article showed the necessity for careful design of
prediction methods that utilise multiple rules rather a single rule for test data allocation.

C. Rule Sorting Evaluation


Usually, AC and rule induction classification approaches require discriminating among the rules during test data
prediction phase. Therefore, choosing the appropriate rule sorting formula in AC is a critical task that may impact
on the selection of the rules during the classification of test data and thus the accuracy of the classifier may get
affected. When rules are having high rank in the classifier they are checked first for predicting the test data.
Therefore, one want to ensure that rules with high rank have high positive influence on the accuracy. In addition,
there could be multiple rules having similar tie breaking criteria (confidence, support, etc) during sorting which
sometimes forces the algorithm to perform random selection. The research question that need to be answered for
rule sorting is:
Can we reduce rule random selection in the ranking process to end up with high quality rules?
A comprehensive experimental study performed by (Abdelhamid, et al., 2012a) revealed that there is a need for
additional criteria to favor among rules once the classifier is formed. In that study, different rule ranking methods
have been experimentally compared on a number of classification datasets. The authors have showed that
discriminating among rules using (confidence, support, rule length) are the best combination that impact the
accuracy performance of the resulting classifiers in AC. There are needs to employ information theory approaches
in rule ranking such as Entropy and Correlation Features Set (CFS). These mathematical formulas showed
consistency in feature selection research domain and have been used in wide range of applications.

3.3 None Candidate Generation Rule Discovery Method


One obvious problem inherited from association rule mining is the process of discovery frequent items using the
candidate generation function (Agrawal and Srikant, 1994). In this step, and at iteration i, the AC algorithm must
go over the training dataset multiple times counting the frequency of frequent i-ruleitems (item, class) so it can
derive candidate ruleitems size i+1. This process is highly intensive especially when the input dataset has large
dimensionality hence the expected numbers of candidate ruleitems at any iteration are massive (Yoon and Lee,
2008). One way to minimise the search space for candidate ruleitems is using early pruning procedures such as
closed itemsets, negative rules or tidlists intersection (Thabtah, et al., 2006) (Li and Zaiane, 2015). Nevertheless,
even when these procedures are employed still the size of the candidate ruleitems set is very large which may
cause the mining algorithm to crash.
We believe that there are need to develop a new learning mechanism in AC that do not rely on the candidate
generation function inherited from association rule and thus making the minim process resource effective in
regards to training time, I/O overheads and memory use. One possible promising direction to reduce this major
issue is to integrate the rule discovery and the rules evaluation steps in AC into a single step similar to greedy
algorithms such as PRISM yet using minimum support and minimum confidence thresholds as early pruning
mechanisms. In this way, the process of finding the rules will depend only on the confidence (rule’s expected
accuracy) and whenever a rule is discovered it will be inserted into the classifier besides removing its instances.
Hence there is no need to evaluate the rule any further as current AC algorithms do.

3.4 Multi-label Data Applications


Traditional multi-label classification problem, i.e. (Tawiah and Sheng, 2013), involves input data where each
training example possibly linked with one or many classes. Typical applications are scene classification or text
categorization where we may have an object in the training data associated with multiple class labels. For instance,
document can be categorized into "sport" and "health". This means the classification problem involves training
datasets with class overlapping, which makes the problem harder and totally different than single data
classification. Most of the multi-label classification algorithms in the literature utilise a class membership function
to measure the correlation between the test data example and the available class labels. For instance, to classify a
test data, each available class in the training data is evaluated and the class or set of classes that are relevant (binary
relevance) are assigned to it. Currently, it is the firm belief of the authors that there is no AC that handles multi-
label classification datasets. There are some attempts to use more than one rule to make a class prediction decision
but not yet to generate an actual rules with more than one class from multi-label datasets. Since there are several
multi-label data applications related to classification there is large demands in extending current AC algorithms
to benefit such applications.

4. Conclusions

Association rule and classification are two major data mining tasks that have been studied extensively by
researchers in the last two decades. Using association rule mining in the training phase has resulted in a promising
classification approach in data mining called associative classification (AC). In the last decade, several AC
techniques have been proposed in the literature and applied on different application datasets including medical
diagnoses, website phishing classification, email classification and others. Still, there are many challenges
associated with current AC techniques that if investigated may improve the overall performance of this family of
algorithms. This paper has investigated possible challenges linked to AC algorithms that can be tackled by scholars
in data mining and machine learning communities. Vital problems such as extending current AC algorithms to
handle multi-label data, not relying on association rule candidate generation function, reducing the number of
candidate rules induced, minimising classifiers’ size are among important issues this paper highlighted. Moreover,
the test data prediction step as well as the need for new tie breaking criteria in rule ranking are other possible areas
that require deep investigation. We also directed researchers to possible starting research points in solving the
abovementioned problems. In near future, we are going to develop a new AC algorithm with no candidate
generation function hoping to resolve a major deficiency in AC which is the exponential growth of rules.

References
1. Abdelhamid N., Ayesh A., Thabtah F. (2015) Emerging Trends in Associative Classification Data Mining International
Journal of Electronics and Electrical Engineering, Vol. 3, No. 1, pp. 50-53, February, 2015.
2. Abdelhamid N. (2014) Multi-label rules for phishing classification. Applied Computing and Informatics 11 (1), 29-46.
3. Abdelhamid N., Ayesh A., Thabtah F., Ahmadi S., Hadi W (2012) MAC: A multiclass associative classification algorithm.
To be published in the Journal of Information and Knowledge Management (JIKM). 11 (2), pp. 1250011-1 - 1250011-10.
WorldScinet.
4. Abdelhamid N., Ayesh A., Thabtah F. (2012a) An Experimental Study of Three
5. Different Rule Ranking Formulas in Associative Classification Mining. Proceedings of the 7th IEEE International
Conference for Internet Technology and Secured Transactions (ICITST-2012), pp. (795-800), UK.
6. Agrawal, R., and Srikant, R. (1994) Fast algorithms for mining association rule. Proceedings of the 20th International
Conference on Very Large Data Bases-VLDP,487-499.
7. Ayyat Susan, Lu J., Thabtah F. (2014) Class Strength Prediction Method for Associative Classification. Proceedings of
the IMMM 2014, The Fourth International Conference on Advances in Information Mining and Management, pp. 5-10.
Paris, France, July 2014. Best Paper Award.
8. Jabez C. (2011) A statistical approach for associative classification. European Journal of Scientific Research Vol. 58 (No.
2) 140-147.
9. Li J., Zaiane O. (2015) Associative Classification with Statistically Significant Positive and Negative Rule. CIKM '15
Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Pages 633-642.
10. Li, W., Han, J., and Pei, J. (2001) CMAR: Accurate and efficient classification based on multiple-class association rule.
Proceedings of the IEEE International Conference on Data Mining –ICDM, 369-376.
11. Liu, B., Hsu, W., and Ma, Y. (1998) Integrating classification and association rule mining. Proceedings of the Knowledge
Discovery and Data Mining Conference- KDD, 80-86. New York.
12. Qabajeh I., Thabtah F. , Chiclana F. (2015) A dynamic rule-induction method for classification in data mining. Journal of
Management Analytics, Volume 2, Issue 3, pp 233-253. Wiley.
13. Sasirekha D., Punitha A. (2015) A Comprehensive Analysis on Associative Classification in Medical Datasets. Indian
Journal of Science and Technology. Volume 8, Issue 33, December 2015.
14. Schmida M. R., Iqbalb F., Fungc B. (2015) E-mail authorship attribution using customized associative classification. The
Proceedings of the Fifteenth Annual DFRWS Conference, Volume 14, Supplement 1, August 2015, Pages S116–S126.
15. Thabtah F., Hammoud H., Abdel-Jaber H. (2015) Parallel Associative Classification Data Mining Frameworks Based
MapReduce. Parallel Processing Letters, Volume 25 (2).
16. Thabtah F., Hadi W., Abdelhamid N., Issa A. (2011) Prediction Phase in Associative Classification. Journal of Knowledge
Engineering and Software Engineering. Volume: 21, Issue: 6(2011) pp. 855-876. WorldScinet.
17. Thabtah F., Mahmood Q., mccluskey L., Abdel-Jaber H (2010). A new Classification based on Association Algorithm.
Journal of Information and Knowledge Management, Vol 9, No. 1, 55-64. World Scientific.
18. Thabtah F. (2007): Review on Associative Classification Mining. Journal of Knowledge Engineering Review, Vol.22:1,
37-65. Cambridge Press.
19. Thabtah F., Cowling P., and Peng Y. (2006): Multiple Label Classification Rules Approach. Journal of Knowledge and
Information System. Volume 9:109-129. Springer.
20. Yoon Y., Lee G. (2008) Efficient implementation of associative classifiers for document classification, Information
Processing Management, An International Journal, 43(2): 393-405.
21. Yu K., Wu X., Ding W., and Wang H. (2011) Causal Associative Classification. Proceedings of the 11th IEEE International
Conference on Data Mining (ICDM '11)914-923.

You might also like