Professional Documents
Culture Documents
1
Birla Institute of Technology, Mesra, Ranchi, India
2
Birla Institute of Technology, Mesra, Ranchi, India
* BIT, off-campus, A-7, Sector-1, Noida, Mesra, Ranchi, India. Email: pgupta@bitmesra.ac.in
Received on: October 9th, 2017 Accepted on: February 16th, 2018 Available online: May 1st, 2018
How to cite this article: P. Gupta and B. Bhushan-Sagar, “Determining Weighted, Utility-Based Time Variant Association
Rules using Frequent Pattern Tree”, Ing. Sol., vol. 14, no. 25 (Special issue), pp. 11 , May 2018. doi: https://doi.org/10.16925/
.v14i0.2228
Abstract
Introduction: The present research was conducted at Birla Institute of Technology, off Campus in Noida,
India, in 2017.
Methods: To assess the efficiency of the proposed approach for information mining a method and an
algorithm were proposed for mining time-variant weighted, utility-based association rules using fp-tree.
Results: A method is suggested to find association rules on time-oriented frequency-weighted, utili-
ty-based data, employing a hierarchy for pulling-out item-sets and establish their association.
Conclusions: The dimensions adopted while developing the approach compressed a large time-variant
dataset to a smaller data structure at the same time fp-tree was kept away from the repetitive dataset,
which finally gave us a noteworthy advantage in articulations of time and memory use.
Originality: In the current period, high utility recurrent-pattern pulling-out is one of the mainly notewor-
thy study areas in time-variant information mining due to its capability to account for the frequency rate
of item-sets and assorted utility rates of every item-set. This research contributes to maintain it at a
corresponding level, which ensures to avoid generating a big amount of candidate-sets, which ensures
further development of less execution time and search spaces.
Limitations: The research results demonstrated that the projected approach was efficient on tested
datasets with pre-defined weight and utility calculations.
Keywords: association rule, frequent pattern tree, information mining, time variant, weighted
transactions.
BY NC ND
BY NC ND
doi: https://doi.org/10.16925/.v14i0.2228 3 de 11
B t1, t2, t3 3 t1 B
Quarter-1 t2 D, B
C t2, t3 2 Quarter-1
D t1, t2, t4 3 t3 B
A t7 1 t4 D, A
B t5, t7 2 t5 B
E t5, t6, t8 3 t8 D
Step 6: Insert transactions into frequency-weighted Repeat subsequently after the insertion until
time variant utility frequent pattern tree as shown reaching the last transaction for quarter 1 (Fig. 2).
below. After processing the first transaction for the Similarly, after inserting the transactions for
first quarter into fp-tree (Fig.1). the second quarter, get the fp-tree for all the trans-
actions for all valid time periods as shown in Fig. 3.
Root
Valid Time Legacy Items
B
Quarter-1 For time period::Quarter-1
D
A
b: 0.044
Root
Valid Time Legacy Items
B For time period::Quarter-1
Quarter-1 D
A
d: 0.176
b: 0.088
b: 0.044
a: 0.022
Root
d: 0.176 d: 0.096
b: 0.048
b: 0.088
b: 0.044
a: 0.022
Step 7: Mine the frequency-weighted utility item Table 6. Test dataset description
sets from fp-tree (Fig. 3) based on conditional fre-
quent pattern methodology as shown in Table 5. Number of Number
Dataset Size transactions of distinct
After mining the time variant frequent item
(td) items (It)
sets, the association-rules as per required mini-
T10I4D100K 3.95mb 100,000 867
mum utility support value are obtained.
(Divided into
So here, only two frequent item sets (da and two terms)
db) have min_req_utility_support within its valid
Calamity 5.07mb 79,175 (Divided 472
time period. into two terms)
Thus the possible rules are:
Reference: the authors
800
sets with these three recurrent pattern-mining
600
algorithms i.e. projected time variant frequency
400
weighted utility algorithm, Efficient Tree Structures
200
for High Utility Pattern Mining in Incremental
0
Databases (ihup-twu algorithm) and standard fre- 0.1 0.2 0.4 0.5
quent pattern growth. These algorithms are used to utility_support
investigate ihup-twu algorithm and the standard Proposed approach ����-��� algorithm ��- Growth
frequent pattern tree-growth algorithm with our
projected approach for efficiently mining elevated Fig . 4a. Number of patterns (one-length) produced using
utility time-variant item sets; unusual outcomes different utility-support
are achieved by shifting the utility support values. Reference: the authors
Item Time period Conditional pattern base Conditional fp-tree Frequent item
A Quarter1 {d:0.022} {d:0.022}|a {da:0.022}
A Quarter2 ∅ ∅ ∅
B Quarter1 {d:0.044} {d:0.044}| b {db:0.044}
B Quarter2 ∅ ∅ ∅
D Quarter* ∅ ∅ ∅
1200 600
1000 500
No. of patterns
No. of patterns
800 400
600 300
400 200
200 100
0 0
0.1 0.2 0.4 0.5 1 2 3 4
utility_support
Length of patterns
Proposed approach ����-��� algorithm ��- Growth Proposed approach ����-��� algorithm ��- Growth
Fig . 4b. Number of patterns (two-lengths) produced using Fig . 5a. Number of patterns of varying length with utility-
different utility-support support=0.5
Reference: the authors Reference: the authors
600
500
1200 No. of patterns 400
1000
No. of patterns
300
800
200
600
100
400
0
200 1 2 3 4
0 Length of patterns
0.1 0.2 0.4 0.5
utility_support Proposed approach ����-��� algorithm ��- Growth
Proposed approach ����-��� algorithm ��- Growth Fig . 5b. Number of patterns of varying length with utility-
support=0.6
Fig . 4c. Number of Patterns (three-lengths) produced using Reference: the authors
different utility-support
Reference: the authors 600
500
No. of patterns
400
300
The test results show that the proposed 200
approach works better than the state-of-the-art 100
algorithms roughly in all cases. 0
1 2 3 4
References
7. Discussion
[1] R. Ivancsy and I. Vajk, “Fast discovery of fre-
The experimental results are obtained by varying quent itemsets: a cubic structure-based approach”,
the utility_support values on the calamity datasets Informatica, vol. 29, pp. 71-78, 2005 [Online].
Available: https://pdfs.semanticscholar.org/9356/
that show the proposed approach’s performance
284f90973ba254f2cea9ef5426773883da12.pdf
is capable in mining high weighted-utility item-
[2] R. Agrawal, T. Imielinski, and A. Swami, “Mining
sets. The performance is evaluated over different association rules between sets of items in large
utility_support values (between 0.1 to 0.5) and databases”, in Proceedings of the ACM sigmod in-
the equivalent generated number of patterns. Our ternational conference on management of data,
approach obtains better outcomes in comparison Washington, D.C, 1993, pp. 207-216. doi: https://
to standard fp-growth and ihup-twu algorithms doi.org/10.1145/170036.170072
by avoiding the repetitive item-sets. [3] R. Agrawal and R. Srikant, “Fast algorithms for mi-
The experimental results are taken by vary- ning association rules”, in Proceedings of the 20th
ing the utility_support values on the T10I4D100K international conference on very large databases,
datasets. The performance is studied by the corre- Santiago, Chile, 1994, pp. 487-499, [Online]. Avai-
lable: http://www.vldb.org/conf/1994/P487.pdf
sponding number of patterns generated with vary-
[4] W. Wang, J. Yang, and P. S. Yu, “Efficient mining of
ing utility_support, which shows that the projected
weighted association rules”, in Proceeding of the sixth
approach generates less in comparison to the other acm sigkdd international conference on knowledge
two algorithms. discovery and data mining, San Francisco, ca, usa,
Also, the projected approach reduces the over- 2000. doi: https://doi.org/10.1145/347090.347149
all runtime by generating less numbers of interme- [5] N. B. Nazar and R. Senthilkumar, “An online
diate candidate sets. approach for feature selection for classification in
This study will help the researcher to uncover big data”, Turkish Journal of Electrical Engineering &
the critical attribute weight and utility of item-sets Computer Sciences, vol. 25, pp. 163-171, 2017. DOI:
in transactional data, which subsequently reduces https://doi.org/10.3906/elk-1501-98
the generation of repetitive groups, which finally [6] J. Pillai, S. Soni, O. P. Vyas, and M. Muyeba,“A con-
gives us more concrete results and association rules. ceptual approach to temporal weighted itemset
utility mining”, International Journal of Computer
Applications, vol. 28, pp. 0887-0895, 2010. doi: ht-
tps://doi.org/10.5120/510-827
8. Conclusions [7] Yh. Kim, Wy. Kim, and Um. Kim, “Mining frequent
itemsets with normalized weight in continuous data
In the current study, we addressed an incipient streams”, Journal of Information Processing Systems,
mixed approach for the information mining pro- vol. 6, pp. 79-90, 2010. doi: https://doi.org/10.3745/
cess. The anticipated algorithm gives a proficient JIPS.2010.6.1.079
doi: https://doi.org/10.16925/.v14i0.2228 11 de 11
[8] B. Christian, “An implementation of the fp-growth Discovery and Data Mining, Lecture Notes in Com-
algorithm”, in Proceedings of the 1st international wor- puter Science, vol. 76, pp. 749-756, 2009. doi: https://
kshop on open source data mining: frequentpattern doi.org/10.1007/978-3-642-01307-2_76
miningimplementations, Chicago, Illinois, 2005, pp. [17] V. S. Tseng, C. J. Chu, and T. Liang, “An efficient al-
1-5. doi: https://doi.org/10.1145/1133905.1133907 gorithm for mining temporal high utility itemsets
[9] G. Gatuha and T. Jiang, “Smart frequent itemsets from data streams”, Journal of System and Software,-
mining algorithm based on fp-tree and diffset data vol. 81, no. 7, pp. 1105-1117, 2008. doi: https://doi.
structures”, Turkish Journal of Electrical Engineering org/10.1016/j.jss.2007.07.026
& Computer Sciences, vol. 25, pp. 2096-2107, 2017. [18] A. Ahmed, N. Makky, and Y. Taha, “Incre-
doi: http://doi.org/10.3906/elk-1602-113 mental mining of constrained association ru-
[10] V. S. Tseng, B. E. Shie, C. W. Wu, and P. S. Yu, “Effi- les”, in Proceeding of the First Siam Conference
cient algorithms for mining high utility itemsets on Data Mining, Atlanta, 2008. doi: https://doi.
from transactional databases”, ieee tkde, vol. 25, org/10.1137/1.9781611972719.24
pp. 1772–1786, 2013. doi: http://doi.org/10.1109/ [19] Y. Li, P Ning, X. S. Wang, and S. Jajodia, “Disco-
TKDE.2012.59 vering calendar based temporal association ru-
[11] V. Goyal and A. Sureka, “Efficient skyline item- les”, Data and Knowledge Engineering, vol. 44, pp.
sets mining”, in Proceedings of the eighth inter- 193-214, 2003. doi: https://doi.org/10.1016/S0169-
national C3S2E conference on computer science 023X(02)00135-0
&software engineering, Japan, 2015. doi: https://doi. [20] P. Gupta and B. B. Sagar, “Discovering Interesting
org/10.1145/2790798.2790816 Weighted Temporal Relationship Rules”, Interna-
[12] S. Cankurt and A. Subasi, “Tourism demand mode- tional Journal of Control Theory and Applications,
lling and forecasting using data mining techniques vol. 9, no. 19, pp. 9091-9099, 2016. doi: https://doi.
in multivariate time series: a case study in turkey”, org/10.17485/ijst/2016/v9i28/98455
Turkish Journal of Electrical Engineering & Compu- [21] J. M. Ale and G. H. Rossi, “An approach to disco-
ter Sciences, vol. 24, pp. 3388-3404, 2016. doi: ht- vering temporal association rules”, in ACM Sympo-
tps://doi.org/10.3906/elk-1311-134 sium on Applied Computing, Italy, 2000, pp. 294-300.
[13] J. Hu and A. Mojsilovic, “High utility pattern mining: doi: https://doi.org/10.1145/335603.335770
a method for discovery of high utility item sets”, Pa- [22] V. Mittal and I. Kashyap, “Empirical study of impact
ttern Recognition, vol. 40, pp. 3317-3324, 2007. doi: of various concept drifts in data stream mining me-
https://doi.org/10.1016/j.patcog.2007.02.003 thods”, International Journal of Intelligent Systems
[14] Y. C. Li, J. S. Yeh, and C. C. Chang, “Isolated items and Applications (ijisa), vol. 8, pp. 65-72, 2016. doi:
discarding strategy for discovering high utility https://doi.org/10.5815/ijisa.2016.12.08
itemsets”, Data & Knowledge Engineering, vol. 64, [23] W. Novitasari, A. Hermawan, Z. Abdullah, and T.
pp. 198-217, 2008. doi: https://doi.org/10.1016/j. Herawan, “A method of discovering interesting as-
datak.2007.06.009 sociation rules from student admission dataset”, In-
[15] H. Yao and H. J. Hamilton, “Mining itemset utilities ternational Journal of Software Engineering and Its
from transaction databases”, Data & Knowledge En- Applications,vol. 9, pp. 51-66, 2015. doi: https://doi.
gineering, vol. 59, pp. 603-626, 2006. doi: https://doi. org/10.14257/ijseia.2015.9.8.05
org/10.1016/j.datak.2005.10.004 [24] J. Wang, “Application of association rule mining algo-
[16] C. F. Ahmed, S. K. Tanbeer, B. S. Jeong, and Y. K. rithm in logistics information system design”, Jour-
Lee, “An efficient candidate pruning technique for nal of Software Engineering, vol. 11, pp. 217-22370,
high utility pattern mining”, Advances in Knowledge 2017. doi: https://doi.org/10.3923/jse.2017.217.223