Professional Documents
Culture Documents
Abstract - Recommendation system plays a major role in web The interaction between the algorithm and customer
mining and it is applied to many applications such as e- information must be fast, if it delays the data’s will be
commerce, e-government and e-library. The key challenges of erased because the customer data is volatile
recommendation system is to recommend the users based on It works under the old customers who are all having
their interest among more visitors and huge information. To thousands of purchases and ratings
make this challenge effective, there is a need for clustering The recommendation process must be in real time basis to
algorithm to handle the data. Hence, this research focused on achieve best recommendation system.
designing effective clustering algorithm to apply it in e-
commerce applications. The grey wolf optimization based Fu et al., (1999) [2] discussed the web users based on the
clustering is proposed to make an efficient clustering method access patterns. The clustering of web users are made based on
for grouping the users based on their interest. To find the their browsing history access patterns on web. Linden et al
effective clustering, proposed a grey wolf optimization based (2013) [3] presented an industry report about the
fuzzy clustering algorithm, and made a comparison on Fuzzy C recommendations in online shopping. They made a
Means (FCM) based Genetic Algorithm (GA), Entropy based recommendation algorithm by finding set of customers rating
FCM and Improved Genetic FCM (FCM-GA). The and user’s purchased history. The algorithm uses two types of
experimental results proves that it performs better than versions namely, collaborative filtering and cluster models.
traditional algorithms, at the same time the quality is improved. The algorithm represents the recommendations based on the
Keywords - Recommender systems, E-commerce, Fuzzy C similarity between two customers namely A and B
Means, Grey Wolf optimization, Fuzzy Clustering. respectively. They represented a common way in vector format
indicated in equation 1.1. They also stated collaborative
I. INTRODUCTION filtering is computationally expensive.
12
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 06 Issue: 01 June 2017 Page No.12-16
ISSN: 2278-2419
II. LITERATURE SURVEY various online user data clustering is reviewed and indicated in
the table 1.1. Lang et al (2005) [14-15] search behavior of
Büchner and Mulvenna (1998) [6] described the combination online shoppers, its main contribution is to track Search
of data mining techniques with internet data, in order to make Behavior of user. Ahmeda et al (2015) discussed about
correct action in electronic commerce. The data which is consumer online shopping based on classification algorithm.
considered here is server data, web meta information, III. PROBLEM DEFINITION
marketing data and marketing knowledge. Yuan and Cheng
(2004) [7] presented clustering based on the heterogeneous From the survey it is noticed that the current clustering
product recommendation in mobile marketing. Evolution of the techniques do not address all the requirements concurrently.
contemporary similarity matrix is analyzed with proposed For example, real time problems occurs in e-commerce,
design named as Ontology Based Personalized Coupled e-government, etc. faces lot of problems in business wise
Clustering (OBPC). The Recommender systems for automatic requirements. GA and fuzzy clustering methods suffers from
product recommendation acquire customer’s preference and certain drawbacks due to the restriction such that the sum of
products. The products available in online varies with respect membership values of a data points varies, hence it faces more
to the one to one basis with the factor price. Generally issues in iterations. To manage all the data with large number
recommender system consists of content based and of dimensions and large number of items becomes a
collaborative filtering (Sarwar et al 2000), [8-10] problematic, due to time complexity. So to avoid time
(Lawrence et al 2001). The content based recommendations complexity and minimizes the error, the best algorithm is to be
made by matching customer interests with product attributes. made for clustering.
But product recommendations in collaborative system overlap
the preference ratings among customers. IV. METHODOLOGY
Table I: Comparison of Related Clustering Work A). The Fuzzy C-Means clustering (FCM)
Fuzzy C-Means clustering (FCM) is simple and most popular
fuzzy clustering algorithm, its steps are explained in figure 1
Sl. Citation Methodology Function
Bezdek et al (1984) [22] described algorithm is considered.
No
The FCM algorithm uses a set of unlabeled feature vectors and
1 Zhang and Latent Usage Clustering and classifies them into C classes, where C is given by the user. It
Zhou (2005) Information Building User assumes that the number of clusters, is known a priori, and
[13] (LUI) Model for Profile minimizes
Clustering Web
Transaction using (4.1)
Modified K-
Means. with the constraint of:
2 Gupta and Improved Used to cluster the (4.2)
Shrivastava Genetic Fuzzy C- user sessions and
(2012) [18] Means Algorithm overcomes the Here, is known as the fuzzifier parameter.
and Entropy defects presents in
based FCM FCM. The algorithm provides the fuzzy membership matrix ‘u’ and
3 Su et al New Fuzzy Used to initialize the fuzzy cluster center matrix ‘v’. To apply C-Means
(2010) [19] Clustering cluster centers and algorithm the following criteria should consider.
Algorithm to reduce the
dependence on the Step 1: Initialize clusters by choosing ‘C’ data points
initial cluster Step2: Find the nearest cluster center that is closest to each
centers data point.
4 Vellingiri Fuzzy Design to predict Step 3: Assign the data point to the corresponding cluster.
and Pandian possibilistic C- the user behavior Step 4: Update the cluster centers in each cluster using the
(2011). Means algorithm mean of the data points which are assigned to that
5 Zhu et al Generalized To test the cluster.
(2009) Fuzzy C-Means Clustering Step 5: Terminate if no changes else go to step 2.
Clustering capability
Algorithm In FCM two main deficiencies were associated, it is inability to
distinguish outliers from non-outliers by weighting the
Kim and Ahn (2008) [11] presented a K-Means clustering memberships and attraction of the centroids towards the
algorithm for system in online shopping market. It is stated as outliers.
novel clustering algorithm with the genetic algorithms to B). Entropy based FCM (EFCM)
segment effectively in online shopping market. The NP EFCM made by Vellingiri and Pandian (2011), they used
complete global optimization process is rectified by genetic information entropy to initialize the cluster centers to
algorithm. They compared K-Means and self-organizing maps determine the number of cluster centers. The main of this
(SOM) Kohonen and Somervuo (1998) [12]. Latent Usage method is to reduce some errors, and also improve the
Information Algorithm is discussed by (Zhang 2005) [13]. The algorithm that introduce a weighting parameters, its steps are
13
Integrated Intelligent Research (IIR) International Journal of Data Mining Techniques and Applications
Volume: 06 Issue: 01 June 2017 Page No.12-16
ISSN: 2278-2419
discussed in figure 2. After assigning the combination process Step 1: Initialize the grey wolf population Xi (i = 1, 2... n)
made with the merger of ideas, and divide the large chumps Step 2: Initialize a, A, and C
into small clusters. Then merge various small clusters Step 3: Calculate the fitness of each search agent
according to the merger of the conditions, finally the irregular =the best search agent
datasets clustering were solved.
=the second best search agent
C). Improved Genetic FCM Algorithm
The fuzzy C-Means clustering is considered as input to this =the third best search agent
algorithm that follows several steps based on serial number
Step 4: while (t < Max number of iterations)
coding by following the algorithm which is described by [23]
For each search agent update the position of the current
Bandyopadhyay et al (2007) and it is shown in figure 3.
search agent.
Step 1: In the algorithm ‘T’ is the input data with ‘N’ data Step 5: Update , and Calculate the fitness value for
points each of which has ‘M’ dimensions. all search
Step2: Calculate entropy for each Xi in T for i = 1. . . N.
Step 6: Update
Step 3: Choose XiMin with least entropy.
Step 4: Remove XiMin and the data points having similarity with
XiMin greater than p from T. Grey wolf optimizer is selected for effectively solving the web
Step 5: If T is not empty go to Step 2. usage data clustering, it is proposed for clustering web users or
Initially a random point is selected from each cluster to build a organizing objects into groups. The advantages of the proposed
chromosome. For each chromosome the fitness value is social hierarchy is, it assists GWO to save the best solutions
calculated by using new fitness function as shown in figure 3. (i.e.) retains the similar users in the same cluster or group
Based on this, genetic operators such as reproduction, recurrently. It also allows candidate solutions to locate the
crossover and mutation are applied to produce a new exact position of groups. The proposed flow diagram
population representation shown in figure 5, it explains the diagrammatic
representation of proposed clustering.
Step 1: Set the number of clusters c, population size ‘N’
evolutionary algebra ‘T’, crossover probability Pc and mutation Start
probability Pm.
Step 2: Set the serial numbers to the individuals of cluster
object. Parameters initialization, size n, coefficient vector A
and C, maximum iterations, number of generation
Step 3: Randomly generate ‘cxn’ serial number and make
chromosome with c serial number.
Step 4: calculate the fitness function with formula,
Stop
16