You are on page 1of 16

Final Minor Project Presentation

Online Shopper's Purchasing


Intention Prediction
Retail shopping is continuing to shift to
E-commerce shopping and as a result,
the dynamics of shopping are changing
around the world. E-commerce has
Overview already become a major form of the
retail market. Online customers often
browse pages of e-commerce sites
before they place orders or abandon
their browsing without purchase.
Objective

Project aims to use the information customers may leave in the form
of the trace of browsing history data or user information when
they visit an online shopping site.
● With the help of this information, it aims to predict online
shoppers' purchasing intention by using clickstream and
session information data.
● Aims to compare the accuracy of different algorithms.
DATA POINTS
Administrative This is the number of pages of this type (administrative) that
the user visited.

Administrative_Duration This is the amount of time spent in this category of pages.

Informational This is the number of pages of this type (informational) that the
user visited.

Informational_Duration This is the amount of time spent in this category of pages.

ProductRelated This is the number of pages of this type (product related) that
the user visited.

ProductRelated_Duration This is the amount of time spent in this category of pages

BounceRates The percentage of visitors who enter the website through that
page and exit without triggering any additional tasks.

ExitRates The percentage of pageviews on the website that end at that


specific page.
Month Contains the month the pageview occurred, in string form.

OperatingSystems An integer value representing the operating system that


the user was on when viewing the page.

Browser An integer value representing the browser that the user


was using to view the page.

Region An integer value representing which region the user is


located in.

TrafficType An integer value representing what type of traffic the user


is categorized into.

VisitorType A string representing whether a visitor is New Visitor,


Returning Visitor, or Other.

Weekend A boolean representing whether the session is on a


weekend.

Revenue A boolean representing whether or not the user completed


the purchase.
SELECTION OF AN ALGORITHM

1. Random Forest
2. Logistics Regression
3. Artificial Neural Network
Random Forest
As the name suggests, "Random Forest is a classifier
that contains a number of decision trees on various
subsets of the given dataset and takes the average to
improve the predictive accuracy of that dataset." The
greater number of trees in the forest leads to higher
accuracy and prevents the problem of overfitting.

The below diagram explains the working of the


Random Forest algorithm:
Logistics Regression

Logistic regression is basically a supervised classification algorithm. In a classification


problem, the target variable(or output), y, can take only
discrete values for a given set of features(or inputs), X.
The model builds a regression model to predict
the probability that a given data entry belongs
to the category numbered as “1”.
Logistic regression models the data using the
sigmoid function.
Artificial Neural Network

An Artificial Neural Network in the field of Artificial intelligence where it


attempts to mimic the network of neurons that makes up a human
brain so that computers will have an option to understand things and
make decisions in a human-like manner.
Artificial Neural Network can be best represented as a weighted
directed graph, where the artificial neurons form the nodes. The
association between the neuron outputs and neuron inputs can be
viewed as the directed edges with weights. The Artificial Neural
Network receives the input signal from the external source in the form
of a pattern and image in the form of a vector. These inputs are then
mathematically assigned by the notations x(n) for every n number of
inputs.
RESULTS
Predicted that customer behavior by analysing their data which they have left
behind in form of navigational history which are captured through clickstream
1.RANDOM FOREST

2.LOGISTICS REGRESSION

3.ARTIFICIAL NEURAL NETWORK


ACCURACY TABLE

Algorithm Accuracy(%)

RandomForest 90.3

Logistics Regression 86

Artificial Neural Network 88.77


CONCLUSION

It can be seen that out of Random Forest(90.3%), Logistics Regression(86%)


and Artificial Neural Network(88.77%), Artificial Neural Network has the
highest accuracy of predicting the intention of the shopper.
FUTURE SCOPE
The customer purchase intention prediction can be combined to the ecommerce
website’s product recommendation system. Further work can be done to see if
recommending products based on customer intention could have an impact on
increases sales or not.
Thankyou

You might also like