Professional Documents
Culture Documents
on
Fall 2021
Submitted by: Group C10
Rahul Surve - 10475549 , Adit Ghanekar - 20007066, Jemin Patel - 10474454
1
Table of Contents
Abstract 3
Introduction 3
Executive Summary 3
Hypothesis 4
Data Description 4
Exploratory Data Analysis 5
Model and Regression Method 9
Correlation Analysis 9
Variable Selection 10
Final Model Description 10
Discussions and Limitations 11
Model Overview 11
Final Model Evaluation 12
Final Model Report 13
Model Summary 14
Conclusion 14
References 15
2
Abstract
This project presents a unique way of solving a real life question which
every company faces in digital marketing. Digital Advertising is a multi
- billion dollar industry which is growing rapidly with each year. The
project model focuses on determining whether a said person in the
dataset would click on the ad link on the particular site or not
depending on different parameters related to the person or the ad.
Introduction
Executive Summary
In the past few years, digital marketing has taken over traditional
marketing. This is because even if traditional marketing brings a trickle
of business, digital marketing offers a productive and cost-effective
way to reach out to higher numbers. Consequently, companies have
been choosing to advertise their products on websites and social
media platforms. In any advertising agency, it is very important to
predict the most profitable users who are very likely to respond to
targeted advertisements. By predicting the click-through rate, an
advertising company selects the most potential visitors who are most
likely to respond to the ads, analyzing their browsing history and
showing the most relevant ads based on the interest of the user. This
task is important for every advertising agency because the commercial
value of promotions on the Internet depends only on how the user
responds to them. A user’s response to ads is very valuable to every
ad company because it allows the company to select the ads that are
most relevant to users.
3
Hypothesis
Data Description
Variable Name Data Type Description
Daily Time Spent on Site Float Time spent on the site by
each customer in minutes
4
Exploratory Data Analysis
5
2. Area Income versus Age
6
3. Daily Time Spent on Site vs Daily Internet Usage
In this cluster plot, we can see there are 2 major groups of points
which we can target. One of them is between daily usage of 100
to 125 mins when the daily time spent on the specific site is
between 40 to 60 mins. The other one is between the daily
internet usage of 200 to 250 mins when the time spent on our
site is between 70 to 90 mins.
7
Model and Regression Method
Correlation Analysis
8
Variable Selection
9
regression model, we have predicted the data using y_test,
which is 30% of the total data. Later, we presented the
classification report from the sklearn.metrics package in python.
Finally from the classification report, we got the precision, the
recall, the f1-score and ultimately, accuracy of the model. This
generated classification report projects the performance of the
model.
10
Final Model Evaluation
As a result, we can say that the model executed successfully with high
accuracy, precision, recall and f1-score.
11
Final Model Report
12
Model Summary
13
Conclusion
After successful implementation of the logistic regression model
on the data set, we can understand that we need to target the
audience with the age group between 30’s and 40’s as who are
spending the most time on the website and have higher chances
of clicking on the ad. We can also target the people between the
age group 20’s and 30’s having income of 50k to 70k are
spending more time on the website and have higher probability
of clicking the advertisement. We can also see that people with
daily internet usage of 100 to 125 mins and 200 to 250 mins are
spending the most time on website and hence have high
probability of clicking on the ads.
References
[1] H. Brendan McMahan “Ad Click Prediction: a View from the
Trenches”
[2] Matthew Richardson, Ewa Dominowska, Robert Ragno
“Predicting clicks: estimating the click-through rate for new ads”
[3] Azin Ashkan, Charles L.A. Clarke, Eugene Agichtein, Qi Guo
“Estimating Ad Clickthrough Rate through Query Intent Analysis”
[4] Rohit Kumar; Sneha Manjunath Naik; Vani D Naik; Smita
Shiralli; Sunil V.G; Moula Husain “Predicting clicks: CTR
estimation of advertisements using Logistic Regression
classifier”
[5] Kuk Lida Lee, Gary M. Ingersoll ”An Introduction to Logistic
Regression Analysis and Reporting”
14