You are on page 1of 14

COMSATS UNIVERSITY ISLAMABAD

WAH CAMPUS
(Project Report)
Submitted by:
Mian M. Shoaib Mehboob (FA18-BSE-7B-008)
M. Kawish Feroz (FA18-BSE-7B-073)
Fahad Ali (FA18-BSE-7B-076)
Submitted To:
Dr. Hikmat Ullah Khan
Submission date:
20th December, 2021
Project Title:
Sentiment Analysis of Amazon Product Reviews using Python
1. Introduction + Aim:
In today’s world sentiment analysis can play a vital role in any industry. Classifying tweets,
Facebook comments or product reviews using a system can save a lot of time and money. At
the same time, the probability of error is lower.
Our aim is to perform Sentiment Analysis on Amazon Products Reviews. We will scrap data of
two selected products and then scrap reviews of these products. We have used python and a
few libraries of python to perform scrapping and applying algorithms for sentiment analysis.

2. Selected Algorithm:
1. We have used Supervised Algorithm for Classification. In Supervised Algorithm for
Classification we will use Naive Bayes classifier Algorithm.

Classification:
Classification is the process of predicting the class of given data points. Classes are sometimes
called as targets/ labels or categories. Classification predictive modeling is the task of
approximating a mapping function (f) from input variables (X) to discrete output variables (y).

Naïve Bayes
Naive Bayes classifiers are a collection of classification algorithms based on Bayes’ Theorem. It
is not a single algorithm but a family of algorithms where all of them share a common principle,
i.e. every pair of features being classified is independent of each other.

Advantages:
 It is simple and easy to implement
 It doesn’t require as much training data
 It handles both continuous and discrete data
 It is highly scalable with the number of predictors and data points
 It is fast and can be used to make real-time predictions
 It is not sensitive to irrelevant features
Disadvantages:
 Naive Bayes assumes that all predictors (or features) are independent, rarely happening in
real life. This limits the applicability of this algorithm in real-world use cases.
 This algorithm faces the ‘zero-frequency problem’ where it assigns zero probability to a
categorical variable whose category in the test data set wasn’t available in the training
dataset. It would be best if you used a smoothing technique to overcome this issue.
 Its estimations can be wrong in some cases, so you shouldn’t take its probability outputs very
seriously.
3. Dataset:
We scraped reviews of 2 products from Amazon Website using python.
Data Source:
Product 1 link:
https://www.amazon.in/OnePlus-Mirror-Black-128GB-Storage/product-
reviews/B07DJHV6VZ/ref=cm_cr_dp_d_show_all_btm?ie=UTF8&reviewerType=all_reviews%
27
Product 2 link:
https://www.amazon.com/adidas-Mens-Questar-White-
Black/dp/B08779MB6H/ref=sr_1_42?keywords=shoes&qid=1639316391&s=fashion-mens-
intl-ship&sr=1-42&th=1
Statistic Table:

Statistics table of Product 1


Statistics table of Product 2

4. EDA (Exploratory Data Analysis):


Graphs for Product 1

Bar chart of Positive and Negative reviews for Product 1


Box Plot of Positive and Negative reviews for Product 1

Scatter Plot graph of positive and negative reviews for Product 1


Graphs for product 2:
Bar chart of Positive and Negative reviews for Product 1
Box plot Graph:

Box Plot of Positive and Negative reviews for Product 2

Scatter chart graph:


Scatter Plot of Positive and Negative reviews for Product 2

5. Experimental set up:


Import libraries that are required
For performing sentiment analysis we have imported libraries given in below code:

Importing Reviews dataset


Find Polarity of single review:

Drop Null values from dataset

Importing important libraries and Functions for Sentiment:

Find positive, negative and total values of reviews:

Merge table with data set


Convert values into positive and negative Sentiments:

Ranking reviews (positive to negative)

Finding mean Values of Sentiment of Reviews:


6. Results:
Detailed Values:

Sentiment Values of Product 1

Sentiment Values of Product 2

Averaged Values:

Average Sentiment Values of Product 1


Average Sentiment Values of Product 2

Final Result:

7. References:
 https://www.kaggle.com/holfyuen/tutorial-scatter-plots-in-python
 https://www.youtube.com/watch?v=AnvrJNLKp0k&t=659s
 https://www.youtube.com/watch?v=O_B7XLfx0ic
 https://theappsolutions.com/blog/development/sentiment-analysis/
 https://www.geeksforgeeks.org/naive-bayes-classifiers/

8. Appendix:
Scrapping using Library BeautifulSoup:
Import Link and Check Response

Target Data to Scrap

Data Cleaning
Scrap Reviews
Store Data into a Table

Convert File into Excel (CSV) and save file into Device

You might also like