You are on page 1of 8

AMAZON PRODUCT REVIEW SENTIMENT ANALYSIS

USING
MACHINE LEARNING AND DEEP LEARNING

Abstract:
As online marketplaces have been popular during the past decades, online sellers
and merchants ask their purchasers to share their opinions about the products
they have bought. As a result, millions of reviews are being generated daily which
makes it difficult for a potential consumer to make a good decision on whether to
buy the product. Analyzing this enormous number of opinions is also hard and
time-consuming for product manufacturers. But in this prospering day of machine
learning, going through thousands of reviews would be much easier if a model is
used to polarize those reviews and learn from it. This thesis considers the problem
of classifying reviews by their overall semantic (Positive, Negative or neutral).To
conduct the study different supervised machine learning techniques, Support
Vector Machine, Naive Bayes, Decision Tree, Random Forest and logical
Regression have been attempted on Product review dataset from Amazon. Their
accuracies have than been compared.

Existing System:
 The Existing works provided the accuracy of
prediction to the level of 80% which was
inadequate.
 In the Existing System, Naive Bayes, Random
Forest, and SVM algorithms had been used. The
accuracy of a product is determined by the
number of reviews it receives.

 Existing works involve data understanding, and


exploration before Implementing the method.

 In a comparative analysis, the learning


algorithms are differentiated using the accuracy
concept.

Drawbacks/Limitations/Disadvantages of Existing
System:
 Noise and Irrelevant Features:
Amazon reviews often contain noisy data, such as
spelling errors, typos, or irrelevant content. Models might focus on
these noise factors rather than meaningful sentiment-bearing words.

 Adaptation to Product Changes:


Machine learning models might struggle to adapt quickly to
changes in product features or quality. A model trained on reviews for
an older version of a product might not accurately reflect the sentiment
of the latest version.

 Sarcasm and Figurative Language:


Sarcasm, irony, metaphor, and figurative language
can easily mislead sentiment analysis models, causing
them to misclassify reviews.

 Misleading Reviews:
Some product reviews might be intentionally misleading,
containing fake positive or negative sentiment to influence
purchasing decisions. Detecting such fraudulent reviews
requires more advanced techniques beyond basic sentiment
analysis.
Proposed System:
 Proposed work makes a comparative analysis of various
learning algorithms and identifies the best-fitted model.
 The proposed work produces the analysis made on metric
values of machine learning and deep learning.
 In this proposed system, we perform the feature selection
and extraction as it is not present in the existing system.
 Proposed work involves all learning algorithms for
classification like SVM, Random Forest, Multinomial NB,
Logistic Regression, Gaussian NB, and Naive Bayes. And
deep learning models like ANN classifier and LSTM.
 Information gain, Gini index, and Gain ratio are some
important tools for selecting the best feature to split on.

Advantages of Proposed System:

 Automation and Efficiency:


Once trained and deployed, deep learning models
can automate sentiment analysis at scale, allowing for
efficient processing and analysis of a large
volume of reviews.

 Capturing Complex Patterns:


Deep learning models, especially those based on
neural networks like Transformers, have the capacity to
capture intricate and complex linguistic patterns in text
data. They can automatically learn hierarchical features
and relationships between words, enabling better
understanding of sentiment expressions.

 Contextual Understanding:
Deep learning models excel at capturing context
and understanding the relationships between words in a
sentence. This is crucial for accurately interpreting
sentiment in Amazon reviews, where context plays a
significant role.

 Handling Long Sequences:


Many Amazon reviews can be lengthy,
containing multiple sentences or paragraphs. Deep
learning models like Transformers are designed to handle
long sequences of text, making them well-suited for
processing these types of reviews.

Hardware Requirements:
• Processor: A multi-core processor with a clock speed of
at least 2.5 GHz is recommended to handle multiple
requests efficiently.
• Memory: At least 8 GB of RAM is recommended to store
the machine learning models and handle multiple
requests.
• Storage: At least 500 GB of storage is recommended to
store the review data.

Software Requirements:
• Operating System: The system can run on any
operating system that supports the required software.
• Programming Language: The system can be
developed using programming languages like Python, Java,
or any other language with machine learning libraries.
• Development Environment: An integrated
development environment (IDE) such as PyCharm, Visual
Studio Code can be used to develop the system.
• Network: A high-speed internet connection is
recommended to handle requests from multiple users
simultaneously.

Problem Statement:
• This model should be able to handle evolving
language and trends, which means it should
able to classify and identify emojis,
cultural references, etc.

Conclusion:
The proposed project of sentiment analysis of amazon
product review using machine learning Algorithm is
observed deeply, we came across many limitations and
drawbacks of using Machine learning Algorithm technique
like SVM, Random Forest, Logistic Regression, Gaussian NB,
and Naive Bayes. To overcome these limitations and
drawback and to improve efficiency of results of sentiment
analysis review we proposed to use Deep Learning
algorithm techniques like ANN (Artificial Neural Network)
classifier, LSTM and BERT.

References:
• Richard A Berk. Statistical learning from a regression
perspective. Springer, 2016.
• Jason Brownlee. Supervised and unsupervised machine
learning algorithms, Mar 2016.
• Pimwadee Chaovalit and Lina Zhou. Movie review mining:
A comparison between supervised and unsupervised
classification approaches. In System Sciences, 2005.
HICSS’05. Proceedings of the 38th Annual Hawaii
International Conferences on, pages 112c. IEEE,2005.
• NelloCristianini and John shawe Taylor. An introduction to
support vector machines and other kernel-based learning
methods. Cambridge university press, 2000.

You might also like