Professional Documents
Culture Documents
USING
MACHINE LEARNING AND DEEP LEARNING
Abstract:
As online marketplaces have been popular during the past decades, online sellers
and merchants ask their purchasers to share their opinions about the products
they have bought. As a result, millions of reviews are being generated daily which
makes it difficult for a potential consumer to make a good decision on whether to
buy the product. Analyzing this enormous number of opinions is also hard and
time-consuming for product manufacturers. But in this prospering day of machine
learning, going through thousands of reviews would be much easier if a model is
used to polarize those reviews and learn from it. This thesis considers the problem
of classifying reviews by their overall semantic (Positive, Negative or neutral).To
conduct the study different supervised machine learning techniques, Support
Vector Machine, Naive Bayes, Decision Tree, Random Forest and logical
Regression have been attempted on Product review dataset from Amazon. Their
accuracies have than been compared.
Existing System:
The Existing works provided the accuracy of
prediction to the level of 80% which was
inadequate.
In the Existing System, Naive Bayes, Random
Forest, and SVM algorithms had been used. The
accuracy of a product is determined by the
number of reviews it receives.
Drawbacks/Limitations/Disadvantages of Existing
System:
Noise and Irrelevant Features:
Amazon reviews often contain noisy data, such as
spelling errors, typos, or irrelevant content. Models might focus on
these noise factors rather than meaningful sentiment-bearing words.
Misleading Reviews:
Some product reviews might be intentionally misleading,
containing fake positive or negative sentiment to influence
purchasing decisions. Detecting such fraudulent reviews
requires more advanced techniques beyond basic sentiment
analysis.
Proposed System:
Proposed work makes a comparative analysis of various
learning algorithms and identifies the best-fitted model.
The proposed work produces the analysis made on metric
values of machine learning and deep learning.
In this proposed system, we perform the feature selection
and extraction as it is not present in the existing system.
Proposed work involves all learning algorithms for
classification like SVM, Random Forest, Multinomial NB,
Logistic Regression, Gaussian NB, and Naive Bayes. And
deep learning models like ANN classifier and LSTM.
Information gain, Gini index, and Gain ratio are some
important tools for selecting the best feature to split on.
Contextual Understanding:
Deep learning models excel at capturing context
and understanding the relationships between words in a
sentence. This is crucial for accurately interpreting
sentiment in Amazon reviews, where context plays a
significant role.
Hardware Requirements:
• Processor: A multi-core processor with a clock speed of
at least 2.5 GHz is recommended to handle multiple
requests efficiently.
• Memory: At least 8 GB of RAM is recommended to store
the machine learning models and handle multiple
requests.
• Storage: At least 500 GB of storage is recommended to
store the review data.
Software Requirements:
• Operating System: The system can run on any
operating system that supports the required software.
• Programming Language: The system can be
developed using programming languages like Python, Java,
or any other language with machine learning libraries.
• Development Environment: An integrated
development environment (IDE) such as PyCharm, Visual
Studio Code can be used to develop the system.
• Network: A high-speed internet connection is
recommended to handle requests from multiple users
simultaneously.
Problem Statement:
• This model should be able to handle evolving
language and trends, which means it should
able to classify and identify emojis,
cultural references, etc.
Conclusion:
The proposed project of sentiment analysis of amazon
product review using machine learning Algorithm is
observed deeply, we came across many limitations and
drawbacks of using Machine learning Algorithm technique
like SVM, Random Forest, Logistic Regression, Gaussian NB,
and Naive Bayes. To overcome these limitations and
drawback and to improve efficiency of results of sentiment
analysis review we proposed to use Deep Learning
algorithm techniques like ANN (Artificial Neural Network)
classifier, LSTM and BERT.
References:
• Richard A Berk. Statistical learning from a regression
perspective. Springer, 2016.
• Jason Brownlee. Supervised and unsupervised machine
learning algorithms, Mar 2016.
• Pimwadee Chaovalit and Lina Zhou. Movie review mining:
A comparison between supervised and unsupervised
classification approaches. In System Sciences, 2005.
HICSS’05. Proceedings of the 38th Annual Hawaii
International Conferences on, pages 112c. IEEE,2005.
• NelloCristianini and John shawe Taylor. An introduction to
support vector machines and other kernel-based learning
methods. Cambridge university press, 2000.