Professional Documents
Culture Documents
2023 Ijsem-147259
2023 Ijsem-147259
KEYWORDS:
Fake reviews detection; methodologies; machine learning; Web- Scrapping; Information retrieval.
1. INTRODUCTION
The rise of digital technology has made it easy for people to purchase products online. With the
help of Web 2.0, users can now share their experiences and opinions about their purchases. These
are very important reviews as they can help the customers make informed decisions and provide
feedback to the organizations that are trying to sell their goods. Due to the abundance of reviews,
it has been difficult for organizations to identify the required information and the overall sentiment
of the consumers regarding the products. Through opinion mining, they can analyze the data and
find out if there are any fake or irrelevant reviews[1].
A fake product review is a review that is not genuine and is written with the intention of promoting
a product or service, or of damaging the reputation of a competitor. Fake reviews can be difficult
to detect, as they are often written to appear legitimate and may include fake profiles or misleading
information. Fake product reviews can be harmful to businesses and consumers. For businesses,
they can distort ratings and reviews and mislead consumers about the quality of a product or
service, damaging the credibility and trustworthiness of the business. For consumers, they can lead
to confusion and frustration when they purchase a product based on fake reviews, and can also
harm their trust in the review process[2].
It's important for businesses and review platforms to have measures in place to identify and
eliminate fake product reviews. This can include using keyword filtering, checking for suspicious
patterns in the reviews, using machine learning techniques, and checking the reputation of the
reviewer. It's also important to continually monitor and update these measures to ensure that they
are effective at detecting fake reviews.
Spam review identification is important for businesses and consumers as it can prevent misleading
information and protect both consumers and businesses from false or negative reviews. There are
several ways to identify fake reviews, such as keyword filtering, checking for suspicious patterns,
using machine learning, checking the reputation of the reviewer and using CAPTCHAs or other
verification methods. Implementing multiple approaches and continually monitoring and updating
methods can ensure that fake reviews are effectively identified and eliminated. However, spam
review identification also has its drawbacks, such as the risk of identifying legitimate reviews as
spam, the resource-intensive process of review and classification and the difficulty in identifying
spam reviews accurately which can lead to a high number of false positives or missed spam
reviews.[3]-[5]
1.2 Current Issues in spam review identification using ML and DL
There are several current issues in spam review identification using machine learning and deep
learning techniques [3]–[5]. Machine learning algorithms can face several challenges when applied
to spam review identification, including data imbalance, lack of diverse data, lack of explain
ability, adversarial attacks, and evolving spam techniques. Data imbalance occurs when spam
reviews make up a small minority of the overall dataset, which can affect the performance of
machine learning algorithms. A lack of diverse data can make it difficult for the algorithm to
generalize well to new, unseen data. Some machine learning models are not easily interpretable,
making it difficult to understand how the model is making decisions and to identify potential
biases. Machine learning models can also be vulnerable to adversarial attacks, in which an attacker
manipulates the input data to cause the model to make incorrect predictions. Finally, spammers
are constantly finding new ways to evade detection, making it difficult to develop effective
machine learning models for spam review identification.
1.3 Various techniques used for text review classification.
There are several techniques that can be used to identify fake reviews, which are reviews that are
not genuine and are written with the intention of misleading or manipulating others [17] – [19].
Some of these techniques include:
• Sentiment analysis: This involves analyzing the sentiment or emotion expressed in a review
to determine whether it is genuine or fake. Reviews that are written in a highly positive or
negative manner may be more likely to be fake.
• Textual analysis: This involves analyzing the content of the review to identify patterns or
characteristics that are commonly found in fake reviews. For example, reviews that use
repetitive or generic language, or that contain a high number of spelling and grammar
errors, may be more likely to be fake.
• Reviewer analysis: This involves analyzing the characteristics of the reviewer, such as the
number of reviews they have written, the types of products they have reviewed, and the
consistency of their ratings. Reviewers who have written a large number of reviews in a
short period of time, or who have given consistently high or low ratings, may be more
likely to be fake.
• Network analysis: This involves analyzing the relationships between reviewers and the
products they have reviewed, in order to identify patterns of fake reviews. For example,
reviewers who have reviewed a large number of products from the same manufacturer or
seller may be more likely to be fake.
• Machine learning: This involves training a machine learning model on a labeled dataset of
genuine and fake reviews, in order to learn the characteristics of each type of review. The
model can then be used to classify new reviews as genuine or fake.
2. RELATED WORK
Muhammad Fayaz et al.[6] create an ensemble model that takes into account the predictions from
KNN, RF, and MLP and then classifies product reviews as either spam or non-spam. The
evaluation of the model revealed that it performed better than the other models in terms of its
classification accuracy. Achieved the accuracy of successfully identifying fake review is 88.13%.
Shwet Mani et al.[7] proposed work, which includes a simple n-gram feature, was used for the first
phase of the study. Three classification algorithms, namely SVM, Random Forest, and Nave
Bayes, were used for the classification of product reviews. The most accurate algorithm was Nave
Bayes, which achieved 87.12% accuracy. The second phase of the study used two different
ensemble techniques to improve the accuracy of the Stacking ensemble model. It performed better
with an accuracy of 87.68%.
To protect the interests of consumers and e-commerce sites, a system should be developed by Minu
Susan Jacob et al.[8] that can identify and remove fake reviews. This paper aims to develop a
framework that can detect unfair reviews on Amazon using sentiment analysis. The proposed
method is applied to a set of consumer reviews collected from Amazon.
Deepika Vachane et al.[9] present a framework that combines the capabilities of sentiment analysis
and LSA to find spam in product reviews. The proposed system using netspam algorithm used to
analyze and detect various types of spam.
Through the study, G. M. Shahariar et al.[10] were able to develop deep learning methods for
detecting spam reviews. These include the Multi-Layer Perceptron, the Convolutional Neural
Network, and the RNN. We have also applied some of these methods to perform different tasks
such as detecting the K Nearest Neighbor and the Support Vector Machine.
Based on the four fuzzy input variables, Komal Dhingra et al.[11] identified a group of people as
being at risk of being a spammer. Then generated 81 fuzzy rules and performed an evaluation
procedure on the group using the FSL Algorithm. As the volume, velocity, and variety of reviews
exceed the three V's of big data, Used Hadoop for analysis and storage, and a sample Amazon
reviews dataset to show proposed algorithm's accuracy of 80.71%.
Soheil Jamshidi et al.[12] developed a method to identify explicit incentivized reviews (EIRs) by
collecting a few datasets. Then show that the characteristics of EIRs and normal reviews vary.
Furthermore, discuss how the ban on Amazon affected the prevalence of these types of reviews.
The results of study revealed that the promotional campaigns of the sellers influenced the number
of reviews submitted for sample products.
Sakshi Shringi et al.[13] presents a hybrid GWOK algorithm that combines the basic GWO
algorithm with the k-Means clustering method to identify spam reviews. The results of the study
show that the proposed algorithm surpasses the current techniques. Obtained the accuracy for
Synthetic Spam Reviews= 80.43%, Movie Reviews=64.74%, and Yelp dataset=75.01%
respectively.
P. Bhuvaneshwari et al.[14] presents a framework that uses a deep learning algorithm to identify
spam reviews. The model, known as CNN BiLSTM, learns to recognize the level representation
of the content in a document by calculating the weightage of the words in the sentence. Through
the CNN model, the model learns to recognize sentence structure. It then combines the various
features of the sentence with contextual information to identify spam review.
3. PROPOSED SYSTEM
Hybrid machine learning is a term used to describe machine learning approaches that combine
multiple different techniques or algorithms. This can be useful when different types of techniques
are complementary or when combining approaches can lead to improved performance on a
particular task. Hybrid machine learning approaches that combine deep learning with other
techniques are often used to solve complex problems where deep learning alone may not be
sufficient. For example, a hybrid approach might use a deep learning model to extract features
from data and then use a more traditional machine learning algorithm to make a prediction based
on those features. This can be particularly useful in tasks such as language translation or image
recognition, where deep learning has proven to be very effective but may not capture all of the
necessary information on its own. There are many different ways that deep learning and other
techniques can be combined in a hybrid approach, and the specific combination used will depend
on the specific problem being solved and the available data. Here we proposed the hybrid of CNN
+ NB, CNN+ RF, CNN + SVM and compare the output for effectiveness comparing to existing
model classifiers.
3.1 System Architecture
The method suggested is made to fit the needs of review classification. It is made up of three steps:
the pre-processing step, the selection step, and the evaluation step. In the first step of the process,
the input product reviews are put through a number of pre-processing steps. Some of these methods
are taking out "stop words," "rooting" words, "pruning" words, and "term weighting." The goal of
this step is to turn the review text into a form that classification algorithms can understand. The
second part of the method is to extract features from the pre-processed dataset. In the third and
final step, a supervised learning algorithm is used to evaluate the selected features from the feature
set. Figure 2 also shows these stages. Below are important steps involved in spam detection.
Figure 2. Proposed System Architecture
C. Sentiment Analysis
Sentiment analysis is a technique used to identify the sentiment or emotion expressed in a piece of
text, such as a review. It is often used in the context of fake review identification to help identify
reviews that are biased or misleading.
There are several approaches to sentiment analysis, including rule-based approaches, which rely
on hand-crafted rules and heuristics to identify sentiment, and machine learning approaches, which
use supervised or unsupervised learning algorithms to learn patterns in the data and make
predictions about the sentiment of a given text.
In the context of fake review identification, sentiment analysis can be used to identify reviews that
are overly positive or negative, as fake reviews are often written with the intention of misleading
the reader. For example, a fake positive review might use very positive language and give the
product a high rating, while a fake negative review might use negative language and give the
product a low rating.
To use sentiment analysis in fake review identification, you would first need to collect a dataset of
reviews, both fake and genuine. You would then need to pre-process the data by converting the
text of each review into a numerical representation, such as a bag-of-words model or a term
frequency matrix. You would then apply a sentiment analysis technique to the data, such as a
machine learning classifier, to predict the sentiment of each review.
The specific approach and model used will depend on the specific requirements and constraints of
the task. It is also important to consider other factors that might affect the performance of the
model, such as the size and quality of the training data, the choice of hyper parameters, and the
ability to handle class imbalances in the data.
3.1.3 Split the Train Test:
The train-test split is used to figure out how well ML, DL, and hybrid algorithms work when they
are used to make predictions. This method is a quick and easy way for us to compare the results
of our own machine learning, deep learning, and hybrid model to what the machine comes up with.
When the default settings are used, 20% of the real data are in the Test set, while 80% of the real
data are in the Training set. Before we can test how well our model works, we need to divide a
dataset into train and test sets. The train set is used to fit the model, and all the information about
the train set is already known. The second group of data is called the "test dataset," and its main
purpose is to help make predictions.
3.1.4 ML / DL Algorithms
With the data we have, we have a lot of options. We did our work with the help of the Amazon
Product Review Dataset. For our ML algorithm implementation, we used Scikit learn and
tensorflow to process data. Here is a list of the machine learning algorithms that are used: Support
Vector Machine (SVM), Neural Network (NN), Convolution Neural Network (CNN). Detailed
description of machine learning model is presented in Background (existing classifiers) chapter.
In machine learning and deep learning, a classifier is a model that is trained to predict the class or
category of an input sample. Classifiers are used in a wide range of applications, including spam
filtering, image classification, and natural language processing[22]–[24]. There are many different
types of classifiers, including linear classifiers, support vector machines (SVMs), decision trees,
and neural network-based classifiers such as convolutional neural networks (CNNs) and recurrent
neural networks (RNNs). The choice of classifier depends on the nature of the input data and the
task at hand.
• Linear classifiers, such as logistic regression, are based on the assumption that the input
data is linearly separable, meaning that it can be separated into different classes by a linear
boundary. These classifiers are simple and efficient, but they may not be suitable for more
complex datasets.
• SVMs are a type of linear classifier that seek to find the hyperplane in the feature space
that maximally separates the different classes. They are effective for high-dimensional
datasets and can handle non-linear boundaries by using kernel functions to transform the
data into a higher-dimensional space.
• Decision trees are a type of classifier that use a tree structure to make predictions based on
the values of the input features. Each node in the tree represents a decision based on a
feature, and the branches represent the possible outcomes of that decision. Decision trees
are simple to understand and interpret, but they may not be as accurate as some other
classifiers.
• Neural network-based classifiers, such as CNNs and RNNs, are capable of learning
complex relationships in the data and can achieve state-of-the-art results on many tasks.
However, they require a large amount of labeled data and computational resources to train.
A. SVM
Support Vector Machines (SVMs) are a type of supervised machine learning algorithm that can be
used for classification or regression tasks. The goal of an SVM is to find the hyperplane in a high-
dimensional space that maximally separates the positive and negative examples. The distance
between the hyperplane and the nearest data points is known as the margin. SVMs try to maximize
the margin between the two classes of data points.
SVMs are particularly useful in cases where the number of dimensions is much greater than the
number of samples, as they tend to be more robust than other algorithms in such situations. They
can also handle cases where the data is non-linearly separable by using the so-called "kernel trick"
to transform the data into a higher-dimensional space where it becomes linearly separable. In
summary, SVMs are a powerful tool for classification and regression tasks and are particularly
effective in high-dimensional spaces and when the data is non-linearly separable.
The basic idea behind SVMs is to find a hyperplane that maximally separates the different classes
(e.g. real vs fake reviews) in the feature space. Once the hyperplane is found, new samples can be
easily classified by checking on which side of the hyperplane they fall. SVMs have been shown to
perform well in text classification tasks and have been used in various studies for fake review
detection. It is important to note that SVM alone may not be the best solution for this problem and
often it is used in combination with other techniques such as feature engineering, natural language
processing and deep learning[21], [25].
B. Neural Network
A neural network is a type of machine learning model that is inspired by the structure and function
of the brain. It consists of layers of interconnected "neurons," which process and transmit
information. Each neuron receives input from other neurons, performs a computation on that input,
and produces an output that is transmitted to other neurons in the next layer. Neural networks are
particularly useful for tasks that involve pattern recognition or data that is difficult to process using
a traditional, rule-based approach. They have been successful at tasks such as image and speech
recognition, natural language processing, and even playing games like chess.
There are many different types of neural networks, including feedforward networks, convolutional
neural networks, recurrent neural networks, and autoencoders. The specific architecture of a neural
network depends on the task it is being used for. It is able to process sequential data such as text
by maintaining an internal state that can store information about the previous inputs. These
architectures have been used in many studies for fake review detection and have been shown to
achieve high performance. However, it is important to note that fake review detection is a complex
task and NNs alone may not be the best solution for this problem and often it is used in combination
with other techniques such as feature engineering and natural language processing [26].
C. CNN
CNN (Convolutional Neural Network) is a type of artificial neural network that is particularly
effective at recognizing patterns and features in images. It is a deep learning algorithm that is
commonly used in image and video recognition tasks, such as object classification and face
recognition. CNNs are composed of a series of layers that process input data and extract relevant
features. The first layers of a CNN typically consist of convolutional layers, which apply a series
of filters to the input data to identify patterns and features. These filters are called kernels or
weights, and they are learned through training the network on a labeled dataset. After the
convolutional layers, the network typically includes pooling layers, which downsample the data
and reduce the dimensionality of the feature maps. This helps to reduce the complexity of the
model and improve its ability to generalize to new data.
The final layers of a CNN are typically fully-connected layers, which process the extracted features
and make predictions based on them. The output of the fully-connected layers is typically a
probability distribution over a set of classes, indicating the likelihood that the input belongs to each
class. CNNs are widely used in a variety of applications, including image classification, object
detection, and facial recognition. They have been successful in achieving state-of-the-art results
on many tasks and have become a key tool in the field of computer vision.
Convolutional Neural Networks (CNNs) are a type of neural network that can be used for image
and text classification tasks, including identifying fake reviews. The main characteristic of CNNs
is the use of convolutional layers, which are designed to identify local patterns in the input data by
applying a set of filters. These filters "slide" over the input data (e.g. text) and extract features such
as edges, shapes, and textures. These features are then passed through several layers of neural
network, where they are combined and transformed in a hierarchical manner to learn more complex
representations of the input data. In the case of fake review detection, CNNs can be trained to learn
the patterns in the text that are indicative of fake reviews, such as specific words or phrases that
are commonly used in fake reviews. CNNs have been used in many studies for fake review
detection and have been shown to achieve high performance in comparison to other methods.
However, it is important to note that fake review detection is a complex task, and CNN alone may
not be the best solution for this problem and often it is used in combination with other techniques
such as feature engineering and natural language processing [27].
3.1.5 Proposed Combine Machine Learning and Deep Learning Approach
Sentiment analysis, also known as opinion mining, is the use of natural language processing, text
analysis, and computational linguistics to identify and extract subjective information from source
materials. The goal of sentiment analysis is to determine the attitudes, opinions, and emotions of
a speaker or writer with respect to some topic or the overall contextual polarity of a document.
A. CNN + RF / CNN +SVM / CNN + NB
One popular approach to classify reviews is to use a combination of convolutional neural networks
(CNNs) and random forests (RFs) / Support Vector Machine (SVM) / Naïve Bayes (NB)[28]–[30].
CNNs are a type of neural network that are often used for image recognition and have been adapted
for use in natural language processing tasks like sentiment analysis. The main advantage of using
a CNN for sentiment analysis is that it is able to learn the local patterns and features within a
sentence that are most indicative of the sentiment. Random forests, on the other hand, are a type
of decision tree-based ensemble learning method. They are particularly well suited for sentiment
analysis due to their ability to handle large datasets and their robustness to overfitting.
When used in combination, CNNs and RFs can complement each other's strengths to achieve
improved performance on sentiment analysis tasks. The CNN can be used to extract features from
the text, such as n-grams or word embeddings, which are then fed into the RF for classification.
This can be done by using the CNN to train a set of word embeddings for the text, which are then
used as input to the RF classifier. The output of the CNN can also be used as input to the RF as
feature representation of the text.
Support vector machines (SVMs) are a type of supervised learning algorithm that can be used for
classification and regression tasks. One of the main advantages of using an SVM for sentiment
analysis is that it is able to perform well with high-dimensional data, such as text, by using a
technique called kernel trick.
Naive Bayes classifiers, on the other hand, are a family of simple probabilistic classifiers based on
applying Bayes' theorem with strong (naive) independence assumptions between the features. One
of the main advantages of using a naive bayes classifier is its simplicity, it's easy to implement and
computationally efficient. It's also particularly well suited for text classification problems and can
be trained on small datasets as well.
Here we perform CNN and RF / SVM / NB in combination by stacking them. In this approach, the
CNN is used as a feature extractor, and the output of the CNN is passed as input to the RF / SVM
/ NB classifier. This allows the RF / SVM / NB to take advantage of the learned features extracted
by the CNN and improve its accuracy. The RF / SVM / NB can use the CNN extracted features to
improve its classification accuracy. In conclusion, combining CNN and RF / SVM / NB models
can be a powerful tool in review classification task, by taking advantage of the strengths of both
models, CNNs ability to extract feature and RF / SVM / NB ability to handle large datasets and
robustness to overfitting. This could lead to improved performance over using either model alone.
Below figure 3 shows the combine approach of CNN with Other ML algorithms.
Figure 3. Combining DL (CNN) with ML (RF /SVM / NB) Models
To improve the performance, the basic CNN algorithm is modified by reducing several layers and
replacing final layers with machine learning (SVM / RF / NB) algorithms as show in figure 3. It
has one convolution layer, two dense layers, one max-pooling layer and one flatten layer. This
model has only 11,036 parameters. It is lighter than the existing CNN model. The model has input
shape of (6*1) where the features are extracted using NLP technique and sentiment analysis. The
output we get is binary classification where we detect the review is spam or not.
The model is built on the “sklearn” and “Tensorflow” framework and operates on the “Pycharm”
IDE, which is primarily used for machine learning-based research and development. F1-score,
precision, recall, and accuracy are used to assess the performance of suggested models.
Figure 4. Web-scrapping
Confusion Matrix: A confusion matrix is a table that is used to evaluate the performance of a
classification model. It is a common tool in machine learning and data analysis, and is particularly
useful for understanding the performance of a model on a classification task. The confusion matrix
is constructed by comparing the predicted classes of the model with the true classes of the test
data. The rows of the matrix represent the predicted classes, and the columns represent the true
classes. The confusion matrix typically includes the following four elements: The confusion matrix
can be used to calculate several evaluation metrics, including accuracy, precision, recall, and F1
score. These metrics can be useful for understanding the strengths and weaknesses of the model
and for comparing the performance of different models.
The proposed work used Naïve Bayes (NB), support vector machine (SVM), Random Forest (RF),
Neural Network (NN), Convolution Neural Network (CNN) and Hybrid (CNN+RF, CNN+SVM,
CNN+NB) classifiers to detect whether the Amazon product review data is spam or not. The
proposed work used TF-IDF, NLP processing, Sentiment analysis before applying machine
learning, deep learning and hybrid classifier. Table 2 (A, B) gives various performance measures
(in percentage) obtained for spam detection after applying the Feature extraction techniques on
product review dataset downloaded from kaggle. Figure 6, Figure 7 shows the performance
parameters comparison graph for spam reviews detection on kaggle dataset.
Table 2 (A). Performance Comparison of various algorithms on Spam Review Dataset downloaded from Kaggle (Train Test
Split)
SVM
100
NB
80 RF
NN
60
%
CNN
CNN+NB
40
CNN + RF
20 CNN + SVM
0
Accuracy Precision Recall F1-Score
PERFORMANCE PARAMETERS
Figure 6. Performance parameters comparison graph for spam reviews detection on kaggle dataset with Train and
test split (80% - 20 %) technique
Table 2 (B). Performance Comparison of various algorithms on Spam Review Dataset downloaded from Kaggle (10 Fold
Validation)
SVM
100
NB
80 RF
NN
60
%
CNN
CNN+NB
40
CNN + RF
20 CNN + SVM
0
Accuracy Precision Recall F1-Score
PERFORMANCE PARAMETERS
Figure 7. Performance parameters comparison graph for spam reviews detection on Kaggle dataset with 10- fold
cross validation technique.
Table 3 (A, B) gives various performance measures (in percentage) obtained for spam detection
after applying the Feature extraction techniques on dataset downloaded from amazon website
(amazon product reviews) using web scrapping technique. Figure 8, Figure 9 shows the
performance parameters comparison graph for spam reviews detection on amazon review dataset.
Table 3 (A) Performance Comparison of various algorithms on Spam Review Dataset downloaded from Amazon Website using
Web scrapping technique (Train –Test Split)
SVM
100
NB
80 RF
NN
60 CNN
%
CNN+NB
40
CNN + RF
CNN + SVM
20
0
Accuracy Precision Recall F1-Score
PERFORMANCE PARAMETERS
Figure 8. Performance parameters comparison graph for spam reviews detection on amazon product review dataset
download using web scrapping technique and applying train test split (80% - 20%)
Table 3 (B) Performance Comparison of various algorithms on Spam Review Dataset downloaded from Amazon Website using
Web scrapping technique (10-fold Validation technique)
40 CNN+NB
30 CNN + RF
20 CNN + SVM
10
0
Accuracy Precision Recall F1-Score
PERFORMANCE PARAMETERS
Figure 9. Performance parameters comparison graph for spam reviews detection on amazon product review dataset
(web scrapping)
Figure 10. ROC Curve comparison of algorithms (a) NB (b) RF (c) SVM (d) NN (e) CNN (f) CNN + NB (g) CNN +
SVM (h) CNN + RF
Figure 10. shows the ROC curve comparison of various algorithms on amazon product spam
review dataset classification, from graph we can clearly see that deep learning algorithm and
combining approach of deep learning algorithm shows the better ROC curve compare to traditional
machine learning approach.
4.5 Discussion
This section provides discussion on algorithms used for spam detection. An attempt has been made
to provide researchers with a comparative analysis of different spam review detection methods and
their reported accuracy with our proposed hybrid classification technique. Here we have used two
datasets 1. Product review dataset downloaded from kaggle 2. Amazon product review dataset
downloaded from amazon website using web scrapping technique. And also we have use K-fold
validation technique for both datasets. The accuracy of different supervised-learning-based works
is presented in Table 3 and Table. It shows that proposed combine ML-DL algorithms outperforms
traditional machine learning and deep learning algorithms in terms of accuracy, precision, recall
and f1-score.
REFERENCESs
[1] S. N. Alsubari et al., “Data analytics for the identification of fake reviews using supervised learning,”
Comput. Mater. Contin., vol. 70, no. 2, pp. 3189–3204, 2022, doi: 10.32604/cmc.2022.019625.
[2] I. Amin and M. Kumar Dubey, “An overview of soft computing techniques on Review Spam Detection,”
Proc. 2021 2nd Int. Conf. Intell. Eng. Manag. ICIEM 2021, pp. 91–96, 2021, doi:
10.1109/ICIEM51511.2021.9445280.
[3] A. Mewada and R. K. Dewang, “A comprehensive survey of various methods in opinion spam detection,”
Multimed. Tools Appl., 2022, doi: 10.1007/s11042-022-13702-5.
[4] S. Saumya and J. P. Singh, “Spam review detection using LSTM autoencoder: an unsupervised approach,”
Electron. Commer. Res., vol. 22, no. 1, pp. 113–133, 2022, doi: 10.1007/s10660-020-09413-4.
[5] V. Gupta, A. Aggarwal, and T. Chakraborty, “Detecting and Characterizing Extremist Reviewer Groups in
Online Product Reviews,” IEEE Trans. Comput. Soc. Syst., vol. 7, no. 3, pp. 741–750, 2020, doi:
10.1109/TCSS.2020.2988098.
[6] M. Fayaz, A. Khan, J. U. Rahman, A. Alharbi, M. I. Uddin, and B. Alouffi, “Ensemble machine learning
model for classification of spam product reviews,” Complexity, vol. 2020, 2020, doi: 10.1155/2020/8857570.
[7] S. Mani, S. Kumari, A. Jain, and P. Kumar, Spam review detection using ensemble machine learning, vol.
10935 LNAI. Springer International Publishing, 2018.
[8] M. S. Jacob, S. Rajendran, V. Michael Mario, K. T. Sai, and D. Logesh, “Fake Product Review Detection
and Removal Using Opinion Mining Through Machine Learning,” Proc. Int. Conf. Artif. Intell. Smart Grid Smart
City Appl., pp. 587–601, 2020, doi: 10.1007/978-3-030-24051-6_55.
[9] D. V. Et. al., “Online Products Fake Reviews Detection System Using Machine Learning,” Turkish J.
Comput. Math. Educ., vol. 12, no. 1S, pp. 29–39, 2021, doi: 10.17762/turcomat.v12i1s.1548.
[10] G. M. Shahariar, S. Biswas, F. Omar, F. M. Shah, and S. Binte Hassan, “Spam Review Detection Using Deep
Learning,” 2019 IEEE 10th Annu. Inf. Technol. Electron. Mob. Commun. Conf. IEMCON 2019, pp. 27–33, 2019,
doi: 10.1109/IEMCON.2019.8936148.
[11] K. Dhingra and S. K. Yadav, “Spam analysis of big reviews dataset using Fuzzy Ranking Evaluation
Algorithm and Hadoop,” Int. J. Mach. Learn. Cybern., vol. 10, no. 8, pp. 2143–2162, 2019, doi: 10.1007/s13042-017-
0768-3.
[12] S. Jamshidi, R. Rejaie, and J. Li, “Characterizing the dynamics and evolution of incentivized online reviews
on Amazon,” Soc. Netw. Anal. Min., vol. 9, no. 1, pp. 1–15, 2019, doi: 10.1007/s13278-019-0563-0.
[13] S. Shringi and H. Sharma, “Detection of spam reviews using hybrid grey wolf optimizer clustering method,”
Multimed. Tools Appl., vol. 81, no. 27, pp. 38623–38641, 2022, doi: 10.1007/s11042-022-12848-6.
[14] P. Bhuvaneshwari, A. N. Rao, and Y. H. Robinson, “Spam review detection using self attention based CNN
and bi-directional LSTM,” Multimed. Tools Appl., vol. 80, no. 12, pp. 18107–18124, 2021, doi: 10.1007/s11042-021-
10602-y.
[15] N. Jain, A. Kumar, S. Singh, C. Singh, and S. Tripathi, Deceptive Reviews Detection Using Deep Learning
Techniques, vol. 11608 LNCS. Springer International Publishing, 2019.
[16] N. Ilakiyaselvan, S. K. J, and S. Verma, “FRAUDULENT REVIEWS DETECTION USING MACHINE
LEARNING ALGORITHM,” vol. 7, no. 15, pp. 1635–1645, 2020.
[17] S. Ahmed and F. Muhammad, “Using Boosting Approaches to Detect Spam Reviews,” 1st Int. Conf. Adv.
Sci. Eng. Robot. Technol. 2019, ICASERT 2019, vol. 2019, no. Icasert, 2019, doi: 10.1109/ICASERT.2019.8934467.
[18] G. Bathla, P. Singh, R. K. Singh, E. Cambria, and R. Tiwari, “Intelligent fake reviews detection based on
aspect extraction and analysis using deep learning,” Neural Comput. Appl., vol. 34, no. 22, pp. 20213–20229, 2022,
doi: 10.1007/s00521-022-07531-8.
[19] R. K. Dewang and A. K. Singh, “State-of-art approaches for review spammer detection: a survey,” J. Intell.
Inf. Syst., vol. 50, no. 2, pp. 231–264, 2018, doi: 10.1007/s10844-017-0454-7.
[20] S. K. Chauhan, A. Goel, P. Goel, A. Chauhan, and M. K. Gurve, “Research on Product Review Analysis and
Spam Review Detection,” pp. 2–5, 2017.
[21] E. Suganya and S. Vijayarani, Sentiment Analysis for Scraping of Product Reviews from Multiple Web Pages
Using Machine Learning Algorithms. Springer International Publishing, 2020.
[22] A. Ghourabi and M. A. Mahmood, “A Hybrid CNN-LSTM Model for SMS Spam Detection in Arabic and
English Messages,” pp. 1–16, 2020, doi: 10.3390/fi12090156.
[23] S. P. Rajamohana, “An Effective Hybrid Cuckoo Search with Harmony Search for Review Spam Detection,”
pp. 978–981, 2017.
[24] N. Kumari, A. Yadav, and P. K. Jana, “Task offloading in fog computing : A survey of algorithms and
optimization techniques,” Comput. Networks, vol. 214, no. June, p. 109137, 2022, doi:
10.1016/j.comnet.2022.109137.
[25] M. Dolly Nithisha, B. Divya Sri, P. Lekhya Sahithi, and M. Suneetha, Unfair Review Detection on Amazon
Reviews Using Sentiment Analysis, vol. 853. Springer Singapore, 2022.
[26] A. P. Rodrigues et al., “Real-Time Twitter Spam Detection and Sentiment Analysis using Machine Learning
and Deep Learning Techniques,” Comput. Intell. Neurosci., vol. 2022, 2022, doi: 10.1155/2022/5211949.
[27] S. Girgis and M. Gadallah, “Deep learning algorithms for detecting fake new in online text,” pp. 93–97, 2018.
[28] J. Abdul, O. Subhani, and I. Varlamis, “International Journal of Information Management Data Insights Fake
news detection : A hybrid CNN-RNN based deep learning approach,” vol. 1, no. December 2020, 2021, doi:
10.1016/j.jjimei.2020.100007.
[29] Y. Jian, X. Chen, and H. W. B, Deep Neural Networks with Hybrid Feature Fusion Method. Springer
International Publishing, 2022.
[30] S. Lin, “Fake Reviews Detection with Hybrid Features Using Time-Sequential Deep Learning Model,” pp. 3–5.