You are on page 1of 13

Received: 27 April 2023 Revised: 7 July 2023 Accepted: 20 July 2023

DOI: 10.1002/ail2.86

LETTER

Deep aspect extraction and classification for opinion


mining in e-commerce applications using convolutional
neural network feature extraction followed by long short
term memory attention model

Kamal Sharbatian 1 | Mohammad Hossein Moattar 2

1
Department of Computer Engineering,
Urmia University of Technology, Abstract
Urmia, Iran Users of e-commerce websites review different aspects of a product in the
2
Department of Computer Engineering, comment section. In this research, an approach is proposed for opinion aspect
Mashhad Branch, Islamic Azad
extraction and recognition in selling systems. We have used the users'
University, Mashhad, Iran
opinions from the Digikala website (www.Digikala.com), which is an Iranian
Correspondence e-commerce company. In this research, a language-independent framework is
Mohammad Hossein Moattar,
Department of Computer Engineering,
proposed that is adjustable to other languages. In this regard, after necessary
Mashhad Branch, Islamic Azad text processing and preparation steps, the existence of an aspect in an opinion
University, Mashhad, Iran. is determined using deep learning algorithms. The proposed model combines
Email: moattar@mshdiau.ac.ir
Convolutional Neural Network (CNN) and long-short-term memory (LSTM)
deep learning approaches. CNN is one of the best algorithms for extracting
latent features from data. On the other hand, LSTM can detect latent temporal
relationships among different words in a text due to its memory ability and
attention model. The approach is evaluated on six classes of opinion aspects.
Based on the experiments, the proposed model's accuracy, precision, and recall
are 70%, 60%, and 85%, respectively. The proposed model was compared in
terms of the above criteria with CNN, Naive Bayes, and SVM algorithms and
showed satisfying performance.

KEYWORDS
aspect classification, convolutional neural network, deep learning, long short-term
memory, opinion mining

INTRODUCTION

Many vendors sell their products via the internet because of the popularity of social networks. An inseparable
section of such websites is the “users' comments” section, in which the users freely share the characteristics of their
bought products with other users. The users also read others' comments about the products before buying them and
purchase them with awareness of their different aspects. One of the main challenges for the sellers is finding their

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided
the original work is properly cited.
© 2023 The Authors. Applied AI Letters published by John Wiley & Sons Ltd.

Applied AI Letters. 2023;4:e86. wileyonlinelibrary.com/journal/ail2 1 of 13


https://doi.org/10.1002/ail2.86
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 13 SHARBATIAN and MOATTAR

products' advantages and disadvantages in the shortest time with the least cost by processing the users' comments and
finding out the future needs of the market. The comment processing system has many advantages for producers, sellers,
and buyers.1,2
Opinion mining is a set of activities, including the detection, extraction, and categorization of different subjects in
text data related to the users' opinions.3 Opinion mining can be considered a set of processes consisting of natural lan-
guage processing, text mining, and information retrieval.4 The four general categories considered for mining the opin-
ions are document level, sentence level, aspect level, and concept level.4,5 Hence, extracting different aspects of a
product can be considered a subset of opinion mining.5
Different opinion mining research fields are as follows6: Polarity classification determines whether a text or a sen-
tence about a product is positive, negative, or neutral. Entity extraction evaluates and extracts entities related to an
opinion, like opinion holders and opinion sources. Aspect extraction is defined as a set of activities to extract various
aspects of a product expressed in the text.5 The goal of this process is to encourage more conscious buying of a product
and the consumers' feedback and determination to enhance the product's quality.5
Opinion mining algorithms are categorized into two methods: Machine Learning and lexicon-based.7,8 In lexicon-
based methods, different words with positive and negative labels are predetermined in the vocabulary domain. The next
step is inserting the synonyms and antonyms with proper labels into the domain. This process is continued until no
other new word is added7 to the domain. An existing issue is that one aspect of a word can be positive and another is
negative.9,10 At the sentence level, referring to the number of positive and negative words and all words in a sentence,
the semantic label of a sentence can be evaluated. At the document level, the general label of a text is determined by
evaluating the text's sentence labels.9 Although these algorithms' processing is not too time-consuming,9 the precision
of this method is less than other methods.11
Another category of opinion mining algorithms is machine learning, including supervised and unsupervised learn-
ing. The most known algorithms of this category in opinion mining are Decision Trees,12 Naïve Bayes,13 Stochastic Gra-
dient Descent,14 Support Vector Machines,15 Maximum Entropy,16 Random Forest,17 and Multinomial Naïve Bayes.9
However, using unsupervised learning algorithms, labeled data is not required.18,19
One of the main challenges of the mentioned methods is that the generated structure is less efficient than previous
methods in other languages.20 Neural Networks and Deep Learning have been considered in natural language
processing in recent years.21 The objective of deep learning methods is to simplify complex relationships among
data,22–26 and their reported results are better than those of other traditional algorithms.23
There are not a significant number of studies of opinion mining in the Persian language. The authors in27
merge CNN and Bi-LSTM to classify the polarity of the sentences into five classes. The users' opinions clustering
at the word level is performed in Basiri et al.28 into two positive and negative clusters. The researchers in Bagheri
et al.29 separate the users' opinions into positive and negative using the Naive Bayes classification algorithm. The
effects of different preprocessing methods related to the Persian language for opinion mining are studied in
Asgarian et al.30
In this research, the product's aspects are extracted from the users' comments in Persian language. Two deep neural
networks namely, CNN and LSTM, are combined. Based on previous researches, the performance of this combination
may be better than using each model separately.31 Since the input of neural networks should be a numeric vector, text
data is converted to a vector using Continuous Bag of Words (CBOW). The results are compared with those of Naive
Bayes and Support Vector Machine classifiers and showed satisfying results. The rest of this article is organized as fol-
lows: Section 2 includes a literature review of text mining and deep learning. The proposed model is explained in
Section 3, including a detailed description of preprocessing as a critical issue. Section 4 explains the details of the imple-
mentation, dataset, and evaluation results. Finally, Section 5 concludes the paper and contains some suggestions for the
future.

RELATED WORKS

A framework called CD-ALPHN is proposed in Marcacini et al.32 They use a heterogeneous network, including labeled
aspects, unlabeled aspects, and linguistic features. The network aims to obtain aspects of the data on which labelling is
performed. Then, they extract the aspect using a transduction learning algorithm. The main objective of Luo et al.33
is to extract the most important aspects of a product with the least semantic overlap. First, they detect and gather a set
of aspects. Then, they determine the semantic overlap of the aspects. Finally, the most prominent aspects of a product
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 3 of 13

are presented as output. Their results are compared with those of six other structures, showing that their structure's per-
formance is better than others.
A framework is presented in Rana et al.34 to extract product's implicit aspects. They use sequential rules to deter-
mine the repetitions of two or some words, and the Google search engine is used to determine the similarity of the
words. The Twits related to the battery, screen, and camera of the smartphone class of products is processed in Rathan
et al.35 with particular attention to Emojis. Because most users express their opinions and satisfaction states using
Emojis.
In Yang et al.,36 it is believed that each aspect's meaning is different in each product or service category. For exam-
ple, the “Terrible” word about horror movies shows a positive aspect, while this word in other movie genres may have
a negative aspect. Ontology and relationships among the words in Alfrjani et al. and Salas-Zarate et al.37,38 are used for
opinion mining. The heuristic pattern is used in Asghar et al.39 for aspect extraction in English texts. Summarizing the
users' opinions is performed in Konjengbam et al.40 using ontology relationships among different aspects of a product
and considering an aspect of a product as a subset of another aspect. When searching for an aspect of a product, all
opinions related to the aspect and its subsets are presented.
The authors in Yang et al.41 explain that there may be no opinion about the product's aspects. In other words, they
have a cold start problem. They solve the problem by considering that each aspect may be a subset of another aspect.
The tree structure helps to show the parent-level opinions when there is no opinion about an aspect. The multichannel
CNN framework is used in Da'u et al.42,43 They use Word Embedding and POS tags, and extraction of a product's
aspects for the recommender systems. The users' comments summarization is performed in Hong et al.44 to increase
the speed and precision in the users' decision-making to buy a product. They use LSTM for summarizing the
comments.
The researchers in Huang et al.45 use tree data structures and LSTM to improve the precision for the detection and
presentation of phrases and sentences. They embed syntactic information like POS signs in a tree data structure.
Processing and extracting the products' different aspects are performed in Jiang et al.,46 considering that the users
express their future needs in their comments. Their goal is to adjust a policy for future products based on the varying
requirements of users. They use a fuzzy neural network to determine the users' future needs.
Onan47 indicates that a bidirectional convolutional recurrent neural network architecture with group-wise mecha-
nisms can outperform the state-of-the-art results for sentiment analysis. The approach presented in Onan48 leverages
the strengths of BERT, allowing for better performance on the classification task. The framework is evaluated on sev-
eral benchmark datasets and compared to state-of-the-art methods, demonstrating significant improvements in classifi-
cation accuracy. Onan49 proposed a novel text augmentation framework that leverages Semantic Role Labelling and
Ant Colony Optimization techniques to generate additional training data for natural language processing models. In
Onan et al.,50 a hybrid pruning scheme is proposed based on clustering and randomized search for text sentiment
classification.
In Deshmukh et al.,16 using TF-IDF, the aspects of products in the datasets without a label are obtained. The signifi-
cant note in this research is using POS related to each word. The authors in Zobeidi et al.27 gather the users' comments
about mobile phones and digital cameras. They use CNN to extract the features and CBOW to determine the text opin-
ions as a numerical matrix. The output of CNN is classified as the input of Bi-LSTM. The Arabic language users'
opinions on Twitter are processed in Abdelminaam et al.51 They divide the users' opinions into positive, negative, and
neutral groups and used three datasets of Arabic tweets. Two classification algorithm groups are used, including Deep
Learning (LSTM, GRU, and mLSTM) and Machine Learning (Naive Bayes, K-Nearest Neighbor, Logistic Regression,
Support Vector Machine, Decision Tree, and Random Forest). Their results indicated better performance of the deep
learning methods than the machine learning approaches.
The students' opinions on distance learning websites were studied in Onan et al.52 They gathered more than 93,000
opinions from the coursetalk.com website. Each opinion's score was between 0 and 5 and used for the opinions' label-
ling (positive or negative). They evaluated and compared different machine learning and deep learning algorithms. The
results showed that the deep learning methods were better than other methods, and the LSTM method had the highest
performance. Moreover, the GIoVe results were better than the word2vec and fast Text methods. In Onan,53 the perfor-
mance of different word embedding schemes with several weighting functions has been evaluated in conjunction with
conventional deep neural network architectures. In Onan,54 the RNN with attention mechanism in conjunction
with GloVe word embedding scheme-based representation has obtained a classification accuracy of 98.29%. In Onan,55
five metaheuristic clustering algorithms are evaluated, and the proposed multiple classifier system outperforms the con-
ventional classification algorithms, ensemble learning, and ensemble pruning methods. In Onan,56 the different feature
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 of 13 SHARBATIAN and MOATTAR

selection methods for sentiment classification are described. Experimental evaluations indicate that an aggregation
model is an efficient method and that it outperforms individual filter-based feature selection methods on sentiment
classification. In Onan,57 researchers proposed an extensive comparative analysis of different feature engineering
schemes and five different learners in conjunction with ensemble learning methods.
The method in Sagnika et al.58 combines CNN, LSTM, and Attention Network to classify the sentences into Subjec-
tive and Objective groups. In Basiri et al.,59 the authors have provided a lexicon-based method using machine learning
methods in the Persian language. Today, Opinion Mining is utilized in many fields, such as medical,60,61 determining
future strategies of an organization,62 advertisement,63 social network analysis,64 policy,65 training,66 recommender
systems,43 terrorism detection,67 sarcasm detection,68,69 education,51 economy,70 and news.71

PROPOSED METHOD

Preprocessing

In this research, deep learning is used for classification. Preprocessing is one of the most influential stages in text min-
ing. Many preprocessing stages in different languages are the same, such as removing stop words, POS, stemming, and
spell-checking. Here, the Google search engine is used for spell-checking that the opinions' words are given to Google
as a request one by one. Since the Google search engine has a “Did You Mean” section, it corrects the misspelt words. It
seems like a proper approach to solve spell-checking problems in languages without efficient spell-checking tools. The
preprocessing implementation stages are explained in detail in the experimental results section.
Neural network input should be numerical. Thus, the text data is converted into a vector. To achieve this aim,
Word2Vec is used as one of the most valuable tools for natural language processing. This algorithm was first proposed by
Mikolov et al.72 There are two models to determine the vector related to a word called CBOW and Skip-gram. In both
models, the vector related to each word is determined based on the word's meaning and the relationship between the
word and the surroundings.27 CBOW is used in this research to determine the vector related to each word. After determin-
ing the vector related to each word, the sentences' matrix is created based on the word sequence. Hence, a two-
dimensional matrix is generated for each comment by putting together its words' vectors. The number of words in each
comment is certainly different. Hence, the first dimensions of the comments' matrix are different. To solve this problem,
for the matrixes whose first dimension is less than the minimum considered number, zero padding is applied.73
In this research, the comments are first annotated on their aspects by a group of annotators. For this purpose, at
first, different aspects of a product are determined. Then, the annotators read the comments and determined the aspects
that were expressed in each comment. The considerable point is that annotators only determine at a glance whether an
aspect of the product is expressed or not. The annotators cannot determine which sentence or word specifies which
aspect of the comment.

The classification architecture

The proposed method includes three layers: CNN, LSTM, and densely connected neural networks. CNN is a multilayer
neural network and the primary method for feature extraction in deep learning.73,74 The network is used to extract text
data features that result in improved classification precision.27,74 Based on Du et al.75 and Zhou et al.,76 CNN is a natu-
ral model that the human mind uses for accurate study.
Because of the gradient exploding problem in Recurrent Neural Networks (RNNs), LSTM is used, which maintains
critical information for a long time.31 LSTM is considered a network of modules. Each module consists of three parts,
including an input gate, an output gate, and a forget gate. The goals of these parts are to maintain and update the mem-
ory status.77 These parts of LSTM should maintain helpful information for a long time and destroy unimportant infor-
mation. Using CNN networks is very effective in reducing the problem's dimensions. Moreover, these networks can use
different filters to extract various data features. Each filter has its own kernel, which makes the extraction of relations
among different words easy. It is not required to define the explaining rules for the words' roles in a sentence to extract
these relations. Indeed, a CNN is an ideal tool to extract local features related to data. The internal product of two vec-
tors is required for implementing kernel behavior. Hence, the computational load of this method is very low. The Max-
pooling process leads to the situation of extracting the most significant feature.
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 5 of 13

Words in different sentences can affect each other. A sentence's length may be longer than the kernel size. Hence, the
impacts of different words on each other in a text should be investigated. The words' semantic effects in a text are investi-
gated quickly because of the LSTM neural network architecture. Since LSTM can maintain time dependence over a long
interval, it is usable for many complex problems.78 The LSTM neural network forgets unnecessary data and memorizes
the essential information, like the human brain when studying a text, which forgets unessential words and memorizes the
keywords. Using an LSTM neural network, the keywords' effects on each other in a text are evaluated quickly.
The proposed model is presented in Figure 1. It shows the input data is transformed into vector space after using
Word2Vec algorithms. The vector data is given to the CNN in the second step. Different filters guarantee that the text's
features are easily extracted. The output of the CNN is used as input for the LSTM neural network. The dependence
among the words in a text, even a long text, can be evaluated using LSTM. The LSTM output is given to a fully con-
nected neural network to evaluate the existence of an aspect in a comment. The implementation details will be
explained in the experimental results section.

EXPER IM ENTA L RESU LT

In this section, the stages of data gathering and preprocessing are explained in detail. Then the details of the proposed
model are presented. Finally, the proposed model is evaluated and compared with naive Bayesian and SVM classifica-
tion algorithms.25,26,28,35,45,61

Dataset

In this research, the gathered data is from the Persian Digikala website. On the website, there are 10 different categories
of products to sell. Each category has some subcategories. For example, from “Digital Products” category, “Mobile

FIGURE 1 The proposed architecture for aspect recognition in text (i.e., this mobile phone has little memory).
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 of 13 SHARBATIAN and MOATTAR

Phone” subcategory is selected and the associated comments are gathered. The gathered data is the users' comments
from 15 July 2020 to 25 July 2020 related to 1,432,855 mobile phones. The total number of comments is 4,974,018, and
1,270,009 products are without comments. The most number of comments about a product is 7415, and the most num-
ber of words in a comment is 2792.
The proposed model is implemented using a system with a Core-i7 processor and 8GB RAM. Gathering the HTML
pages' information is performed using the C# programming language. Text data extraction of the HTML pages is done
using HTML Agility Pack. Python3 programming language is used for implementing the neural network and the deep
learning algorithms. The results are saved as HTML pages, and then the correct words are extracted. C# programming
language is used to write the Crawling program. The program is written as Multi-Threading. The number of considered
threads is 5. The time interval considered between two requests of a Thread is 10 s.
The Digikala website considers six aspects for mobile phones, including (1) purchase value rather than cost, (2) facil-
ities and capabilities, (3) ease of use, (4) design and appearance, (5) construction quality, and (6) innovation. Determin-
ing the aspects of a product should not overlap. To this end, 270,000 comments have been investigated. It is determined
which aspects of the product are expressed in each comment. A comment may express none of the product's aspects, or
all aspects of the product may be criticized.
Table 1 shows the details of the comments related to each aspect. #Review column shows the number of comments
related to the considered aspect. Max and Min's columns show the maximum and the minimum number of words in a
comment. Finally, the Avg column presents the average number of a comment's words.

Preprocessing steps

Preprocessing is very effective for generating results. Twelve steps are considered for preprocessing in this research.
Because users are free to write their comments, many users use Emojis to express their opinions. First, all Emojis are
removed from the text. In the second step, all Latin letters are written in lowercase, and the numbers are removed from
the text in the third step.
In the fourth step, punctuation is removed from the text. In the fifth step, consecutive words are corrected. Some-
times, users may repeat a letter of a word as many times as they want in their comments to express the effectiveness of
that word. For example, the word “Good” may be written as “Goodddddddd” to show the feature is so appropriate,
while there are no such words in the dictionary. Here, if a letter in a word is repeated, the word is considered consecu-
tive and is corrected.
In the sixth step, all additional spaces are removed from the text in which there is only one space between the
words. Most times, users do not put a space between Persian and Latin words. Since this research is performed in
the Persian language, in the seventh step, non-Persian words are removed from the comments.
Since numerous comments include misspellings, an essential step is to spell-check the comments. Some researchers
propose methods for misspelling correction in English words, but not for all languages. If the written word is misspelt
in the Google search engine, Google suggests the corrected word by showing “Did You Mean?” To achieve this aim, the
Google search engine is utilized, and using Google-related APIs, all words are given to the search engine as queries. We
can say that the eighth step is the most important one in the preprocessing process.
In the ninth step, all stop words are removed from the comments. In the tenth step, all verbs in the comment text
are removed because all aspects of a product are dependent on nouns. The verbs do not help a product's aspect

TABLE 1 The specifications of the dataset.

Aspect #Review Max Min Avg


Purchase value rather than cost 61,886 5124 2 136
Facilities and capabilities 73,324 5124 2 130
Ease of use 24,168 4077 2 145
Design and appearance 52,374 4077 2 142
Construction quality 78,104 5124 2 135
Innovation 20,310 4077 2 162
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 7 of 13

detection. In the eleventh step, word stems are extracted and utilized. Persian stemming is done using https://github.
com/ICTRC/Parsivar. In the final step, the text data should be put into vector space using CBOW. The length of a word
vector is considered 300. Then, text data is put into vector space. Then, the data is used as the CNN's input. The 12 steps
of preprocessing are shown in Figure 2.

Parameter setup

In this research, six different aspects are considered. The modelling is performed for each aspect separately to reduce
the complexity and performance increment. Therefore, the model's output is only a binary number. Zero Output means
the aspect does not exist (i.e., negative), and 1 means the existence of the aspect in the text (i.e., positive). Indeed, we
run the proposed model for each aspect separately. Another considerable note is that the classes of aspects are highly
imbalanced. It affects the performance of the approach. Thus, in the preparation of the training data for each aspect's
model, equal numbers of positive and negative data are used. Also, 10-fold Cross-Validation is used as the evaluation
method in this paper.
Determining the size of the kernel is required for CNN implementation. The considered kernel's size in this research
is 9, meaning that each word can be related to eight neighboring words. The number of filters considered for CNN is
256. The max-pooling process is performed with window size 4. The output is considered the LSTM network's input.
The number of LSTM units is equal to 100. Finally, the LSTM neural network's output is the input of the fully con-
nected neural network. This neural network's output determines whether the input comment includes the specific
aspect or not.

FIGURE 2 Preprocessing steps.


26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 of 13 SHARBATIAN and MOATTAR

RESULTS

The results of the proposed CNN + LSTM model are compared with naive Bayes and SVM classification algorithms.
Input data in naive Bayesian and SVM algorithms are put in vector space using TF-IDF and CBOW methods. Therefore,
the proposed model's results are compared with TF-IDF + SVM, TF-IDF + NB, CBOW + SVM, and CBOW + NB
models. The considered performance criteria for the model are accuracy, precision, recall, and F-score. The following
figures show the evaluation results of the mentioned models.
Hence, there is only one vector for each text, the inputs of CBOW + SVM and CBOW + NB methods are the aver-
age of the word vectors related to each comment. In all the figures, the Average value shows the classification model's
average performance for all aspects of the product for the considered criterion.
Accuracy focuses on the number of samples that the algorithm is categorized correctly. Figure 3 shows that CBOW
+ SVM and CBOW + NB methods' accuracies are very low. It means that these two methods performance is not excel-
lent and the combined method including Word2Vec algorithms and NB and SVM algorithms does not have desired
results. On the other hand, the accuracy is presented for TFIDF + SVM and TFIDF + NB methods. The TFIDF + SVM
classification method is more accurate than TFIDF + NB for all aspects. The most accuracy is for TFIDF + SVM. It is
74% for the “purchase value rather than price” aspect, while the obtained value of accuracy for the aspect by the pro-
posed method is 72.93.
As Figure 3 denotes, the proposed CNN + LSTM method accuracies in other aspects are more than the TFIDF +
SVM method. Based on the figure, the average accuracy of the proposed CNN + LSTM model is more than other classi-
fication methods. Another considerable note of the figure is that the accuracy difference of the proposed CNN + LSTM
model for different aspects is less than 5%, while this difference for TFIDF + SVM is close to 20% and for TFIDF + NB
method is near 12%. It means that the proposed CNN + LSTM approach is more consistent than the other models.
Another criterion for evaluation is precision. As shown in Figure 4, precision values by CBOW + SVM and CBOW
+ NB algorithms is acceptable. However, TFIDF + SVM and TFIDF + NB classification algorithms result being worse
than the proposed CNN + LSTM model. The results of the TFIDF + SVM algorithm are better than TFIDF +
NB. Figure 4 shows that the precision of the proposed CNN + LSTM model is at least 75%. The difference in average
precision value by the proposed CNN + LSTM model and other classification algorithms is at least 10%.

FIGURE 3 Accuracy of the proposed approach versus other classification models.


26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 9 of 13

FIGURE 4 Precision of the proposed approach versus other classification models.

FIGURE 5 Recall of the proposed approach versus other classification models.

Another investigated criterion is recall, which focuses on really positive data. As Figure 5 shows, CBOW + SVM
and CBOW + NB values are better than TFIDF + SVM and TFIDF + NB. The considerable note is that using the pro-
posed CNN + LSTM model, the minimum recall value is 83%, meaning that the algorithm has labeled 83% of the posi-
tive samples correctly. Figure 5 represents that the recall values for CBOW + SVM, CBOW + NB, TFIDF + NB, and
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 of 13 SHARBATIAN and MOATTAR

FIGURE 6 F-score of the proposed approach versus other classification models.

TFID + SVM are different, while the maximum difference of recall values for various aspects in the proposed CNN
+ LSTM model is 5%.
The last considered criterion is F-Score, which is a combination of recall and precision. Figure 6 shows that the
F-Score value of the proposed CNN + LSTM model is better than other considered classification algorithms. The mini-
mum value of F-Score by the proposed CNN + LSTM model is 75%. The maximum difference of this criterion for the
proposed algorithm is 10% among all the investigated aspects.
All the studied criteria show better results of the proposed CNN + LSTM model than other methods. Moreover, the
average of all criteria is more acceptable when using the proposed CNN + LSTM model, indicating this method can be
used for detecting different aspects of a product in a comment with high confidence.

CONCLUS ION AND FUTURE WORKS

This research proposed a model for aspect recognition in opinion mining. Based on the results, the proposed model is
more effective than other previously proposed approaches like SVM and NB. Moreover, deep learning models provide
the conditions that have eliminated the dependence on a specific language. The results show that CNN and LSTM are
very effective for a product's aspects detection at the document level. In this research, there was a huge volume of the
unlabeled data due to the timely procedure of labeling and annotating. These datasets can be utilizable in a
semi-supervised approach which can be suggested for future works. Moreover, there can be a future study to automati-
cally construct aspects' lexicons based on the results of the proposed model.

A C K N O WL E D G M E N T S
The authors declare that they have no known competing financial interests or personal relationships that could have
appeared to influence the work reported in this article.

DATA AVAILABILITY STATEMENT


The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable
request.
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 11 of 13

ORCID
Mohammad Hossein Moattar https://orcid.org/0000-0002-8968-6744

R EF E RE N C E S
1. Guellil I, Boukhalfa K. Social big data mining: A survey focused on opinion mining and sentiments analysis. Paper presented at: 2015
12th International Symposium on Programming and Systems (ISPS), IEEE. 2015.
2. Keyvanpour M, Karimi Zandian Z, Heidarypanah M. OMLML: a helpful opinion mining method based on lexicon and machine learning
in social networks. Soc Netw Anal Mining. 2020;10(1):1-17.
3. Riaz S, Fatima M, Kamran M, Nisar MW. Opinion mining on large scale data using sentiment analysis and k-means clustering. Cluster
Computing. 2019;22(3):7149-7164.
4. Hemmatian F, Sohrabi MK. A survey on classification techniques for opinion mining and sentiment analysis. Artif Intell Rev. 2019;52(3):
1495-1545.
5. Wang R, Zhou D, Jiang M, Si J, Yang Y. A survey on opinion mining: from stance to product aspect. IEEE Access. 2019;7:41101-41124.
6. Yue L, Chen W, Li X, Zuo W, Yin M. A survey of sentiment analysis in social media. Knowledge Inform Syst. 2019;60(2):617-663.
7. Pradhan VM, Vala J, Balani P. A survey on sentiment analysis algorithms for opinion mining. Int J Comput Appl. 2016;133(9):7-11.
8. Soong H-C, Jalil NBA, Ayyasamy RK, Akbar R. The essential of sentiment analysis and opinion mining in social media: introduction
and survey of the recent approaches and techniques. Paper presented at: 2019 IEEE 9th Symposium on Computer Applications & Indus-
trial Electronics (ISCAIE), IEEE. 2019.
9. Isabelle G, Maharani W, Asror I. Analysis on opinion mining using combining lexicon-based method and multinomial Naïve Bayes.
Paper presented at: 2018 International Conference on Industrial Enterprise and System Engineering (ICoIESE 2018). 2018.
10. Ding X, Liu B, Yu PS. A Holistic Lexicon-Based Approach to Opinion Mining. Paper presented at: Proceedings of the 2008 International
Conference on Web Search and Data Mining. 2008.
11. Vaitheeswaran G, Arockiam L. Combining lexicon and machine learning method to enhance the accuracy of sentiment analysis on big
data. Int J Comput Sci Inform Technol. 2016;7(1):306-311.
12. Gupta S, Jain S, Gupta S, et al. Opinion mining for hotel rating through reviews using decision tree classification method. Int J Adv Res
Comput Sci. 2018;9(2):180-184.
13. Hasan KA, Sabuj MS, Afrin Z. Opinion Mining Using Naive Bayes. Paper presented at: 2015 IEEE International WIE Conference on
Electrical and Computer Engineering (WIECON-ECE), IEEE. 2015.
14. Minab SS, Jalali M, Moattar MH. Online analysis of sentiment on Twitter. Paper presented at: 2015 International Congress on Technol-
ogy, Communication and Knowledge (ICTCK), pp. 359-365. 2015.
15. Ali F, Kwak K-S, Kim Y-G. Opinion mining based on fuzzy domain ontology and support vector machine: A proposal to automate online
review classification. Appl Soft Comput. 2016;47:235-250.
16. Deshmukh JS, Tripathy AK. Entropy based classifier for cross-domain opinion mining. Appl Comput Inform. 2018;14(1):55-64.
17. Onan A, Korukoglu S, Bulut H. Ensemble of keyword extraction methods and classifiers in text classification. Expert Syst Appl. 2016;57:
232-247.
18. Zhao Y, Qin B, Liu T. Clustering product aspects using two effective aspect relations for opinion mining. In: Sun M, Liu Y, Zhao J, eds.
Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (NLP-NABD CCL 2014).
Lecture Notes in Computer Science. Vol 8801. Springer; 2014:120-130.
19. Babu AG, Kumari SS, Kamakshaiah K. An experimental analysis of clustering sentiments for opinion mining. Paper presented at: Pro-
ceedings of the 2017 International Conference on Machine Learning and Soft Computing. 2017.
20. Minab SS, Jalali M, Moattar MH. A new sentiment classification method based on hybrid classification in twitter. Paper presented at:
2015 International Congress on Technology, Communication and Knowledge (ICTCK), pp. 295-298. 2015.
21. Mohamad Beigi O, Moattar MH. Automatic construction of domain-specific sentiment lexicon for unsupervised domain adaptation and
sentiment classification. Knowl Based Syst. 2021;213:106423.
22. Gupta N, Agrawal R. Application and techniques of opinion mining. In: Bhattacharyya S, Snašel V, Gupta D, Khanna A, eds. Hybrid
Computational Intelligence. Elsevier; 2020:1-23.
23. Poria S, Cambria E, Gelbukh A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowl Based Syst.
2016;108:42-49.
24. Akhtar MS, Kumar A, Ekbal A, et al. A hybrid deep learning architecture for sentiment analysis. Paper presented at: Proceedings of
COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. 2016.
25. Dahou A, Xiong S, Zhou J, Haddoud MH, Duan P. Word embeddings and convolutional neural network for Arabic sentiment classifica-
tion. Paper presented at: Proceedings of COLING 2016, the 26th international conference on computational linguistics: Technical papers.
2016.
26. Joshi A, Prabhu A, Shrivastava M, Varma V. Towards sub-word level compositions for sentiment analysis of Hindi-English code mixed
text. Paper presented at: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical
Papers. 2016.
27. Zobeidi S, Naderan M, Alavi SE. Opinion mining in Persian language using a hybrid feature extraction approach based on convolutional
neural network. Multimed Tools Appl. 2019;78(22):32357-32378.
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
12 of 13 SHARBATIAN and MOATTAR

28. Basiri ME, Naghsh-Nilchi AR, Ghassem-Aghaee N. A framework for sentiment analysis in Persian. Open Trans Inform Process. 2014;
1(3):1-14.
29. Bagheri A, Saraee M, de Jong F. Sentiment classification in Persian: introducing a mutual information-based method for feature selec-
tion. Paper presented at: 2013 21st Iranian Conference on Electrical Engineering (ICEE), IEEE. 2013.
30. Asgarian E, Kahani M, Sharifi S. The impact of sentiment features on the sentiment polarity classification in Persian reviews. Cogn Com-
put. 2018;10(1):117-135.
31. Zhang J, Li Y, Tian J, Li T. LSTM-CNN hybrid model for text classification. Paper presented at: 2018 IEEE 3rd Advanced Information
Technology, Electronic and Automation Control Conference (IAEAC), IEEE. 2018.
32. Marcacini RM, Rossi RG, Matsuno IP, Rezende SO. Cross-domain aspect extraction for sentiment analysis: A transductive learning
approach. Decis Support Syst. 2018;114:70-80.
33. Luo Z, Huang S, Zhu KQ. Knowledge empowered prominent aspect extraction from product reviews. Inform Process Manag. 2019;56(3):
408-423.
34. Rana TA, Cheah Y-N, Rana TJAI. Multi-level knowledge-based approach for implicit aspect identification. Appl Intell. 2020;50(12):4616-
4630.
35. Rathan M, Hulipalled VR, Venugopal KR, Patnaik LM. Consumer insight mining: aspect based twitter opinion mining of mobile phone
reviews. Appl Soft Comput. 2018;68:765-773.
36. Yang H-L, Lin Q-F. Opinion mining for multiple types of emotion-embedded products/services through evolutionary strategy. Exprt Syst
Appl. 2018;99:44-55.
37. Alfrjani R, Osman T, Cosma G. A hybrid semantic knowledgebase-machine learning approach for opinion mining. Data Knowl Eng.
2019;121:88-108.
38. Salas-Zarate MdP, Valencia-García R, Ruiz-Martínez A, Colomo-Palacios R. Feature-based opinion mining in financial news: an
ontology-driven approach. J Inform Sci. 2017;43(4):458-479.
39. Asghar MZ, Khan A, Zahra SR, Ahmad S, Kundi FM. Aspect-based opinion mining framework using heuristic patterns. Cluster Comput.
2019;22(3):7181-7199.
40. Konjengbam A, Dewangan N, Kumar N, Singh M. Aspect ontology based review exploration. 2018;30:62-71.
41. Yang Y, Chen C, Qiu M, Bao F. Aspect Extraction from Product Reviews Using Category Hierarchy Information. Paper presented at:
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers.
2017.
42. Da'u A, Salim M, Rabiu I, Osman A. Weighted aspect-based opinion mining using deep learning for recommender system. Expert Syst
Appl. 2020;140:112871.
43. Da'u A, Salim M, Rabiu I, Osman A. Recommendation system exploiting aspect-based opinion mining with deep learning method.
Inform Sci. 2020;512:1279-1292.
44. Hong M, Wang H. Research on customer opinion summarization using topic mining and deep neural network. Math Comput Simula-
tion. 2021;185:88-114.
45. Huang M, Qian Q, Zhu X. Encoding syntactic knowledge in neural networks for sentiment classification. ACM Trans Inform Syst. 2017;
35(3):1-27.
46. Jiang H, Kwong CK, Okudan Kremer GE, Park WY. Dynamic modelling of customer preferences for product design using DENFIS and
opinion mining. Adv Eng Inform. 2019;42:100969.
47. Onan A. Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment
classification. J King Saud Univ Comput Inform Sci. 2022;34(5):2098-2117.
48. Onan A. Hierarchical graph-based text classification framework with contextual node embedding and BERT-based dynamic fusion.
J King Saud Univ Comput Inform Sci. 2023;35:101610.
49. Onan A. SRL-ACO: A text augmentation framework based on semantic role labeling and ant Colony optimization. J King Saud Univ
Comput Inform Sci. 2023;35:101611.
50. Onan A, Korukoglu S, Bulut H. A hybrid ensemble pruning approach based on consensus clustering and multi-objective evolutionary
algorithm for sentiment classification. Inf Process Manag. 2017;53(4):814-833.
51. Abdelminaam DS, Neggaz N, Gomaa IAE, Ismail FH, Elsawy AA. ArabicDialects: an efficient framework for Arabic dialects opinion
mining on twitter using optimized deep. Neural Netw. 2021;9:97079-97099.
52. Onan A. Sentiment analysis on massive open online course evaluations: a text mining and deep learning approach. Comput Appl Eng
Educ. 2021;29(3):572-589.
53. Onan A. Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks. Concurr Comput Pract
Exp. 2021;33(23):e5909.
54. Onan A. Mining opinions from instructor evaluation reviews: a deep learning approach. Comput Appl Eng Educ. 2020;28(1):117-138.
55. Onan A. Biomedical text categorization based on ensemble pruning and optimized topic modelling. Comput Math Methods Med. 2018;
2018:2497471.
56. Onan A, Korukoglu S. A feature selection model based on genetic rank aggregation for text sentiment classification. J Inform Sci. 2017;
43(1):25-38.
57. Onan A. An ensemble scheme based on language function analysis and feature engineering for text genre classification. J Inform Sci.
2018;44(1):28-47.
26895595, 2023, 3, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ail2.86 by Cochrane Saudi Arabia, Wiley Online Library on [13/09/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
SHARBATIAN and MOATTAR 13 of 13

58. Sagnika S, Mishra BSP, Meher SK. An attention-based CNN-LSTM model for subjectivity detection in opinion-mining. Neural Comput
Appl. 2021;33(24):17425-17438.
59. Basiri ME, Kabiri A. HOMPer: A new hybrid system for opinion mining in the Persian language. J Inform Sci. 2020;46(1):101-117.
60. Gopalakrishnan V, Ramaswamy C. Patient opinion mining to analyze drugs satisfaction using supervised learning. J Appl Res Technol.
2017;15(4):311-319.
61. Basiri ME, Abdar M, Cifci MA, Nemati S, Acharya UR. A novel method for sentiment classification of drug reviews using fusion of deep
and machine learning techniques. Knowl Based Syst. 2020;198:105949.
62. Yoon B, Jeong Y, Lee K, Lee S. A systematic approach to prioritizing R&D projects based on customer-perceived value using opinion
mining. Technovation. 2020;98:102164.
63. Sanchez-Núñez P, Cobo MJ, de las Heras-Pedrosa C, Pelaez JI, Herrera-Viedma E. Opinion mining, sentiment analysis and emotion
understanding in advertising: A bibliometric analysis. IEEE Access. 2020;8:134563-134576.
64. Hou Z, Cui F, Meng Y, Lian T, Yu C. Opinion mining from online travel reviews: A comparative analysis of Chinese major OTAs using
semantic association analysis. Tourism Manag. 2019;74:276-289.
65. Bose R, Dey RK, Roy S, Sarddar D. Analyzing political sentiment using twitter data. In: Satapathy S, Joshi A, eds. Information and Com-
munication Technology for Intelligent Systems. Springer; 2019:427-436.
66. Kastrati Z, Arifaj B, Lubishtani A, Gashi F, Nishliu E. Aspect-based opinion Mining of Students' reviews on online courses. Paper pres-
ented at: Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence. 2020.
67. Azizan SA, Aziz IA. Terrorism detection based on sentiment analysis using machine learning. J Eng Appl Sci. 2017;12(3):691-698.
68. Onan A. Topic-enriched word embeddings for sarcasm identification. Paper presented at: Proceedings of 8th Computer Science on-Line
Conference on Software Engineering Methods in Intelligent Algorithms, Vol. 18, Springer. 2019.
69. Onan A, Toçoglu MA. A term weighted neural language model and stacked bidirectional LSTM based framework for sarcasm identifica-
tion. IEEE Access. 2021;9:7701-7722.
70. Xing FZ, Cambria E, Welsch RE. Intelligent asset allocation via market sentiment views. IEEE Comput Intell Mag. 2018;13(4):25-34.
71. Balahur A, Steinberger R, Kabadjov M, et al. Sentiment analysis in the news. Paper presented at: Proceedings of the 7th International
Conference on Language Resources and Evaluation (LREC'2010), pp. 2216-2220. Valletta, Malta, 19-21 May 2010. 2013.
72. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. Part of
Advances in Neural Information Processing Systems (NIPS 2013). 2013.
73. Ajit A, Acharya K, Samanta A. A review of convolutional neural networks. Paper presented at: 2020 International Conference on Emerg-
ing Trends in Information Technology and Engineering (ic-ETITE), IEEE. 2020.
74. She X, Zhang D. Text classification based on hybrid CNN-LSTM hybrid model. Paper presented at: 2018 11th International Symposium
on Computational Intelligence and Design (ISCID), IEEE. 2018.
75. Du J, Gui J, Xu R, He Y. A convolutional attention model for text classification. Paper presented at: National CCF Conference on Natu-
ral Language Processing and Chinese Computing, Springer. 2017.
76. Zhou C, Sun C, Liu Z, Lau FCM. A C-LSTM Neural Network for Text Classification. Cornell University; 2015.
77. Bai X. Text classification based on LSTM and attention. Paper presented at: 2018 Thirteenth International Conference on Digital Infor-
mation Management (ICDIM), IEEE. 2018.
78. Greff K, Srivastava RK, Koutnik J, Steunebrink BR, Schmidhuber J. LSTM: A search space odyssey. IEEE Trans Neural Netw Learn Syst.
2016;28(10):2222-2232.

How to cite this article: Sharbatian K, Moattar MH. Deep aspect extraction and classification for opinion
mining in e-commerce applications using convolutional neural network feature extraction followed by long short
term memory attention model. Applied AI Letters. 2023;4(3):e86. doi:10.1002/ail2.86

You might also like