You are on page 1of 7

Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)

IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

Comparative Analysis of Rotten Tomatoes Movie


Reviews using Sentiment Analysis
2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS) | 978-1-6654-1035-9/22/$31.00 ©2022 IEEE | DOI: 10.1109/ICICCS53718.2022.9788287

Kanishk Soni (2K18/SE/071) Palak Yadav (2K18/SE/092) Rahul


Dept. of Software Engineering Dept. of Software Engineering Dept. of Software Engineering
Delhi Technological University Delhi Technological University Delhi Technological University
New Delhi, India New Delhi, India New Delhi, India
kanishksoni_2k18se071@dtu.ac.in palakyadav_2k18se092@dtu.ac.in rahul@dtu.ac.in

Abstract— For many people, deciding on a movie night We can see various areas where opinion mining is
begins with a search of online review sites. Consumer employed such as Analysis of Movies, Products which
information has been redefined by social media platforms.
Regarding movies, one such platform is Rotten Tomatoes, are commercially used, varying kinds of services, news
which is focused on in this project. The goal is to test and so on.
various machine learning methods for predicting the
sentiment of unseen reviews using an upgraded corpus with Subtasks include:
more information about the opinions for various sub- a. Subjective analysis: This is one of the
categories. The Amazon Mechanical Turk-annotated categories of the subtasks of Sentiment Analysis. Here
corpus of review of movies mentioned on Rotten Tomatoes after examining the text we classify them as objective or
is used, which is considerably optimised and annotated with subjective. Objective means that the meaning hidden
a high emotion attribute. This work tests if tighter inside the text is neutral. Whereas in the subjective class
sentiment annotations for each root and span for a set of
characters that is parsed into a training set can aid in
of the text, it is inclined towards the positive or negative
determining general emotion or sentiment or opinion of side up to a certain degree.
unnoticed phrases. b. Polarity Analysis: This method is used in the
text which is classified as subjective. Here we have two
Keywords- Naive Bayes, Sentiment analysis, Support Vector sub-categories. The text can be positive or negative
Machine, Rotten Tomatoes, Deep Learning, Movie Reviews, which should be decided after considering the polarity of
Neural Network, Natural Language Processing, Machine the meaning concealed in the text.
Learning c. Degree of Polarity: Sentiment analysis as the
name suggests is the investigation and tracking of the
mood of the customers towards a product or service or
I. INT RODUCT ION topic by collecting their feedback. Now, the polarity
degree expresses the text’s degree of inclination towards
Sometimes it is necessary to comprehend the point of
the negative or positive side. Sentiment analysis is used
view of any creators or authors over the subject instead
of considering the subject itself. Sentiment analysis is a here in a way to identify the attention of the customers
gained by the movie and the impact it had on audience.
part of Data Mining technology. It is becoming a highly
researched topic among the researchers. It is used in the
field of collecting the users’ feedback over a particular With the developing prevalence of online entertainment
stages which incorporate Facebook and survey sites, for
subject. It is also very necessary to understand the
example, Zomato or Rotten Tomatoes, it's basic to have
feedback by getting insights of it. This can be achieved
using NLP. the option to naturally translate gigantic measures of
emotionally one-sided information. Sentiment analysis,
which groups abstract human assessments or opinions
Today in this ever growing competitive world where it is
highly required to maintain the pace with other utilizing regular language handling and AI procedures, is
rapidly acquiring prevalence as an approach to breaking
competitors, businesses and organizations are diving
down huge corpora for an assortment of utilizations.
deep into the application of NLP so that they can use
various features of it which may prove beneficial for
In this review, we gather emotional human opinions
them as well as the business. Their main motto is to
from subsets of concentrates, down to the feelings of
understand the customer feedback, views and opinions
over services and products and use it in an effective individual words. Fusing the feeling of individual words
into learning calculations would provide us with a
manner so that their objectives can be achieved. Users
superior comprehension of how the opinion of the whole
post various comments and reviews on social media for
any type of service or product they use. Those views can example is created from its constituents. We trust that
be classified into positive, negative or neutral comment. better grained opinion investigation will further develop
exactness on clever models.
Sometimes it is necessary to examine the traits of
movies. At that time features are extracted and the
meaning or sentiment hidden inside them are
investigated.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1494

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

II. RELATED WORK T ABLE I. SUMMARY OF PREVIOUS WORK

Research about sentiment analysis on film audits from


IMDB has been led with Information Gain (IG)
screening components, NB, K-Nearest Neighbor (KNN),
SVM and Random Forest. The exploration expressed
that utilizing IG with a greater edgeworth can assist with
offering some incentive better exhibition. However, as
the edge value is increased, the number of items
included to process in SVM decreases, making SVM
more difficult to construct a classification model.

Utilizing Feature Extraction strategy, a few features are


removed utilizing Bag of Words, which incorporates
terms like TF-IDF and Term Frequency. In the
mentioned work, feature extraction based on Lexicon is
to extricate the accompanying features from the client
audits: Positive word Count, Positive Connotation
Count, Negative word Count, and Negative Connotation
Count. Directed learning calculations, for example,
Naïve Bayes, Maximum Entropy, SVM and KNN are
utilized.

Recognizing semantics or even the importance of the


available content using ML calculation is quite a
difficult undertaking. The use of Lexicon features has
been done to eliminate the evaluations conveyed in the
text. Mockery identification is one of the significant
benefits of picking lexicon features. While another
research work [13] zeroed in on the emoticons/emojis as
well as the slangs that are available in the text to
recognize the emotion. Grasping the opinion of the
available text or the polarity of it, helps in recognizing
the general extremity of the text.

In another work, both ML and lexicon ways were


utilized to deal with perform opinion examination on
Twitter information. Exceptional lexicon features, for
example, Emoticons, N-grams, Punctuations, Elongated
words number, and so on, have been used. Utilization of
the above mentioned features expanded the general
precision. The primary benefit of utilizing the lexicon
features is that it captures the significance or semantics
communicated in audits in this manner adding to the
powerful classification.

Minhoe Hur et al. [15] established the basis for


predicting Boxoffice selection based on audit Sentiments
of the film. They used viewer perceptions as information
variables in addition to indicators, as well as 3 AI-based
methods to obtain a quasi-relationship among the
assortment indications and films.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1495

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

III. CORPUS OF REVIEWS OF A MOVIE Preprocessing - Data Preprocessing is the method of


declutter the data from unnecessary and insignificant data
We complete our tasks using an upgraded version of a elements. During this process, the portion of data is
previously published movie critic database and compare separated which is in fact useful and makes more sense,
the results to those of the original data. and on which actual processing, that is, training, and
testing will take place. The output is more reasonable
form of the original or initial dataset and is usable and
A. Defining the Dataset
structured, instead of initial unstructured dataset. The
Reviews of audiences regarding the movies they nature and type and noise present in the data must be
watched are often written and posted on online platforms checked beforehand and then apply actual mining
like IMDb and Rotten Tomatoes. Since they are very techniques.
popular and impactful on various platforms of social
media, we have chosen the dataset from Rotten 1) Stopwords are the phrases or character sequences
Tomatoes. which are easy and quick to be disposed of from a text.
Stop words, for instance, an, the, and contribute nothing
to the comprehension and meaning of the text. They are
B. Polarity of the Sentence insignificant for the data mining and analysis process.
The database of samples of reviews given by the users Then, at that point, we'll check lowercase and capitalized
on Rotten Tomatoes website. To generate a better- letters out to further ease out our efforts and bring the
quality version of the dataset, we started by removing whole dataset into consistent behaviour. Each of the
any remaining HTML elements and returning the letters are changed to lowercase. We likewise dispose of
authentic capitalization of the words. The Stanford characters that aren't helpful, aren't in the public space,
Parser was used to parse the cleansed sentences into a or aren't applicable.
binarized tree. Then ratings for sentiments of each of the 2) Text can be applied upon a scheme and this
spans of binarized trees were collected. process is known as annotations. There are mainly two
parts in annotations which are tagging of part of speech
and markup of structure.
C. Collection of Sentiments
3) Schemes consist of terms and these terms can be
We utilized the Amazon Mechanical Turk (AMT) mapped or translated. Lemmatization, stemming are
innovation to gather emotional human opinion some types of the standardizations which are mainly
appraisals. The expression related with each range of the used for reduction of linguistics.
parsed trees was introduced to something like three 4) Feature analysis on a dataset can be done through
human adjudicators, who were approached to rate a manipulating, generalization and statistical probing.
feeling score on a size of 1 (generally pessimistic) to 25
(best). The opinion rating for that expression would be
found the middle value of and standardized to 0 to 1 to The three main components of Pre-Processing are:
substitute Rotten Tomatoes' twofold sentence extremity  Tokenization
as the ground truth.  Normalization
 Noise removal
This has been shown in the figure 2 below:

Fig.1 Steps to building a classifier

Fig.2 Main components of Preprocessing of Text


The steps to build a classifier as show in figure 1 are
described below.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1496

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

D. Tokenization F. Noise Removal


Trimming of large character sequences or strings into The framework had a substitution task which is
small pieces, called tokens, is known as Tokenization. continued by the noise removal. The two preprocessing
So, this applies to all types of cutting. A large volume of tasks that we went through above were tokenization and
text can be tokenized to be converted into sentences. normalization. Those two were able to be applied on any
Long sentences can be tokenized and converted into kind of chunk of text data. Now we are talking about
words. Whatever methods of preprocessing are to be noise removal. This technique doesn’t apply on all types
applied on a certain portion of text, are done only after, of text but is task specific.
once the tokenization of the same has been done. There
are two more names given to tokenization which are text Dataset Statistics- We divided our dataset into training,
segmentation and lexical analysis. Nowadays, cross-validation, and test sets in a 7:1:2 ratio, which we
transforming a large volume of texts into smaller units used for the majority of this paper. Later on, we'd run our
such as paragraphs or sentences is also referred to as jobs through 10-fold cross-validation to see how they
segmentation. So this leaves tokenization which is now compared to other published results.
reduced to breaking a large volume of texts into words
which is the smallest unit of text. An examination of the typical split sets is shown here
to prove and verify the current test findings, that they are
not distorted due to an unequal ratio or split of positive
and negative terms in each subset of data. The fraction of
E. Normalization
data in the dataset, scored as perfectly neutral, neither
Normalization of text has to be performed before any positive nor negative (AM of 0 and 1, i.e., 0.5), that is
further processing. Some tasks that are part of going to be summarily executed upto positive rating with
normalization are: text is transformed in one case a type of Boolean data for prediction/computation of the
throughout, either uppercase or lowercase; punctuation tasks, is shown in figure 3 and figure 4.
removal is carried out, all the numbers are converted into
their word form in English like 1 will be converted to
one; and many more tasks. After normalization, we have
normalized all the words in our data to a uniform level
and now we can proceed with further work on all the
data in a normal and uniform way. There are three
unique steps defined as normalization:

1) Stemming: Removal of affixes from a word is


called stemming. After stemming we get the word stem
which is rid of all suffixes, prefixes, infixes, circumfixes
etc. For example: running is converted to run after
stemming is performed on it.

2) Lemmatization: Word’s lemma forms the basis of


the canonical form of a word. Lemmatization captures
that form of word and brings the word to its simplest
form. It works like stemming but is used when the word Figure 3: T he normalised AMT sentiment values of the sentence-level
is not in its simplest form canonically. For example, nodes are distributed. T ake note of the distribution's bimodal shape,
stemming won’t work on “worst” but lemmatization which suggests that most sentences are either positive or negative.
would easily work on “worst” and convert it into “bad”.

3) Miscellaneous: So far, we have seen that


lemmatization and stemming are very important
practices for text preprocessing. They are based on
intricate rules of grammar and aren’t just simple
transformations without any set norms. The grammatical
norms serve as rules for stemming and lemmatization.
Given that, there are other steps which can be followed
to bring all the words of a dataset on the same level to
proceed with other processes of text mining. These steps
consist of basic techniques of removal, substitution,
changing case of alphabets, and trimming of extra white
spaces. They also become an important part of the
process.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1497

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

Figure 5: T he proportion of positive/(neutral)/negative evaluations in


the dataset is represented by these bars. The proportion of sentence-level
pos/neg ratings is shown in bars 1-3 for each set, while the same ratings
Figure 4: T he normalised AMT sentiment ratings are distributed over determined for all-span are presented in bars 4-5 of each group. T he
all phrases in the sample. The neutral score has a high peak, indicating a initial Rotten Tomatoes labelling is used for the first bars in each group,
large volume of sentences with no significant directional sentiment. while the rounded AMT labelling is used for the rest. T he 2 nd and 4 th
bars in respective sets show the data percentage as well, scored like
perfectly neutral (AM = 0.5), that has been rounded off as a positive
The dataset taken for the training, validation and opinion in our studies. If all the neutral data from each set was deleted,
the 3 rd and 5 th columns for every set reflect now computed the splits for
testing, consists of the sentences in number of 10215, in negative and positive.
which number of unique phrases are 211245 and out of
them the number of unigrams are 21437.
IV. METHODOLOGY
As per the splitting of the dataset into training set, we
have assigned 7215 number of sentences, which is Sentiment Analysis - Sentiment Analysis is a
composed of unique phrases, 153969 in number. methodology used to decide whether a lump of text is
positive, negative or unbiased. In message investigation,
The set divided for cross -validation consists of 1000 normal language handling (NLP) and AI (ML) strategies
number of sentences which has almost balanced negative are joined to allot feeling scores to the themes, classes or
and positive division and they also contain 25284 number elements inside an expression. Sentiment analysis focuses
of unique phrases. around the extremity of a text (good, pessimistic,
unbiased) however it additionally goes past extremity to
Considering the test dataset, this contains around 2000 recognize explicit sentiments and feelings (irate, blissful,
sentences in number, which also has equal/balanced miserable, and so on), criticalness (pressing, not dire) and
positive and negative division. This test set contains even goals (intrigued or not intrigued).
47483 number of unique phrases in all.
The root sentiment of the entire excerpt will be referred
Distribution of the evaluations in the dataset to as the root sentiment in the rest of this work, and the
categorised as positive, negative and neutral for all sentiments of the subspans will be referred to as all-
groups are presented in the figure 5 below. spans.
The methods that we have used for training and
evaluation are described below:

A. Multinomial Naive Bayes (MNB)


Nowadays the amount of data present over the internet
has increased exponentially and it becomes necessary to
classify them so as to generate the desired results.
Classification of documents is a necessity. Some of the
categories in which we can categorize are - politics,
technology, genres, sports etc. Here Multinomial Naive
come into picture. It is widely used in such types of
applications. Here the classifiers used the occurrence or
frequency of the words present in the document as their
predictors or features.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1498

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

B. Support Vector Machine (SVM) After performing the 10-cross validation, the results
obtained are as follows:
"Support Vector Machine" is a machine learning model
that classifies some given data into pre-defined
categories, and thus it is a supervised learning method. T ABLE III. RESULT OF 10-FOLD CROSS VALIDAT ION
The classification predictions made via SVM can be
used for various purposes and artificial intelligent # SVM - uni MNB - uni
products. Notwithstanding, for the majority of times, its 1 73.1 75.9
2 72.6 76.1
usage is in the issues of order categorisation. While 3 73.2 75.4
implementing SVM and performing calculation, all the 4 73.8 75.2
data and information provided is considered as a point in 5 74.1 76.3
the space with n-layers, n being the number of attributes 6 73.6 75.9
or elements within the given dataset. The points, 7 73.2 74.5
representing data, are plotted there and each of them has 8 72.9 73.8
9 73.8 74.2
a specific direction. As it is a classification algorithm, 10 74.3 74.9
we observe the hyper-planes which are separating the Average 73.4 75.2
whole dataset into two or more classes in the overall
picture.
In the above results, we observed that this model was
slightly bent towards predicting number of positives more
C. Deep Learning than the number of negatives. This was observed due to
Machine learning has a branch called deep learning. the rounding off the ratings in the initial arrangement
Unlike typical machine learning algorithms, which have from 0 to 1, causing an unequal ratio or balance in the
a fixed ability to learn regardless of the amount of data polarity of the mentioned dataset, basically converting the
they collect, deep learning systems may increase their neutral data too into positive.
performance with additional data: the computer
equivalent of more experience. Machines can be put to In the next step, we have further made some
work for specialized tasks such as text categorization improvisations and optimized the Support Vector
when they have accumulated enough knowledge through Machine model for the sentiment analysis of Rotten
deep learning. Tomatoes reviews. This was done by varying the
regularization parameter for the model. For the best
Deep learning is a method that executes and takes model that would give better accuracy and correct results,
decisions and implement them for multiple applications the regularization parameter was found to be C = 0.08
using the L2-loss and L2-regularization. The percentages
and reasonings, that can even produce and develop
numerous computerized services/products/applications. found for TP, TN, FP, FN were as follows:
Most importantly, it does not require any human
interference or command to decide or perform any T ABLE IV. RESULT SHOWING T P, T N, FP AND FN
service that requires insights and thinking.
True False
The ratings determined during the training process of this Positive 39.6% 13.7%
dataset are impacting whole scale, and the model is Negative 32.4% 11.3%
computed around the same approximately. Therefore, Accuracy 77.5%
another technique is to scale using computed probability Precision 74.7%
and take these intervals into use for each rating from the Recall 79.2%
training set. F-Ratio 76.8%

V. RESULT & ANALYSIS

After performing sentiment analysis using the mentioned


algorithms for unigram and bigram models, the accuracy
obtained is present in the table below.

T ABLE II. ACCURACY RESULTS OF MNB AND SVM


MODELS

Algorithm Accuracy (%)


MNB - uni 74.9
MNB - bi 76.1
SVM - uni 73.0
SVM - bi 74.7

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1499

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.
Proceedings of the Sixth International Conference on Intelligent Computing and Control Systems (ICICCS 2022)
IEEE Xplore Part Number: CFP22K74-ART; ISBN: 978-1-6654-1035-9

[4] He,W., Zha, S. and L. Li, 2013, Social Media Competitive


Analysis and T ext Mining: A Case Study in T he Pizza Industry.
International Journal of Information Management, 33, Issue 3, 464–
472.
[5] Sentiment Analysis Using Naïve Bayes Classifier,
Kavya Suppala, & Narasinga Rao, 8 June, 2019
[6] Sentiment analysis of twitter data using machine learning
approaches and semantic analysis, Geetika Gautam & Divakar Yadav,
2014
[7] Sentiment Analysis of T witter Data: A Survey of T echniques,
Vishal. A. Kharde& Prof. Sheetal Sonawane, 2016
[8] Pang, B., & Lee, L. (2005). Seeing stars: Exploiting class
relationships for sentiment categorization with respect to rating scales.
In Annual meeting-association for computational linguistics (Vol. 43,
p. 115)
[9] H. Kaur, V. Mangat and Nidhi, "A survey of sentiment analysis
techniques," 2017 International Conference on I-SMAC (IoT in Social,
Mobile, Analytics and Cloud) (I-SMAC), 2017, pp. 921-925, doi:
10.1109/I-SMAC.2017.8058315.
[10] Rahul, V. Raj and Monika, "Sentiment Analysis on Product
Reviews," 2019 International Conference on Computing,
Communication, and Intelligent Systems (ICCCIS),2019, pp.
5-9, doi:10.1109/ICCCIS48478.2019.8974527.
[11] Daeli, N. O. F., & Adiwijaya, A. Sentiment Analysis on Movie
Reviews using Information Gain and K-Nearest Neighbor. Journal of
Data Science and ItsApplications, 3(1), 1-7. 2020
[12] Rahul, H. Kajla, J. Hooda and G. Saini, "Classification of
Online T oxic Comments Using Machine Learning Algorithms," 2020
4th International Conference on Intelligent Computing and Control
Systems (ICICCS), 2020, pp. 1119-1123, doi:
10.1109/ICICCS48265.2020.9120939.
[13] A. G. Prasad, S. Sanjana, S. M. Bhat, and B. S. Harish.
“Sentiment analysis for sarcasm detection on streaming short text
data.” In Knowledge Engineering and Applications (ICKEA), 2017,
2nd International Conference on, pp. 1-5. IEEE, 2017.
[14] O. Kolchyna, T. T. P. Souza, P. Treleaven, and T. Aste. “Twitter
sentiment analysis: Lexicon method, machine learning method and
their combination.” arXiv preprint arXiv:1507.00955. 2015.
[15] Minhoe Hur Seoul National University “Box-office forecasting
based on sentiments of movie reviews and Independent subspace
method”, Information Sciences,2016
[16] S. Naseem, T . Mahmood, M. Asif, J. Rashid, M. Umair and M.
Figure 6: Plot of TP, TN, FP & FN Shah, "Survey on Sentiment Analysis of User Reviews," 2021
International Conference on Innovative Computing (ICIC), 2021, pp.
As per the results obtained after implementing Deep 1-6, doi:10.1109/ICIC53490.2021.9693029.
Learning technique on this dataset, the accuracy turned [17] Sudhir, P. and Suresh, V.D., 2021. Comparative study of
various approaches, applications and classifiersfor sentiment analysis.
out to be less than the two other models tested before, Global Transitions
that is, 58.6%. We also implemented the same with Proceedings, 2(2), pp.205-2
various different values of metrics and other factors, but
this model didn’t perform better or well in comparison to
the Multinomial Naïve Bayes and Support Vector
Machine models.

REFERENCES
[1] Cambria, E., Schuller, B., Xia, Y. and C. Havasi, 2013, New
Avenues in Opinion Mining and Sentiment Analysis. IEEE Intelligent
Systems, 28, 2.
[2] Cvijickj, I.P. and F. Michahelles, 2011, Understanding Social
Media Marketing: A Case Study on T opics, Categories and Sentiment
on a Facebook Brand Page. MindT rek '11 Proceedings of the 15th
International Academic MindT rek Conference: Envisioning Future
Media Environments, Finland. 175182
[3] Younis, E.M.G., 2015, Sentiment Analysis and T ext Mining for
Social Media Microblogs Using Open Source T ools: An Empirical
Study. International Journal of Computer Applications, 112, 5.

978-1-6654-1035-9/22/$31.00 ©2022 IEEE 1500

Authorized licensed use limited to: b-on: ISCTE. Downloaded on March 25,2023 at 15:08:39 UTC from IEEE Xplore. Restrictions apply.

You might also like