Bsc. Project Chapter 2

NAME: OLATUNJI FATAI ABIODUN
MATRIC NUMBER: 20001571

PROJECT TOPIC: APPLICATION OF LONG SHORT-TERM MEMORY
FOR SENTIMENT ANALYSIS OF COVID-19 TWEETS
CHAPTER TWO
2.1 Introduction
Sentiment analysis is the automated process of identifying and classifying subjective information
in text data. This might be an opinion, a judgment, or a feeling about a particular topic or product
feature.
The most common type of sentiment analysis is ‘polarity detection’ and involves classifying
statements as positive, negative or neutral.
Sentiment analysis uses Natural Language Processing (NLP) to make sense of human language,
and machine learning to automatically deliver accurate results.
Natural Language Processing (NLP) is a field of Artificial Intelligence (AI) that makes human
language intelligible to machines. NLP combines the power of linguistics and computer science
to study the rules and structure of language, and create intelligent systems (run on machine
learning and NLP algorithms) capable of understanding, analyzing, and extracting meaning from
text and speech.
Natural Language Processing (NLP) allows machines to break down and interpret human
language. It’s at the core of tools we use every day – from translation software, chatbots, spam
filters, and search engines, to grammar correction software, voice assistants, and social media
monitoring tools.
NLP is used to understand the structure and meaning of human language by analyzing different
aspects like syntax, semantics, pragmatics, and morphology. Then, computer science transforms
this linguistic knowledge into rule-based, machine learning algorithms that can solve specific
problems and perform desired tasks.
Machine learning (ML) is a branch of artificial intelligence (AI) that enables computers to
self-learn and improve over time without being explicitly programmed. In short, machine
learning algorithms are able to detect and learn from patterns in data and make their own
predictions.
In traditional programming, someone writes a series of instructions so that a computer can
transform input data into a desired output. Instructions are mostly based on an IF-THEN
structure: when certain conditions are met, the program executes a specific action.
Machine learning, on the other hand, is an automated process that enables machines to solve
problems and take actions based on past observations.
Long Short Term Memory Networks is a specific type of Recurrent Neural Network (RNN)
that is very effective in dealing with long sequence data and learning long term dependencies.
Long short-term memory (LSTM) units or blocks are part of a recurrent neural network
structure. Recurrent neural networks are made to utilize certain types of artificial memory
processes that can help these artificial intelligence programs to more effectively imitate human
thought.
The recurrent neural network uses long short-term memory blocks to provide context for the way
the program receives inputs and creates outputs. The long short-term memory block is a complex
unit with various components such as weighted inputs, activation functions, inputs from previous
blocks and eventual outputs.
The unit is called a long short-term memory block because the program is using a structure
founded on short-term memory processes to create longer-term memory. These systems are often
used, for example, in natural language processing. The recurrent neural network uses the long
short-term memory blocks to take a particular word or phoneme, and evaluate it in the context of
others in a string, where memory can be useful in sorting and categorizing these types of inputs.
In general, LSTM is an accepted and common concept in pioneering recurrent neural networks.
Twitter boasts 330 million monthly active users (Ying Lin 2020), which allows businesses to
reach a broad audience and connect with customers without intermediaries. On the downside,
there’s so much information that it’s hard for brands to quickly detect negative social mentions
that could harm their business.
2.2 Review of Related Work
2.2.1 Sentiment Analysis and Opinion Mining from Social Media
Due to the huge growth of social media on the web, opinions extracted in these media are used
by individuals and organizations for decision making. Each site contains a large amount of
opinionated text which makes it challenging for the user to read and extract information (G. U.
Vasanthakumar, et al, 2016). This problem can be overcome by using sentiment analysis
techniques. The main objective of sentiment analysis is to mine sentiments and opinions
expressed in the user generated reviews and classifying it into different polarities. The output is
the data annotated with sentiment labels. Machine learning techniques are widely used for
sentiment classification (N. Godbole, M. Srinivasaiah, and S. Skiena, 2016). For a specific
domain D, sentiment data Xi and Yi denoting data Xi has polarity Yi. If the overall sentiment
expressed in Xi is positive, then Yi is +1, else -1. Labelled sentiment data is a pair of sentiment
text and its corresponding sentiment polarity fXi,Yig. If Xi is not assigned with any polarity data
Yi, then it is an unlabelled sentiment data. In supervised sentiment classification methods,
classifiers are trained using labeled data from a particular domain. Semi Supervised classification
method, combines unlabeled data along with few labeled training data to construct the classifier
(S. Li, C.-R. Huang, G. Zhou, and S. Y. M. Lee, 2010).
Applications: There is a variety of information in the form 77 of news blogs, twitter etc.. are
available in social media about different products. Sentiment Analysis can summarize and give a
score that represents the opinion
of that data. This is used by customers depending on their need. There are a number of
applications of sentiment analysis and opinion mining. The area where Sentiment Analysis is
used is in Finance, Politics,
Business and public actions. In business Domain, Sentiment analysis is used to detect the
customer's interest in their product. Sentiment analysis in the political domain is used to get
clarity on the politician's position. Opinion Mining is also used to find the public interest on the
newly applied rules by the government. Motivation: Current trend is to look for opinions and
sentiments in the product reviews that are available in large scale in social media. Before making
a decision, we tend to look at the sentiment analysis results of the opinion given by different
users. This helps any customer to decide his opinion on that product. As data available in large
scale, it is a laborious process to look into all the user opinions. Hence Sentiment analysis is
required. The main Objective of sentiment analysis is to classify the sentiment into different
categories. Fig 2.1, shows the overall architecture of the sentiment analysis. Document level,
sentence level and aspect level are the different levels of sentiment classification. Classifying
each document into positive or negative classes is called document-level sentiment classification.
While expressi- ng the sentiment of a document by this type of classifier, it assumes that
document contains the opinion of the user about a single object. Aspect level sentiment analysis
classifies the opinion about a document assuming that the opinion is expressed about different
aspects in a document.
Sentiment classifiers, designed using data from one domain may not work with high accuracy if
the same is used to classify the data from a different domain. One of the main reasons is that the
sentiment words of a domain can be different from another domain. Thus, Domain adaptations
are required to bridge the gaps between domains. The Domain used to train the classifier is
called source domain and the domain to which we apply the trained classifier is called the target
domain. The advantage of this method is that we need some or no labeled data of the target
domain, where labeled data is costly as well as in-feasible to manually label the reviews for each
domain type. This type of classification is called Cross Domain Sentiment Classification.
Heterogeneous domain adaptation is required when domains of different dimensions are input to
the topic adaptive sentiment classifier.

Fig: 2.1Architecture of SentimentAnalysis
Sentiment classifiers can be broadly classified into machine learning and lexicon based. Machine
learning algorithms are used in machine learning approaches. These algorithms can work in
supervised, semi-supervised or unsupervised learning methods. Supervised learning methods
give more accurate results compared to semi-supervised and unsupervised learning methods,
but it requires labeled data which is an expensive and time consuming process. Semi-supervised
approach uses Easy Adapt (++[EA++]) which is easier than the Easy Adapt [EA] which requires
labeled data from source and target domain. This is because it uses both labelled and unlabeled
data from the target domain which results in superior performance
theoretically and empirically over EA and hence it can be efficiently used for preprocessing (J.
Jiang and C. Zhai, 2007). Lexicon based approach utilizes Sentiment lexicon to analyze the
sentiments in a review. Lexicon based approach can use a dictionary or corpus to classify the
sentiment words. Due to the shortage of labeled data, a single classifier can be designed to
classify reviews from different domains. But a classifier designed to classify data from one
domain may not work efficiently on another domain.
This is due to domain specific words which are different for every domain.
Support vector machines and Naive Baye's classifiers are the important classifiers that support
machine learning approach. Support vector machines classify data by finding hyper-planes that
separates into different classes. Naive Baye's classifier is a probabilistic classifier based on Bayes
theorem and the strong independence between the features. As there is a shortage of labeled data,
a single classifier can be designed to classify reviews from different domains. But a classifier
designed to classify data from one domain may not work efficiently on another domain. This is
due to domain specific words which are different for different domains.
2.2.2 Sentiment Analysis as a Service: A social media-based sentiment analysis
framework
Social media platforms, i.e., social information services, such as Facebook, Twitter, etc., have
emerged as a source of free public data (A. Dingli, et, al, 2015). During any emergency event, a
large number of users rapidly generate and share the data by using social information services.
Thus, these users are interpreted as human sensors, i.e., social sensors (S. Takeshi, et, al, 2010).
The data generated by social sensors has two beneficial features:
a. It is composed of the subjective information (e.g., sentiments and opinions) of social
sensors.
b. It contains the spatio- temporal information of social sensors.
Sentiment analysis facilitates to extract and understand human dynamics such as behaviors,
trends, attitudes and emotions from the subjective information (J.Guerrero, et, al, 2015). In
addition, the spatio-temporal information in the social sensor’s data provides the promising
opportunity to gain insights of human’s activities based on geographical locations (M. Hwang,
et, al, 2013). Thus, combining both features can benefit to understand sentiments and emotions
based on various geo-locations.
Despite the several benefits of social information services, it has met with some serious
challenges. Social information services often contain a lot of noise, i.e., irrelevant and
unnecessary data. Moreover, there are diverse types of social information services available
online. These services provide various features and impose different limitations (e.g., text length)
for the data sharing. As a result, social information services have diverse data characteristics
such as size, quality, etc. Thus, various types of social information services require different
mechanisms (e.g., tools and algorithms) for extracting the useful information. Although, there
are several online tools available for sentiment analysis, however, they only focus on
general-purpose search and analysis. Moreover, many online tools are dedicated to a single
information service. Thus, end users may need to use multiple tools in an ad-hoc manner. Using
various tools is time consuming and provides inconsistent views of the social sensors’ data(S.
Wan and C. Paris, 2014).
In the paper, ‘Sentiment Analysis as a Service’ (SAaaS) is proposed as a framework that
abstracts sentiments from multiple social information services, analyses and transforms into
useful information, and delivers as a service. We classify social information services by using
various properties of the social sensors’ data. SAaaS uses this classification to dynamically
compose services for noise removal, geo-tagging (e.g., location extraction) and sentiment
extraction. Finally, the results are presented in various formats, i.e., maps, charts.
SAaaS uses a generic information composition approach to compose the social sensors’ data as a
service from multiple sources for sentiment analysis. Traditional approaches do not consider
different types and characteristics of social information services for sentiment analysis. On the
contrary, SAaaS takes into account the different properties such as data size, type, etc., and
dynamically composes appropriate services for sentiment analysis. In particular, we focus in the
domain of disease surveillance. However, our framework is not limited to disease surveillance
and it can be applied to other domains applicable to sentiment analysis. The main contribution of
this work is as follows:

● A service framework that exploits the spatio-temporal properties of social information
services to monitor epidemic outbreaks via sentiment analysis. Our framework includes
a new service model for composite and component services for sentiment analysis.
● We present a classification of social information services based on the social sensors’
data properties. We propose a social information service classification driven service
composition mechanism to compose services for sentiment analysis
● We present a new service quality model to evaluate the quality of social information
services.
2.3 Social Media and Crisis Events
During a crisis, whether natural or man-made, people tend to spend relatively more time on
social media than the normal. As the crisis unfolds, social media platforms such as Facebook and
Twitter become an active source of information (Imran M, et, al, 2015) because these platforms
break the news faster than official news channels and emergency response agencies (Imran M, et,
al, 2020). During such events, people usually make informal conversations by sharing their
safety status, querying about their loved ones’ safety status, and reporting ground level scenarios
of the event (Imran M, et, al, 2015). This process of continuous creation of conversations on such
public platforms leads to accumulating a large amount of socially generated data. The amount of
data can range from hundreds of thousands to millions (Kalyanam J, et, al, 2016). With proper
planning and implementation, social media data can be analyzed and processed to extract
situational information that can be further used to derive actionable intelligence for an effective
response to the crisis. The situational information can be extremely beneficial for the first
responders and decision-makers to develop strategies that would provide a more efficient
response to the crisis.
In recent times, the most used social media platforms for informal communications have been
Facebook, Twitter, Reddit, etc. Amongst these, Twitter, the microblogging platform, has a
well-documented Application Programming Interface (API) for accessing the data (tweets)
available on its platform. Therefore, it has become a primary source of information for
researchers working on the Social Computing domain. Earlier works have shown that the tweets
related to a specific crisis can provide better insights about the event. In the past, millions of
tweets specific to crisis events such as the Nepal Earthquake, India Floods, Pakistan Floods,
Palestine Conflict, Flight MH370, etc., have been collected and made available (Imran M, et, al,
2016). Such Twitter data have been used in designing machine learning models for classifying
unseen tweets to various categories such as community needs, volunteering efforts, loss of lives,
and infrastructure damages. The classified tweets corpora can be
a. trimmed or summarized and sent to the relevant department for further analysis,
b. used for sketching alert-level heat maps based on the location information
contained within the tweet metadata or the tweet body.
Similarly, Twitter data can also be used for identifying the flow of fake news. If miss-information
and unverified rumors are identified before they spread out on everyone’s news feed, they can be
flagged as spam or taken down.
Further, in-depth textual analyses of Twitter data can help
a. discover how positively or negatively a geographical region is being textually-verbal
towards a crisis,
b. understand the dissemination processes of information throughout a crisis.

2.4 Novel Coronavirus (COVID-19)
As of July 17, 2020, the number of Novel coronavirus (COVID-19) cases across the world had
reached more than thirteen million, and the death toll had crossed half a million (Worldometer,
2020). States and countries worldwide are trying their best to contain the spread of the virus by
initiating lockdown and even curfew in some regions. As people are bound to work from home,
social distancing has become a new normal. With the increase in the number of cases, the
pandemic’s seriousness has made people more active in social media expression. Multiple terms
specific to the pandemic have been trending on social media for months now. Therefore, Twitter
data can prove to be a valuable resource for researchers working in the thematic areas of Social
Computing, including but not limited to sentiment analysis, topic modeling, behavioral analysis,
fact-checking and analytical visualization.
Large-scale datasets are required to train machine learning models or perform any kind of
analysis. The knowledge extracted from small datasets and region-specific datasets cannot be
generalized because of limitations in the number of tweets and geographical coverage. Therefore,
this paper introduces a large-scale COVID-19 specific English language tweets dataset,
hereinafter, termed as the COV19 Tweets Dataset. As of July 17, 2020, the dataset has more than
310 million tweets and is available at IEEE DataPort (Lamsal R, 2002). The dataset gets a new
release every day. The dataset’s geo version, the GeoCOV19Tweets Dataset, is also made
available [29]. As per the stats reported by the IEEE platform, the datasets (Lamsal R, 2020)have
been accessed over 74.5k times, collectively, worldwide.

2.5 Artificial Intelligence
Artificial intelligence enables computers and machines to mimic the perception, learning,
problem-solving, and decision-making capabilities of the human mind.
In computer science, the term artificial intelligence (AI) refers to any human-like intelligence
exhibited by a computer, robot, or other machine. In popular usage, artificial intelligence refers
to the ability of a computer or machine to mimic the capabilities of the human mind—learning
from examples and experience, recognizing objects, understanding and responding to language,
making decisions, solving problems—and combining these and other capabilities to perform
functions a human might perform, such as greeting a hotel guest or driving a car.
After decades of being relegated to science fiction, today, AI is part of our everyday lives. The
surge in AI development is made possible by the sudden availability of large amounts of data and
the corresponding development and wide availability of computer systems that can process all
that data faster and more accurately than humans can. AI is completing our words as we type
them, providing driving directions when we ask, vacuuming our floors, and recommending what
we should buy or binge-watch next. And it’s driving applications—such as medical image
analysis—that help skilled professionals do important work faster and with greater success.
As common as artificial intelligence is today, understanding AI and AI terminology can be
difficult because many of the terms are used interchangeably; and while they are actually
interchangeable in some cases, they aren’t in other cases. What’s the difference between artificial
intelligence and machine learning? Between machine learning and deep learning? Between
speech recognition and natural language processing? Between weak AI and strong AI? This
article will try to help you sort through these and other terms and understand the basics of how
AI works.
2.5.1 Artificial intelligence, machine learning, and deep learning
The easiest way to understand the relationship between artificial intelligence (AI), machine
learning, and deep learning is as follows:
Think of artificial intelligence as the entire universe of computing technology that exhibits
anything remotely resembling human intelligence. AI systems can include anything from an
expert system—a problem-solving application that makes decisions based on complex rules or
if/then logic—to something like the equivalent of the fictional Pixar character Wall-E, a
computer that develops the intelligence, free will, and emotions of a human being.
Machine learning is a subset of AI applications that learns by itself. It actually reprograms
itself, as it digests more data, to perform the specific task it's designed to perform with
increasingly greater accuracy.
Deep learning is a subset of machine learning applications that teaches itself to perform a
specific task with increasingly greater accuracy, without human intervention.

Fig 2.2: Artificial intelligence
2.5.2 Machine Learning
Machine learning applications (also called machine learning models) are based on a neural
network, which is a network of algorithmic calculations that attempts to mimic the perception
and thought process of the human brain. At its most basic, a neural network consists of the
following:
a. An input level, where data enters the network.
b. At least one hidden level, where machine learning algorithms process the inputs and
apply weights, biases, and thresholds to the inputs.
c. An output layer, where various conclusions—in which the network has various degrees of
confidence—emerge.
Fig 2.3: Basic Neural Network
Machine learning models that aren’t deep learning models are based on artificial neural networks
with just one hidden layer. These models are fed labeled data—data enhanced with tags that
identify its features in a way that helps the model identify and understand the data. They are
capable of supervised learning (i.e., learning that requires human supervision), such as periodic
adjustment of the algorithms in the model.
2.5.3 Deep Learning
Deep learning models are based on deep neural networks—neural networks with multiple hidden
layers, each of which further refines the conclusions of the previous layer. This movement of
calculations through the hidden layers to the output layer is called forward propagation. Another
process, called backpropagation, identifies errors in calculations, assigns them weights, and
pushes them back to previous layers to refine or train the model.
Fig 2.4: Deep Neural Network
While some deep learning models work with labeled data, many can work with unlabeled
data—and lots of it. Deep learning models are also capable of unsupervised learning—detecting
features and patterns in data with the barest minimum of human supervision.
A simple illustration of the difference between deep learning and other machine learning is the
difference between Apple’s Siri or Amazon’s Alexa (which recognize your voice commands
without training) and the voice-to-type applications of a decade ago, which required users to
“train” the program (and label the data) by speaking scores of words to the system before use.
But deep learning models power far more sophisticated applications, including image recognition
systems that can identify everyday objects more quickly and accurately than humans.

Bsc. Project Chapter 2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bsc. Project Chapter 2

Uploaded by

Copyright:

Available Formats

NAME: OLATUNJI FATAI ABIODUN

MATRIC NUMBER: 20001571

statements as positive, negative or neutral.

and machine learning to automatically deliver accurate results.

text and speech.

problems and perform desired tasks.

In traditional programming, someone writes a series of instructions so that a computer can

problems and take actions based on past observations.

blocks and eventual outputs.

that could harm their business.

2.2 Review of Related Work

2.2.1 Sentiment Analysis and Opinion Mining from Social Media

Yi, then it is an unlabelled sentiment data. In supervised sentiment classification methods,

(S. Li, C.-R. Huang, G. Zhou, and S. Y. M. Lee, 2010).

score that represents the opinion

used is in Finance, Politics,

the topic adaptive sentiment classifier.

supervised, semi-supervised or unsupervised learning methods. Supervised learning methods

data from the target domain which results in superior performance

domain may not work efficiently on another domain.

2.2.2 Sentiment Analysis as a Service: A social media-based sentiment analysis

The data generated by social sensors has two beneficial features:

a. It is composed of the subjective information (e.g., sentiments and opinions) of social

b. It contains the spatio- temporal information of social sensors.

based on various geo-locations.

Wan and C. Paris, 2014).

In the paper, ‘Sentiment Analysis as a Service’ (SAaaS) is proposed as a framework that

this work is as follows:

data properties. We propose a social information service classification driven service

composition mechanism to compose ser- vices for sentiment analysis

2.3 Social Media and Crisis Events

response to the crisis.

and infrastructure damages. The classified tweets corpora can be

contained within the tweet metadata or the tweet body.

flagged as spam or taken down.

Further, in-depth textual analyses of Twitter data can help

a. discover how positively or negatively a geographical region is being textually-verbal

b. understand the dissemination processes of information throughout a crisis.

fact-checking and analytical visualization.

been accessed over 74.5k times, collectively, worldwide.

problem-solving, and decision-making capabilities of the human mind.

As common as artificial intelligence is today, understanding AI and AI terminology can be

2.5.1 Artificial intelligence, machine learning, and deep learning

learning, and deep learning is as follows:

Machine learning is a subset of AI applications that learns by itself. It actually reprograms

increasingly greater accuracy.

specific task with increasingly greater accuracy, without human intervention.

2.5.2 Machine Learning

a. An input level, where data enters the network.

apply weights, biases, and thresholds to the inputs.

adjustment of the algorithms in the model.

2.5.3 Deep Learning

pushes them back to previous layers to refine or train the model.

Fig 2.4: Deep Neural Network

You might also like