Bsc. Project Chapter 1

NAME: OLATUNJI FATAI ABIODUN
MATRIC NUMBER: 20001571

PROJECT TOPIC: APPLICATION OF LONG SHORT-TERM MEMORY
FOR SENTIMENT ANALYSIS OF COVID-19 TWEETS
1.1 Introduction
Coronaviruses are a large family of viruses that are known to cause illness ranging from the
common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS) and
Severe Acute Respiratory Syndrome (SARS).
COVID-19 derived from SARS-CoV-2 is currently spreading dramatically worldwide and
causing millions of infections and deaths amongst the human population (Liu, et al., 2020).
SARS-CoV-2 was detected in China in late 2019 in a general seafood marketplace and
eventually infected millions of people (Velavan & Meyer, 2020). The situational reports of the
World Health Organisation’s (WHO) statistics have indicated that the number of confirmed cases
exceeds 44 millions, and the number of deaths exceed 1,175,000 worldwide (WHO, 2020) with
about 28,940,373 recoveries. Accurate insights into COVID-19 can only be obtained when the
pandemic ends as literature and statistics are proliferating, and keeping data updated is nearly
impossible (Hamzah, et al., 2020). On 28 February 2020, the WHO launched emergency
protocols in all medical and public health systems because of the severity and risks of
COVID-19 (Epidemiol, 2020). COVID-19 is not the first global pandemic. Several different
viruses and pandemics, including Ebola (McMullan, 2020), Mers-Cov and SARS, have occurred
in the past. Medical doctors and medical researchers have earnestly dealt with these pandemics,
and their efforts have not been in vain (Elder, Johnston, Wallis, & Crilly, 2020). Nevertheless,
with the current trends of technologies, especially the role of computer science, computer
technologies have fairly shown their contribution to medical decisions, such as infectious
diseases and outbreaks (Bhat, et al., 2020; Soliman, Tabak, & Sciences, 2020). Historical data
are utilised in the process, and an increase in the availability of data enables researchers to
generate better decisions and conclusions (Pan, et al., 2020). Current and genuinely reasonable
sources for obtaining these data include social media platforms that provide available data more
than ever before. Interestingly, these data serve as the basis for conducting opinion mining and
sentiment analysis.
Sentiment analysis is the automated process of analyzing text data and sorting it into sentiments
positive, negative, or neutral. Using sentiment analysis tools to analyze opinions in Twitter data
can help companies understand how people are talking about their brand.
Long Short Term Memory Networks is a specific type of Recurrent Neural Network (RNN) that
is very effective in dealing with long sequence data and learning long term dependencies.
Twitter boasts 330 million monthly active users (Ying Lin 2020), which allows businesses to
reach a broad audience and connect with customers without intermediaries. On the downside,
there’s so much information that it’s hard for brands to quickly detect negative social mentions
that could harm their business.
This is why social listening, which involves monitoring conversations on social media platforms,
has become a key strategy in social media marketing.
Listening to customers on Twitter allows companies to understand their audience, keep on top of
what’s being said about their brand, and their competitors, and discover new trends in the
industry.
Social and statistical studies have shown that these applications influence human behaviours,
given the users’ length of time spent on them, which ranges from hours per week to daily use
(Statista, 2019). Despite the large data presented on these social media platforms, their content
may have contradictory effects, which range from negative psychological influence on people’s
lives to positive psychological influence on people’s lives (Crawford, 2009). People who are
addicted to social media likely unleash and share opinions and ideas across these platforms
(Jansen, Sobel, & Cook, 2010). Subsequently, turning these opinions and posts into assets is
highly valuable. Discovering a Tweet or Facebook post may be possible with millions of likes
and retweets, but this massive interaction with such a post does not reflect its importance or the
emotions of users who participate in the post because of many factors, such as the nature of
posts, including negation and irony (Ji, Chun, Wei, Geller, & Mining, 2015); happiness and
sadness (K. Ali, et al., 2017); anger (Ji, et al., 2015); positive and negative (Zarrad, Jaloud, &
Alsmadi, 2014); concern, surprise, disgust or confusion (Ji, Chun, & Geller, 2016); and the
massive numbers of tweets (Gayo-Avello, et al., 2013).
1.2 Statement of Problem
Computer technologies provide profound opportunities to fight infectious disease outbreaks
(Eysenbach, 2003; Goldschmidt, 2020) and have a remarkable role, especially in sentiment
analysis for social media (Singh, Singh, & Bhatia, 2018); this importance is due to their
tremendous role in analysing public sentiment. Various research articles have indicated that
many outbreaks and pandemics could have been promptly controlled if experts considered social
media data (Singh, et al., 2018). Therefore, application of long short-term memory for sentiment
analysis of COVID-19 tweets are important based on recent events. COVID-19 remains a
controversial global topic in social media (Pastor, 2020). This review aimed to examine the role
of sentiment analysis in the occurrence of COVID-19 and other previous infectious diseases via a
systematic review protocol that involved previous research-related efforts adopted for a span of
10 years.
1.3 Aim and Objectives

The aim of this research work is to apply long short-term memory for sentiment analysis of
COVID-19 tweets to clearly improve public information dissemination practices. The objectives
of this research works are to
a. Gather tweets about COVID-19
b. Convert raw twitter tweets into into valuable information by using sentiment
analysis
c. Determine the sentiments of people to COVID-19 by analyzing tweets related to
COVID-19 pandemic
1.4 Significance of the Project
The importance of this research work are:
a. it will help to determine the perspective of people to COVID-19 diseases
b. It will help enlighten members of the public about COVID-19 diseases
c. It will provide live updates about COVID-19 diseases
1.5 Methodology
Data (here tweets) collection is the foundation of this research work. The experiment starts with
large scale tweets collection. Feature expression fragments words from short phases or texts to
represent distinct categorical properties. Text will be specifically converted to numeric features
through binary encoding. In this scheme, we will create a vocabulary by looking at each distinct
word in the complete data corpus. For each tweet, the output of this particular mechanism will be
a vector of binary size n, where n will be the total number of words that will be produced as our
cumulative vocabulary. Initially, all entries in the vector will be 0. If the word in the given
document exists in the vocabulary then the vector element at that position is set to 1.
Next, the whole corpus will be split into a 70:30 ratio for training and testing.

Bsc. Project Chapter 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bsc. Project Chapter 1

Uploaded by

Copyright:

Available Formats

NAME: OLATUNJI FATAI ABIODUN

MATRIC NUMBER: 20001571

1.3 Aim and Objectives

You might also like