You are on page 1of 14

Sustainable Cities and Society 71 (2021) 102993

Contents lists available at ScienceDirect

Sustainable Cities and Society


journal homepage: www.elsevier.com/locate/scs

SNS Big Data Analysis Framework for COVID-19 Outbreak Prediction in


Smart Healthy City
Abir EL Azzaoui, Sushil Kumar Singh, Jong Hyuk Park *
Department of Computer Science and Engineering, Seoul National University of Science and Technology, (SeoulTech), Seoul 01811, Republic of Korea

A R T I C L E I N F O A B S T R A C T

Keywords: Nowadays, the world is experiencing a pandemic crisis due to the spread of COVID-19, a novel coronavirus
COVID-19 disease. The contamination rate and death cases are expeditiously increasing. Simultaneously, people are no
Smart Healthy City longer relying on traditional news channels to enlighten themselves about the epidemic situation. Alternately,
Big Data Analysis
smart cities citizens are relying more on Social Network Service (SNS) to follow the latest news and information
SNS
NLP
regarding the outbreak, share their opinions, and express their feelings and symptoms. In this paper, we propose
an SNS Big Data Analysis Framework for COVID-19 Outbreak Prediction in Smart Sustainable Healthy City,
where Twitter platform is adopted. Over 10000 Tweets were collected during two months, 38% of users aged
between 18 and 29, while 26% are between 30 and 49 years old. 56% of them are males and 44% are females.
The geospatial location is USA, and the used language is English. Natural Language Processing (NLP) is deployed
to filter the tweets. Results demonstrated an outbreak cluster predicted seven days earlier than the confirmed
cases with an indicator of 0.989. Analyzing data from SNS platforms enabled predicting future outbreaks several
days earlier, and scientifically reduce the infection rate in a smart sustainable healthy city environment.

1. Introduction assuring the availability of the services for current and future genera­
tion. Enhancing the QoL is only possible with advanced transportation,
By the end of 2019, the Wuhan Municipal Health Commission in governance, finance, education, and health services. Under the same
China declared accumulated cases of patients with pneumonia symp­ conditions, the United for Smart Sustainable Cities (U4SSC) initiative
toms [1]. A few days later, the World Health Organization (WHO) [3] reported the healthcare system as one of the crucial and main Key
recorded the first virus outbreak. This flagship technical report named Performance Indicators (KPI) for a successful SSC. The noticeable
the novel virus COVID-19 (Corona Virus Disease 2019). Since then, progress of healthcare services and technologies, named Smart Health­
global media and healthcare organizations have raced to publish and care, have a direct contribution with the improvement of smart cities in
deliver information and guidance to the public. On January 13, 2020, general. The better the service provided by smart healthcare are, the
Thailand officially confirmed one COVID-19 case, the first contamina­ more competently the life is in a smart city. Respectively, the smart
tion outside China. Other countries were infected as well such as South sustainable cities are developed into Sustainable Smart Healthy Cities
Korea and Japan. On March 11, 2020, WHO declared a COVID-19 (SSHC). Researchers and SSHC’s citizens were looking forward to the
pandemic, calling for a public quarantine, raising the virus counter­ measures will be taking to cope with the sudden virus outbreak and how
measures to the highest levels, and shutting down several international it will affect the sustainability and development of their smart cities.
airports to control the outbreak considering the alarming levels of Singapore, as an example of SSHC deployed state-led technologies such
spread and severity. as TraceTogether application and SafeEntry system for digital surveil­
With the continuous spread of the virus across the globe, the concept lance during the outbreak of the virus [4], which seemingly contributed
of Smart Healthy Cities becomes under the scoop. A Smart Sustainable to controlling the fast propagation of the virus. Notwithstanding, the
City (SSC) is the approach of allotting the progress of Information and COVID-19 epidemic have affected some characteristics of SSHC and
Communication Technology (ICT) and other innovative means to serve identified some of its weaknesses [5]. Due to the large population of
and improve the Quality of Life (QoL) of its citizens [2], all while SSHC, the human proximity narrowed down the social distancing

* Corresponding author.
E-mail addresses: abir.el@seoultech.ac.kr (A. EL Azzaoui), sushil.sngh001007@seoultech.ac.kr (S.K. Singh), jhpark1@seoultech.ac.kr (J.H. Park).

https://doi.org/10.1016/j.scs.2021.102993
Received 26 November 2020; Received in revised form 8 March 2021; Accepted 2 May 2021
Available online 7 May 2021
2210-6707/© 2021 Elsevier Ltd. All rights reserved.
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

measures. Moreover, SSHC also faced the problem of micro-mobility sharing various types of recent information and news regarding the virus
spaces [6–8]; in order to reduce the infection rate and to keep up with outbreak. SNS users produce extensive spatial and temporal data. These
the sustainability demand, smart cities need to provide more spaces for data are the main pillar for infodemiology scientists. The term info­
walking and diminish the public transportation usage. demiology refers to the novel science portfolio of information science
Antithetically, following the advancement of smart cities in the area and epidemiology to address the pressing concerns for public health and
of 5 G networks, people are more connected via their smart devices policy decisions [16–18]. With an increasing number of people turning
including smartphones, tablets, computers, smart TVs and so on. The to social media for information and to express their sentiments, fears,
study in [9] disclosed that the average time spent on the smartphone by and opinions during this epidemic [19], SNS platforms function as a
an adult is 232 minutes per day. Similarly, 90% of adolescents in Nor­ convenient source of information [43], yet only a few systematic studies
way were found to use SNS platforms such as Facebook, Twitter, and have been conducted using data generated by social media.
Instagram daily [10]. People are exposed to billions of information and A smart and fully sustainable healthy city commits to utilize ICT and
news regarding their interests, making SNS platforms the main source of alike technologies in order to improve the QoL of its citizens and
news particularly during a pandemic situation due to social distancing enhance its services while maintaining the required sustainability. Thus,
measures and quarantine [11]. Multiple Social Networking Services predicting future clusters of the virus outbreak and preventing fake news
reported an increase in usage. As one of the most used social media from further propagation are critical actions to reduce the burdens faced
platforms, Facebook has reported a 50% increase in overall messaging by SSHC. To this end, and to exploit the openly shared SNS data, we
when WHO announced the pandemic status [12]. WhatsApp, too, has propose an SNS big data analysis framework related to COVID-19 using
disclosed a 40% increase in the same period. As a general overview of Twitter API as a platform, Python, and NLP methods to predict potential
the statistics, usage of social media has increased by more than 9% future cases and virus outbreak hotspots based on the users’ openly
during the COVID-19 pandemic [13]; this consists of 321 million new shared data including location and symptoms. SNS users provide regular
users, with the worldwide total at 3.80 billion users. People are no updates about their health status and concerns regarding the virus,
longer relying on classic media sources such as government news or including their locations such as home, work, or school, and frequently
local healthcare providers to inform themselves during a pandemic sit­ visited places. Moreover, we analyze the shared information on SNS to
uation such as the current COVID-19. Instead, they tend to trust social detect false information and suppress its spread as it is considered one of
media platforms and messenger services more, where they can express the main and direct reasons the virus transmission increases. Fake news
their opinion and enlighten themselves about the current status is covering official healthcare providers’ voices and government ad­
momentarily. For instance, online headlines and hashtags during the visers on COVID-19, making it extremely difficult to deliver the right
month of January mostly deal with fear and prejudice against the Chi­ instructions [20]. Researchers from the University of East Anglia
nese people [14]. modeled the effect of fake news and misinformation on the propagation
Online platforms were the main source of information as people of infectious diseases, and results showed that reducing misleading
usually do not watch television news or read the newspapers, leading to advice and misinformation by just 10% noticeably reduced people’s
a COVID-19 infodemic situation. The term infodemic was defined by risky behaviors that directly cause the virus spread. Other misleading
Zarocostats et al. [15] as the high demand for timely, trustworthy in­ information such as news of fake lockdowns directly contributes to
formation about a novel virus. Fig. 1 presents the graphical chart of the public panic. We have witnessed several cases around the world where
number of mentions of the hashtag (a type of metadata tag used on social people were rushing to buy food supplies following fake news of lock­
networks) “#coronavirus” on the Twitter platform from January (the down, breaking social distancing rules and leading to further virus
beginning of the outbreak) until March (the peak of the outbreak). We propagation [45].
noticed that the searching rate using the keyword “coronavirus” was Very limited researches have been conducted on SNS data for
even higher than the number of hashtags mentions. This means that COVID-19 alike crisis. Besides, existing researches’ focus point is either
Twitter users were actively reading the news and viewing content sentimental analysis, fake news detection, or virus outbreak prediction.
regarding COVID-19 to keep themselves informed. However, via this proposed study, we strongly believe that big data
SNS platforms such as Twitter contain valuable data and informa­ analysis of SNS information is crucial and critical application for SSHC
tion. Users, for instance, may disclose their symptoms online and request during the time of epidemic to control the situation, manage the circu­
advice from other users. They may as well mention their locations, lated and shared information, prevent misinformation from spreading

Fig. 1. Coronavirus hashtag’s report on Twitter.

2
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

further, and predict mainly future potential outbreak locations, hot­ relationship among key variables. Shahi et al. [25] presented a multi­
spots, and patients. lingual cross-domain dataset of 5182 fact-checked news articles for
The following are the main contributions of our study: COVID-19. Their proposal included a classifier for fake news automatic
detection. The authors collected articles and news related to COVID-19
• We proposed an SNS big data analysis framework based on users’ from two sources - “Poynter” and “Snopes” - and performed explanatory
openly shared information wherein we analyze the keywords and analysis on it. They have a well-covered article that is linked to social
sentences related to COVID-19 tweeted and openly by the users media accounts including Twitter, Facebook, Reddit, YouTube, and
relying on public accounts. This investigation stimulated and Instagram. The proposed machine learning classifier recorded an F1-
improved government and healthcare providers’ perception of peo­ score of 0.76. This result helps in the initial screening of propagation of
ple’s opinions as well as the tendency to stick to social distancing and fake news and misinformation during the pandemic period, nonetheless,
follow the rules given by the World Health Organization. this work didn’t provide a method for Twitter infodemic managing and
• Predicting COVID-19 outbreak and detecting future potential pa­ considered human annotation categories only for tree languages.
tient’s clusters by analyzing the tweets based on the keywords Following the same concept of detecting fake news on social media,
database: These keywords include basic COVID-19 symptoms such as and in order to gain early insight into the social opinion during a global
dry cough and high body temperature. SNS users tend to share their pandemic, Shahi et al. [26] conducted an exploratory study on the
symptoms as well as those of their companions to get advice from propagation and authors of misinformation on Twitter. They divided the
other users or simply to express themselves. We accumulated a false claims they found into two categories - false news and partially
database for two months containing public tweets, to predict the false news - and found that verified accounts including celebrities and
whereabouts of a possible outbreak. organizations are involved as well in creating or spreading misinfor­
• Detecting fake news and managing infodemic by comparing the mation by retweeting them. The results of this mapping study showed
tweets to fact-checking websites to detect the circulated fake news the huge gap in the current scientific coverage of the topic, covered the
and calculate the propagation’s coefficient: Fake news contributes most shared fake news, yet the used dataset excluded the less viral
directly to the virus spread. Thus, we believe that infodemic man­ misinformation. Similarly, Massaad et al. [27] designed social media
agement is mandatory to control the epidemic in SSHC. data analytics in order to describe and analyze the volume, content, and
geospatial distribution of tweets associated with telehealth during the
The rest of this paper is organized as follows: in the second section, COVID-19 pandemic. They have inquired about Twitter public data to
we present a summary of other existing studies; Section 3 presents our access tweets and analyze them using Natural Language Processing
proposed framework overview and explains in detail our flow diagram; (NLP) and unsupervised learning methods. The study was conducted in
the fourth section contains the numerical results and analysis based on one country, results showed that such platform must be used to evaluate
the Twitter case study; finally, we conclude this work in the fifth section. the needs of societies and to embrace the healthcare response during a
pandemic. Yet, fake news was not taking under consideration and the
2. Related works data based used for analyzing the results included all the information
form Twitter. Likewise, Pui et al. [28] discussed how the content of
The recent advancement of smart cities and 5 G technologies con­ tweets about COVID-19 can help understand the virus outbreak based on
tributes directly to the increase of SNS and mobile users. Since the sentimental analysis, however, this work lacks results development.
Quality of Experience guaranteed by the 5 G network is relatively high, While Jia et al. [29] emphasized on the usefulness of the geographic
users in the SSHC rely more and more on their phones to be informed information system to fight this epidemic. The authors discussed in their
daily instead of classic news channels. Taking the Twitter platform as an paper how the national intelligent syndromic surveillance system could
example, Monthly Active Users (MAU) in Japan numbered over 21 identify early risk by detecting patient’s symptoms and locations.
million, and they communicate and share information via Twitter [21]. Other researchers focused on the big data analysis of social media for
An average of 6000 tweets per second was recorded in May 2020, and various motivations but did not deal with the COVID-19 topic. None­
200 billion tweets per year [22]. People rely more on SNS platforms to theless, we included their works in the related work section as they
inform themselves about the pandemic. Nonetheless, very few studies helped build the main pillar of data collection and analysis on social
considered investigating and analyzing the news and information media platforms. Analogously, Htet et al. [30] conducted social media
broadcast on social media to understand the propagation of the virus. In data analysis using a maximum entropy classifier on the big data pro­
this section, we discuss some of the works that have explored social cessing framework. They considered Twitter as the platform and
networks to derive the outbreak patterns and people’s willingness to retrieved the health condition, education status, and states of business
contribute to the virus control [23]. using data mining techniques, and a maximum entropy classifier was
used to perform sentiment analysis on their tweets, yet the Hadoop MR
2.1. Seminal contribution method used has some drawbacks in the case of the transaction between
input and output. Following the same concept, Barbosa et al. [31] pre­
To the best of our knowledge, only five studies have considered sented a robust sentiment detection method using Twitter as a platform
analyzing social network data and news to understand the propagation from Biased and noisy data. The authors leveraged sources of noisy la­
of misinformation and its correlation with the virus outbreak. For bels as training data. The results obtained showed lower error rates as an
instance, Yoo et al. [24] investigated the effects of SNS communication effective polarity classifier was produced even when only a small
during an epidemic period such as Middle East Respiratory Syndrome number of data were provided, nonetheless, sentences with antagonistic
(MERS) and argued that it can predict preventive behavioral intentions sentiments cannot be analyzed.
in South Korea. The authors examined the theoretical expression and
reception effect model using data collected from a nationally represen­ 2.2. Key consideration for COVID-19 prediction model
tative panel survey. Results showed that public health organizations
need to adapt SNS monitoring tools to a better understanding of diverse In this section, we discuss similar contributions and related work,
social groups’ opinions and requests in order to adjust their messages compare related research studies (Table 1), and present our main key
based on it. This research certainly proposed the management of SNS consideration for our proposed SNS-based COVID-19 outbreak predic­
platforms such as Twitter for outbreak cluster control, however, the tion in a smart healthy city. Based on the abovementioned related works,
authors didn’t consider the propagation of fake news. Moreover, the we could say that authors focused more on fake news detection and
proposed method cannot make casual inferences regarding the public sentiment determination. In our approach, however, we tried to

3
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Table 1
Related work comparison
Research work Year Twitter Other Method/ SNS Sentiment Fake News Limitations
Platform Software/ Monitoring Detection Detection
Hardware

Yoo et al. [24] 2016 Yes No Structural Yes No No Cannot make causal inferences regarding the
equation relationship among the key variable.
modeling.
MPlus V6.1
Shahi et al. [25] 2020 Yes No NLP, No No Yes Human annotated categories only for tree
Python. languages.
Shahi et al. [26] 2020 Yes No Python No No Yes The dataset excludes less viral misinformation.
Massaad et al. 2020 Yes No Google Colab. No Yes No The data collected only in one country.
[27] Python.
Pui et al. [28] 2020 Yes No Google Colab No Yes No The results were not developed
Python
Htet et al. [30] 2018 Yes No HBase. No Yes No Hadoop MR used has some drawbacks in the
Raspberry pi case of the transaction between input and
output.
Barbosa et 2010 Yes No Support vector No Yes No Sentences with antagonistic sentiments cannot
al. [31] machines be analyzed
Our 2020 Yes No Google Colab Yes Yes Yes Tested only one SNS platform
Contribution Python

detect and predict future outbreak hotspots based on openly public platform to track, understand, and predict future virus outbreaks as well
published posts on Twitter. In our approach, we considered sentiment as manage the infodemic and prevent fake news from spreading. This
detection, understanding of people’s fear, and fake news detection as framework includes four layers: 1) User Layer, 2) Fog Layer, 3) Cloud
well as two other considerations. Our main key consideration can be Layer, and 4) Application Layer. The overview of the framework is
described as follows: depicted in Fig. 2 below.

• Sustainability: In a SSHC environment, authorities can make use of 3.1. General overview
the available connectivity and data online to control the outbreak of
any infectious disease such as COVID-19. Users of SNS tend to share The proposed framework contains mainly four layers, and it is
valuable data and information online; such data, when correctly designed to analyze data from SNS platforms based on the information
filtered and analyzed aid in better understanding of the virus prop­ shared by the users:
agation patterns and help contain it. We focused on using this User layer: The first layer of our proposed framework is the user’s
approach to proposing a sustainable way to predict the virus layer. SNS users in this layer use their smartphones, computers, and/or
outbreak for the authorities to make accurate and meaningful de­ tablets to surf the platforms. Users can consume data - which means
cisions. Detecting the virus outbreak cluster will directly impact the being a mere receiver of information - participate in the creation of new
management of the pandemic in SSHC, thus, maintaining the data, and/or share existing data. Each user utilizes one or multiple SNS
required level of QoL while improving the health care service. platforms such as Facebook, Instagram, Twitter, YouTube, and so on to:
• Security and privacy: Other approaches do not consider the data
security and the privacy of users as they are using the raw data • Receive information regarding the virus: Users utilize the platform to
provided by the SNS platforms’ API without filtering it. In order to receive, search, and read news and information regarding COVID-19.
maintain data protection agreement in different regions, our pro­ The data could reach the user in the form of posts, messages, videos,
posed approach analyze only the openly shared data by users on and news articles shared by other users. Users are capable of
Twitter. Moreover, we consider in this study the privacy of users by searching on SNS platforms using keywords or hashtags to retrieve
adding a filter functionality using NLP methods before forwarding their desired information. Taking Twitter as an example, in 2019, the
the data to be processed; in the filter phase, we omit the user name platform recorded 330 million monthly active users, 40% of whom
and SNS identification. This step allowed us to analyze the data use the service daily. Note, however, that Twitter announced that
anonymously and retrieve knowledge without violating the users’ over 500 million people access its platform without logging into an
privacy. account [32]. These numbers suggest that people are using Twitter as
• Availability: This approach is based on the data collected from receivers, and that they are capable of getting the information they
openly shared tweets on Twitter. Using the API provided by Twitter, need without participating.
we can generate and collect data from various users in multiple lo­ • Share information regarding the virus: Users can share any post or
cations and from diverse backgrounds, which generates a large article they want on their profile. This action requires having an
database of information. active account and helps in understanding the user’s opinion later in
• Integrity: Data integrity refers to the assurance and maintenance of the analysis phase.
accuracy of data over its life cycle [46–47]. In this approach, we • Create content and information regarding the virus: SNS platforms give
filtered the data thoughtfully before proceeding to the analysis users the power to express themselves and their thoughts virtually.
phase, which guarantees accurate and meticulous data for our pre­ Users contribute to creating new information and data directly by
diction model. posting their viewpoints, sharing their knowledge and experiences,
and expressing their feelings regarding the epidemic situation. Such
3. Proposed SNS big data analysis framework data, after being analyzed, strengthen our understanding of how
people react through a serious pandemic, including their willingness
Very few infodemiology studies have applied network analyses as a to stick to the authority’s instruction in order to control the virus
method of pandemic management. To this end, we proposed a Big Data outbreak. Moreover, checking if the shared information contains any
analysis framework based on information shared openly on the SNS

4
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Fig. 2. SNS Big Data Analysis Framework Overview.

misleading information is a critical point before people start to people to spend more time at home. They turn to social media in
reshare and believe it. order to socialize and keep in touch with their loved ones and friends.
• Update their health status and concerns: SNS users tend to share their
concerns about their health status and conditions online. The authors All of this information is constantly being uploaded to the fog layer
in [42] surveyed 1,040 US adults and found that roughly 33% are where social networking platforms run their databases.
using SNS platforms such as Twitter to share medical questions, Fog Layer: The fog layer is where the social media platforms reside
obtain health information, and track their symptoms. 80% of par­ and run their databases such as Facebook, Instagram, Twitter, YouTube,
ticipants aged 18 ~ 24 said that they are more likely to share in­ and other chatting applications such as WhatsApp and Facebook
formation about their health condition and symptoms on social messenger. The data uploaded by the users are being stored in fog da­
media. Analyzing these data can help healthcare providers and au­ tabases. Some of the SNS platforms offer an open API to access their
thorities predict potential future COVID-19 outbreak hotspots and database content for research purposes, whereas others need to be
make fast decisions to stop it from spreading further. We will explain collected manually in order to study them. Thus, in this paper, we work
in the next section how this functionality is crucial for virus control. with the open Twitter API and make use of their database to retrieve
• Share their locations and visited places: Another valuable information tweets related to COVID-19 and analyze them.
that users tend to share in their SNS accounts is their location. When Cloud Layer: Two main functionalities are running at the cloud
they post a picture or a text post, they are most likely to mention the layer: Filtering and Analyzing. In the filtering phase, the data will be
place where they are. Facebook, Instagram, and Snapchat mainly cleaned, and this includes taking out hashtags, links, lowering all the
suggest locations for people upon uploading in their accounts. sentences, and other functionalities that we will explain in detail in the
Moreover, users upload the other people as well they are with at the next section. Subsequentially, the cleaned data will be forwarded to the
time of posting. This information can directly help in tracing the analyzing phase, where it will be processed and analyzed in order to
virus spread as we will discuss in the following section: retrieve knowledge and valuable information from it. Details about this
• Express their opinion and anger: Opinion, sentiment, and fear can be process are discussed in the next section. The results will be forwarded to
retrieved from the contextual posts of the user as well as his/her likes the Application layer.
of other posts and articles. Application Layer: In the last layer, the results of the analyzing phase
• Socialize with other people: During the quarantine imposed by most and the knowledge gleaned are transmitted to the authorities in order to
countries around the world, schools and works have shifted to on­ make use of the results retrieved from the cloud layer [37–41]. Based on
line, and they no longer require physical presence; thus leading the results received, authorities such as public healthcare providers and

5
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Fig. 3. Process Flow of the Proposed Framework.

governments should be able to make an accurate decision regarding the information about the virus that could be created or shared from another
virus outbreak and infodemiology. The results help the authorities resource such as web news pages or other Tweets. All of these data are
predict future outbreak waves based on the user’s shared symptoms and being collected and crawled and shared with the cloud server where two
locations, control the information shared in SNS platforms in order to functions are mainly done: preprocessing and postprocessing.
track fake news and stop it from spreading further, and understand Preprocessing: Preprocessing methods include filtering the data
people’s fear and opinion regarding the epidemic in order to maintain a communicated from the fog layer; this step consumes roughly 85% of
sustainable, healthy smart city. knowledge discovery time [33] to filter the data. We followed several
The proposed framework and solution are not only felicitous for the steps as follows:
COVID-19 epidemic case but must also be used to monitor future virus
outbreaks and manage the situation and public fear before it is too late. a) Keywords extraction: Since the platform we were working on is
Twitter, we selected a set of the most used hashtags and collected our
3.2. Structural design with methodological flow tweets based on it. These hashtags include: #coronavirus, #wuhan­
virus, #COVID-19, #ncov, #lockdown, #2019ncov, and #corona.
Fig. 3 below presents the methodological flow of our proposed These hashtags are the most used on Twitter regarding the COVID-19
framework and aids in better understanding of it. We took Twitter as the topic. Algorithm 1 shows the process of selecting tweets based on
SNS platform as it provides researchers with a free API to access all the hashtags; we started by connecting the Twitter open API, created a
openly shared tweets. It starts with a user who decides to share a piece of table where we put all the hashtags, and retrieved all the tweets
certain information on Twitter - this information could be simple where those hashtags are mentioned.
opinion, location (dinner at a certain place or at home with a certain
person), symptoms (users who share that their symptoms are similar to
COVID-19 symptoms online before getting checked), and/or general

6
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

of tweets in the Tokenized Tweets table. For frequency of word wi ,


nw is the total number of wi in the Tokenized Tweets. The sum of nk
b) Tokenization: Tokenization is the process wherein words are trans­ is the total number of all words in the Tokenized Tweets table.
formed from normal strings into meaningful terms in order to facil­ Equation 1 explains this step:
itate the analysis phase [34]. The tokenization action removes any nw
special characters, numbers, and symbols from the text. These TF(w, t) = ∑ (1)
nk
characters are called tokens. Algorithm 1 depicts the tokenization k

process. After establishing the connection and collecting the tweets Inverse Document Frequency is a logarithm of the ratio of the
based on the given hashtags, we created a table and associated the number of all tweets of the Tokenized Tweets table to the number of
special characters, such as “RT” mention, which refers to a retweeted words within tweets with term wi . Equation 2 depicts this phase:
tweet, “@” symbol, which is used to tag a user, and “#” symbol used
to highlight the keywords and numbers. The function Replace is log|T|
IDF (w, T) = (2)
responsible for putting a space in place of those special characters. |{ti ε T| w ε ti } |
c) Lemmatization and clustering: In linguistics, lemmatization is the
TF-IDF is the product of TF(w, t) to IDF (w, T) as shown in equation
process of grouping together the inflected forms of a word so that
3:
they can be analyzed as a single item and identified by the word’s
lemma, or dictionary form. In data analysis science, this method can TF − IDF = TF(w, t) . IDF (w, T) (3)
be referred to as clustering. A lot of tweets are similar, and this may
Clustering all the tweets and organize is a necessary step to expedite
create data noise. Thus, clustering tweets based on their similarities
the post-analysis process.
saves us more time in the post-analysis phase. The clustering algo­
rithm is based on a machine learning concept called Term Frequency
- Inverse Document Frequency (TF-IDF). The TF-IDF is a matrix
where a Tokenized Tweets table resulting from algorithm 1 is
referred to as t and the words within it are referred to as w. The
frequency is the ratio number of the current word to the total number

7
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Table 2 d) Feature extraction: Subsequent to the keywords’ extraction, tokeni­


Subjectivity and Polarity Metric zation, and lemmatization phases, the remaining necessary step is
Subjectivity Polarity feature extraction, which is the aspect wherein the main character­
istics of the tweet are being excerpted. The results will be used as the
Scale 0 1 − 1 1
Explanation Objective Subjective Negative Positive fuel of the post-processing stage. These features include user loca­
tion, mentioned symptoms in the tweets, subjectivity, and polarity.
Starting with the user location, we used the location provided by the
user in the location field from the tweet meta-data. Aside from the
city or country, we focused more on precise locations such as

Fig. 4. Example of Subjectivity and Polarity results.

8
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Fig. 5. Detailed Methodology Flowchart.

restaurant name, workplace, school name, and so on. Algorithm 2


presents the pseudo-code used to retrieve the user location. We used
the Tokenization Table obtained from algorithm 1, fetched through it Commensurate with the subjectivity analysis, to extract the polarity
all the qualified Tweets, and searched if the location field is pro­ result, we used python TextBlob for NLP. The results varied between -1
vided. If the case is true, we created a table with all the users and and 1 where negative scores represent a negative opinion, and positive
their respective locations. This table will serve to predict future scores denote a positive opinion. Scores with 0 suggest a neutral point of
possible outbreak cases. As for the subjectivity and polarity analysis, view (Table 2).
it is a method that provides clear understanding of the user’s atti­ Algorithm 3 presents the pseudo-code of the abovementioned steps,
tude, opinion, and sentiment using NLP techniques. As part of whereas Fig. 4 shows an example of 100 tweets retrieved using the
sentiment analyses using NLP, the subjectivity analysis classifies previously mentioned methods. This method supports detecting fake
texts as opinionated or non-opinionated [35]. Adjectives, adverbs, news as well. Fake news mainly shares the same characteristics, tend to
and certain verbs and nouns were used as indicators of a subjective be more negative, and show a very subjective score. Based on that, we
opinion. On the other hand, polarity analysis was performed after the retrieved the suspected tweets and compared them with fake news da­
subjectivity analysis in order to define if the opinion is positive or tabases provided by multiple researchers in order to delete them from
negative. our database if the similarity is high, and tweets from the same user will
not be used in the future. Fig. 5 presents the detailed methodological
flowchart of the proposed framework.

9
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Fig. 6. Categories of USA population in COVID-19 case.

Fig. 7. Prediction results compared with confirmed results.

Fig. 8. Median Prediction Factor.

10
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Fig. 9. Predicted and Confirmed COVID-19 Cases Based on Days.

4. Analysis and discussion

After gathering the data from Twitter, filtering it, and organizing it
based on sentiment analysis, we will apply in this section deep learning
on the data in order to retrieve and extract knowledge to build a possible
prediction of the virus outbreak based on people’s locations and shared
symptoms and questions. This method will allow us to have an overview
of potential future COVID-19 cases and take fast actions to prevent
further spread.

4.1. Numerical results

In order to test the feasibility of our proposed model, we simulated it


on an i-7 computer with 64 bits. The data were collected for this specific
prediction simulation during a two months period from August 1, 2020
until September 30, 2020. Data collection was done using Python for
Twitter on Google collab. In order to simplify the task, we took the
United States of America as the location where we retrieved and
analyzed the data. The total number of tweets collected is 10000. 38% of
users age is between 18 and 29 years old, while 26% are between 30 and
49 years old. 56% male and 44% female. We used these data to feed the
deep learning algorithm in order to extract a prediction from it. After­
ward, we compared the results with the actual COVID-19 cases in the Fig. 10. Comparative result analysis.
USA. For this proposition, we used the SIR (Susceptible, Infected,
Recovered) model wherein we divided the data collected and filtered
from Twitter into 3 categories based on the symptoms and confirmation
shared by the users, and these categories are shown in Fig. 6.
The module is based on the assumption of transmitting the COVID-19 Table 3
virus throughout the direct or indirect contact from the infected case’s Comparative result analysis based on different models.
category to susceptible case people [36]. In our proposition, we used as Model Name Model Description Prediction Results
data set the filtered information retrieved from Twitter accounts based Logistic A 0.999
P =
on the USA. Infected cases who previously shared their locations and
( )
L− x
+2
people they have met are a potential virus transmitter to susceptible
((4μ)∗
1+e A
Linear P = Ax − B 0.557
cases. After dividing the data into three categories, we analyzed it based
Logarithmic P = A + Blog⁡(x) 0.289
on the movement and contact of the infected cases, removed the death
Quadratic P = A + Bx + Cx2 0.88
and recovered cases from the calculation, and excluded newborn cases
in this simulation as the period is relatively small. Cubic P = A + Bx + Cx2 + Dx3 0.982

In order to calculate the progress of individuals in the susceptible Compound P = ABx 0.977
cases category by time-series, we used the following differential equa­ Power P = AxB 0.702
tion: Exponential P = AxB 0.977
SNS Analysis P = I0 eα− β 0.989

11
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

Table 4
Comparison based on key considerations.
Research work Year Twitter Method/Software/Hardware Sustainability Security Availability Integrity

[24] 2016 Yes Structural equation modeling. No No Yes Yes


MPlus V6.1
[25] 2020 Yes NLP, Yes No Yes Yes
Python.
[26] 2020 Yes Python No Yes Yes No
[27] 2020 Yes Google Colab. No No Yes Yes
Python.
[28] 2020 Yes Google Colab Yes Yes No No
Python
[30] 2018 Yes HBase. Yes No Yes Yes
Raspberry pi
[31] 2010 Yes Support vector machines Yes Yes No Yes
Our Contribution 2020 Yes Google Colab Yes Yes Yes Yes
Python

dS information as they mainly share the same vocabulary and grammatical


= − αSI (4)
dt characteristics. Detecting the suspected fake information and comparing
S and I refer to Susceptible and Infected cases, respectively. α is the it with the Fake news database provided by various researchers support
reproduction rate per day of the differential equation used to regulate our filtering system to delete those data and information from the
the susceptible infectious contact. In the early stages of the outbreak, the analysis phase.
value of I is negligible, and the value of S is approximated to be equal to Although we only provided one example of SNS big data analysis
1. By the time the virus outbreak progresses, the value of I becomes based on only one SNS platform in this work, we strongly believe that it
is the main pillar for controlling the current COVID-19 outbreak as well
larger, whereas the value of S gradually declines. Thus, the accession dI dt
as future pandemics in smart cities. Researchers should make use of the
becomes linear, and the infected category can be calculated as follows:
openly shared data on SNS platforms and focus more on improving SNS
dI big data analysis methods for a better understanding of users’ fear and
= αSI − βI (5)
dt sentiments and to provide oversight and manage serious pandemics. Our
future research focal point is an automated fake news and information
Where β represents the daily rate regulator of new infections based on detection tool on SNS platforms, which will support this proposed model
the quantification of the number of infected cases in the transmission. and provide a better data set to analyze.
Moreover, removed cases R, which represents the category of people
who have been cured of the virus or who died, can be calculated as
follows: 4.2. Comparative analysis and discussion

dR
= βI (6) Based on the above-mentioned methods and the results of median
dt prediction factor, we retrieve the confirmed COVID-19 cases of the last 7
Based on the equations above, the prediction of future potential days and compare it with the predicted cases in the last 14 days. Results
outbreak using Twitter data can be calculated as follows: depicted in Fig. 9 show that our proposed method can predict the next
COVID-19 outbreak and hotspots with approximately 7 days earlier than
I (t) ≈ I0 eα− β
(7) the government authorities. The dots represent the outbreak cluster.
We computed the data previously retrieved and filtered from Twitter Following SNS big data analysis method, authorities can benefit from an
in order to run a simulation and compared our predicted results with the earlier prediction of COVID-19 and alike crises situation and hotspots
accurate confirmed results in the USA between 01/08/2020 and 31/08/ outbreak, which support virus control and provides a better under­
2020. The regression model shows an accuracy of 0.81. Fig. 7 shows the standing on people opinions and their tendency to follow the govern­
prediction results. ment guidance.
In order to evaluate the accuracy of the proposed model, we calcu­ We compared as well our results with other state-of-arts outcomes,
lated the median prediction factor. Results showed that the factor varies although those papers do not use SNS information as a data base to
between 0.8 and 1.2 as depicted in Fig. 8, presenting a fair model for calculate and predict the virus outbreak, they tend to use other methods
predicting future outbreak cases. based on machine learning models powered by Grey Wolf Optimization
The median factor can be calculated as follows: (GWO) method including linear, logarithmic, quadratic, cubic, com­
pound, power, exponential, and logistic as depicted and explained in
M=
P
(8) [35]. The authors calculated the prediction results of machine learning
C models for USA COVID-19 cases fitted by GWO, we compare those re­
sults with our model and explain them in Table 3 and visually depicted
Where, P refers to the predicted cases and C is the confirmed cases by
in Fig. 10.
legal authorities.
Based on the above-mentioned results, our model achieves a 0.989
Big data SNS analysis can provide us with a strong tool to manage
prediction result as depicted in Table 3 compared to other machine
and control a pandemic such as COVID-19; people in the areas of smart
learning methods, which make it suitable to be used in COVID-19
city and IoT rely more on their smartphone and connected devices to
outbreak prediction [44] and control by the authorities. Moreover, we
inform themselves about everything including a serious global
compared the previous-mentioned related works based on four key
pandemic. The data shared on SNS platforms is very meaningful as it can
considerations including sustainability, security, availability, and
inform us about the user’s condition, help us understand the user’s
integrity. Unlike other works who used Twitter as a data base for
sentiments, and predict future outbreak hotspots based on the user’s
COVID-19 and alike crises outbreak prediction, our model covers the
movements, locations, and symptoms. Moreover, based on the sentiment
four key consideration as depicted in Table 4.
analysis performed on the SNS platform, we can detect fake news and
SNS big data analysis is a critical solution that urges more

12
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

development in order to be applied on SSHC. Using this proposed Performance Indicators for Smart Sustainable Cities, CBD, ECLAC, FAO, ITU, UNDP,
UNECA, UNECE, UNESCO, UN Environment, UNEP-FI, UNFCCC, UN-Habitat,
framework, we can predict future potential COVID-19 cases and
UNIDO, UNU-EGOV, UN-Women and WMO.
outbreak cluster seven days earlier than the confirmed cases were [4] Das, D., & Zhang, J. J. (2020). Pandemic in a smart city: Singapore’s COVID-19
announced, allowing governments to apply national or regional lock­ management through technology & society. Urban Geography, 1–9. https://doi.org/
down a few days in advance, and control the virus propagation. As far as 10.1080/02723638.2020.1807168
[5] Megahed, N. A., & Ghoneim, E. M. (2020). Antivirus-built environment: Lessons
we know, until the moment of writing this paper, and compared with learned from Covid-19 pandemic. Sustainable Cities and Society, 61, Article 102350.
other researches, we are the only work that combined sentimental https://doi.org/10.1016/j.scs.2020.102350
analysis, fake news detection, deleting fake news as well as users’ [6] Badshah, A., Ghani, A., Qureshi, M. A., & Shamshirband, S. (2019). Smart security
framework for educational institutions using internet of things (IoT). Computers,
identity from our analyzed database, and predicting the virus outbreak Materials & Continua, 61(1), 81–101. https://doi.org/10.32604/cmc.2019.06288
before the confirmed cases. The results of this research proved to be [7] Singh, S. K., Pan, Y., & Park, J. H. (2021). OTS Scheme Based Secure Architecture
adaptable for SSHC environment as it achieves a 0.989 prediction re­ for Energy-Efficient IoT in Edge Infrastructure. Computers, Materials & Continua, 66
(3), 2905–2922.
sults, which contribute directly in the maintenance and sustainability of [8] Jha, S., Nkenyereye, L., Joshi, G. P., & Yang, E. (2020). Mitigating and Monitoring
the services provided by smart cities to its citizens, thus, improving the Smart City Using Internet of Things. Computers, Materials & Continua, 65(2),
QoL and KPI of SSHC. However, we have only simulated our proposed 1059–1079. https://doi.org/10.32604/cmc.2020.011754
[9] Ellis, D. A., Davidson, B. I., Shaw, H., & Geyer, K. (2019). Do smartphone usage
scheme on Twitter API data. Moreover, we have analyzed tweets from scales predict behavior? International Journal of Human-Computer Studies, 130,
one geospatial location and based on one langue (English). In our future 86–92. https://doi.org/10.1016/j.ijhcs.2019.05.004
work, we aim to cover these limitations and address them thoughtfully [10] Brunborg, G. S., & Andreas, J. B. (2019). Increase in time spent on social media is
associated with modest increase in depression, conduct problems, and episodic
by analyzing data from multiple SNS platforms such as Facebook and
heavy drinking. Journal of adolescence, 74, 201–209. https://doi.org/10.1016/j.
YouTube and including more linguistic and geospatial options. adolescence.2019.06.013
[11] Impact of Covid-19 pandemic on social media, (Last Accessed 2 March 2021)
https://en.wikipedia.org/wiki/Impact_of_the_COVID19_pandemic_on_social_me
5. Conclusion
dia#Increase_in_usage.
[12] Taylor, D. (2020). COVID-19: Social media use goes up as country stays indoors".
To avoid the calamity of COVID-19 during the global pandemic Victoria News (Last Accessed 28 February 2021) https://www.vicnews.com/news/
outbreak, a SSHC urges fast and accurate SNS big data analysis man­ covid-19-social-media-use-goes-up-as-country-stays-indoors/.
[13] Kemp, S. (2020). Digital 2020: 3.8 Billion people use social media (Last Accessed 28
agement to handle a pandemic such as COVID-19. Motivated by the February 2021) https://wearesocial.com/digital-2020.
advancement of these technologies and the rapid development of Social [14] Chung, R. Y. N., & Li, M. M. (2020). Anti-Chinese sentiment during the 2019-nCoV
Networking Service (SNS), this paper proposed an infodemiology study outbreak. The Lancet, 395(10225), 686–687. https://doi.org/10.1016/S0140-6736
(20)30358-5
to predict the epidemic outbreak and track its spread across a global [15] Zarocostas, J. (2020). How to fight an infodemic. The lancet, 395(10225), 676.
shared framework, we considered Twitter as a case study for our https://doi.org/10.1016/S0140-6736(20)30461-X
framework as it provide researchers with an API to collect various and [16] Eysenbach, G. (2009). Infodemiology and infoveillance: framework for an
emerging set of public health informatics methods to analyze search,
openly shared tweets, we collected tweets from USA users during one communication and publication behavior on the Internet. Journal of medical
month period, and analyzed them using preprocess methods including Internet research, 11(1), e11. https://doi.org/10.2196/jmir.1157
key word extraction, redundancy removal, tokenization, lemmatization, [17] Horvitz, E., & Mulligan, D. (2015). Data, privacy, and the greater good. Science,
349(6245), 253–255. https://doi.org/10.1126/science.aac4520
and feature extraction. The proposed framework achieved a results of [18] Hu, Z., Yang, Z., Li, Q., Zhang, A., & Huang, Y. (2020). Infodemiological study on
seven days earlier outbreak prediction with a 0.989 indicator results, COVID-19 epidemic and COVID-19 infodemic. https://doi.org/10.21203/rs.3.rs-
thus enhancing the control of this pandemic situation, aids in reducing 18591/v1
[19] Park, H. W., Park, S., & Chong, M. (2020). Conversations and medical news frames
the infection cases by predicting potential virus carriers several days
on twitter: Infodemiological study on covid-19 in south korea. Journal of Medical
earlier and supports understanding of users’ fear and sentiment and Internet Research, 22(5), Article e18897. https://doi.org/10.2196/18897
their tendency to follow authorities’ regulations. SNS big data are a tool [20] Oxford Analytica. (2020). Misinformation will undermine coronavirus responses.
Emerald Expert Briefings. https://doi.org/10.1108/OXAN-DB250989 (oxan-db).
that should be used in future research as they provide us with an over­
[21] Sasaki, Y., Kawai, D., & Kitamura, S. (2015). The anatomy of tweet overload: How
sight of the situation and help control a serious pandemic such as number of tweets received, number of friends, and egocentric network density
COVID-19, and to maintain the KPI in SSHC by improving the healthcare affect perceived information overload. Telematics and Informatics, 32(4), 853–861.
services provided to its citizens. Future research direction will cover the https://doi.org/10.1016/j.tele.2015.04.008
[22] Sayce, D. (2020). The Number of Tweets per day in 2020 (Last Accessed 28 February
limitations of this work and address the dilemma of analyzing multiple 2021) https://www.dsayce.com/social-media/tweets-day/.
SNS platforms in different linguistics and geospatial options. [23] Sun, C., & Zhai, Z. (2020). The efficacy of social distance and ventilation
effectiveness in preventing COVID-19 transmission. Sustainable cities and society,
62, Article 102390. https://doi.org/10.1016/j.scs.2020.102390
Declaration of Competing Interest [24] Yoo, W. H., Choi, D. H., & Park, K. H. (2016). The effects of SNS communication:
How expressing and receiving information predict MERS-preventive behavioral
intentions in South Korea. Computers in Human Behavior, 62, 34–43. https://doi.
The authors report no declarations of interest. org/10.1016/j.chb.2016.03.058. ISSN 0747-5632.
[25] Shahi, G. K., & Nandini, D. (2020). FakeCovid–A Multilingual Cross-domain Fact
Acknowledgments Check News Dataset for COVID-19. arXiv preprint arXiv:2006.11343. https://doi.
org/10.36190/2020.14
[26] Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2020). An exploratory study of covid-
This work was supported by the National Research Foundation of 19 misinformation on twitter. arXiv preprint arXiv:2005.05710. https://doi.org/
Korea (NRF) with a grant funded by the Korea government (NRF- 10.36190/2020.14
[27] Massaad, E., & Cherfan, P. (2020). Social media data analytics on telehealth during
2019R1A2B5B01070416). the COVID-19 pandemic. Cureus, 12(4). https://doi.org/10.7759/cureus.7838
[28] Arianto, D., & Pui, N. (2020). Social media analysis: Utilization of social media data
References for research on COVID-19, 2020.
[29] Jia, P., & Yang, S. (2020). Publisher Correction: China needs a national intelligent
syndromic surveillance system. Nature Medicine, 26(7), 1149. https://doi.org/
[1] World Health Organization. (2020). Archived: WHO Timeline- COVID-19 (Last
10.1038/s41591-020-0977-2
Accessed 22 February 2021) https://www.who.int/news-room/detail/27-04-2020-
[30] Htet, H., & Myint, Y. (2018). Social Media (Twitter) Data Analysis using Maximum
who-timeline—covid-19.
Entropy Classifier on Big Data Processing Framework (Case Study: Analysis of
[2] Park, J. H., Rathore, S., Singh, S. K., Salim, M. M., Azzaoui, A. E. L., Kim, T. W.,
Health Condition, Education Status, State of Business). Journal of Pharmacognosy
Pan, Y., & Park, J. H. (2021). A Comprehensive Survey on Core Technologies and
and Phytochemistry, 7, 695–700.
Services for 5G Security: Taxonomies, Issues, and Solutions. Human-centric
[31] Barbosa, L., & Feng, J. (2010). Robust sentiment detection on twitter from biased
Computing and Information Sciences, 11(3). https://doi.org/10.1109/
and noisy data. August. Coling 2010: Posters (pp. 36–44).
HCIS.2021.11.003
[32] Lin, Y. (2021). Twitter Statistics: 10 Twitter Statistics You Need to Know in 2021 (Last
[3] Smiciklas, John, Prokop, Gundula, Stano, Pawel, & Sang, Ziqin (2021). United for
Accessed 26 February 2021) https://www.oberlo.com/blog/twitter-statistics.
Smart Sustainable Cities (U4SSC) initiative report. Collection Methodology for Key

13
A. EL Azzaoui et al. Sustainable Cities and Society 71 (2021) 102993

[33] Al-Khafaji, H. K., & Habeeb, A. T. (2017). Efficient algorithms for preprocessing [40] Park, J. S., & Park, J. H. (2020). Future Trends of IoT, 5G Mobile Networks, and AI:
and stemming of tweets in a sentiment analysis system. Journal of Computer Challenges, Opportunities, and Solutions. Journal of Information Processing Systems,
Engineering, 19(3), 44–50. https://doi.org/10.9790/0661-1903024450 16(4), 743–749. https://doi.org/10.3745/JIPS.03.0146
[34] Singh, V., & Saini, B. (2014). An Effective tokenization algorithm for information [41] Park, J. H., Salim, M. M., Jo, J. H., Sicato, J. C. S., Rathore, S., & Park, J. H. (2019).
retrieval systems. Haryana, India: Department of Computer Engineering, National CIoT-Net: a scalable cognitive IoT based smart city network architecture. Human-
Institute of Technology Kurukshetra. https://doi.org/10.5121/csit.2014.4910 centric Computing and Information Sciences, 9(1), 1–20. https://doi.org/10.1186/
[35] Kharde, V., & Sonawane, P. (2016). Sentiment analysis of twitter data: a survey of s13673-019-0190-9
techniques. arXiv preprint arXiv:1601.06971. https://doi.org/10.5120/ [42] Anderson, K., Smita, L., & Garrett, D. (2012). Social Media “Likes” Healthcare: From
ijca2016908625 Marketing to Social Business (Last Accessed 26 February 2021) (pp. 1–40). PwC
[36] Ardabili, Sina F., MOSAVI Amir, Ghamisi, Pedram, Ferdinand, Filip, Health Research Institute https://adindex.ru/files2/access/2013_06/99606_tpc-h
Koczy, Annamaria R. Varkonyi, Reuter, Uwe, Rabczuk, Timon, & Atkinson, Peter ealth-care-social-media-report.pdf.
M. (2020). medRxiv. https://doi.org/10.1101/2020.04.17.20070094, [43] Roser, M., Ritchie, H., Ortiz-Ospina, E., & Hasell, J. (2020). Coronavirus pandemic
04.17.20070094. (COVID-19). Our world in data [Online Resource] https://ourworldindata.org/coro
[37] Lee, Y., Rathore, S., Park, J. H., & Park, J. H. (2020). A blockchain-based smart navirus.
home gateway architecture for preventing data forgery. Human-centric Computing [44] Ardabili, S. F., Mosavi, A., Ghamisi, P., Ferdinand, F., Varkonyi-Koczy, A. R.,
and Information Sciences, 10(1), 1–14. https://doi.org/10.1186/s13673-020-0214- Reuter, U., … Atkinson, P. M. (2020). Covid-19 outbreak prediction with machine
5 learning. Algorithms, 13(10), 249. https://doi.org/10.1101/2020.04.17.20070094
[38] Jo, J. H., Sharma, P. K., Sicato, J. C. S., & Park, J. H. (2019). Emerging technologies [45] How fake news has exploited COVID-19, https://www.pwc.co.uk/issues/crisis-a
for sustainable smart city network security: Issues, challenges, and nd-resilience/covid-19/how-fake-news-has-exploited-covid19-cyber.html.
countermeasures. Journal of Information Processing Systems, 15(4), 765–784. [46] Cha, J., Singh, S. K., Kim, T. W., & Park, J. H. Blockchain-empowered cloud
https://doi.org/10.3745/JIPS.03.0124 architecture based on secret sharing for smart city. Journal of Information Security
[39] Singh, S. K., Jeong, Y. S., & Park, J. H. (2020). A deep learning-based IoT-oriented and Applications, 57, 102686. https://doi.org/10.1016/j.jisa.2020.102686.
infrastructure for secure smart city. Sustainable Cities and Society, 60, Article [47] Salim, M. M., Rathore, S., & Park, J. H. (2019). Distributed denial of service attacks
102252. https://doi.org/10.1016/j.scs.2020.102252 and its defenses in IoT: a survey. The Journal of Supercomputing, 1–44. https://doi.
org/10.1007/s11227-019-02945-z

14

You might also like