Professional Documents
Culture Documents
Abstract The delivery of unbiased news articles in the healthcare sector is one
of the prominent problems in the fight against unvaccinated individuals as Covid-19
causes great skepticism among many groups of people. Companies such as Facebook
have already integrated AI models for context to content that is rated by third-party
fact-checkers to detect misinformation (“Fonctionnement du programme de vérifi-
cation tierce de Facebook,” Fonctionnement du programme de vérification tierce de
Facebook. https://www.facebook.com/journalismproject/programs/third-party-fact-
checking/how-it-works?locale = fr_FR (accessed Sep. 06, 2021)). In this paper we
use Natural Language Processing (NLP) and Sentiment Analysis to derive the content
of news articles, an API is integrated to gather news articles from various sources
using the newsapi (“News API – Search News and Blog Articles on the Web,” News
API. https://newsapi.org (accessed Oct. 28, 2021)). Applying VADER, Text Blob,
and Flair rule-based lexicons, we create a hybrid approach from the lexicons and
combine each method, we then present a novel Fuzzy-Logic Lexicon Mamdani Rule-
Base Multi Inference System (FLLMRBMIF) that can generate a final sentiment from
each output of polarity, we then classify into a positive, neutral and negative result.
The results demonstrate that it is possible to integrate a tool to classify the sources
in real-time allowing more insightful information on biased news stories.
J. N. Morden
School of Computer Science and Informatics, De Montfort University, Leicester, UK
e-mail: P2433071@alumni365.dmu.ac.uk
A. S. Khuman (B)
School of Computer Science and Informatics, Institute of Artificial Intelligence (IAI), Leicester,
UK
e-mail: arjab.khuman@dmu.ac.uk
A. Fasanmade · M. Muhammad
School of Computer Science and Informatics, Cyber Technology Institute (CTI), Leicester, UK
e-mail: adebamigbe.fasanmade@dmu.ac.uk
M. Muhammad
e-mail: musa.muhammad@coventry.ac.uk
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 215
T. Chen et al. (eds.), Artificial Intelligence in Healthcare, Brain Informatics and Health,
https://doi.org/10.1007/978-981-19-5272-2_11
216 J. N. Morden et al.
1 Introduction
Over the last two decades, a sizable anti-vaccination movement has developed,
resulting in a decrease in herd immunity among the general population. The fact
that websites represent a range of concerns regarding vaccine safety and varying
degrees of distrust in medicine has been noted before in this context. These kinds of
websites rely heavily on emotional appeal to convey their message, and they have
previously been recognized as a source of concern for getting reliable information
[3]. Previous research has identified several reasons why people in other countries are
hesitant to be immunized against COVID-19, including concerns about efficacy and
safety, prior exposure to COVID-19, concerns about the vaccine’s novelty, religious
beliefs, a lack of trust, a dislike for needles, a belief that COVID-19 is not a risk, a
preference for natural immunity, and being infected. [4–7]. On the other hand, as a
result of the proliferation of viral social media news stories, celebrities have emerged
as a major source of information. When the pandemic of COVID-19 started, it precip-
itated one of the most divisive vaccination projects in modern history. The World
Health Organization labeled the COVID-19 issue an “infodemic” as early as February
2020, stressing the need of improving accurate communications and building local
confidence in public health institutions in order to fight the pandemic. The critical
role of tailored messaging and delivery strategies in increasing vaccination coverage
has been shown [8, 9]. Whether local governments make it easy and enjoyable for
residents to get vaccinated at local bars, or partner with trusted and community-based
organizations on roll-out and messaging campaigns, this report outlines a variety of
ways sentiment data can assist local policymakers in assessing and improving their
roll-out and messaging strategies to increase resident vaccination rates.
The word sentiment has at least two different meanings in common English. It is
often used to refer to (a) an emotion or anything emotionally charged, as well as
(b) a particular subjective perspective. Sentiment Analysis commonly referred to
as opinion-mining is a collection of methods for identifying opinion, sentiment,
and subjectivity in text [3–5]. Sentiment analysis is usually regarded a branch of
computational linguistics and, to a lesser degree, computer science, although it is
most closely related to computational linguistics. Social science, on the other hand,
views sentiment analysis as a discipline, not a method. Information extraction is
an aspect of computational and information science that deals with compressing,
summarizing, and making inferences from collections of texts [6–8].
Sentiment analysis in a computer sense gained popularity in the early 2000s within
computational linguistics [9]. Research on subjectivity in the wider artificial intelli-
gence paradigm goes back to Lotfi Zadeh’s 1965 explanation of Fuzzy Logic, which
dates to the 1960s [10] He believed that conventional computer logic could not deal
A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis … 217
The objective of this study is to develop a model that can analyse the content of
news articles from numerous sources around the globe, moreover, to be able to
analyse how biased a news article is compared to another. This model would then
form the basis of a computer application which an individual or organisation could
use in confidence, furthermore, to develop a timely, and strong representation of
perceptions on healthcare news articles, rather than focusing on scaremongering
media. The objective of this paper is to analyse the sentiment of the given discourse
using Fuzzy logic. The identification and characterization of sentiment as positive,
negative and neutral are tackled in this paper. The proposed framework can be seen
below in Fig. 1.
Each sentiment analysis model uses its own custom rule-base or framework to
determine what sentiment tag is given, these tags are; positive, negative or neutral.
An output of the final sentiment from each sentiment analysis model is fed into fuzzy
218 J. N. Morden et al.
Rule Base
Data Pre
Fuzzification DeFuzzification Post Processing
Processing
Inference Engine
News API
Sentiment Analysis
Vadar
TextBlob
Flair
Let us assume the inputs are crisp measurements from measuring equipment, rather
than linguistic. A pre-processor, conditions the measurements before they enter the
controller. Examples of pre-processing are quantization in connection with sampling
or rounding to integers; A quantizer converts an inbound measurement to fit it to a
discrete universe.
• normalization or scaling onto a particular, standard range;
• filtering in order to remove noise;
• averaging to obtain long-term or short-term tendencies;
• a combination of several measurements to obtain key indicators; and
• differentiation and integration, or their approximations in discrete time.
A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis … 219
Data gathered from API’s require preprocessing to remove noise before the opinion
mining process can be performed. These news articles usually contain mistakes in
spelling, grammar, use of non-dictionary words such as abbreviations or acronyms of
common terms, mistakes in punctuation, incorrect capitalization etc., furthermore,
URL cleaning due to special characters need to be removed. These mistakes in
the dataset make preprocessing step are essential. In this stage tasks such as spell
error correction using a standard word processor, sentence boundary selection and
repetitive punctuation conflation is done. Preprocessing steps generate sentences
which can be parsed automatically by the linguistic parser. Moreover, each source
from the API will use the coronavirus keyword, thus, the only articles featured will
be surrounding that topic.
This is a pragmatic method to text analysis that does not need training or the use of
machine learning models. This technique produces a set of guidelines by which the
text is classified as positive/negative/neutral. Additionally, these rules are referred to
as lexicons. As a result, the rule-based method is referred to as the Lexicon-based
approach. We utilized the Vadar, Text Blob, and Flair lexicons since they are often
used for sentiment analysis.
(a) Vadar
The Vadar compound score is calculated by adding the valence ratings of each word
in the lexicon, adjusting them according to the criteria, and then normalising them
to fall between −1 (the most extreme negative) and +1 (the most extreme positive)
(most extreme positive). This is the most relevant metric if a single unidimensional
assessment of a sentence’s emotion is required. As outlined below in Fig. 2.
(b) TextBlob
TextBlob returns a sentence’s polarity and subjectivity. Polarity falls between [−1,
1], where −1 represents a negative sentiment and 1 represents a positive one.
Negative expressions invert the polarity. TextBlob has semantic markers that facil-
itate detailed analysis. For instance, emoticons, the exclamation mark, and emojis.
Subjectivity falls between 0 and 1. Subjectivity measures the proportion of personal
opinion and factual information in a writing. Greater subjectivity indicates that the
content includes more personal opinion than objective facts. Using the same state-
ment aforementioned previously we can see the TextBlob score represented below
in Fig. 3.
220 J. N. Morden et al.
(c) Flair
label. Flair outputs both the emotion label (positive/negative) and the prediction’s
polarity (how strongly polarised the statement is, between 0 and 1). See Fig. 4.
7 Fuzzification
The first block inside the FLLMRBMIF controller is fuzzification interface, which
is a lookup in the membership functions to derive the membership values. The
fuzzification block of FLLMRBMIF therefore evaluates the input measurements
according to the conjectures of the rules. Each congecture produces a membership
grade expressing the degree of fulfilment of the premise.
We consider the following important factors and use them to determine the
fuzzification interface for the controller.
(1) the specific membership functions—in our case three membership functions are
considered
(2) the input universe of discourse for number of fuzzy sets defined—the input
universe of discourse is [−1,1].
The Sentiment analysis score is computed with the consensus value that will need
to be calculated. The compound score is the sum of positive, negative & neutral scores
which is then normalized between −1(most extreme negative) and +1 (most extreme
positive). The more Compound score closer to +1, the higher the positivity of the
text. This is the most useful metric if you want a single unidimensional measure of
sentiment for a given sentence. An example of the compound score (cs) is computed
using the following equation.
x
x=√ (1)
x2 +α
222 J. N. Morden et al.
W N ,I
N or m I,N = , (3)
Sour ces N
where Sour ces N = the number of sources for news source N. Essentially, we can
define the fuzzification mapping can be denoted as F:I → A.
9 Functionality Block
This section describes the rules base, membership functions and inference engine.
10 Rule-Base
A rule allows for several variables both in the premise and the conclusion. A controller
can therefore be multi-input–multi-output (MIMO) or single-input–single-output
(SISO). The typical SISO controller regulates a control signal according to an error
signal. A controller may apply the error, the change in error, and the integral error,
but we will still call it SISO control, because the inputs are based on a single feedback
loop. This section assumes that the control objective is to regulate a plant output
around a prescribed setpoint (reference) using a SISO controller.
A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis … 223
A linguistic controller contains rules in the if–then format, but they can appear in
other formats. Matlab’s Fuzzy Logic Toolbox presents the rules to the end-user in a
format like the one in Fig. 5.
11 Membership Functions
The final step in creating a fuzzy logic model is to transform the linguistic variables
into membership functions that describe the variable in a fuzzy way. Each member-
ship function represents a linguistic variable and defines a fuzzy set that is associated
with a particular parameter. Once the fuzzy set is defined, the next step is to define the
if–then rules that describe how the fuzzy sets interact. Fuzzy operators are introduced
to create connections between the fuzzy sets, namely max (which represents and) and
min (which represents or). On completion of the connections, all the propositions
are aggregated into the final fuzzy set. This fuzzy set releases a fuzzy number, and
this is transformed into a crisp value, through different techniques. See Figs. 6 and 7.
224 J. N. Morden et al.
12 Multi-inference System
Figure 8 is a graphical construction of the inference, where each of the nine rows
represents one rule. Consider, for instance, the first row: if the error is negative
(row 1, column 1) and the change in error is negative (row 1, column 2) then the
control action should be negative big (row 1, column 3). The chart corresponds to
the rule base (3.2). Since the controller combines the error and the change in error,
the controller is a fuzzy version of a proportional-derivative (PD) controller.
A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis … 225
Fig. 8 PD controller
The instances of the error and the change in error are indicated by the vertical
lines through the first and second columns of the chart. For each rule, the inference
engine looks up the membership value where the vertical line intersects a membership
function.
13 Defuzzification
The resulting fuzzy set µA must be converted to a single number in order to form
a control signal to the plant. This is defuzzification. In Fig. 8 the defuzzified control
signal is the x-coordinate marked by the red thick vertical line overlayed on the
combined output set. Several defuzzification methods exist.
14 Cogs
The crisp control value uCOG is the abscissa of the centre of gravity of the fuzzy set.
For discrete sets, its name is centre of gravity for singletons, COGS. The expression
is the membership weighted average of the elements of the set. For continuous sets,
replace summations by integrals and call it COG. The method is much used, although
its computational complexity is relatively high.
226 J. N. Morden et al.
15 Boa
The bisector of area method, BOA, finds the abscissa x of the vertical line that
partitions the area under the membership function into two areas of equal size. For
discrete sets, uBOA is the abscissa xj that minimizes. Its computational complexity
is relatively high. There may be several solutions xj.
16 Mom
An intuitive approach is to choose the point of the universe with the highest member-
ship. Several such points may exist, and it is common practice to take the mean of
maxima (MOM).
The fuzzy system, for example, must choose either negative or positive to decide on
the appropriateness of a news item; thus, the defuzzifier must choose one (Negative)
or the other (Positive), not something in between. These defuzzification methods
are indifferent to the shape of the fuzzy set, but the computational complexity is
relatively small.
Each source will now contain an output between extremely negative and extremely
positive, the problem is that there may not be the same amounted of sources therefore
if we compute each extremely negative source from the BBC-News it may differ
from the independent news source, therefore each sentiment is then computed and
normalised by dividing each sentiment from each source by the total number of
sources in the dataset.
Let hgt(A) = supμ A (x) be the highest membership degree of A and let x ∈ A :
x∈A
μ A (x) = hgt(x) be the set of domain elements with degree of membership equal to
hgt(A). Then the first of Maxima is given by lnf {x ∈ A : μ A (x) = hgt(x)} and the
x∈A
last of maxima is given by Sup{x ∈ A : μ A (x) = hgt(x)}. Middle of maxima MoM
x∈A
is very similar to first of maxima or last of maxima instead of determining y to be
the first or last from all values when S has the maximal membership degree. This
method takes the average of these two values formally it is given as:
17 Post Processing
The final output will be a value of each news source that has combined all three
methodologies in a fair and balanced way, therefore, a value of news source will
then be further analysed in the Fuzzy Logic Framework to compute a sentiment that
handles the subjectivity of the value that is between −1 and 1.
A Fuzzy Logic Approach to a Hybrid Lexicon-Based Sentiment Analysis … 227
⎧
⎨ Positive, y ≥ 0.45
FBiasssness = Negative, y ≤ −0.45 (5)
⎩
Neutral, otherwise
18 Conclusion
References