You are on page 1of 9

Fuzzy logic and Genetic Algorithm based Text Classification Twitter

Dr. Yossra Hussain Ali Dr. Nuha Jameel Ibrahim Mohammed Abdul Jaleel
Computer Science Department, University of Technology, Baghdad, Iraq
Abstract Keywords- A Social media; Text
Social media are a modern web-based classification; Fuzzy logic; Genetic
application for communication and algorithm
interaction between humans through audio 1. Introduction
messages, written messages, and video Social media is a convenient space for
messages. These devices build and activate people to clarification their opinions on a
living communities around the world. People certain events and communicate with each
share their interests and activities with these other. Tweets contain features related to
Applications. Twitter is a social media site, users' thoughts about certain events, and it is
where people communicate through tweets. important to categorize and select them. In
A service that enables users to communicate tweets, there are data that reduces a data
in touch through the exchange of frequent extracted and decrease its utility and impact
tweets and quick. People publish their tweets to a classification process [1].
on their profile and send their followers to
express their thoughts and opinions about Text classification is an embodiment and
events in this world. It is important to study essence issue for various applications which
and categorize these tweets. This article is similar to sentiment analysis, smart replies
based on fuzzy logic and genetic algorithm to or spam identification. It is an issue
fix the problem of text classification. The contemplated generally in recent years and
Inputs for this classification system are a set different techniques used to tackle this issue.
of features extracted from a tweet and the Text classification intends to dispense
output of this system is a decision of records to a numerous or one classification
classification for a tweet, which is a degree of [2]. Most strategies rely to a upon
correlation for each tweet to an appointed representing text as a text vector. This text
event where the degree of relevance to the vector contains recurrence of each word in
desired event if it irrelevant or relevant. The content. Some strategies used to manage text
results compared with a method of keyword classification, for instance Naïve Bayes,
search and fuzzy logic which is based method keyword search and fuzzy logic[3].
based incremental rate and correction rate. In
Fuzzy logic is a strategy for clarifying
the incremental rate, the proposed system is
indeterminate data and manages ambiguity
able to extract tweets more than this methods,
excellently. It approximates to human
where in dataset 1 the number of the tweets
thought and it is similar to natural language
that extracted by the proposed system is 160
[4]. Computerization of words is a vital
tweets but the number of the tweets that
element of fuzzy logic, so fuzzy logic is a
extracted by the other one are 98 and 141.
decent strategy to manage linguistic issues.
The correction rate of the proposed system is
Nonetheless, given the non-crisp and
(98.75) but the correction rate of this methods
unstructured of normal language like a
are (97.9) and (95.7).
language of English, so, can utilize fuzzy
logic to manage text classification[5].
A genetic algorithm is an inquiry method [10] introduced an enhanced approach that
demonstrated on the mechanics of normal was referred to as deep feature weighting
determination instead of simulated reasoning. Naïve Bayesian with the use of Maximum
It simulates developmental procedures that Likelihood Estimate for the calculation of the
occur in people and utilize the operators as prior and conditional probabilities. Sathe et
components of adjustment. for example, al, 2017 [11] suggested sentiment
mutation, selection, recombination and classification algorithm using fuzzy logic
crossover of chromosomes. The most combined with Neural Network.
typically implemented genetic operators are
crossover, mutation, and reproduction [6]. 2. Proposed Work.
In this paper, we proposed to classify The proposed work pass through many
tweets of Twitter data. A set of features from phases.
each tweet are extracted. These features are 2.1 Data Collection and Preprocessing
inputs of the classification procedure that
based on the fuzzy logic and genetic Basic data collect from twitter using the
algorithm to classify the tweet according to it Twitter Application Programming Interface
relevance. (API). data is collected from the time period
10.27.2012 to 11.7.2012. based on Sandy
1. Related works Hurricane event as a case study. Set of 1000
The problem of text classification is an tweets taken from initial data chased as
objective of many research article. Below we training data. This training data used to
review some of these works. Caragea et al, extract More than 50 words used frequently
2014 [7] performed a sentiment analysis of and extract additional fuzzy rules used in the
twitter posts at the time Hurricane Sandy inference phase. Each tweet utilizing one or
happened and visualized those thoughts on a zero to refer to a relevance degree.
map with the hurricane as its center. They additionally, each tweet has an accumulation
suggested an approach of sentiment of score that interim from 0 to 15. Four scores
classification with the use of the Senti- is used to depict an important level of
Strength algorithm in combination with relevant to a tweet based on irrelevance L1
SVMs and Naïve Bayesian classifier. Salari [zero, 5), low relevance L2 [5, 9), moderate
et al, 2014 [8] proposed classification model relevance L3 [9, 12) and high relevance L4
using kNN algorithms, genetic algorithms, [12, 16], respectively. A set of 1002 tweets is
and ANNs. The aim is to obtain the optimal taken from initial data as testing data. The
array of properties, initially, feature-ranking pre-processing data is used to provide a
approaches like Fisher's discriminant ratio satisfactory results. The length of a Tweet
and class reparability criterions that use for text is 140 characters or less. Tweets contain
prioritizing features. Then, the results which a non-useful data in the process of
have been obtained and included arrays of the categorizing text such as a URL (global web
top-ranked properties that are utilized as the page address), a label, numbers, and stop
initial population of a genetic algorithm for words. For example, 'Hurricane Sandy!
producing optimum arrays of features. #Hurricane (Bonnier) http: // twittter.com'. It
Spielhofer et al, 2016 [9] suggested that is important to remove these additions or
issues of irrelevant data elimination and noise manipulate them in tweets so as not to
removal use similar to filtering e-mail spam. affect the classification process. There are
They have trained a Naïve Bayesian classifier many internal processes in this step such as
for relevant data detection. Jiang et al, 2016 manipulate hashtag, remove URL, remove
special character, remove additions, 2.3 Classification Phase
tokenization, remove stop words, stemming,
lemmatization and Part of Speech (POS). In the classification phase, the feature
Tweets is the inputs to the preprocessing step vector contains an eleven value for every
and the output of this step is a series of tweet. eleven features utilized as the input
important words that used in the feature then pass through three steps of Fuzzification
extraction step. process, Inference process and
Defuzzification process , algorithm (1)
2.2 Feature Extraction Phase describe these steps.
The text features extraction is the process of Algorithm (1): Classification phase
converted list of the tokens into a features Input: Predefined classified training data, Feature
vector. Table (1) is an example of feature vector for each tweet contain eleven values.
extraction, List D contains more 50 words
Output: Decision of classification.
used frequently.
Step 1: Generate fuzzy rules from predefined
Table (1) example of feature extraction classified training data.
Tweet 1454 Fires Reported In #NewJersey and
#NewYork City Step 2: Fuzzification process
Tweet after [fire, report, new, jersey, new, york, City]
preprocessing 2.1 Select membership function
Feature vector [('new', 2, 0. 2857142857142857), ('fire', 1, 0.
of tweet 1428571), ('report', 1, 0. 1428571), ('jersey', 1, 0. 2.2 For each value in the feature vector
1428571), ('york', 1, 0. 1428571), ('city', 1, 0.
1428571), 5, 3, 1, 0, 2.0, 7, 2, 0. Compute degree of membership
2857142857142857, 0. 2857142857142857] using membership functions
Component of [('word', word’s number in tweet, word’s score in
feature vector a tweet (𝐺𝑗)), Number of Hurricane Sandy words 2.3 Map the crisp or real input to fuzzy set
(Sw), More words used in the list D (Z1), Words
used moderately in the D list (Z2), Less 2.4 Generate new membership degree using
commonly used in the list D (Z3), score of a a genetic algorithm
tweet (𝐾𝑗), length of a tweet (Nj), words used
frequently in a tweet (Mj), weight of tweet (Wj),
weight of words used frequently in a tweet (Xj)]
Step 3: Inference process
3.1 Write a set of IF-THEN fuzzy rules
word’s number in New=2, fire=1, report=1, jersey=1, York=1,
tweet city=1
3.2 Decision Making based on these fuzzy
𝐺𝑗 = max Si 0. 2857142857142857
1 <=i<=n rules in addition to fuzzy rules extracted in
Sw 5 step1
Z1: indicates to 3
more words used
Step 4: Defuzzification process
in the list D
Z2: Words used 1 1- Select Defuzzification function
moderately in the 2- Transform the fuzzy results into real
D list value
Z3: Less 0
commonly used in Step 5: A print decision of classification and
the list D
𝑛
2.0 the real value of a result
Kj = ∑ Si
𝑖=0 3.2.1 Fuzzification process
𝑁𝑗 = 𝑛 words Words number = 2+1+1+1+1+1=7 The first step is Fuzzification, which is
number used to convert the real or crisp inputs into
Mj the number of 2
words in a tweet fuzzy sets. The degrees of membership is
and it same to calculated for each element in a features
words in list D
vector using a membership functions. For
𝐾𝑗 Wj = 2.0 / 7 = 0. 2857142857142857
𝑊𝑗 = each output and input, variable define three
𝑁𝑗

𝑀𝑗 Xj = 2 / 7 = 0. 2857142857142857
𝑋𝑗 =
𝑁𝑗
or more linguistic values, for instance: low, linguistic variable is specified is set for each
moderate or high. feature and the parameter for each linguistic
In this work, the triangular shape variable. For instance, five degrees of
membership function is used to calculate linguistic value and its parameter are defined
membership degree for each linguistic for score of a word feature (Kj), which is very
variable because it is accurate and widely low value [0 - 2.5], low value [2 - 7],
used. Table (2) show the eleven inputs and moderate value [4 - 10], high [7- 15] and very
output parameter for each features value. high [10 -20]. After that, the membership
function is calculated based on triangular
Table (2). Inputs and Outputs Parameters membership function for this features as
Linguistic Feature Rang Linguistic Value Parameter
follows:
Variables Name e
of Feature
Largest Very Low 0 - 0 .36 Feature vector [('new', 2, 0.
Gj
score of
word in a 0-1
Low
Moderate
0.16 - 0.46
0.26 - 0.56 of tweet 2857142857142857), ('fire', 1,
twee High
Very High
0.5- 0.75
0.65 - 1 0. 1428571), ('report', 1, 0.
0 – 2.5
score of a
tweet
Very Low
Low 2–7
1428571), ('jersey', 1, 0.
0– 4 – 10
Kj
20
Moderate
High 7 – 15
1428571), ('york', 1, 0.
Length of a
Very High
Low
10 - 20
0–7
1428571), ('city', 1, 0.
Nj tweet 0–
20
Moderate
High
5 – 14
12 – 20
1428571), 5, 3, 1, 0, 2.0, 7, 2, 0.
2857142857142857, 0.
Mj
Frequently
used words 0 - 10
Low
Moderate
0–3
2–7
2857142857142857]
number in a
tweet
High 4 - 10 Component of [('word', word’s number in
Weight of Very Low 0 – 2.26
feature vector tweet, word’s score in a tweet
Wj
tweet Low
Moderate
0.2 – 0.4
0.3 – 0.6
(𝐺𝑗)), Number of Hurricane
0-1 High
Very High
0.55 – 0.8
0.7 - 1
Sandy words (Sw), More words
Xj
Frequently
used words 0-1
Low
Moderate
0 – 0.12
0.06 – 0.23
used in the list D (Z1), Words
weight in a
tweet
High 0.16 – 1 used moderately in the D list
patterns Low 0–4
(Z2), Less commonly used in
V number in a
tweet
0 - 10 Moderate
High
3–7
6 - 10
the list D (Z3), score of a tweet
(𝐾𝑗), length of a tweet (Nj),
Z1
More words
used in the 0 - 20
Low
Moderate
0–2
1–5
words used frequently in a tweet
list D High 4 - 20 (Mj), weight of tweet (Wj),
Words used Low 0–2
words used frequently weight
Z2 moderately
in the D list
0 - 20 Moderate
High
1–5
4 - 20
in a tweet (Xj)]

Less Low 0–2


Z3 commonly 0 - 20 Moderate 1–5 - Membership degree for Number of
used in the High 4 - 20
list D Hurricane Sandy words (Sw)
Number of Low 0-2 triangle(x; a, b, c)
5−1 5−5
SW words not 0 - 20 Moderate 1- 5
found in the High 4 - 20
D, but
= max (min ( , ) , 0)
5−1 5−5
belong to
Sandy

triangle(5; 1,5,5)
words
Irrelevance/DK 0-40
R 0- Low Relevance 30-65
100 Moderate Relevance
High Relevance
50-85
75-100 = max(min(1,0), 0)
triangle(5; 1,5,5) = max(0,0)
triangle(5; 1,5,5) = 0
After features extraction step, the
degrees of membership for every value in the So, the value of the membership degree is (0)
feature vector compute in fuzzification step and the Linguistic value for feature of
based on the triangular membership function number of Hurricane Sandy words (Sw) is
and determine the linguistic variable for each moderate.
value to use it in the inference process. The
In the same way of the solution, other the current generation of the chromosomes,
degrees of membership and linguistic values and generate a new chromosomes from these
are calculated for each value of features. The parents using mutation and crossover.
linguistic values for features used are applied 1. Data Structures
to the fuzzy rules in inference phase to obtain The fuzzy linguistic variables are
the final decision of the classification. Table defined by 5 or 3 fuzzy sets according to
(3) show final linguistic value and the range of feature that is selected initially. The
membership degree for each linguistic standard functions [high, low, moderate] or
variable. [very high, high, moderate, low, very low]
are used for these fuzzy sets as a membership
Table (3) linguistic values for each functions.
linguistic variables The gene is a set of 6 or 10 parameters
Linguistic Sw Z1 Z2 Z3 Kj Nj Mj Xj that used to determine a standard
variable
Linguistic mod mod mod low low low mod high
membership functions for the fuzzy linguistic
value variable as show in the figure (1). Each
Membersh 0 0.666 1 0 1 1 1 1
ip degree
membership functions have two parameters
as shown in figure (1). The chromosome is
set of these genes, and each gene represents a
3.2.2 Genetic algorithm (GA) different fuzzy linguistic variables as shown
in figure (2).
fuzzification development has been
adapted in the current system using genetic Low
algorithm to derive and tune a new x y
membership degrees for a fuzzy variables to y z moderat
enhance an execution of the classification e
system. The fuzzy logic is a useful method x z high
for a classification systems. It is difficult for
the human experts to determine a fuzzy sets
in fuzzification step. The performance can be Figure (1) Gene data structure
enhanced by learning a membership
functions instead of getting the membership
functions from human observation. the
derivation is automatically for a membership
functions and it will give the new
membership degrees derived from different
situations.
An original membership degrees that
Figure (2) structure of Chromosome
used to product the first chromosome and the
initial population generated from a first
2. The fitness function
chromosome by repeated implementation of
The fitness function in this work
mutation and crossover operators. In each
depends on a similarity of the rule sets. A
generation, evaluate the fitness function for
similarity defines based on two fuzzy rule
every new chromosome by using a fuzzy
sets and two fuzzy association rules. The
membership functions that represented by the
association rule is describe by this equation:
chromosome. For the next generation,
specified of the most fit chromosomes are
retained. Then parents are identified through
𝑋 → 𝑌, 𝑐, 𝑠 … [1] normal data and second set is called reference
data. Fitness function in [14] was employed
Where X and Y are the parameter of the in this step as follow:
membership function, variable "c" used as a
confidence for the rule and variable "s" used 𝐹 = 2𝑆𝑟𝑛 – 𝑆𝑟1 – 𝑆𝑟2 … [7]
as a support for a rule.
The proportion of parameters is a support of The similarity between the “normal” set
the rule, where X and Y appeared in the and reference set is Srm. Sra2 and Sra1 are
same parameter : the similarity between abnormal set 1 and
the reference set and between the abnormal
𝑠 = 𝑛’/𝑛 … [2] set 2 and reference set.
The number of parameters is n’ that consists 3. GA operators
of both X and The single-point crossover strategy is
the proportion of the parameters containing used with a crossover point that selected
X is the confidence of the rule that also randomly. The probability of fixed mutation
contain Y: is 0.0333 and the probability of fixed
𝑛’ crossover is 0.6. the mutation process occurs,
𝑐 = 𝑛’’ … [3]
the proportion of their current values changes
all the parameters in a gene happen when the
The parameters number that contains X is n". mutation process occurs.
Genetic algorithm used to find the
association rules that use two parameters 3.2.3 Inference
called minimum support threshold and After fuzzification, the process of
minimum confidence threshold. If the value inference based on the fuzzy rules is set.
of s and c above the threshold, the rules Fuzzy rules are a collection of linguistic
extracted from a data. values for each feature and contains a
By Given two association rules: decision of classification. In this work, fuzzy
𝑅𝑢𝑙𝑒1: 𝑋 → 𝑌, 𝑐, 𝑠 and 𝑅𝑢𝑙𝑒2: 𝑋’ → 𝑌’, 𝑐’, 𝑠’ rules written manually and fuzzy rules
if Y=Y’ and X=X’, then the similarity extracted from previously classified training
between Rule1 and Rule2 is data used for decision-making. The vector
𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦(𝑅𝑢𝑙𝑒1, 𝑅𝑢𝑙𝑒2) = that contains linguistic values for each tweet
|𝑐−𝑐| |𝑠−𝑠|
𝑚𝑎𝑥(0, 1 – 𝑚𝑎𝑥( 𝑐 , 𝑠 )) … [4] is matched with all the fuzzy rules to obtain
the final decision through the inference
else, similarity(Rule1, Rule2) = 0 process. Some of these rules is define as
A similarity that compute between two rule follows:
sets S1 and S2 is: 1) If I: high ^ S: very high ˅ high ^ Z1: high
𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑆1,𝑆2=𝑠𝑆1∗ 𝑠𝑆2 →R: high relevant.
𝑠 𝑠
𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 (𝑆1, 𝑆2) = |𝑆1| ∗ |𝑆2| …[5] 2) If S: high ^ Z3: Moderate ^ L3: low →R:
moderate relevant.
Where |S1| and |S2| are the total number of 3) If L: moderate ˅ E: low ^ G: low, → R:
rules in S1 and low relevant.
𝑠 = ∑ 𝑠𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 (𝑅1, 𝑅2 ….. [6] 4) If M: high ^ S: very low ^ Z1, Z2, Z3=zero
Three and five different sets of linguistic ^ SW is low → R is irrelevance/DK.
variables used for testing genetic algorithm In accordance with the above rules, The
such as Nj, Mj, Xj and other. A data set is tweet has a high relevant degree to Hurricane
partitioned into two sets. The first set called Sandy if the frequently used words that have
high degree in tweet and the number of word Researches [15] utilized the keyword
that belong to tweet in List D are high. The search method for the extraction related
tweet has a moderate relevant degree to tweets from the primary dataset. The
Hurricane Sandy if the degree of its words advantage of a keyword search method is
have high value and tweet's length is low and exceptionally precise, proficient and
the number of words is moderate in list D. A straightforward because tweet is very
degree of tweet has a low Relevant to sandy relevant. A mistake of this strategy is that it
If the weight of tweet is low, frequently used is unable to extract adequate tweets relevant.
words’ weight is low and the number of Table (4) shows a comparison results about
important words in list D is low. The tweets between this proposed system and a keyword
are categorized irrelevant if the important search method based on correction rate and
words associated with Hurricane Sandy was incremental rate. A correctness rate is:
not found. When the rate of words in List D 𝐴
is high and the number of words belonging to 𝛼 = 𝐵 ∗ 100 % … [8]
Hurricane Sandy in the training data is not in
the list of the most important words it is Where B set a related tweets' number
considered moderate In this case the Tweet is extracted from each technique. The correctly
highly relevant to Hurricane sandy. categorized tweets number in B is X. A is
3.2.4 Defuzzification ascertained by manually checking by authors
and It is affirmed by training data, i.e. the
Defuzzification is a process of generate a number of tweets categorized correctly in B
quantifiable results in real logic. It must is computed. likewise, an incremental rate (λ)
implemented to convert fuzzy results to real is portrayed that a proposed system is able
value based on fuzzy sets and corresponding and ready to exploit data more than keyword
membership degrees. There are set of search method, it is defined as:
defuzzification functions like centroid, 𝐴𝑓 – 𝐴𝑘
Center of Sums Method (COS), bisector, 𝜆 = ∗ 100 % …[9]
𝐴𝑘
mean of the maximum (MOM), First of
Maxima Method (FOM) and the smallest of Where Af is computed by this system and
the maximum (SOM). In this work, a centroid Ak computed by a keyword search
function is used at defuzzification step based Now, a comparison is made between the
on equation (7) as follows: keyword search method and this proposed
∑𝑛𝑖=1 xi . µC(xi ) system based on correction rate and an
𝑋∗ = … [7]
∑𝑛𝑖=1 µC(xi ) incremental rate shown in Table 4.
(5 ∗ 0) + (3 ∗ 0.666) + (1 ∗ 1) + (0 ∗ 0) + (2 ∗ 1) + (7 ∗ 1)
= TABLE 4. Comparison results between
0 + 0.666 + 1 + 0 + 1 + 1 + 1 + 0 + 1
(2 ∗ 1) + (0.285 ∗ 0) + (0.285 ∗ 1)
proposed system and Keyword search
0.666 + 1 + 0 + 1 + 1 + 1 + 0 + 1 Data Keyword Search Proposed system λ
Set method
14.2837
= = 2.52095 X Y α Y X α
5.666 1 98 96 97.9 160 158 98.75 64.58
% % %
4 Experimental Results 2 103 103 100 184 183 99.45 77.66
% % %
The results compare with a keyword search 3 86 86 100 147 146 99.31 69.76%
method and fuzzy logic based method. % %
4 93 92 98.9 154 152 98.70% 65.21%
%
4.1 Compare with keyword search method 5 99 99 100 138 136 98.55% 37.37
% %
Through manual audit and trial of it can ensure high rate value and high quantity
results, all the extracted tweets by keyword value of correction and tweets more that are
search method appear in this work which are relevant and classified accurately.
the results of the keyword search method is
subsets from this work. In Table 4, the rates 5. Conclusion
of λ demonstrated in this work successfully In this research, the proposed work is for
to revise additional tweets more than text classification from data of twitter based
keyword search method. on fuzzy logic and genetic algorithm. By
4.2 Comparison with a fuzzy logic based utilizing, an arrangement of a training data
method and a test data and got eleven feature from
every tweet as inputs to the classification
Table 5 shows the difference between procedure. This work compare with two
this proposed work and fuzzy logic method methods. The first is a method of keyword
for text classification [12]. search and the second method is a fuzzy
TABLE 5. Results between Fuzzy logic Logic method for text classification. Results
based method and proposed system. demonstrated that this work is suitable and
appropriate to classify irrelevant or relevant
No. Fuzzy Logic Based Proposed system λ tweets more than fuzzy logic method and
Method
X Y α Y X α keyword search method, additionally, by
1 141 135 95.7 160 158 98.75 17.03 contrasting defuzzification functions usually
% % %
2 161 157 97.7 184 183 99.45 26.56 utilized. To conclude, centroid function is
% % % more productive and powerful than other
3 128 126 98.4 147 146 99.31 15.87
% % % function.
4 137 132 96.3 154 152 98.70% 15.15
% %
5 122 118 96.7 138 136 98.55% 15.25%

Summarize, this work is ready to


extract tweets more than fuzzy logic method
and keyword search method. With
considering the incremental rate, the
proposed work is powerful more than fuzzy
logic based method and keyword search
method. With considering, a correctness rate
values, a keyword search method completion
somewhat better than a fuzzy logic method
but is superior and better than a fuzzy logic
method and approximate to the keyword
search method. With thinking and
considering both standards, the proposed
work is choosen in research, where relevant
tweets are exceptional and highly required
for analysis step. Correctness rate value and
high quantity is ready to ensure a more
informative, educational and helpful. So the
proposed work is superior to anything and
better than a fuzzy logic based method where
REFERENCES management", 3rd International Conference
on Information and Communication
[1] Nahon Karine and Crowston Kevin, Technologies for Disaster Management
"Introduction to the Digital and Social (ICT-DM), Vienna, Austria, pp. 1-6,
Media Track", IEEE, 49th Hawaii December, 2016.
International Conference on System Sciences
(HICSS), USA, 2016. [10] Q. Jiang, W. Wang, X. Han, S. Zhang,
X. Wang and C. Wang, "Deep feature
[2] Charu C. Aggarwal, "Data weighting in Naive Bayes for Chinese text
Classification Algorithms and classification", International Conference on
Applications", Taylor & Francis Group, Cloud Computing and Intelligence Systems
2015. (CCIS), Beijing, China, August, 2016.
[3] Rajni Jindal, Ruchika Malhotra, Abha [11] J. B. Sathe and M. P. Mali, "A hybrid
Jain, "Techniques for text classification: Sentiment Classification method using
Literature review and current trends", Neural Network and Fuzzy Logic" IEEE,
Webology, Volume 12, Number 2, December India, January 2017.
2015.
[12] KeYuan Wu, MengChu Zhou, Xiaoyu
[4] B. Kosko, "Fuzzy Thinking: The New Sean Lu, and Li Huang, “A Fuzzy Logic-
Science of Fuzzy Logic", International Based Text Classification Method for
Journal of General Systems, Hyperion, New Social Media Data“, International
York, June 1994. Conference on Systems, IEEE, October
[5] L. A. Zadeh, "Fuzzy logic = computing 2017.
with words", IEEE, May 1996. [13] A. Kasun, M. Manic, and R. Hruska,
[6] Xinjie Yu, "Introduction to “Optimal stop word selection for text
evolutionary algorithms", Computers and mining in critical infrastructure domain“,
Industrial Engineering (CIE), IEEE, Japan, Resilience Week (RWS), Philadelphia, pp. 1-
July 2010. 6, August 2015.
[7] C. Caragea, A. Squicciarini, S. Stehle, K. [14] Wengdong Wang and Susan M. Bridges,
Neppalli, and A. Tapia, "Mapping moods: "Genetic Algorithm Optimization of
geo-mapped sentiment analysis during Membership Functions for Mining Fuzzy
hurricane Sandy" International Conference Association Rules", the International Joint
on Information Systems for Crisis Response Conference on Information Systems, Fuzzy
and Management (ISCRAM), pp. 642-651, Theory and Technology, Atlantic City, N.J.
May 2014. March 2, 2000.
[8] : Salari N, Shohaimi S, Najafi F, [15] H. Dong, M. Halem, and S. Zhou,
Nallappan M and Karishnarajah "A Novel “Social media data analytics applied to
Hybrid Classification Model of Genetic hurricane sandy”, International Conference
Algorithms, Modified k-Nearest Neighbor on Social Computing (SocialCom), IEEE,
and Developed Backpropagation Neural September,2013.
Network", PLOS ONE, November, 2014
[9] T. Spielhofer, R. Greenlaw, D. Markham,
and A. Hahne, "Data mining Twitter during
the UK floods: Investigating the potential
use of social media in emergency

You might also like