Review Paper

DOSEN : BARLIN, S.T.,M.Eng.,Ph.D.
NAMA: Muhammad Yudana Aditya Pratama
NIM : 03051382227109
Review paper
PRELIMINARY
Twitter is one of the most popular social media in the world. Indonesia occupies the 5th
position of the largest users in the world, every day the twitter server receives large amounts
of tweet data, thus we can perform data mining that is used for certain purposes, one of which
is as a medium for promoting a product or service as well as research conducted by James
luke and this Suharjito.
For large-scale data, speed is needed in the data search process. So it is necessary to group
the data first. Naive Bayes is a learning algorithm for classification with computational
efficiency and good accuracy, especially for dimensions and large amounts of data. For this
reason, this study will prove the ability of the nave Bayes classifier to classify tweets
containing information about a product and service, research case studies This event was held
at PT Bobobobo Jakarta.
METHOD
Data mining is a process of knowledge discovery (discovery of knowledge) from very large
data. Meanwhile text mining is a field of data mining that aims to collect useful information
from text data in natural language or the process of analyzing text data and then extracting
useful information for a particular purpose.
The algorithm used in this study is the Naïve Bayes classifier (NBC). Nave Bayes classifier is
a machine learning method that
utilize the calculation of probability and statistics put forward by the French scientist Thomas
Bayes. That is predicting future probabilities based on past experience.
Process classification conducted basedon equality :
The research process carried out can be seen in the picture 1

Figure 1 Research Framework
Data:
The data used in this study are Indonesian-language tweets in the Jakarta area, from June
2013 to February 2014 from the data then divided into two categories/classes: products and
services.
 Product Category:
"Liattas, shoes, headscarves, clothes, I want to buy everything"
"Looking for a batik dress: "if you get what you like, it's a bit expensive in the pocket
or the dress...a pair with a shirt..."
Service Category :
“Yoga class after hours is quite flexible, cyn…”
“vacation to paris, traveling mahameru, comparative study of wear and tear”
"Looking for a cheap hotel near the center of Surabaya? any suggestions?"
Tweet classification with NBC
After the tweet data is collected, then the data is used as a training dataset and grouping based
on product and service class/attributes. The next process is classification using the Naïve
Bayes Classifier (NBC) algorithm to measure the level of accuracy. The tweet classification
process is drawn with a flowchart in the picture 2
Gambar 2 Flowchart klasifikasi menggunakan NBC
Word and Tweet Trend Selection
After tweets are classified into product and service classes. The next step is to select
popular/trend words and compare them with tweets that will be used as promotions
automatically. The following is the flow of the tweet selection and promotion process. The
selection and promotion process is shown in Figure
3
Figure 3 automatic selection and promotion flowchart
The steps for selecting popular/trend words and tweets that will be used as promotional
materials are as follows:
1.Tokenization
The stage of cutting the input string based on the words that compose it.
words are separated from the tweet as a sign. The word is considered valid if it consists of 3-
25 letters and is not a link or URL
2.Stopword
These are words that have no effect on the classification process. The results of this process
are stored in the database
3.Trend words
By using query from database to get back 5 words from product and service category
4.Promoting tweets
Promotional tweets are written by the Twitter admin, each tweet is specified with a
promotional grace period, keywords, and a match score with the trend word. Then the tweets
are stored in the database
5.Tweets automatically
The query process to the database to get a match / match promotion of tweets with trend
words. If each criterion matches, the system will automatically tweet.
EVALUATION
This research has two stages, as follows:

Analysis of the accuracy of the Naïve Bayes Classier algorithm
In measuring the accuracy of the nave Bayes classifier algorithm, tweets/data are formed into
three variations of training and test data. Every product and service tweet is tested.
distribution of training data and test data is shown in table 1
Table 1 composition and variation of training data
Analysis of increasing follower engagement
To measure the performance of follower engagement, it is calculated by the following

equation:
a. RESULTS AND DISCUSSION
Tweet classification results using NBC
he composition of variations from the training/training data is used in all product categories,
and results as shown in table 2 below
Table 2 NBC results using product tweet data
Using the variation composition of the training data for service category tweets, the results
are as shown in table 3
Table 3 NBC results using service tweet data
Composition of training data variation on the combination of product and service category tweets.
Table 4 NBC results (combination of product and service tweet data)
From the results of the test data above, the NBC algorithm has a fairly good level of
accuracy.
Tweet promotion automation results
From the experiments conducted in September 2014 to February 2015, the results
Picture 4 Engagement twitter @_bobobobo_

From the picture above, the average calculation has been done by combining the results of retweets,
mentions and the addition of followers before and after. Shown in table 5
Table 5 Increased follower engagement
The table above shows that the increase in retweets and mentions will greatly affect the
increase in follower engagement. The results of the comparison of follower engagement
before and after implementation are shown in Figure 5
Figure 5 levels of engagement before and after implementation This study gave
positive results on increasing follower engagement.
CONCLUSION
The conclusions of this study are:
1.The NBC algorithm has a high level of accuracy in the classification process, as indicated
by an accuracy rate of 90.31% using product category test data, and 80.91% using service
category test data. And the combination of the two results in an accuracy of 83.51%
2.Many of the stopwords can determine trendwords from a collection of product and service
category tweets
3.The increase in activity occurred on twitter by this study, for tweets it reached 39%,
mentions 120% and new followers 69%. retweeting and mentioning have an impact on
follower engagement results.
4.The number of tweet engagement rates, after this study gave a fairly high result of 17.44%
and the lowest was 4.72%. when compared to the previous study, the highest was 3.80% and
the lowest was 2.90%.
5.By using twitter as a promotional medium, it gives quite satisfactory results, before
tweeting we can analyze the trend word / trending topic from followers, which gives a good
response from followers.

Review Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Review Paper

Uploaded by

Copyright:

Available Formats

DOSEN : BARLIN, S.T.,M.Eng.,Ph.D.

NAMA: Muhammad Yudana Aditya Pratama

The research process carried out can be seen in the picture 1

"Liattas, shoes, headscarves, clothes, I want to buy everything"

“Yoga class after hours is quite flexible, cyn…”

“vacation to paris, traveling mahameru, comparative study of wear and tear”

Tweet classification with NBC

Word and Tweet Trend Selection

This research has two stages, as follows:

Analysis of increasing follower engagement

To measure the performance of follower engagement, it is calculated by the following

Tweet classification results using NBC

Table 2 NBC results using product tweet data

Table 3 NBC results using service tweet data

Tweet promotion automation results

Picture 4 Engagement twitter @_bobobobo_

Table 5 Increased follower engagement

The conclusions of this study are:

You might also like