Professional Documents
Culture Documents
115
CRPIT Volume 147 - Computer Science 2014
aware that their children were bullied on social media of privacy, threats and exclusion - being excluded from
(Gottfried 2012). In Australia, it is about 87%, Poland is the society. Besides, law enforcement in Indonesia is still
around 83%, Sweden is about 85%, the United States is weak. So, the victims rarely report such incidents to the
about 82%, and Germany is around 81% (Gottfried authorities, such as the police or their parents.
2012). Moreover, the majority of citizens in each of the This work uses Rapid Miner 1 software in order to analyse
24 countries agreed that cyber bullying needs special the Indonesian bullying words from Twitter. There are
attention from parents, particularly Japan (91%) and several processes before analysing the Indonesian
Indonesia (89%) (Gottfried 2012). bullying words: first, importing data from the reprository
Kaman (2007) has conducted a survey about cyber in Rapid Miner; second, filtering words to clean up
bullying across 40 countries including Indonesia in 2005- unstructure sentences; third, using FP-Growth and
2006. The result is that Indonesia took third place after association rules techniques.
Japan and South Korea. This indicates that cyber bullying The data is collected using Twitter adder 2. Moreover, in
in Indonesia is a major problem needing urgent attention. the process of filtering words, we created a stem
Given the rapid growth of cyber bullying in Indonesia, Indonesian bullying dictionary to detect some bullying
identifying bullying patterns in social media is an words in Indonesian Twitter posts. Jakarta and Surabaya
important research task to understand what kind of are two big cities in Indonesia which have become
bullying patterns occur. Hence, detecting Indonesian objects in this research because some people who sent
bullying patterns in Twitter should be analysed more bullying messages in Twitter come from Jakarta and
deeply to know the strong relationship between Surabaya. Our research has identified Indonesian
Indonesian bullying words. This study will contribute to bullying patterns in both cities. We found the trend of
knowledge about human rights violations in social media Indonesian bullying patterns in Jakarta and Surabaya to
in Indonesia. be quite similar.
Analysing Indonesian bullying words in Twitter is an Additionally, the results of this research are an important
interesting issue to explore. The Indonesian bullying contribution to the Indonesian government, Non-
words are unique compared to other countries, because Government Organizations and Indonesian society about
they are always related to animals, psychology, disability, human rights violations through the Internet.
and attitude. For example, “kamu gila, perilaku kamu This paper is organized as follows. Section 2 describes
seperti anjing, bangsat “(You are crazy, you act like a the related work of this paper. In section 3, we describe
dog, you rascal). Other than that, there are several FP-Growth and Association Rule mining which will be
bullying words combinations – “ bangsat, anjing, gila “ used to analyse the research problem. In section 4,
(rascal, dog, crazy). So, the perpetrator technically uses implementation of text mining, FP-Growth, and
two or more words to bully. Association Rule to solve the research problems will be
Even though some previous research has been conducted detailed. Analysing the relationship among words will
on bullying studies, such as to explore the variety of discover new bullying patterns. The last section is the
characteristics of bullying in social media, e.g., Conclusion section.
(Campbell 2005), identifying the relationship between
Indonesian bullying words using text mining technique 2 Related Work
has not been investigated as far as we know. Hence, this Some previous research has discussed cyber bullying in
research has proposed to analyse Indonesian bullying social media. Chen et al. (2012) have conducted research
words in social media. This research uses text mining to detect offensive language in social media. The
techniques as a tool to extract the words and to mine data technique that has been used to identify offensive
in databases. language is the Lexical Syntactic Feature (LSF) approach.
Bullying in social media may influence people to commit The LSF framework has been successful in detecting
harassment and violence, whether physical or some offensive content in social media, which has
psychological (Kowalski, Limber & Agatston 2008): for achieved precision of 98.24%, and recall of 94.34% and
example, when the perpetrator arranges to meet with the also succeeds in detecting users who sent offensive
victims and they possibly accomplish violence acts messages, achieving precession of 77.9%, and recall of
towards the victims. There have been a few cyber 77.8% (Chen et al. 2012). A similar technique to detect
bullying cases in Indonesia that have influenced physical offensive content in social media is a lexicon-based
violence. These cases are located in Yogyakarta in which approach which can block and filter the offensive words
mostly young high school females are the victims. and sentences in social media (Popescu & Etzioni 2007;
Bantul, Gunung Kidul, Kulonprogo, Sleman, Klaten, Taboada et al. 2011). The lexicon-based technique has
Magelang, and Purworejo in Central Java also have a been successful in detecting sentiment terms in social
similar cyber bullying phenomenon (Octaviany & media.
Waskita 2012). The advantage of learning Indonesian Identifying context words in Twitter using the machine
bullying words is to recognise and to identify the words learning technique has been shown in previous research.
when they appear in social media . Then, we may be able Pak and Paroubek (2010) constructed a simple binary
to seek earlier help and prevent any violence from taking
place.
1
In Indonesia, cyber bullying occurs continuously, which Rapid Miner software is available at http://rapid-
i.com/content/view/26/201/
includes harassment, denigration, impersonation, invasion 2
Twitter adder is available at http://www.tweetadder.com/download
116
Proceedings of the Thirty-Seventh Australasian Computer Science Conference (ACSC 2014), Auckland, New Zealand
classifier using n-gram and POS (Part-of-Speech) features Bullying words Indonesia English
to classify tweets on the basis of positive, negative and Bullying words - Bangsat - Rascal
related to animals - Anjing - Dog
neutral sentiment. Their approach has similar techniques
- Babi - Pig
to the unigram, bigrams and POS tags approach
- Monyet - Monkey
introduced by Go, Bhayani and Huang (2009). Go, - Kunyuk - Monkey or stupid
Bhayani and Huang have analysed the distribution of
certain POS tags between positive and negative posts.
Recently, Maynard, Bontcheva and Rout (2012) Bullying words - Goblok - Stupid
related to stupidity - Idiot - Idiot
conducted research to detect negative opinions in social and Psychology
- Geblek - Fool
media using the rule-based approach. Their research has - Gila - Mental disorder
identified negative opinions, which contain sentiment - Tolol - Stupid
sentences, with 86% precision and 71% recall, and also - Sarap - Crazy
identified sentences where the accuracy of the polarity - Udik - Rube
- Kampungan - Hick
(positive or negative) was 66%.
Bullying words - Buta - Blind
Another approach to detecting cyber bullying is a related to disabled - Budek - Deaf
language-based method introduced by Reynolds, persons
- Jelek - Ugly
Kontostathis and Edwards (2011).Their research has General Bullying - Setan - Satan
successfully identified 78.5% of message posts from words - Iblis - Devil
“Form spring” which contain cyber bullying by recording - Keparat - Assh..*
the percentage of curse and insult word posts. - Gembel - Poor
- Brengsek - Bastard
- Sompret - Jeepers
3 Collecting Indonesian Bullying Words from
- Bajingan - Scoundrel /
Indonesian Twitter Posts Bastard
The collected data are the important part in conducting Bullying words - Bejad - Depraved action
related to attitude
research. Much of the work required to collect data from
Twitter will be spent obtaining and preparing the data. Table 1: The Classification of Indonesian
Choosing the proper approach will depend on the ultimate Bullying Words
purpose of the analysis. There is Twitter Adder software
development which can capture text in social media. The Overall, the total number of comments downloaded are
reader should be aware that these examples of Indonesian around 14000 tweets, which specifically contain
bullying words may contain offensive and often abusive Indonesian bullying words, and the whole tweet messages
language. that had been analysed. These data were recorded for
about a week. The data are saved in Microsoft Excel
3.1 Data Collection from Indonesian Twitter format and will be recorded in Table 2.
The first step in collecting data is download tweets text in After recording data in Table 2, the next step is
Twitter using Twitter Adder. This step generally needs a preparation of data to be analysed using data mining
key word to capture posting tweet text from Twitter techniques. This work will analyse posting messages on
automatically. The classification of Indonesian bullying Twitter. From Table 2, the object to be analysed is the
words which were posted on Twitter will be represented last tweet attribute.
in Table 1.
Tweet Location Follower Friend Last Tweet Last
In our research, to collect data we used the Indonesian ID tweet
date
language because we would like to mine Indonesian
36409 Bandung 302 222 Avanya andaiy, 4/06/20
bullying words which occur in Twitter. Furthermore, 4716 avanya mohamad 13
DFDM: "sarap 14:35
bullying on the Internet is also a part of human rights (crazy)!! addnan_ch:
violations. To understand what Indonesian bullying Oh anjing (dog)
goblog (Stupid)
words mean, we try to translate some Indonesian bullying fakyu siah monyet
(monkey), setan
words into English in Table 1. (Satan), babi (pig)
alas bangsat (rascal)
We accessed Twitter’s public timeline to search tweets shit damn aaaaahhhh
an
containing Indonesian bullying words to suit our interest.
We removed all facts that did not express Indonesian 50039 Bekasi 148 179 baju lu beresin gak 4/06/20
8381 anjing 13
bullying messages like news and objective phrases from lu,bangsat,monyet,ta 14:17
i,sialan, gua bilangin
collected data. mama. gini nihh lagi
emosi,semua gua
lempar ke ade
117
CRPIT Volume 147 - Computer Science 2014
118
Proceedings of the Thirty-Seventh Australasian Computer Science Conference (ACSC 2014), Auckland, New Zealand
The stem Indonesian bullying dictionary has the purpose 3.4 Indonesian Bullying Words in Data Set
to filter some words which may have the same meaning In association rule mining, the data which come from
as Indonesian bullying words. The stem Indonesian tweets will be represented as alphabet in the data set. The
bullying dictionary process will transform some purpose of representing Indonesian bullying words in the
unstructured Indonesian bullying words into more alphabet is to calculate how often the words occur in the
structured words because people sent tweets by typing database and to compute how strong the relationship is
incomplete words. For example, bngt and anj will be between words. Representing data in a data set of
replaced to become bangsat and anjing. association rules will be described below.
To replace some unstructured words in tweets, we create In association rule TID is a set of transactions. In this
an Indonesian bullying dictionary by typing some process, TID represents a set of all tweets posts in a
Indonesian bullying words in Microsoft Word and save it database of Indonesian cyber bullying words. I =
in text format. For example: anjing:anj.*; (a,b,d,c,d,e,…,z) is a set of items which contain
bangsat:bngt.*. Typing anjing:anj.* means that all words Indonesian bullying words. Let A be a set of items of
by typing anj or anji will be replaced to be anjing. Indonesian bullying words. When A ⊆ T , an implication
from A ⇒ B can be called an association rule,
where A ⊆ I , B ⊆ I , and A B ≠ ∅. In order to
obtain a strong relationship between items in an
association rule, all itemsets should satisfy the minimum
support threshold and minimum confidence threshold as
prerequisites in association rule mining. For calculating
minimum support and minimum confidence, this work
uses the formula given by Han et al. (2012).
Support ( A ⇒ B ) = Ρ( A B )
Confidence
( A ⇒ B ) = Ρ(B / A) = ( support ( A B)) = ( support_ count ( A B))
( support ( A)) ( support_ count ( A))
Item Represent
a Bangsat (rascal)
Figure 2: Creating Stem Indonesian Bullying b Anjing (dog)
Dictionary c Babi (pig)
d Monyet (monkey)
Figure 2 shows the stem Indonesian bullying dictionary in e Kunyuk (monkey)
text format. The stem dictionary has the function of
f Bajingan (scoundrel/bastrad)
replacing unstructured Indonesian bullying words with
structured words. The resulting stems Indonesian bullying Table 4: Illustrate Set of Item in Data Set
dictionary is shown in Table 3.
The result of the processing document operator is data Table 4 shows the illustrated Indonesian bullying words
matrix which is shown in Appendix 1. The data show represented in an alphabet. For example: the letter a
some Indonesian bullying words which occur in some represents “bangsat”, b represents “anjing”, etc. The
tweets. For example, “bangsat” (rascal) term occurs in concept association rule is market basket analysis; hence
tweets 1,2,3,…, etc.. The 0 means the word not appearing all Indonesian words should be represented in the
in the tweets, and 1,2,3 mean how many times the word alphabet. The purpose of representing Indonesian
appears in the tweets. Appendix 1 shows the data matrix bullying words with the alphabet is to get itemsets
after generating some operators in the processing patterns in the Indonesian Bullying database.
document.
The next operator in the processing document is Tweet ID Last Tweet
numerical-to-binominal. If the value of an attribute is 1 a, b, c, d, g, n
between the specified minimal and maximal values, it is 2 a, b, d
false, otherwise true. The result of generating this 3 a, b, k
operator is shown in appendix 2. 4 a, b, c, d
Data from the matrix will be analysed using data FP- 5 a, b, m, t
Growth and Association Rule. The data will change into
letters representing the real data. The reason to change Table 5: Example Data in Data Set
the real data to letters is that the computation of data Table 5 shows some example data recorded in the
mining will read data in letters or numbers. Table 3 database. This means the itemsets that occur in
describes representing the real data as letters or numbers. transactional data are tweets which contain bangsat (a),
anjing (b), babi (c), monyet (d), goblok (g), and sarap (n).
119
CRPIT Volume 147 - Computer Science 2014
4 Mining Indonesian Bullying Patterns on 1. To get support count for each item, scan the data
Indonesian Twitter Post set firstly to find frequent itemsets. Then,
abandon the items which are not frequent and
Data preparation, collection data from Twitter are the
then sort the frequent items in decreasing order.
main work in this research. The purpose of analysing
message from Twitter is to identify Indonesian bullying 2. To create FP-Tree, scan every transaction from
words patterns and trends. To achieve this aim, we will the data set. Every transaction will be read as
make an effort to identify pattern words using text mining follows:
technique. a. Create a new path if the transaction is a
The process of mining Indonesian bullying patterns in unique transaction form and set the counter
Twitter can be seen in figure 3. for each node to be 1
b. Add the common itemsets node counters if
the transaction shares a common prefix
Collecting data Twitter adder Export to Excel itemsets then and create new nodes if
needed.
3. Continue from first to second steps until each
Rapid Miner
transaction has been mapped onto the tree.
Import data from Excel Numeric to Binominal Using the Rapid miner software, the data set will be
to Repository generated and calculated automatically based on FP-
Growth method. After being generated automatically, the
Retrieve Data from FP-Growth frequent itemsets have been found, shown in table 6.
Repository
Support Item1 Item 2 Item 3
Count
Import data to Association Rule 0.330 bangsat (rascal)
Repository
0.323 anjing (dog)
0.070 anjing_bangsat
Processing data (filter Result 0.267 bangsat anjing
tokenize, transform case, 0.054 bangsat babi (pig)
stop word, stem Dictionary, 0.070 bangsat anjing_bangsat
and n-Grams)
0.081 anjing babi
0.070 anjing anjing_bangsat
Figure 3. The process analysing Indonesian bullying 0.052 bangsat anjing babi
words from Indonesian Twitter 0.070 bangsat anjing anjing_bang
sat
120
Proceedings of the Thirty-Seventh Australasian Computer Science Conference (ACSC 2014), Auckland, New Zealand
the association rule is applied to mine text, the text field Hence, this work will try a comparison between two
would become the transaction and the words themselves regions in Indonesia which make a high percentage
would become the items. contribution to bullying in Indonesian Twitter. This
Using the Rapid miner software, the transaction of work will analyse the cities of Jakarta and Surabaya
itemsets is generated and calculated automatically and the because both cities are among the biggest cities of
result is shown in table 7. Table 7 shows that the frequent Indonesia. Many Indonesian people who come from rural
itemsets from transaction in database have been found. areas move to Jakarta and Surabaya. Therefore, Jakarta
“Bangsat” and “anjing” are dominant in frequent and Surabaya have a lot of communities which tend to
itemsets. Meanwhile, “iblis” (devil) and “setan” (satan) interact with each other.
have taken the second place after both “bangsat” and To examine the Indonesian bullying pattern in Jakarta and
“anjing” words. “Bangsat” and “babi” have support count Surabaya regions, both FP-Growth and Association Rule
0.052 and confidence 0.970 with imply “anjing” words. will be used. The purpose of this is to compare two
That means that “bangsat” and “babi” words have strong locations which contribute Indonesian bullying words on
relation with “anjing” words. Furthermore, “iblis” word Twitter. The comparison between Jakarta and Surabaya
has support count 0.056 and confidence 0.986 to “setan” will be described clearly in this work.
word. This means that “iblis” also a strong relation with The first technique to identify the Indonesian bullying
“setan”. pattern in Jakarta and Surabaya is FP-Growth. Using the
rapid miner software, the result after generating FP-
Premises Conclusion Support Confidence Growth algorithm is as follows:
Count
bangsat (rascal), babi
Anjing (dog) 0.05 0.97 Comparison Support Count FP-Growth between
(pig)
Jakarta and Surabaya
iblis (devil) setan (satan) 0.05 0.98
anjing_bangsat bangsat 0.07 1.0 1
anjing_bangsat anjing 0.07 1.0 0.9
0.8
Support Count
121
CRPIT Volume 147 - Computer Science 2014
The differences in bullying patterns between tweeter in and”bangsat”. This means that the strong rule of the
Jakarta and Surabaya are as follows: Indonesian bullying words in Surabaya is “bajingan” and
1. There are other Indonesian bullying patterns in “bangsat”; “setan” and “bangsat”; “anjing”, ‘setan”
Jakarta such as “babi”, which has support count and”bangsat”.
at around 0.25 and “monyet”, which has support Words with similar calculation support count and
count at about 0.15. confidence for Indonesian bullying words between
2. Others Indonesian bullying patterns in Surabaya Jakarta and Surabaya after generating association rule are
are “goblok”, “jancuk”, and “setan”. “Goblok” the “setan” “bangsat”; “anjing “, “monyet” and “bangsat”
has support count at about 0.1, “jancuk” has words. This means that the words have strong rule.
support count at around 0.25, and “setan” has When we look at the result, the trend of Indonesian
support count at about 0.18. bullying words is “bangsat” and “anjing”. This trend
Another technique that will be used is Association Rule. illustrates that bullying through technology media has
The Association rule technique will identify how strong become a new phenomenon in modern life. People can
the relationship is between Indonesian bullying words. send messages easily with anonymity to embarrass
someone they do not like.
Jakarta Surabaya Moreover, all the trends of Indonesian bullying patterns
Premises Conclusion Support Confidence Support Confidence both in Jakarta and Surabaya have important information
Count Count
for people who are interested in cyber bullying, especially
anjing bangsat
(dog) (rascal)
- - 0.576 0.807 in Indonesia. The trend of Indonesian bullying patterns
bajingan bangsat also supports information for the Indonesian government
(scoundrel - - 0.102 0.922 to develop software for detecting offensive messages
)
which contain Indonesian bullying words on Twitter. This
bangsat anjing 0.246 0.803 - -
result shows the real data Indonesian cyber bullying on
babi (pig) anjing 0.218 0.808 - -
Twitter, which can support the Indonesian government in
setan anjing
(satan)
0.077 0.852 - - enforcing Indonesian communication law.
monyet bangsat 0.171 1 - - Additionally, the trend of Indonesian bullying patterns on
setan bangsat 0.090 1 0.168 0.926 Twitter can be associated with social anxiety. Perpetrators
babi bangsat, send offensive messages on Twitter which can affect the
0.218 0.808 - -
anjing victims whereby they feel unsafe, become unwell and
bangsat, anjing
0.218 0.808 - - anxious. In this situation, the victims are experiencing a
babi
social anxiety. The social anxiety is a potential cause of
setan bangsat,
anjing
0.077 0.852 - - the victims feeling shyness, anxiety disorders, other
bangsat, anjing emotional and temperamental factors. The victims feel
0.077 0.852 - -
setan, discomfort and fear when they interact with other people
babi, anjing in social media involving the perpetrators judging and
monyet 0.073 0.885 - -
(monkey)
evaluating them.
babi, bangsat, The social anxiety can be related to the self-esteem of the
0.073 0.885 - -
monyet, anjing victims. The perpetrator will evaluate the victims based
anjing, bangsat on the qualities they possess. The victims with low self-
0.218 1 - -
babi,
esteem tend to focus on their own weaknesses rather than
anjing, bangsat
monyet,
0.119 1 0.051 0.805 focusing on their strengths. This will have a huge impact
anjing, bangsat on their psychological well-being and their actions, thus
- - 0.113 0.991
setan, leading to disorders like social anxiety.
Table 8: Comparison between Jakarta and Surabaya The trend of Indonesian bullying patterns visualises that
after Generating Association Rule the perpetrators have a social attitude problem in which
they always feel superior to the victims. They also invite
The results of the comparison after generating association their friends and other followers to be involved in the
rule between Jakarta and Surabaya are shown in table 8. bullying.
In Jakarta, the combination two and three Indonesian
bullying words that have great support count and 5 Conclusion
confidence calculation are “monyet (monkey)” and In this study, we investigate the Indonesian bullying
“bangsat (rascal)”; “setan (satan)” and “bangsat”; “anjing pattern which exists in Indonesian Twitter posts.
(dog)”, “babi (pig)”, and “bangsat”; “anjing”, “monyet” Specifically, we propose the stem Indonesian dictionary
and “bangsat”. This means that the strong rule of the to identify Indonesian bullying words in Twitter Posts,
Indonesian bullying words in Jakarta are “monyet” and and further, to analyse the relationship between
“bangsat”; “setan” and “bangsat”; “anjing”, “babi”, and Indonesian bullying words. Our research has several
“bangsat”; “anjing”, “monyet” and “bangsat”. contributions. First, we practically conceptualize the
When we compare this to Surabaya, the combination two Indonesian lexicon bullying words and further detect
and three Indonesian bullying words which have great Indonesian bullying word. Second, we analysed the
support count and confidence calculation are “bajingan” Indonesian bullying words using Association Rule and
and “bangsat”; “setan” and “bangsat”; “anjing”, ‘setan” FP-Growth to find trends and patterns of Indonesian
122
Proceedings of the Thirty-Seventh Australasian Computer Science Conference (ACSC 2014), Auckland, New Zealand
bullying words in Indonesian Twitter. Experimental Hosking, W. (2013): 'Plug in judges: cyber bullying call
results show that the stem Indonesian dictionary in Rapid to arms', Herald Sun, 13 july, News 17.
Miner satisfies the identification of some Indonesian Hui, JY. (2010): 'The Internet in Indonesia: development
bullying words in Indonesian Twitter posts. Moreover, and impact of radical websites', Studies in Conflict &
this research used Association Rule and FP-Growth Terrorism, vol. 33, no. 2, pp. 171-91.
techniques in Rapid Miner Software to find frequent
Kaman, C. (2007): What country has the most bullies?,
Indonesian bullying patterns which were transformed into
Latitude News, viewed 29/4/2013 2013,
itemsets. The results after generating the data from
http://www.latitudenews.com/story/what-country-has-
repositories have shown that the trend in Indonesian
the-most-bullies/.
bullying patterns which occurred in Indonesian Twitter
posts after being generated were “bangsat”, “anjing”, Kowalski, R, Limber, S & Agatston, P. (2008): 'Cyber
“iblis”, and “setan”. Furthermore, the result of finding Bully: Bullying in the Digital Age', Australia:
Indonesian bullying patterns in Jakarta and Surabaya is Blackwell.
quite similar. Both “bangsat” and “anjing” terms are the Li, Min., Sun, Xiaoxun., Wang, Hua., Zhang, Yanchun.,
most popular used by tweeters who live in Jakarta and and Zhang, Ji. (2011): Privacy-aware access control
Surabaya. with trust management in web service. World Wide
In addition, after generating data from the repository, Web 14, 4 (July 2011), 407-430.
both techniques, FP-Growth and Association Rule, has DOI=10.1007/s11280-011-0114-8
similar results. It means both techniques actually have http://dx.doi.org/10.1007/s11280-011-0114-8
powerful calculation able to find frequent itemsets which Maruli, Ae. (2010): 'Hasil Survey Terbaru Jumlah Pulau
appeared in the database. Indonesia', Antara News.
In conclusion, this research will mine deeply the Maynard, D, Bontcheva, K & Rout, D 2012, 'Challenges
Indonesian bullying words using clustering techniques to in developing opinion mining tools for social media',
cluster and Naïve Bayes to classify some Indonesian Proceedings of@ NLP can u tag#
bullying words. In the future, this research will be user_generated_content.
developed to detect and to predict what kind of Octaviany, K & Waskita, D. (2012): 'Fenomena
Indonesian bullying patterns will occur on Twitter. Cyberbullying Facebook Pelajar Yogya: Tindakan
Cyberbullying ternyata Seperti Fenomena Gunung
6 References Es', Viva News, 27/06/2012, Technology.
Campbell, MA. (2005): 'Cyber bullying: An old problem Pak, A & Paroubek, P. (2010): 'Twitter as a Corpus for
in a new guise?', Australian Journal of Guidance and Sentiment Analysis and Opinion Mining', in LREC.
Counselling, vol. 15, no. 1, pp. 68-76.
Popescu, A-M & Etzioni, O. (2007): 'Extracting product
Chen, Y., Zhou, Y., Zhu, S., & Xu, H. (2012) : 'Detecting features and opinions from reviews', in Natural
Offensive Language in Social Media to Protect language processing and text mining, Springer, pp. 9-
Adolescent Online Safety', in Privacy, Security, Risk 28.
and Trust (PASSAT), 2012 International Conference
on and 2012 International Confernece on Social Reynolds, K., Kontostathis, A. & Edwards, L. (2011):
Computing (SocialCom), pp. 71-80. 'Using machine learning to detect cyberbullying', in
Machine Learning and Applications and Workshops
Commission, AHR (2013): Cyberbullying, Human rights (ICMLA), 2011 10th International Conference on, vol.
and bystanders, Australian Human Rights Commision 2, pp. 241-4.
viewed 29/4/2013 2013,
https://bullying.humanrights.gov.au/cyberbullying- Sun, Xiaoxun., Wang, Hua., Li, Jiuyong., and Zhang,
human-rights-and-bystanders-0. Yanchun. (2012): Satisfying Privacy Requirements
Before Data Anonymization. Comput. J. 55, 4 (April
Go, A., Bhayani, R., & Huang, L. (2009): 'Twitter 2012), 422-437. DOI=10.1093/comjnl/bxr028
sentiment classification using distant supervision', http://dx.doi.org/10.1093/comjnl/bxr028
CS224N Project Report, Stanford, pp. 1-12.
Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede,
Gottfried, K. (2012): One in Ten (12%) Parents Online, M. (2011): 'Lexicon-based methods for sentiment
Around the World Say Their Child Has Been analysis', Computational linguistics, vol. 37, no. 2, pp.
Cyberbullied, 24% Say They Know of a Child Who 267-307.
Has Experienced Same in Their Community, Ipsos,
viewed 1/4/2013 2013, http://www.ipsoshk.com/wp- Telkom. (2012) : Indonesia Menuju 100 Juta Pengguna
content/uploads/2012/04/Cyberbullying-factum- Social Media. Telkom, viewed 29/4/2013 2013,
AP.pdf. http://www.telkomsolution.com/news/it-
solution/indonesia-menuju-100-juta-pengguna-social-
Graphs.net. (2012): Bullying Statistics 2012, graphs.net, media.
viewed 1/5/2013 2013,
http://www.graphs.net/201209/bullying-statistics- Wang, H., Zhang, Y., Cao, J. (2009) : Effective
2012.html. collaboration with information sharing in virtual
universities, IEEE Transactions on Knowledge and
Han, J., Kamber, M., & Pei, J. (2012): Data mining: Data Engineering, Vol. 21, No. 6, pages: 840-853,
concepts and techniques, 3rd edn, Elsevier, June.
Amsterdam.
123
CRPIT Volume 147 - Computer Science 2014
Appendix 1
Data Matrix
Appendix 2
Numeric to Binominal
Appendix 3
Association Rule in Graph
124