You are on page 1of 9

The Impact of NLP on Turkish Sentiment Analysis

&:‘ME‘S‘N '4±FUJO
ɗ5Ã$PNQVUFS&OH ɗ5Ã$PNQVUFS&OH
*OGPSNBUJD'BDVMUZ *OGPSNBUJD'BDVMUZ

(&SZJʓJU 55FNFM
ɗ5Ã$PNQVUFS&OH ɗ5Ã$PNQVUFS&OH
*OGPSNBUJD'BDVMUZ *OGPSNBUJD'BDVMUZ

Abstract
Sentiment analysis on English texts is a 4PDJBM NJDSPCMPHHJOH QMBUGPSNT FH
highly popular and well studied topic. On 5XJUUFS BOE 'BDFCPPL
 PGGFS BO
theotherhand,theresearchinthisfieldfor PQQPSUVOJUZ UP HFU IVHF BNPVOU PG FBTJMZ
morphologically rich languages is still in BDDFTTJCMF BOE QSPDFTTBCMF EBUB 6TFST PG
its infancy. Turkish is an agglutinative NJDSPCMPHHJOH QMBUGPSNT XSJUF BCPVU UIFJS
language with a very rich morphological QFSTPOBM MJWFT  UIFJS PXO PQJOJPOT BCPVU
structure. For the first time in the liter- QPMJUJDBM DBTFT  FDPOPNJD DIBOHFT 
ature, this paper investigates and reports DPNQBOJFTBOEUIFJSQSPEVDUT
the impact of the natural language prepro- With the emergence of social media
cessinglayersonthesentimentanalysisof platforms,thesentimentanalysisstudiesare
Turkish social media texts. The ex- shiftedfromdocumentlevelanalysis(Bruce
perimentsshowthatthesentimentanalysis and Wiebe, 1999; Wiebe et al., 1999;
performancemaybeimprovedbynearly5 Wiebe, 2000) towards sentence or phrase
percentage points yielding a success ratio level analysis (Morinaga et al., 2002; Yi et
of78.83%ontheuseddataset. al., 2003; Kim and Hovy, 2004; Yu and
Hatzi-vassiloglou, 2003; Wilson et al.,
2005). Recent years showed that syntactic
1 Introduction and/or semantic analy-sis outperforms
baseline sentiment analysis meth-ods in
Sentiment analysis has become a very many areas such as aspect-based and com-
popular re-search area because of needs to parativeopinionmining(HuandLiu,2004;
track and man-age population tendency. Liu,
Manycompaniesto-dayworksonthisarea 2012; Balahur et al., 2014). In order to
inordertomeetcus-tomerexpectationsand reach this level of analysis, many other
demands. natural language pre-processing stages are
required; i.e. tokenization, normalization,
parts-of-speechtaggingetc...
(ÚOEFSNFWFLBCVMUBSJIJ4

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
As in all other natural language With the emergence of new tools
processing (NLP) problems, the most dealing with automatic language
widely studied lan-guage for sentiment processing of social media texts (Eryi˘git,
analysis is English. However, studies for 2014), it is now becoming possible to
morphologically rich languages are not integrate them into higher level
mature yet. Abdul-Mageed et al. (2014) applications; i.e. sentiment analysisinour
usedasupervised,two-stageclassification case. But, the following issues still reside
approach em-ploying morphological, asopenquestions:
dialectal, genre specific features besides
1. theimpactsofeachNLPlayerson
basic ones for a morphologically rich sentimentanalysis.
language, Arabic. Jang and Shin (2010)
pro-poses an approach for agglutinative 2 information (e.g. stems, main POS
languagesandtesttheirmethodonKorean . tags, in-flectionalfeatures)tousefrom
short movie reviews and news articles. theoutputsofbeneficiallayers.
Wiegand et al. (2010) investi-gate the In this paper, for the first time in the
impactofnegationinsentimentanalysisof literature, we investigate and report the
German. impact of the prepro-cessing layers
(namely, tokenization, normaliza-tion,
In the literature, it has been shown
morphological analysis and
several times that Turkish, due to its
disambiguation) on the sentiment analysis
highly inflectional and derivational
of Turkish social mediatexts. In order to
structure, poses many different prob-lems
show the maximum sentiment analysis
fordifferentNLPtaskswhencomparedto
performance to be achieved with flawless
morphologically poor languages. By this
NLP tools, we used a hand-annotated
prop-erty, previous NLP research on
sentiment corpus with gold-standard
Turkishlanguagepioneeredthestudiesfor
linguisticfeatures.
many similar languages. On the other
hand, sentiment analysis studies for 2 Turkish
Turkish are very preliminary; although
5VSLJTI JT BO BHHMVUJOBUJWF MBOHVBHF
there ex-ist a couple of studies on
XIFSF FBDI TUFN NBZ CF JOGMFDUFE CZ
sentimentclassificationofmoviereviews,
political news, fairytales (Vural et al.,
NVMUJQMF TVGGJYFT &WFSZ OFX TVGGJY
2013; Kaya et al., 2013; Boynukalin, DPODBUFOBUJPO NBZ DIBOHF UIF
2012; Seker and Al-naami, 2013), there NFBOJOH PG UIF XPSE PS SFEFGJOF JUT
exist very few studies on sentiment TZOUBDUJDSPMFXJUIJOUIFTFOUFODF
analysis of social media posts (Çetin and
Amasyalı(2013a;2013b)).

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
 This feature of Turkish yields to rela- mation about the word and possible
tively long words (having higher number relations with other words in the
of char-acters when compared to other sentence can be extracted from the
languages).Asanordinaryexampleofthis correctanalysis.
situation, the Turk-ish word
“yapabilirmi¸scesine” can be translated as 3 TheUsedDataSet
“asifhe/sheisabletodo”intoEnglish.In For this study, we collected a twitter
addition,theexampleshowsthatthesame Turkish sen-timent corpus mainly from
En-glishstatementisexpressedbyalesser the telecommunication domain. The data
wordcount(smallermeansentencelength) is retrieved from the Tweeter API by
in the Turkish side. Therefore, semantic querying a predetermined list of
analysis of Turkish social me-dia texts is keywords.Thetimeframeofthecollected
more risky to be defeated by the erro- data was between May, 10th of 2012 and
neous writings within this informal July, 7thof 2013. We re-finedthe corpus
domain.Thevariousproblemsobservedin from non-Turkish tweets through a
the Turkish Tweets are presented in detail language specifier based on a “Language
in Toruno˘glu and Ery-i˘git (2014); these Detec-tion Library for Java”1. For the
are mainly the missing vowels, diacritics, manualannotationofourcorpus,weused
theusageofemoticons,slangwords,emo- TURKSENT (Eryi˘git et al., 2013) - a
stylewritings,spokenaccentsandhighoc- sentimentannotationtoolwhichallowsus
currence of spelling errors. The lower to annotate the corpus on the following
wordcountwithinasentenceleadstostrict layers:generalandtargetbasedsentiment,
dependencies be-tween words in Turkish text normal-ization, morphology and
and the only one single misspelled word syntax. For this study, we used only the
canruintheunderstandabilityofthewhole general sentiment, the normal-ization and
sentence. This indicates the importance of themorphologicalannotationlayersofthe
normalization preprocessing stage for tool.
TurkishdifferentlyfromEnglish.
POStaggingtaskforotherlanguagesis
Sincethesentimentannotationsdepend
per-formed in two stages for Turkish:
on sub-jective decisions of the human
morphologicalanalysisandmorphological
annotators, we ap-plied an inter-
disambiguation. Mor-phological analysis
annotator agreement filter to in-crease
of a single word can produce several
the confidence level of our sentiment
possible analysis regardless of the context
anno-tations. Ourfinaldatasetconsistsof
insentence.However,onlyoneofthemis
12790 tweets manually normalized,
correctinitscontext.Thecorrectanalysis
morphologically analyzed and classified
can be selected by morphological
between 3 sentiments (3541 posi-tive,
disambiguation process on the
4249 negative and 5000 neutral) agreed
morphological analysis results. Linguistic
bytwohumanannotators.
infor-

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
4 Feature Extraction Methods .JO$MPTFTU5I " TNBMM *%' WBMVF
JOEJDBUFT B DIBSBDUFSJTUJD GFBUVSF GPS B
In this study, we treat the sentiment
HJWFO DMBTT #VU  JO PSEFS GPS B GFBUVSF UP
detection of a tweets as a multi-class
CF EJTDSJNJOBUJWF CFUXFFO EJGGFSFOU
classification problem.We used support
DMBTTFT  UIF EJGGFSFODF CFUXFFO JUT *%'
vector machines (SVM) in or-der to
WBMVFT TIPVME CF CJHHFS UIBO B HJWFO
classify the tweets into one of the
UISFTIPME *O PUIFS XPSET  B GFBUVSF
three classes (positive, negative,
IBWJOH TJNJMBS *%' WBMVFT GPS UXP DMBTTFT
neutral). When we extract unigrams
EPFT OPU IFMQ GPS UIF EJTDSJNJOBUJPO PG
from all collected data without
UIFTFDMBTTFT'PSFYBNQMF BTUPQXPSEPS
preprocessing and feature filtering, we
B LFZXPSE XIJDI JT VTFE UP SFUSJFWF EBUB
get 97472 unique features. This
GSPN 5XJUUFS "1* XJMM IBWF TJNJMBS TNBMM
amount of features is ex-tremely huge
*%' WBMVFT GPS BMM DMBTTFT *O UIF MJHIU PG
for machine learning algorithms, be-
UIFTF PCTFSWBUJPOT  BGUFS UFTUJOH XJUI
cause more features ends up with more
TFWFSBM GFBUVSF FYUSBDUJPO NFUIPET  XF
training time and more resources. In
GPVOE UIBU .JO$MPTFTU5I QFSGPSNFE UIF
addition to time and resource
CFTU *O UIJT BQQSPBDI &RVBUJPO 
 XF
constraints, irrelevant features mayalso
GJOE UIF EJGGSFODF CFUXFFO UIF TNBMMFTU
ruin the steady nature of the trained
BOEUIFTFDPOETNBMMFTU*%'WBMVFGPSB
model. Since feature extraction is an
GFBUVSF BNPOH BMM DMBTTFT 5IF
indispensable stage of machine learning
GFBUVSFT  GBMMJOH PVUTJEF PG UIJT UISFTIPME
algorithms, we applied an ex-traction
BSFSFNPWFEGSPNUIFGFBUVSFTFU
method utilizing Inverse Document
FreRVFODZ *%'
8IJMF5FSN'SFRVFODZ
]NJO*%'NFEJBO*%']UISFTIPME 

JT FBTJFS BOE TJQMFS UIBO *%'  JU JT OPU


DPOWFOJFOU JG UIFSF BSF MPUT PG SFDVSSJOH
'JHVSFTIPXTUIFIJTUPHSBNPG]NJO*%'
QBSUT PG UFYUT XIJDI JT UIF DBTF GPS PVS
 NFEJBO*%'] EJGGFSFODF EJTUSJCVUJPOT
TUVEZ 5XFFUT BSF USFBUFE BT
0OF TIPVME OPUJDF UIBU B EFUFSNJOFE
TJOHMF EPDVNFOUT XIJMF DBMDVMBUJOH UIF
UISFTIPME WBMVF XJMM BMTP EFUFSNJOF UIF
EPDVNFOU GSFRVFODJFT JO *%' "GUFS UIF
OVNCFS PG GFBUVSFT UP CF VTFE JO UIF
DBMDVMBUJPO PG *%' WBMVFT PG BMM VOJHSBN
FYQFSJNFOUTBMMUIFXPSETFOUFSJOHUPUIF
GFBUVSFT  XF GJMUFS UIFN BDDPSEJOH
CJOTHSFBUFSUIBOUIFUISFTIPMEWBMVFXJMM
UPPVSQSPQPTFEGJMUFSJOHBMHPSJUIN.JO
CFJODMVEFEJOUPUIFGFBUVSFTFU
$MPTFTU5IHJWFOCFMPX

1
It is available on https://code.google.com/p/language-
detection/

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
*O PSEFS UP TFMFDU B HPPE UISFTIPME WBMVF .PSFPWFS  BMM UIFTF EJGGFSFOU WBSJBUJPOT PG
GPS GVSUIFS FYQFSJNFOUT  XF JOWFTUJHBUF B XPSE NBZ OPU NBLF B EJGGFSFODF PO
UIF TFOUJNFOU BOBMZTJT QFSGPSNBODF XJUI TFOUJNFOU DMBTTJGJDBUJPO PG UXFFUT
EJGGFSFOU UISFTIPME WBMVFT       5IFSFGPSF  XF XBOU UP QPMBSJ[F GFBUVSFT
BOE
5IFTFBSFHJWFOJOTNBMMMJOF XIJDI IBWF UIF TJNJMBS JNQBDU PO
DIBSUJO'JHVSF"TTFFOGSPNUIFGJHVSF  TFOUJNFOU UP UIF TBNF QPMF  BOE NBLF
UIF NBYJNVN GNFBTVSF JT BDIJFWFE BU FYQMJDJU UIF EJGGFSFODF CFUXFFO
'NFBTVSFJTOPUUIFPOMZNFUSJDUP QPMFT 8F BQQMJFE NBJOMZ UISFF EJGGFSFOU
TFMFDU UIF PQUJNVN UISFTIPME TJODF UIF /-1 QSFQSPDFTTJOH MBZFST FYQMBJOFE JO
UPUBM GFBUVSF DPVOU TIPVME BMTP CF QSFWJPVT TFDUJPOT
 UP USBOTGPSN GFBUVSFT
DPOTJEFSFE 'PS FYBNQMF  UIF OVNCFS PG GSPN PSJHJOBM WFSTJPOT UP UIF EFTJSFE
GFBUVSFT JO UIF GFBUVSF TFU JT  XIFO SFQSFTFOUBUJPOT #FMPX XF HJWF UIF
UIF UISFTIPME JT DIPTFO  BOE  JOGPSNBUJPO FYUSBDUFE GSPN UIF PVUQVU PG
XIFO  "MUIPVHI UIF EJGGFSFODF UIFTFMBZFST
CFUXFFO GNFBTVSFT JT OPU ESBNBUJD  UIF
MFTTFS OVNCFS PG GFBUVSFT JT QSFGFSBCMF /PSNBMJ[BUJPO 8F VTFE UIF OPSNBMJ[FE
8F TFMFDUFE  GPS GVSUIFS FYQFSJNFOUT GPSNT PG UIF XPSET CFGPSF FYUSBDUJOH UIF
TJODF BT TFFO GSPN UIF GJHVSF  UIF GFBUVSFT 'PS JOTUBODF  iU…TLLSMFSw JT
QFSGPSNBODF ESPQT DPOTJTUFOUMZ XJUIPVU OPSNBMJ[FEBTiUF…TFLLàSMFSw UIBOLT

IBWJOH BOZ JNQPSUBOU EJGGFSFODF JO
GFBUVSFDPVOUT 4UFNNJOH 4UFNT PG XPSET IBWF NPSF
HFOFSBM DPWFSBHF UIBO TVSGBDF GPSNT 5P
NBUDI EJGGFSFOU TVSGBDF GPSNT PG B XPSE
JOUPPOFTJNQMFUPLFO XFVTFETUFNNJOH
CZEFMFUJOHBMMJOGMFDUJPOBMHSPVQTBOEUBHT
GSPN JUT DPSSFDU NPSQIPMPHJDBM BOBMZTJT
'PS JOTUBODF  iV[NBOMBSw TQFDJBMJTUT

iV[NBOM‘H‘w IJTIFS TQFDJBMUZ

iV[NBOM‘Lw TQFDJBMUZ
 BSF EFSJWFE GSPN
UIF TBNF TUFN iV[NBOw TQFDJBMJTU
 "MM
'JHVSFNJO*%'NFEJBO*%')JTUPHSBNBOE UISFF GPSNT BSF UVSOFE JOUP UIFJS TUFN
3FMBUFE1FSGPSNBODF iV[NBOw

/BUVSBM-BOHVBHF1SFQSPDFTTJOH /FHBUJPO "T TUBUFE JO 8JFHBOE FU BM


-BZFST 
 UIF EFUFDUJPO PG OFHBUJPO OFFET
FYUSB USFBUNFOU JO NPSQIPMPHJDBMMZ SJDI
5VSLJTI JT BO BHHMVUJOBUJWF MBOHVBHF BOE
MBOHVBHFT XIFSF UIF OFHBUJPO NBZ CF
TUFNT DBO CF USBOTGPSNFE UIFPSFUJDBMMZ UP
VOMJNJUFE OVNCFS PG WBSJBUJPOT XJUI SFBMJ[FEXJUIJOUIFXPSEXJUIBOBGGJYBUJPO
EFSJWBUJPOBMBGGJYFT SBUIFS UIBO B TFQBSBUF JOEJWJEVBM XPSE
5IF DBTF IPMET WFSZ GSFRVFOUMZ GPS
5VSLJTI UIBUTXIZPVSNPUJWBUJPOJOUIJT
%VFUPTQBDFDPOTUSBJOUT XFPOMZQSPWJEFIFSFPVSCFTU
TFDUJPO JT UP NPEFM UIF OFHBUJPO
NPEFM
4JODFXFIBWFPOMZUISFFDMBTTFT UIFTFDPOETNBMMFTU GPSTFOUJNFOUBOBMZTJT
*%'JTSFQSFTFOUFEBTNFEJBO*%'JO&RVBUJPO

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
Model# ModelName Avg. F-measure Accuracy Feature#
1 no_normalization–no_preprocessing 73.38 73.72 78025
2 normalization 78.05 78.28 39788
3 normalization-stem 78.35 78.63 17855
4 normalization-stem-neg 78.83 79.09 18493
5 normalization-stem-neg-adj 77.93 78.27 23613

Table1: SentimentAnalysisExperimentsResults

Negative indicators -such as the Using adjectives. We performed extra
inflectional tags at the output of effort for adjectives in this research,
morphological analysis: “+Neg”, because of the gen-eral belief that
“+WithoutHavingDoneSo” (like in use adjectives have a direct impact on
of regard-less of, or without stopping)- sentiment analysis in comparison with
have a power to turn meaning of words other word types. We added adjectives to
into opposite. For instance, the feature set with-out exposure them to
“çekmiyor” (meaning “there is no filtering by feature extrac-tion methods
signal” for the the telco domain) has a defined previously. Even if we ap-plied
morphological analysis such as “çek any of the other NLP preprocessing
+Verb+Neg+Prog1+A3sg” where the methods on adjectives just like any other
stem “çek” translated literally as to pull word types, we BMTP VTFE TVSGBDF GPSN PG
into En-glish. If a feature will be BEKFDUJWFTBTBOBEEJUJPOBMGFBUVSFJOTUFBE
extracted from this word we represent it PG VTJOH POMZ QSFQSPDFTTFE WFSTJPOT 'PS
as “çek+Neg”. In addition, nega-tion FYBNQMF  XF SFQSFTFOU UIF BEKFDUJWF
word,“de˘gil”(meanstonotinEnglish), iUBUT‘[w UBTUFMFTT
 XJUI UXP EJGGFSFOU
has the same negative effect on GFBUVSFT  iUBU /FHw UBTUF /FH
 BOE
preceding words. We put negation tag if iUBUT‘[w
a word contains negative indica-tors, or
has “de˘gil” as its successor. For 6 Experimentsand Discussions
instance, “iyi de˘gil” (not good) is
represented as “iyi+Neg”. Furthermore, In all of our experiments, we used SVM
weaddednegationtagtothead-jectiveif withlin-ear kernel. In order to increase
its successor is a negative verb. “Net the confidence level of sentiment
göremiyorum.” (I can’t see clearly.) analysis, we applied 10-fold-cross-
transformed to “Net+Neg gör+Neg”. validation. The results are presented
When a word achieved double negation in terms of macro average of all
tag because of these conditions, we iterations in Table1.
removed all the negation tags belonging
We tested with 5 different NLP
to this word. For example, “sessiz
preprocessing models where each of
de˘gil” (not silent-“siz” suffix matches
them is the addition of anewprocessing
with less, like use in noiseless.)
layerontopofthepreviousone.
converted to “ses”, not to “ses+Neg
+Neg”.

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
The first line of the table UIJT JNQSPWFNFOU JT BHBJO OPU TUBUJTUJDBMMZ
(no_normalization –no_preprocessing) TJHOJGJDBOU XIFSF BT JU BMTP JODSFBTFT UIF UPUBM
presents our baseline model.This test is OVNCFS PG TFMFDUFE GFBUVSFT " TJNJMBS DBTF
IPMET GPS .PEFM  BHBJO XJUI OP TUBUJTUJDBM
performed on the original version ofthe
TJHOJGJDBODF CVUUIJTUJNFXJUIBTNBMMEFDSFBTF
data set, in other words without applying
anypreprocessing during the selection of 0VSGJOBMUXPFYQFSJNFOUT .PEFM.PEFM
the feature set. The further experiments 
 EFBM XJUI UIF BEEJUJPO PG TPNF
areevaluatedaccord-ingtotheirpreceding NPSQIPMPHJDBM GFBUVSFT JOUP TFOUJNFOU BOBMZTJT
experiments, and the perfor-mance EFUBJMFE JO 4FDUJPO 
 "MUIPVHI XJUI UIF
improvement of the best model is BEEJUJPOPGOFHBUJPO .PEFM
XFPCTFSWFEB
reportedwithrespecttothisbaseline. TMJHIU JNQSPWFNFOU JO UIF SFTVMUT  UIJT
JNQSPWFNFOU JT BHBJO OPU TUBUJTUJDBMMZ
TJHOJGJDBOU XIFSF BT JU BMTP JODSFBTFT UIF UPUBM
Table 1 shows that the normalization OVNCFS PG TFMFDUFE GFBUVSFT " TJNJMBS DBTF
stage (Model #2) contributes to the IPMET GPS .PEFM  BHBJO XJUI OP TUBUJTUJDBM
TJHOJGJDBODF CVUUIJTUJNFXJUIBTNBMMEFDSFBTF
sentiment analysis, and increases the
"TUIFDPODMVTJPO JOUIJTTUVEZ XFTIPXFEUIBU
overall success by about 5 per-centage OPSNBMJ[BUJPOJTBOJOEJTQFOTBCMFTUBHFGPSTFO
points. On the other hand, although the UJNFOUBOBMZTJTXIFSFBTTUFNNJOHJTBMTPWFSZ
addition of the stemming (Model #3) WBMVBCMF GPS GVSUIFS TUVEJFT FH BDUJWF MFBSO
results in a slight improvement on top of JOH
 )PXFWFS  PVS UFTUFE NPEFM GPS UIF
Model#2,thisim-provementisnotproven BEEJUJPOPGNPSQIPMPHJDBMJOGPSNBUJPOJOUPUIF
tobestatisticallysignif-icantaccordingto TZTUFNEPOPUTFFNXFMMGJUUFEGPSUIJTEPNBJO
McNemar’stest.Despitethis,Model#3is /FWFSUIFMFTT  XF NBZ OPU DPODMVEF UIBU UIF
NPSQIPMPHJDBM JOGPSNBUJPO TVDI BT OFHBUJPO
considered very valuable since the to-tal
IBTOPJNQBDUPOTFOUJNFOUBOBMZTJT8FSBUIFS
number of features is almost reduced by TFOTFUIBUXFOFFEUPNBLFGVSUIFSSFTFBSDIPO
50%(39788→ 17855). As a result, the UIFJODMVTJPOPGNPSQIPMPHJDBMGFBUVSFTTVDIBT
lesser number of features provide us the VTJOH UIFN BT TFQBSBUF GFBUVSFT JOTUFBE PG UIF
abilitytotrainourclas-sifierbyusingless BQQSPBDI EFGJOFE JO IFSF UIF DPODBUFOBUJPO
timeandlessresourcesaswementionedin 4UFN /FH

Section 4. This yields the possibil-ity of
7 ConclusionandFutureWork
adding more valuable training data to our
machinelearningalgorithm,especiallyfor 'FBUVSF FYUSBDUJPO NFUIPET QSPWJEF VT UP
activelearningexperiments. EFDSFBTF USBJOJOH UJNF PG DMBTTJGJFST  BOE BMTP
0VSGJOBMUXPFYQFSJNFOUT .PEFM UIFZ IBWF B QPTJUJWF JNQBDU PO TFOUJNFOU
.PEFM
EFBMXJUIUIFBEEJUJPOPGTPNF BOBMZTJT TVDDFTT SBUF 8F BDIJFWFE IJHIFS
TFOUJNFOU BOBMZTJT TVDDFTT SBUF XJUI MFTT
NPSQIPMPHJDBM GFBUVSFT JOUP TFOUJNFOU
OVNCFS PG GFBUVSFT *O BEEJUJPO  XF TIPXFE
BOBMZTJT EFUBJMFEJO4FDUJPO
"MUIPVHI IPXUIFOPSNBMJ[BUJPOJNQSPWFUIFTFOUJNFOU
XJUIUIFBEEJUJPOPGOFHBUJPO .PEFM
 BOBMZTJT PO 5VSLJTI TPDJBM NFEJB QPTUT 8JUI
XF PCTFSWFE B TMJHIU JNQSPWFNFOU JO UIF UIFOPSNBMJ[BUJPOQSFQSPDFTTJOH XFJODSFBTFE
SFTVMUT  UIF TVDDFTT SBUF PG TFOUJNFOU BOBMZTJT GSPN
UP XIJDIJTUIFSFMBUJWF
JNQSPWFNFOU

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 4
 By the addition of morpholog-ical Mahmut Çetin and M Fatih Amasyali.
features we saw a slight improvement B "Dtive learning for Turkish
from 78,05% to 78,83% which is not sentiment analysis. In In-novations in
statistically sig-nificant according to Intelligent Systems and Applications
McNemar. However, stem-ming, which (INISTA), 2013 IEEE International
Symposium on,pages1–4.IEEE.
isthefirst morphological feature thatwe
applied, is dramatically reduced the Gül¸senEryi˘git,FatihSametÇetin,Meltem
number of features as an advantage of Yanik,5BOFM5FNFM BOE*MZBT±JÎFLMJ
ability to train mod-els with more data. 2013. Turksent: A sentiment annotation
For our future studies, we will work on tool for social media. In Pro-ceedings of
the7thLinguisticAnnotationWorkshopand
developingautomaticNLPtoolstomake InteroperabilitywithDiscourse,pages131–
use of morphological information. 134, Sofia, Bulgaria, August. Association
Thereby, we want to build an for Computa-tionalLinguistics.
environment for further linguis-tic Gül¸senEryi˘git.2014.ITUTurkishNLPweb
analysis, such as syntax and semantics. service.
Weex-pecttoincreasesentimentanalysis In Proceedings of the Demonstrations at
success by such deep analyzes of the 14th Conference of the European
language. Chapter of the Associa-tion for
Computational Linguistics (EACL),
8 Acknowledgments Gothen-burg, Sweden, April. Association
forComputationalLinguistics.
closed for blind review MinqingHuandBingLiu. 2004.
.JOJOHBOETVNNBrizingcustomerreviews.
In Proceedings of the Tenth ACM SIGKDD
References International Conference on Knowl-edge
Discovery and Data Mining, KDD ’04,
Muhammad Abdul-Mageed, Mona Diab, pages168–177,NewYork,NY,USA.ACM.
and Sandra,àCMFS4BNBS Hayeon Jang and Hyopil Shin. 2010.
Subjectivity and sentiment analysis for specific sentiment analysis in
Arabic social media. Computer Speech & morphologically rich languages. In
Language,28(1):20–37. Proceedings of the 23rd International
AlexandraBalahur,RadaMihalcea,and Conference on Computational Linguistics:
AndrésMon-PZP$PNQVUBUJPOBM Posters, COLING ’10, pages 498–506,
Stroudsburg, PA, USA. Association for
tapproaches to subjec-tivity and sentiment ComputationalLinguistics.
analysis: Present and envisaged methods
andapplications. ComputerSpeech&Lan- MesutKaya,GuvenFidan,andIHakkı
guage,28(1):1–6. Toroslu.2013.
ZeynepBoynukalin. 2012. Emotionanalysis Transfer learning using twitter data for
ofTurk-ishtextsbyusingmachinelearning improvingsentimentclassificationofTurkish
methods. Ms. political news. In Information Sciences and
Systems2013,pages139–148.Springer.
RebeccaFBruceandJanyceMWiebe.
1999. Recog-OJ[JOHTVCKFDUJWJUZ
acasestudyinmanualtagging.Natural
LanguageEngineering,5(2):187–205.

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 
Soo-Min Kim and Eduard Hovy. 2004. Janyce M Wiebe, Rebecca F Bruce, and
Thomas P0)BSB
%FUFSNJOing the sentiment of opinions.
In Proceedings of the 20th international Development and use of a gold-standard
conference on Computational Linguistics, data set for subjectivity classifications. In
page 1367. Association for Computa- Proceedings of the 37th annual meeting of
tionalLinguistics. the As-sociation for Computational
Linguistics on Compu-tational Linguistics,
BingLiu. 2012. pages 246–253. Association for
4FOUJNFOU BOBMZTJT BOE PQJOJPO NJOing. ComputationalLinguistics.
SynthesisLecturesonHumanLanguage MichaelWiegand,AlexandraBalahur,
Tech-nologies,5(1):1–167. BenjaminRoth,%JFUSJDI,MBLPX BOE"OESÏT
Satoshi Morinaga, Kenji Yamanishi, Kenji .POUPZP
Tateishi,BOE5PTIJLB[V'VLVTIJNB A survey on the role of negation in
sentiment analysis. In Proceedings of the
Mining prod-uct reputations on the web.
Workshop on Negation andSpeculationin
In Proceedings of the eighth ACM
SIGKDD international conference on Natural Language Processing, NeSp-NLP
’10, pages 60–68, Stroudsburg, PA, USA.
Knowledge discovery and data mining,
As-sociationforComputationalLinguistics.
pages341–349.ACM.
Theresa Wilson, Janyce Wiebe, and Paul
SadiEvrenSekerandKhaledAl-naami. Hoffmann.
2013.
 Recognizing contextual polarity in
4FOUJmental analysis on Turkish blogs via phrase-level sentiment analysis. In
ensemble clas-sifier. In Proceedings Of ProceedingsoftheCon-ference on Human
The 2013 International Conference On Language Technology and Em-pirical
DataMining.DMIN. Methods in Natural Language
DilaraTorunoˇgluandGül¸senEryi˘git.2014. Processing, HLT ’05, pages 347–354,
Stroudsburg, PA, USA. Association for
" $MTcaded approach for social media text ComputationalLinguistics.
normalization of Turkish. In 5th
Workshop on Language Analy-sis for JeongheeYi,TetsuyaNasukawa,Razvan
Social Media (LASM) at EACL, Bunescu,and8BZOF/JCMBDL
Gothenburg,Sweden, April.Associationfor . Sentiment analyzer: Extract-ing
ComputationalLin-guistics. sentimentsaboutagiventopicusingnatural
lan-guage processing techniques. In Data
AGuralVural,BBarlaCambazoglu,Pinar
Mining, 2003. ICDM 2003. Third IEEE
Senkul,and;0[HF5PLHP[ International Conference on, pages 427–
A framework for sentiment analysis in 434.IEEE.
Turkish: Application to polarity detec-tion
of movie reviews in Turkish. In Computer Hong Yu and Vasileios Hatzivassiloglou.
and Information Sciences III, pages 437– 2003.
445.Springer. 5Pwards answering opinion questions:
Janyce Wiebe. 2000. Separating facts from opinions and
identifying the polarity of opinion
fromcorpora. InAAAI/IAAI,pages sentences. In Proceedings of the 2003
735–740. conference on Empirical methods in
natural language processing, pages 129–
136. Association for Computational Lin-
guistics.

TÜRKøYE BøLøùøM VAKFI BøLGøSAYAR BøLøMLERø VE MÜHENDøSLøöø DERGøSø (2014 Cilt:7 - SayÕ:1) - 

You might also like