You are on page 1of 20

Ideogram Based Sentiment

Analysis in Japanese Text

Tyler Thornblade
Introduction
Many papers apply similar techniques across
differing languages
Two papers in this class introduced a novel
technique: assign sentiment at the character
(sub-word) level
Opinion Extraction, Summarization and
Tracking in News and Blog Corpora, Ku, L.,
Liang, Y. & Chen, H. AAAI 2006 Symposium.
(John)
Experiments with sentimental dictionary
based classifier and CRF model, Huang, R.,
Sun, L. & Pan, L. Sixth NTCIR Workshop, 2007.
(Cem)
Why are ideograms different?
Unlike phonetic characters, ideograms have
innate meanings
Are they sentiment bearing?
Example: 気
Spirit
Mind
Air
Mood
Note that ideograms seldom have just one
meaning; more typical to have a synset or
group of related synsets
Ku et al.
Create a sentiment dictionary
General Inquirer, Chinese Network Sentiment
Dictionary
Expanded dictionary using thesauri
 Tong2yi4ci2ci2lin2 (Mei et al. 1982)
 Academia Sinica Bilingual Ontological Wordnet
(Huang et al. 2008)
Performance of Ku et al. & Huang et al.
Results were fair but not impressive
Neither paper outlined results at the word
level
Hypothesis
These techniques will not be as effective in
Japanese as in Chinese
Why?
Bag-of-words type approach ignores
compositional understanding
Japanese uses script in addition to ideograms
A Short Background on Japanese
Although linguistically unrelated, Chinese
and Japanese both use Chinese characters
extensively
Many multi-character compounds in
Japanese are borrowings
Writing systems
Chinese characters (Kanji)
Script (Hiragana, Katakana)
 Words that mix characters with script (okurigana)
 Words that are entirely script (kana)
Japanese Compound Composition
Five classes
1. Both characters have the same meaning.
2. The characters have opposite meanings.
3. The top character modifies the bottom
character.
4. The bottom character is the target, direct
object, or complement of the top character.
5. The top character negates (“flips”) the
meaning of the bottom character.
First two are ok, last three could present
problem for Ku et al.
Experiment
Start with sentiment dictionary of Kaji and
Kitsuregawa (2007) (Presented by Tyler),
=> 10,000 words
Clean to remove bigrams, trigrams => 2386
words
Apply Ku et al.
Generate sentiment scores for the 954
Chinese characters
Generate sentiment scores for the words in
the dictionary
Ignore magnitude and score result by
comparing sign of Kaji & Kitsuregawa to sign
of program output
Caveats
This is a proof of concept; there was
insufficient time (and resources) to develop
a new sentiment dictionary and/or perform
an annotation study
Train and Test on same data
Results not comparable to other systems
Should interpret as an upper bound on
performance of this method
 We start with essentially perfect knowledge of the
sentiment value of words
 Our results should be near optimal for this method
Results

Oh no! Weren’t we expecting poor results?
Detailed results for characters
Error Analysis
20% of the errors were selected for detailed
analysis
50 false positives
50 false negatives
These were further pruned so that only multi-
character compounds were considered
Error Analysis, False Positives
33% of errors
explained by lack of
compositional
knowledge
6.7% class 5
27% class 3
Error Analysis, False Negatives
54.8% of errors
explained by lack of
compositional
knowledge
3.2% class 5
51.6% due to “ 的”
Other errors
Script characters
We can’t analyze words entirely made up of
script
 34.7% of all errors were due to this
Words that mix script with characters may
introduce additional noise
Problems with source data
After cleaning, the dictionary still contained 4-
5% bigrams
Some data from Kaji & Kitsuregawa is
unintuitive
 E.g. 無用 and 不用 , both of which mean “useless” yet
received high positive sentiment scores and showed
Evaluation of lexicon
Pulled a list of 500 adjective phrases
randomly selected from Web
After removing parse errors and duplicates,
405 unique phrases
No overlap with development set
Balance: 158 positive, 150 negative, 97
neutral
 Based on human annotation
 Two annotators, Kappa 0.73
Baseline: Turney 2002, co-occurrence in a
window
Turney used “excellent” and “poor”, they use
最高 “ best” and 最低 “ worst”
Conclusions
Overall: results were good. As a proof of
concept, this provides support for additional
work in this area.
Hypothesis was accurate in that
approximately 60% of the errors were
explainable in terms of missing linguistic
knowledge
Next steps
Perform a more rigorous study of this nature
Use Kaji & Kitsuregawa dictionary and do an
annotation study to show the true
performance of this approach
Create a better sentiment dictionary and do
the same
 Kobayashi’s Evaldic might be one resource
Apply compositional features
Unclear if lexical data of this nature is
available
Apply word-based techniques to script
characters
Questions