You are on page 1of 5

Machine Translated by Google

Proceedings of the 30th Hangul and Korean Information Processing Conference (2018)

Korean sentiment analysis using multi-channel CNN


Min Kim† , Byeon Jeung-hyeonO,‡
, Chunghee Lee‡ , Yeonsoo Lee‡

Stanford University† , NCSoft Co., Ltd.‡


tomas76@stanford.edu, {jhbyun, forever73, yeonsoo}@ncsoft.com

Multi-channel CNN for Korean Sentiment Analysis


Min Kim† , Jeunghyun ByunO,‡, Chunghee Lee‡ , Yeonsoo Lee‡
Stanford University† , NCSOFT Corp.‡

summary

This paper analyzes the emotion of sentences by simultaneously passing morphemes, syllables, and graphemes of Korean sentences through different convolutional layers.
Ryu Ha proposes a multi-channel CNN. In the case of colloquial sentences containing typos, morpheme-based CNN
Features that cannot be extracted can be extracted from syllables or graphemes. Although morpheme-based CNNs are widely used in Korean
sentiment analysis, the multi-channel CNN model in this paper classifies the sentiment of sentences more accurately by simultaneously considering
morphemes, syllables, and graphemes. The model proposed in this paper classified the sentiment of sentences about 4.8% more accurately in the
baseball comment data and about 1.3% more accurately in the movie review data than the morpheme-based CNN.

Keywords: Multi-channel CNN, Sentiment Analysis, Text Classification, CNN

1. Introduction This paper's model, which classifies the emotion of a sentence by taking all into
account, can classify the emotion of a sentence more accurately than other existing

Sentiment analysis is a natural language analysis technology that identifies the CNN models. This was confirmed in two different online data sets.

polarity of emotions contained in text. Sentiment analysis of text is necessary to


automatically classify the vast amount of comments, tweets, and product reviews
posted on the Internet and extract necessary information. Sentiment analysis is 2. Related research

used to identify user opinions or predict future election results [1]. Traditional
machine learning methods such as Naive Bayes and Logistic Regression were There are studies that have expanded Y. Kim's text classification research
widely used in existing emotional analysis, but recently, deep learning-based using CNN [5] and applied it to Korean [6,7]. Y. Kim proposed a word-based CNN
technologies have recorded high performance in many fields [2]. Although interest in the paper [5], but it has a serious OOV (Out of Vocabulary) problem in Korean.
in opinion mining and sentiment analysis technology is growing, most sentiment This is because, due to the nature of Korean, which is an agglutinative language,
analysis studies are focused on English [3]. Since countless word phrases can be created. Morphemes or syllables are often used as
each language has different grammatical rules, and there are grammatical input instead of words. Morphemes, the smallest meaningful units in Korean, are
differences between Korean and English, inaccurate results may come out in actively used in Korean text classification research. In [6],
Korean sentiment analysis if the techniques and models used in English sentiment Naver movie reviews were divided into morphemes using Konlpy's Twitter
analysis are used as is. High performance can be expected only when sentiment morpheme analyzer to learn word2vec, and then used as input values for CNN.
analysis is performed using models and technologies suited to the linguistic The overall model structure is similar to the model in [5], and this model classifies
characteristics of Korean [4]. In this paper, we propose a multi-channel CNN Naver movie review sentiment about 8% more accurately than the Naive Bayes
(Convolutional Neural Network) model that is effective in sentiment classification of model. [7] is a study that proposed a syllable-based CNN model in Korean. In [7],
Korean sentences. The proposed model uses three different convolutional layers a syllable-based CNN model, not morpheme-based, was used to make predictions
to receive even for new, unlearned words. Through this, the syllable-based CNN model that
morphemes, syllables, and graphemes as input. In particular, in the case of learned 'Google' or 'Galaxy Note'
online text, there are many abbreviations or spelling errors, which can lead to a lot can make similar predictions for compound words and abbreviations such as
of information loss at the morpheme level. By extracting feature vectors from 'Google God' and 'Galaxy Note', making the OOV (Out of Vocabulary) problem
graphemes and syllables, information lost in typos and grammatical errors at the morpheme-based. It is improved over CNN [7]. Sentiment analysis studies using
grapheme level and compound words and abbreviations at the syllable level can multi-channel CNN, which formed the basis of this paper, include [8, 9, 10]. In
be extracted. And, for new words that have not been learned, features that cannot the study of [8]
be extracted from morphemes can be extracted from graphemes and syllables.
Feature vectors found from morphemes, graphemes, and syllables

- 79 -
Machine Translated by Google

Proceedings of the 30th Hangul and Korean Information Processing Conference (2018)

In this case, it was confirmed that multi-channel CNN, which uses words and The Multi-channel CNN model proposed in this paper, as shown in Figure 1, uses
characters simultaneously, is superior to word-based CNN or character-based CNN morphemes, syllables, and graphemes as input. The multi-channel CNN model is a
when classifying spoken English sentences. In [9], when word embeddings such as variation of Y. Kim's CNN model [5] that inputs only one subword-level, and can input
Word2vec, Glove, and Syntactic were used simultaneously through multi-channel morphemes, syllables, and graphemes at the same time. Through this, documents
CNN in sentiment classification, performance improved compared to using a single can be classified by considering multiple subword-levels rather than considering only
word embedding. Also, recently, [10] classified the sentiment of movie reviews by one subword-level. The above multi-channel CNN model was constructed to
simultaneously using sentence graphemes and morphemes in Korean sentiment simultaneously use feature vectors extracted from morphemes, syllables, and
analysis. However, in [10], the syllables of sentences were not considered and the graphemes. As shown in Figure 1, the multi-channel CNN model has three input
experiment was conducted only on Naver movie review data, which among online channels, and one sentence is divided into morphemes, syllables, and graphemes
comments is relatively standardized and has long sentences. On the other hand, in and used as
this paper, we not only improved emotional classification performance through a multi- input for the three channels. As shown in Figure 1, “Go LG!” If the sentence is
channel model that uses all three morphemes, graphemes, and syllables, but also divided into morphemes, it becomes [“LG”, “Going”, “!”], which goes into the first
conducted experiments on two data sets with different characteristics. Through channel. Also, when the above sentence is divided into syllables, it becomes [“L”,
experiments, the final model of this paper recorded about 2.4% higher performance “Ji”, “<space>”, “Hwa”, “I”, “Ting”, “!”], which is input to the second channel. use. A
on baseball comment data and about 1.12% higher performance on Naver movie <space> token is inserted between words. Likewise, the above sentence is divided
review data than the multi-channel model that did not consider syllables. Additionally, into graphemes [“ÿ”,”ÿ”,”ÿ”,”<eoc>”, “ÿ”,”ÿ”, … … ] is used as the input of the last
in this study, we present an optimized combination of Korean sentence input units channel. The <eoc> token was inserted between syllables. The morpheme-based
for CNN through experiments on CNN combining several Korean units. channel and syllable-based channel limited the maximum length of a sentence to
50 tokens, so longer sentences were cut out and shorter sentences were filled with
padding. In the case of grapheme-based channels, the maximum length was limited
to 150 tokens and truncated or padding was input as above. In each channel, the
embedding of input values is learned through an embedding layer. We experimented
with using pre-trained Korean FastText embeddings provided by Facebook Research,
but there was no significant performance improvement, so embeddings randomly

3. Korean sentence classification using multi-channel CNN initialized to 300 dimensions were used in this model. The resulting value after
passing through the embedding layer is used as an input to the convolution layer,
which consists of three different sized filter windows. The output of the convolution
layer is connected to one matrix through Max-Pooling, and the sentences are classified
through the Softmax layer.

4. Data and experimental methods

4.1 Data

Table 1: Data Set Statistics


Baseball Movie
Comment Average Sentence Length Review 35.24
(characters) 3 2
25.2 Sentiment 15846 150000
Count Learning Unit 1761 50000
Size Test Unit Size 17313 94747
Figure 1. Multi-channel CNN model structure Morpheme Vocab 1777 3006
Size Syllable Vocab Size Grapheme 156
Vocab Size 162

- 80 -
Machine Translated by Google

Proceedings of the 30th Hangul and Korean Information Processing Conference (2018)

Table 2: Baseball comment sentiment example A syllable-level morpheme analyzer using Bidirectional LSTM and CRF
Emotion was used [11]. As a result of experimenting with several morpheme
Sentence “I’m 100% sure KIA will Positive analyzers, using the in-house morpheme analyzer was able to distinguish
win” “Just get some rest after the denial the sentiment of baseball comment data about 2% more accurately than
season” neutrality using Konlpy's Twitter or morpheme-based CNN using Kkma.
“Good job” “Well done Positive

Hyunsik!” “I paid more and bought it for 210,000 won. Gabio, even if you wait 3 hours, denial
you won’t be able to get a
Syllable-based CNN: Same as the models above, and used syllables as
ticket.” “The injured area is also unique” Neutral
input. In the preprocessing process, the <space> token was entered
Baseball comments: The following baseball article comment
between words.
sentiment corpus data was used as experimental data. This data is a
data set produced in-house, and comments are tagged with sentiments
Grapheme-based CNN: Same as the above models, and graphemes
of ‘positive’, ‘negative’, and ‘neutral’. Positive, negative, and neutral
were used as input values. In the preprocessing process, when
are evenly distributed in learning and testing units in a 1:1:1 ratio. As
decomposing a sentence into graphemes, <eoc> tokens were entered
can be seen from the examples in Table 2, the sentences are made up
between syllables, and <space> tokens were entered between words.
of colloquial language, often contain spacing and grammatical errors,
and are often not long. “Well done Hyunsik!” In the same case, it is
tagged as positive, but the similar “doing well” is tagged as neutral.
Multichannel CNN (morpheme + syllable + grapheme): This is the final
There are expressions that can be used positively, such as “You are
model proposed in this paper and is a multi-channel CNN model that
good,” but in some situations, they can be used sarcastically, negatively
uses all morphemes, syllables, and graphemes. Preprocessing is the
or neutrally, so sentiment analysis can be difficult. Additionally, there are
same as the above models for each sub-word. In addition to this final
many sentences that have an ambiguous sensibility, such as “The
model, multi-channel CNNs consisting of a combination of two of
injured area is also unique,” for a person to judge whether it is neutral or
morphemes, syllables, and graphemes were also tested for comparison.
negative.

4.3 Experimental environment and model parameters

Table 3: Movie review sentiment example


Table 4: Final parameters for each model
Emotion

parameter Morpheme Syllable Grapheme Multi-channel based


Sentence “Oh, the dubbing.. the voice is really denial
based based based on CNN
annoying” “It was so annoying, so I recommend watching it” Positive
CNN CNN CNN
“It’s a prison story.. honestly, it’s not fun.. Adjust the rating” “One of the few denial
Feature map size 100 300 300 100,300,300
movies
(morpheme, syllable, grapheme)
that is fun even though there is no action” Positive
L2 constraint 0 0 0 128 30 30 1e-5 1e-4

Batch size 1 30 1e-4


Movie Review: Models were trained and tested using the publicly
Learning Rate 1e-5
available Naver Movie Review Corpus v1.0 (https://github.com/e9t/nsmc)
data. In the case of emotional tagging, there are two types of emotional The parameters of the experimental models were determined using
tagging, positive and negative, and are evenly distributed between grid search on the baseball comment data validation set. The range of
learning and testing units. In this corpus, during the preprocessing parameters used for grid search is as follows.
process, reviews with ratings between 1 and 4 were tagged as negative,
and reviews with ratings between 9 and 10 were tagged as positive, so
the sentiment classification is relatively clear. Also, compared to baseball Feature map size: [50, 100, 300]
comment data, the sentence length is longer and the learning unit is L2 constraint: [0, 0.1, 1, 3]
much larger. Additionally, because it is an open source corpus, Batch size: [30, 60, 128, 256]
performance comparison with other studies is possible. Learning rate: [1e-5, 1e-4, 1e-03, 1e-02]

4.2 Experimental The values shown in Table 4 are the final parameters found in the
model Word-based CNN (Baseline): Y. Kim's CNN model [5] was used. grid search. Epoch was set to 100 and early-stopping was performed.
Sentences were separated into word units and used as input values for Dropout was commonly set to 0.5, and the Adam optimizer was used.
CNN. In all models, three different sizes of Filter window were used: 3, 4, and
5.
Morpheme-based CNN: Same as the word-based model, and used
morphemes rather than words as input. man's

- 81 -
Machine Translated by Google

Proceedings of the 30th Hangul and Korean Information Processing Conference (2018)

4.4 Experiment evaluation criteria When doing so, accuracy was used. In addition, the classification F1 score for each
emotion was also included in the experiment results to show classification performance

Evaluate the performance of the overall model as emotions are evenly distributed in for each emotion.

baseball comment data and movie review data

Table 5: Sentiment analysis experiment results

baseball comments movie review

Model Accuracy positive (f1) neutral (f1) negation(f1) Accuracy positive(f1) negation(f1)

CNN (Eojeol) Baseline 45.03% 55.85% 32.41% 39.10% 79.17% 80.68% 77.40%

CNN (morpheme) 62.12% 67.56% 57.86% 61.67% 85.02% 85.14% 84.90%

CNN (syllable) 59.51% 66.03% 49.48% 62.77% 84.92% 84.83% 85.01%

CNN (jaso) 59.23% 63.62% 53.95% 61.33% 84.4% 84.18% 84.63%

Multi-channel CNN (syllable + grapheme) 60.25% 66.43% 50.48% 63.64% 84.96% 85.02% 84.90%

Multi-channel CNN (morpheme + syllable) 63.14% 69.94% 57.79% 62.48% 85.33% 85.18% 85.48%

Multi-channel CNN (morpheme + grapheme) 64.56% 70.08% 58.31% 65.37% 85.15% 85.38% 84.92%

Multi-channel CNN (morpheme + syllable + grapheme) 66.95% 71.46% 60.88% 68.91% 86.27% 86.42% 86.11%

Table 6: Sentence classification example

Sentence Answer Label Multi-channel CNN CNN CNN CNN


(morpheme + syllable + grapheme) (morpheme) (syllable) (jaso)

“I’m looking positive Positive neutral positive positive

forward to it” “Even though our team has lost in a row, the players are in a great positive Positive negative positive positive

mood…” “This is the best scammer of all time,,,” negative Negative neutral negative positive

“There are no bad comments… You must have had a successful life positive Positive positive negative negative

as expected” negative Neutral neutral neutral neutral

“For domestic use” “Suarez in soccer ” negative Neutral neutral positive positive

5. Experimental results In the sentence, there is a morpheme with a negative sentiment, ‘losing a row,’ so the
morpheme-based CNN classifies the sentiment as negative. If I had been able to

The CNN sentiment analysis results for each model are summarized in Table 5. decompose the word ‘good’ into morphemes, I could have distinguished this sentence

all. Compared to other models, the baseline word-based CNN as positive.


It seems like a sentence, but it is too colloquial so it cannot be used as a morpheme.
Performance is poor. Morphemes, syllables, and characters proposed in this paper
It cannot be disassembled properly. However, in CNN based on syllables and
Multi-channel CNN model that uses cows simultaneously is baseball that
graphemes, the word ‘good’ is used as ‘good’ or ‘like’.
Highest accuracy for both text data and movie review data
Correctly classify this sentence as positive by relating it to
The emotions of road sentences were classified.
It was. Multi-channel CNN uses not only morphemes but also syllables and characters.
Because morphemes are the smallest unit of meaning, syllables are
Because so is used at the same time, the emotions of the sentences above are
Sentiment classification performance was higher than grapheme-based CNN. however
accurately classified. Morpheme creation by using syllables and graphemes together
There are also cases where syllable-based CNN or grapheme-based CNN correctly
There are many OOV (Out of Vocabulary) problems, which are a major problem with anti-CNN.
classified sentences that morpheme-based CNN did not correctly classify. Table 6
This has been resolved. In particular, it can be confirmed through classification
shows the baseball hits actually classified by the models.
examples that Multi-channel CNN is effective in colloquial sentences or sentences
These are examples of text sentences. In the case of “it is expected”, the form
containing typos.
If divided into small parts, it can be divided into “wait, ÿ”. however
Comparing the F1 score experiment results for each emotion, the morpheme
“I’m looking forward to it” is a typo of “I’m looking forward to it” so “I’m waiting”
Negative sentiment of the final multi-channel CNN compared to the semi-CNN
The morpheme “be” does not have much meaning. However, syllable-based Ryu
performance has improved significantly. This is combined with baseball comment data and
And grapheme-based CNNs use feature vectors even in sentences containing typos.
You can see it in all the review data, which is negative
Colloquial and synthetic sentences that are difficult to decompose into morphemes in negative
comments such as “I’m looking forward to it” can be extracted because the base can be extracted.
Correctly categorized by association with “anticipated” or “anticipated”
I believe this is because it contains more words than other comments. “Even though our
team has lost in a row, the players are in a great mood... ” Multi- that uses three
morphemes, syllables, and graphemes at the same time

- 82 -
Machine Translated by Google

Proceedings of the 30th Hangul and Korean Information Processing Conference (2018)

Channel CNN appears to have higher performance than multi-channel CNN, [3] M. Bautin, L. Vijayarenu and S. Skiena,
which uses only morphemes and syllables or only morphemes and graphemes. International Sentiment Analysis for News and Blogs,
Ah, when three subword-levels are used simultaneously, the emotion of the sentence ICWSM, 2008.
can be determined more accurately while considering more diverse feature vectors. [4] H. Jang and H. Shin, Language-Specific Sentiment Analysis
You can classify it as “The best scam of all time,,” in Morphologically Rich Languages, COLING '10
In a morpheme-based or grapheme-based CNN, the part Proceedings of the 23rd International Conference on
It was classified incorrectly because static feature vectors could not be extracted. Computational Linguistics: Posters, pp. 498-506, 2010.
In syllable-based CNN, features between syllables are considered.
The sentence was correctly classified as negative. Also morphemes [5] Y. Kim, Convolutional Neural Networks for
and grapheme description of features that could not be extracted from syllable-based CNN. Sentence Classification. Proceedings of the 2014
Because it can be extracted from a semi-CNN, emotional classification Conference on Empirical Methods in Natural
performance is highest when all three types of subword-levels are used. Language Processing, EMNLP 2014, pp. 1746-1751,
all. 2014.
In baseball comments, implicit or ambiguous sentences such as “for domestic use” [6] W. Kim and K. Park, Design of Hangul text sentiment classifier using convolutional
make it difficult to accurately classify the sentiment without knowing the context of the sentence. neural network, Proceedings of the 2017 Korean Computer Science Conference
Because experimental models often classify chapters as neutral, of the Korean Society of Information Scientists and Engineers, pp. 642-644, 2017.
their performance for neutral emotions is generally lower than for [7] S. Choi et al., A Syllable-based Technique for Word
positive or negative emotions. For movie review data without neutral Embeddings of Korean Words, Proceedings of the First
sentiment, the final multi-channel CNN model reviews accurately up to 86.7%.Workshop on Subword and Character Level Models in
Classify emotions. The improvement in emotional classification NLP, pp. 36-40, 2017.
performance when using multi-channel CNN was found in movie review data. [8] J. Park and P. Fung, One-step and Two-step
It is larger than in baseball comment data. movie review data Classification for Abusive Language Detection on Twitter,
Because the sentence length is long and the learning unit is large, Proceedings of the First Workshop on Abusive
Although it is possible to classify the emotion of most sentences using Language Online, pp. 41-45, 2017.
phonetics, baseball comment data contains many short, colloquial sentences, [9] Y. Zhang, S. Roller and B. Wallace, MGNC-CNN: A Simple
including neutral emotions, making it difficult to classify using only morphemes. Approach to Exploiting Multiple Word Embeddings
I think it's because there are a lot of sentences. both things for Sentence Classification, Proceedings of
This argument is made by verifying the model on data sets with different characteristics. NAACL-HLT 2016, pp. 1522-1527, 2016.
The model presented by Moon is effective in online data sentiment classification.
It was confirmed that it was. [10] K. Mo et al., Text Classification based on
Convolutional Neural Network with Word and
6. Conclusion Character Level, Journal of Korean Institute of Industrial
Engineers Online, 2018.
In this paper, through several experiments, Korean colloquial sensibility [11] H. Kim et al., Syllable-level morpheme analyzer using part-of-
We proposed a multi-channel CNN based on morphemes, graphemes, speech distribution and Bidirectional LSTM CRFs, Proceedings of
and syllables that is effective for analysis. Out of vocabulary (OOV) problem the 28th Hangul and Korean Information Processing Conference, 2016.
It was confirmed that feature vectors extracted from syllables and
graphemes can be used complementary to morpheme-based feature
vectors by solving the problems of the morpheme-based CNN. Higher
emotional content than the commonly used morpheme- and syllable-based CNN
It was confirmed that it has the potential to be used in other studies classifying
colloquial Korean language by showing the accuracy of interpretation.

References
[1] H. Lee and S. Lee, Implementation of sentiment analysis and
sentiment information attachment system, KIPS Tr. Software
and Data Eng, Vol.5, No.8, pp. 377-384, 2016.
[2] 513-520, 2011.

- 83 -

You might also like