You are on page 1of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

Bora Beran
Worst blog article you ever saw? Well, my next one will
be better.

Sentiment analysis in Tableau with R by Bora Beran


With the increasing amount of user content on the web, text analytics is gaining more mainstream
adoption. Sentiment analysis, keyword and named entity extraction are the most common tasks
since they allow quickly classifying, filtering and turning text into easily consumable metrics. What
do the customers like most about your product, what do they not like? Do people perceive your
brand more positively or negatively compared to last year?
With the new R integration feature in Tableau 8.1 it is very easy to add these functionality to your
dashboards. There are currently two packages in R that can be used for this purpose: sentiment
(http://cran.r-project.org/src/contrib/Archive/sentiment/)
and
qdap
(http://cran.rproject.org/web/packages/qdap/index.html). In this post we will use sentiment (http://cran.rproject.org/src/contrib/Archive/sentiment/). This package requires tm (http://cran.rproject.org/web/packages/tm/index.html)
and
Rstem (http://www.omegahat.org/Rstem)
packages, so first youll need to install those. You can do this by typing in the commands below, into
your R console (or RStudio if thats the IDE of your choice).
It may be difficult to find the right versions of Rstem and sentiment. If you already have these
packages you can skip to the next step. Before you run the workbook please make sure you load the
packages either in the calculation by adding library(sentiment); before the classify_ functions or
in Rserve config as covered in my previous blog post about Logistic Regression.
install.packages("tm")
download.file("http://cran.cnr.berkeley.edu/src/contrib/Archive/Rstem/Rstem_0.4C
1.tar.gz",G"Rstem_0.4C1.tar.gz")G
install.packages("Rstem_0.4C1.tar.gz",Grepos=NULL,Gtype="source")
Once you have tm (http://cran.r-project.org/web/packages/tm/index.html) and Rstem
(http://www.omegahat.org/Rstem) installed, here is how you can download and install the
sentiment package (http://cran.r-project.org/src/contrib/Archive/sentiment/).
download.file("http://cran.rC
http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 1 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

download.file("http://cran.rC
project.org/src/contrib/Archive/sentiment/sentiment_0.2.tar.gz",
"sentiment.tar.gz")
install.packages("sentiment.tar.gz",Grepos=NULL,Gtype="source")
Lets take the first stab by using the classify_polarity function. Comment Text column contains
reviews for a hypothetical product. We are using our calculated field Sentiment for both text and
color coding as it returns one of three classifications: negative, neutral and positive.

You will notice that the results are not perfect. Second row from the bottom, is in fact a negative
comment about delayed delivery but classified as a positive comment. More on that later. Now lets
have a look at what the calculated field looks like.

http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 2 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

As you can see the R script is very simple. We are calling the function and retrieving the column
corresponding to best_fit. Another method in this package is classify_emotion which classifies text
into emotion such as anger, joy, fear The function call is very similar but we get a different
dimension from the results this time. Especially the two lines that are associated with emotion fear
look far off. But how does this work and how can it be made better?

(http://boraberan.files.wordpress.com/2013/12/image3.png)
Sentiment analysis techniques can be classified into two high level categories:
1. Lexicon based : This technique relies on dictionaries of words annotated with their orientation
described as polarity and strength e.g. negative and strong, based on which a polarity score for
the text is calculated. This method gives high precision results as long as lexicon used has a good
coverage of words encountered in the text being analyzed.
http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 3 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

2. Learning based : These techniques require training a classifier with examples of known polarity
presented as text classified into positive, negative and neutral classes.
Rs sentiment package follows a lexicon based approach hence we were able to get right into the
action,
given
it
comes
with
a
lexicon
(http://people.cs.pitt.edu/~wiebe/pubs/papers/emnlp05polarity.pdf) for English. In your R
package library under \sentiment\data folder you can find the lexicon as a file named
subjectivity.csv.gz.
The text that was incorrectly classified as having positive polarity is the following Took 4 weeks to
receive it even though I paid for 2 day delivery. What a scam. If you open the file, as you probably
suspected, you will find out that scam is not a word in the lexicon. Lets add the following line to the
file,
scam,strongsubj,negative
then save, zip the file, restart RServe and refresh our workbook.

(http://boraberan.files.wordpress.com/2013/12/image4.png)
Now, you can see that the text is classified correctly as expressing negative sentiment. When using
lexicon-based systems, adding new words to the lexicon or using a completely new lexicon are
potential paths to follow if you are not getting good results. Incorrect classifications are more likely
if slang, jargon and colloquial words are being used in the text youre analyzing since these are not
covered extensively in common lexica.
You can download the workbook containing the example HERE (http://sdrv.ms/1ckyO5P).
Happy Holidays!

http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 4 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

About these ads (http://wordpress.com/about-these-ads/)

You May Like


1.

This entry was posted in R, Visualization.

11 comments on Sentiment analysis in Tableau with R


Sean Otto says:
February 19, 2014 at 10:13 am
Trying to replicate what you have done and not being competely experienced with R and R
scripts, I get the following error message in Tableau: Error in eval(expr,envir, enclos): could not
find function classify_polarity
Im assuming it refers to your comment about adding the sentiment library, but I is that added in
the Tableau formula? Not exactly sure on your message of where to add it.
Reply
Bora Beran says:
February 19, 2014 at 11:17 am
Hi Sean,
Were you able to install all the libraries successfully? If the answer is yes the most likely
culprit is that the library isnt loaded. You could verify this by adding the library reference to
your function call.
SCRIPT_STR(library(sentiment);polarity_data =
classify_polarity(.arg1,algorithm=bayes,verbose=TRUE)[,4],ATTR([Call Script]))
Let me know if this solves the issue.

http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 5 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

I wrote about how to take advantage of Rserve configuration file to preload packages and
other objects here which you may find useful
http://boraberan.wordpress.com/2013/12/16/logistic-regression-in-tableau-using-r/
If you do this Rserve will load the package on start only once instead of evaluating
library(sentiment) command every time you refresh your view in Tableau. It shortens your
code in R, also would give you better performance. i will add a pointer from this article to that
one and a note that the example assumes the libraries are pre-loaded in Rserve configuration
to avoid future confusion.
~ Bora
Reply
Praveen Koppolu says:
March 12, 2014 at 9:53 am
Im trying to replicate the above, got the error
Error in base::parse(text = .cmd) : :1:71: unexpected input
1: library(sentiment);polarity_data = classify_polarity(.arg1,algorithm=
^
Please help me.
Reply
Bora Beran says:
March 25, 2014 at 10:36 pm
I cant tell much from the snippet. This sort of error commonly happens when the line is a
continuation and there is something missing like a trailing comma on the previous line. The
other likely cause is ASCII vs UTF. It could be that youre using a different kind of quote or
there is some other non-ASCII character in there. If you are editing in a tool like Word or
copy-pasting from a browser etc., it is likely that you get the wrong character while it may
appear like the right character on the surface.
Reply
kurrabac says:
March 25, 2014 at 3:05 pm
there is simpler way doing for Twitter text sentiment analysis in R. Try this pacakge.
https://github.com/okugami/sentiment140/blob/master/README.md
Reply
Hein says:
May 16, 2014 at 4:15 am

http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 6 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

I installed R v3.0.2 and tried to install this package. I got an error message that it is not
available for this version of R.
Reply
Shubho Ray says:
May 5, 2014 at 9:24 pm
Hi,
This is a very interesting article which prompted me to recreate it with my own response data.
The only problem is, Im unable to add any extra lines to the subjectivity.csv file that you had
mentioned.
Can you point out as to what could be the reason?? If my query is not clear, do let me know what
extra information you require.
Reply
Bora Beran says:
May 11, 2014 at 12:32 pm
Hi Shunho,
What were the steps you used? Unzip, edit, save, zip? If the file you saved didnt work,
potential issues could be related to encoding, using different new line etc. If your changes
seem to be ignored, it could be because the package is already loaded in which case loading
again (if this is done in your Rserve config, restarting Rserve) could be the solution.
Reply
Shubho Ray says:
May 28, 2014 at 1:32 am
Sorry for the delayed response.
Steps used were exactly the same that you mentioned, but I didnt understand what
exactly you meant by the different new line issue.
As for reloading Rserve, I did that too, every time I made any changes to the file.
So, how to proceed from this point?
Matthew Loxton says:
May 8, 2014 at 2:17 pm
Looks like R Sentiment has been discontinued, in the interim do you have any other suggestions?
Reply
Bora Beran says:
May 11, 2014 at 11:34 am
You can still get the sentiment package from Omegahat.org. I picked sentiment since it is the
most straightforward to use.
Qdap is another package that can be used for sentiment analysis and it is still in CRAN. You
can also write your own function to do it or give a shot to sentiment140 as suggested in the
above link.
http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 7 of 8

Sentiment analysis in Tableau with R Bora Beran

9/29/14, 10:58 PM

Reply

Create a free website or blog at WordPress.com. | The Quintus Theme.


Follow

Follow Bora Beran


Powered by WordPress.com

http://boraberan.wordpress.com/2013/12/24/sentiment-analysis-in-tableau-with-r/

Page 8 of 8