You are on page 1of 18

Tone Analyzer Project Report

I. Introduction

The Tone Analyzer detects emotional, social, and language tones in written text by the use of

linguistic analysis. The service can assess the tone of a document as well as a sentence.

Businesses can use the service to enhance and analyze their client discussions in general, as well

as to learn the tone of their clients' communications and suitably reply to each consumer.

This service could be used in the following scenarios, in my opinion: A logistics firm wants

to know what their consumers are saying about their goods on Twitter. They can create tweets

with their username or hashtags and run them through the program to obtain this information.

You want to send an email message but don't want to give the incorrect impression by sounding

angry? You can examine the tone of the message by copying and pasting the email content

before sending it. A corporation wants to know how well its client service department has

handled phone calls. The chats can be transcribed and submitted through the service in order to

receive tone insights and understand what people are saying.

Figure 1: Server-less Backend Illustration of Watson Tone Analyzer


Despite the fact that emotion analytics is a limited business, many early-stage startups are

focusing on studying client emotions in order to provide better user experiences. Journalism,

political critiques, and NLP research will all benefit from this study. Rare academic projects or

news sites, on the other hand, have used NLP (natural language processing) algorithms to do

emotion and sentiment analysis on a hot political topic. The project's limitations are determined

by the platform and tools employed [1].

Rare academic projects or news sites, on the other hand, have used NLP (natural language

processing) algorithms to do emotion and sentiment analysis on a hot political topic. The

project's limitations are determined by the platform and tools employed. For example, spam

generated by chatbots programs may appear in YouTube comments. This introduces noise into

the data and skews the results. Big data research is more efficient, intelligible, and usable to look

at and provide logic insights than traditional research approaches such as questionnaires, focus

groups, and surveys. I aim to initiate a more quantitative research effort within the Emerson

community, which will serve as a springboard for my future interests in big data, emotion

artificial intelligence, and data visualization [3].

II. Problem Statement

Emotions and moods have a direct impact on a person's daily actions. To assist our

family or friends in leading a better life, it is vital to eliminate bad feelings that they may be

facing. Social networking activity has been found in studies to be a good indicator of a person's

mood. A user's mood is frequently reflected in his or her social content, such as tweets, blogs,

articles, status updates, and so on. In extreme cases, timely analysis of a user's social media can
be utilized to improve feelings and even save a person's life! As a result, it's critical to assess our

friends' and family's social media health on a frequent basis in order to take appropriate action.

III. Nonfunctional and functional requirements

Tone Analysis is a useful tool for determining your clients' tone. Tone Analysis can be used

in a variety of situations with the purpose of moving your company to the next level.

 Contact Center Quality Improvement

Automated Tone Analysis will help you evaluate the quality of your call center in a fraction of

the time. Rather than relying on human monitoring, automated tone analysis may examine your

call center representative's tone on a regular basis, allowing you to provide high-quality client

interactions through your call center.

 Client Service

By evaluating all client service contacts, Tone Analysis can help you enhance your client service.

Tone analysis can help you figure out what your clients are thinking during these support calls,

which can help you figure out what you need to do to make them happy.

 Product Improvement

Conducting Tone Analysis on clients who phone your organization to discuss your products can

give a valuable feedback loop for your items. You can improve your products for the market by

utilizing this input.

 Product Sales
Tone analysis can help your sales staff improve their effectiveness by studying how they present

your products to buyers. The tone analysis' feedback can directly lead to better client sales calls,

which leads to increased sales [2].

IV. Implementation and Demonstration

Watson, developed by IBM, is one of the most sophisticated artificial intelligence systems

available today. Watson won a world-famous round of Jeopardy on January 14, 2011,

demonstrating that AI can compete with today's greatest minds. Fortunately, using Watson does

not require one to be Jeopardy champion or a mathematician. IBM has made available a

comprehensive API that lets people from all around the world to integrate their apps with

Watson and begin leveraging its powerful data analysis tools.

Because of its versatility, JavaScript is my preferred language, although other languages can

and will link to IBM's Watson toolkit just fine. To begin, go to IBM's website and create an

account. We'll be making a tone analyzer in this tutorial. For security reasons, we'll need to

generate an API key. To get started with the key and region, go to

https://www.ibm.com/watson/services/tone-analyzer/ and follow the directions. To use this tool,

you'll need three credentials. A version, URL, and API key are all required. Depending on when

and where you use this tool, the version and URL may change. We'll be using Version '2017–09–

21' and the URL 'https://gateway.watsonplatform.net/tone-analyzer/a' for this project.

The IBM Watson Tone Analyzer is a simple natural language processing software tool that

can help organizations improve their customers' experiences. This user-friendly cloud NLP

software may be used to evaluate customer posts for emotion and tone, anticipate consumer

behavior based on writing, track customer care calls and chat interactions, and assist chatbots in
detecting and adapting to client conversations. To simplify and optimize agent-customer

interactions, Tone Analyzer includes social listening, chatbot integration, and increased customer

service monitoring.

You may combine IBM Watson Tone Analyzer with incoming and outgoing calls, as well as

your chatbot, using the API, so that it can identify customer tone and deliver conversation that

responds to changing customer communication.

The IBM Watson Tone Analyzer can even be integrated into your apps to enhance consumer

interactions. Based on your company's needs, IBM Watson Tone Analyzer offers three distinct

monthly pricing options. The Lite and Standard subscriptions include a predetermined number of

API calls per month, with the option to increase the number of calls dependent on the volume of

requests. The Premium package is meant to add extra layers of security to help safeguard critical

customer information.

The service offers two endpoints named below

General purpose endpoint

To evaluate shorter online data, such as tweets, e-mails or longer documents, such as blog

entries or articles, we make use of the Tone Analyzer general purpose endpoint.

Client engagement endpoint

Monitor client service and support conversations using the Tone Analyzer client

engagement endpoint. When client talks become tense, escalate the situation or look for ways to

improve client care scripts, dialog methods, and client journeys. JSON input is accepted by the

endpoint. See Using the Client Engagement Endpoint for more details about the function and the
tones it produces. Conversations between clients and client support agents are analyzed by Client

Engagement. Client happiness and concerns are measured, agent performance is evaluated, and

you can track how the engagement progresses [4].

Use Cases

Use cases of the service are as follows:

i) Social listening and audience monitoring:

ii) Personalized marketing:

iii) Chat bots:

Allow an automated agent to recognize consumer tones and craft appropriate responses

based on the tones detected. For example, you might say to someone who is sad, "I'm sorry

you're angry about this difficulty," or "I'm glad you're happy with our service." IBM examined

client support forums at a software firm that serves a variety of sectors. The company

participates in client support forums on a regular basis. Kudos can be given to responses that

users deem helpful.

Goals

The tone of the query and the response can be used to predict consumer happiness.

IBM assumed that a Kudos response indicated that the user was satisfied.

Actions

 Sifted through the most recent 1000 threads from multiple boards, ensuring that the

number of responses with and without Kudos was equal.

 Examined the questions as well as the responses.


 Used a variety of cutting-edge classifiers to predict whether a response would receive

Kudos, including random forest, Bayes, and Support Vector Machine (SVM).

V. Solution Architecture

Figure 2: Tone Analyzer Architecture

Hypotheses

The hypothesis will be based on the most valid information I gather. More detailed labels on

each dataset will provide the best backup for each hypothesis. For example, within 3,000

comments of the inauguration video, people with a higher education background are labeled as

more disappointed than those who are less educated which indicates the correlation between

education and sentiment lean. If the labeled emotion is too small to be considered statistically

significant, then it will reject the hypotheses (null). There are five assumptions as followed:

 The extent to which people felt aggressively on the immigration issues is contingent to

their citizenship and languages they are posting.


 The emotion fluctuations should show peaks surrounding the most popular (most

frequent and most cited) keywords from the speech.

 Within the 3,000 comments, opinion clusters should form varied by party affiliation and

attitudes on each side.

 The text analysis and emotion charts should show a common prevailing negative

response.

Research questions

Research questions: After examining few related researches on YouTube comments, I raised

following questions:

 What are the typical emotional reactions on each YouTube videos?

 Which issues do people react to most?

 How did non-English speaking people react compared to U.S citizens who posted in

English

 What are the key words and factors that trigger discussions on YouTube videos?

 How long after the video came out did it trigger the most discussion (either negative or

positive)?

 What is the comparison between the sentiment in the comment and the sentiment from

own speech text?

 Which emotion are people showing the most?

Methodology

Data collection

The source of the video came from major television networks’ YouTube channel. I used
an online web scraper called “ytcomments” developed by Philip Klostermann. Variables

include userID, comments, timestamp, replies and likes. The raw excel files are sorted by date

and only the most recent 3,000 comments were selected. After deleting the duplicates and

empty cells, the excel data was converted to JSON file and parsed through IBM’s alchemy

Language to be labeled with categorical emotions.

The emotion score for each category ranges from 0 to 1. Sum and average scores were

calculated and ready to visualize. Meanwhile, Semantria, a tool for sentiment score analysis,

ran the three sets of data. Semantria returned files with detected entities, themes, languages,

and sentiment score (calculated from emotion scores) for each comment. The sentiment score

then went through JMP (a statistical tool by SAS) to produce a logistic fit curve to show

distribution. The limitation is that Semantria can only read 100 queries with 1500 characters

on each query text so longer comments were filtered.

The number of queries for the inauguration, immigration, and congress speech was

2849, 2862, and 2899 respectively. Netlytic then drew word-over-time graphs and popular

keyword cloud. It migrated data into network analysis which contains name- network and

chain network diagrams. After seeing the peaks on each visualization, I looked back on that

day to see what exactly happened. Name network diagrams give me a view on opinion

cluster-people who shared similar views on each topic are grouped into one cluster. I am able

to manually find these comments and draw conclusions from featured comments.

Metrics and measurements

Text analysis

Wikipedia defines text analysis as “a set of linguistic, statistical, and machine learning

techniques that model and structure the information content of textual sources for business

intelligence, exploratory data analysis, research, or investigation”. The higher-level goals in


this method are to investigate language usage, entity recognition/extraction and visualization.

Programming language skills are preferred but not required because tools are the means not

the end. Toolsets for text-mining in this project include JMP (word frequencies, word cloud),

Bitext (keyword extraction), Netlytic (network, chain analysis) and IBM Watson analytics,

Gephi, and Tableau for visualization.

Network analysis

Netlytic approaches this task by building two types of social networks: i) Name network

and ii) Chain (reply-to) network. In the networks, each node represents a name, a person, or a

comment. The ties in between them represent the relationships between two entities. The

shorter the closer. Nodes form clusters and clusters indicate where opinions lean and whose

comments attract most interactions.

Semantria for sentiment analysis

Semantria categorizes results into sentiment polarity and sentiment scores. Polarities can

be "negative", "neutral", or "positive". Scores have two types-Component sentiments and

document sentiment depends on target entities. Components are themes, topics, and entities.

The score range is from -10 to 10.

Common limitations

Several commercial and academic tools, such as those from IBM analytics, SAS, Oracle,

SenticNet, and Luminoso, track public viewpoints on a large scale by offering graphical

summarizations of trends and opinions. Yet most COTS tools are limited to a polarity

evaluation or a mood classification, according to Erik Gambria, a scholar from Nanyang

Technological University. Human emotions are generally categorized to a limited set of

emotions in each research platform and such methods rely mainly on parts of text in which
emotional states are explicitly expressed and, hence, they cannot capture opinions and

sentiments that are subtle and implicit in YouTube comments.

YouTube: chatbots refer to artificial conversational entities that conduct conversations

and post comments by a computer program. A classic example of the malicious use of chatbots

was Yahoo messenger’s inability to deal with the spam unleashed by its bots. Arun Uday

pointed out in a Techcrunch article, “In fact, so bad was the problem that it forced Yahoo to

contrive the now familiar captcha code to prevent bots from automatically entering chat rooms”

(Uday, 35). While chatbots inspired academic researchers to develop techniques for

distinguishing bots from humans, there’s no record showing any programming languages can

successfully detect them with 100% accuracy.

IBM: The latest version of tone analyzer was released in July 2016. Though human

emotions are far more than these categories, I’m using the basic five emotions to start the

journey. IBM bluemix free accounts can process up to 1,000 queries every day.

Excel to Json: When YouTube comments contain emojis or foreign languages, JSON

files can store them but excel file won’t be able to convert it into Unicode. The issue

increased the number of comments filtered out by Semantria and Netlytic. In this project, I

manually read excel files and delete the invalid characters and emojis. A debug python

program may help to solve the issue faster but it requires extensive programming knowledge

for researchers.

Semantria: As mentioned earlier, Semantria has the following input limits:

● Configurations/Profiles - 10 configurations/profiles allowed with up to 50 characters

per configuration/profile

● Blacklist - 100 items of up to 50 characters allowed


● Queries - 100 queries allowed with up to 50 characters title and up to 1500 characters

query text.

● Categories - 100 categories allowed with up to 50 characters titles and up to 10 samples

per category (sample can be up to 50 characters as well)

● Entities - 1000 entities allowed with up to 50 characters per title and entity type.

● Sentiment-bearing Phrases - 1000 phrases allowed with up to 50 characters per phrase

● Maximum size of document ID - 36 characters

● Allowed document size - 2048 characters per document

Using Watson for knowledge management

Alchemy Language is a collection of APIs that provide natural language processing

through text analysis (Watson, n.d.). This set of APIs provides the ability to put structure onto

unstructured text. In particular, at the time of writing, it provides a number of different potential

knowledge management capabilities, including the following:

• Entity extraction (cloud computing, United States);

• Sentiment analysis (documents are given a score and a rating of positive or negative);

• Emotion analysis (documents are categorized for the amounts of different emotions);

• Keyword extraction (keywords are extracted);

• Concept tagging (different concepts are tagged and rated for relevance);

• Relation extraction (relationships between companies or companies and capabilities);

• Taxonomy classification (categorization);

• Author extraction (author information, if present, is extracted);

• Language detection (for example English);

• Text extraction.
Watson provides additional knowledge management capabilities. Rather than having a

person go through and track who is the author of a document, what might be some keywords or

how the particular knowledge contribution should be catalogued by some taxonomy, the Watson

system provides those capabilities. In particular, Watson provides a number of capabilities

designed to facilitate content management. Based on these capabilities, a number of classic

knowledge management librarian capabilities appear to be automated. However, Watson also

expands on those kinds of knowledge management librarian capabilities. As an example, the

“emotion analysis” (as discussed above) service analyzes text in order to detect anger, disgust,

fear, joy and sadness in a sample of text that can be used as above or for other purposes.

Authentication

The Tone Analyzer API is accessed by entering the password and username specified in

the service login details for the service instance that you want to utilize. Basic authentication is

used by the API. After you've created a Tone Analyzer instance, go to its dashboard page and

click Service Credentials from the left-hand navigation to see the instance's password and

username.

VI. Components

 iOS 8.0+

 Xcode 9.0+

 Swift 3.2+ or Swift 4.0+

Procedure
You can develop this project as a starting kit on IBM Cloud, which automatically

provisioned essential services and injected service credentials into a bespoke fork of this pattern,

as an alternative to the procedures following:

i. Install developer tools

ii. Install dependencies

iii. Create a Tone Analyzer service instance

iv. Run

Prerequisites

 Sign up for an account at IBM Cloud

 Download the IBM Cloud CLI

 Create an instance of the Tone Analyzer service and note down the login details:

 Proceed to the Tone Analyzer page in the IBM Cloud Catalog

 Proceed to login to your IBM Cloud account

 Click Create

 Click Show to view the service login details

 Copy the apikey value or the password and username values if your service

instance does not provide an apikey

 Copy the url

Configuring the application

 In the application folder, copy the .env.example file and create a file called .env
 cp .env.example .env

 Open the .env file and add the service credentials that you obtained in the previous step.

If your service instance uses username and password credentials, add the

TONE_ANALYZER_USERNAME and TONE_ANALYZER_PASSWORD variables to the

.env file.

VII. Individual Contribution

I spent a significant amount of time to advising other team members on specific project tasks

such as Implementation to help my group achieve attain objectives. Throughout the project, I

demonstrated great leadership abilities in a variety of project areas. I was also the team leader.

Each team member was given a section of the assignment to complete, and once everyone's

contributions were gathered, the paper was proofed for formatting, grammatical, and punctuation

mistakes, uploaded for evaluation and approval by the group, and the final product was posted.

VIII. Conclusion

The Tone Analyzer uses linguistic analysis to detect emotional, social, and language

tones. A document's tone, as well as a sentence, can be evaluated by the service. The service can

be used to discover how others perceive your written communications and, as a result, enhance

the tone of your conversations. Businesses may use the service to improve and study their client

conversations, as well as to learn the tone of their clients' interactions and respond to them

accordingly.

JSON, HTML or plain text input containing your authored content is provided to the

service. The service will accept text up to 128 kilobytes in size, which is approximately one
thousand phrases. The service returns JSON data that includes the tone of your input. These

insights can assist you in improving the perception and effectiveness of your communications by

ensuring that your writing has the tone and style you desire for your target audience.
REFERENCES
1. Wood, Laura. "Worldwide $1.71 Billion Emotion Analytics Market 2016-2022:

Drivers, Opportunities, Trends, and Forecasts - Key Players are Microsoft, IBM,

Retinad VR, Neuromore, Imotions, Kairos, Affectiva & Eyris." Nasdaq

globenewswire.13 Jan. 2017. Web. 16 Feb. 2017

2. Cobb, Michael. "Measuring Emotion in a Volatile Election." Possible. 07 Nov. 2016.

Web. 30 Jan. 2017.

3. McDuff, Daniel, Rana El Kaliouby, Evan Kodra, and Rosalind Picard.

"Measuring Voter’s Candidate Preference Based on Affective Responses to

Election Debates."

Affective Computing. 02 Sept. 2013. Web. 30 Jan. 2017.

4. Heubl, Ben. "How to apply face recognition API technology to data journalism with R

and python." Data dico. 20 Oct. 2016. Web. 30 Jan. 2017.

5. T. Brader, “Striking a responsive chord: How political ads motivate and persuade

voters by appealing to emotions,” American Journal of Political Science, vol. 49, no.

2, pp.388–405, 2005. 30 Jan. 2017.

6. Dahlke, Dan. "Election 2016: Analyzing Real-Time Twitter Sentiment with MemSQL

Pipelines." MENSQL . 18 Oct. 2016. Web. 31 Jan. 2017.

You might also like