Professional Documents
Culture Documents
LANGUAGE PROCESSING
AN INSIDE LOOK AT THE ENGINE THAT POWERS
TEXT ANALYTICS AT CLARABRIDGE
A CLARABRIDGE EBOOK
Table of Contents
Introduction....................................................................................................................1
Conclusion....................................................................................................................30
About the Author...........................................................................................................................31
About Clarabridge..........................................................................................................................32
Introduction
When I started working at become more consumable, they are quickly
Clarabridge six years ago, natural divorced from their technical origins.
language processing (NLP) and One no longer needs to be a software
text analytics were very technical- developer or a computational linguist
to use NLP technologies. These toolkits
sounding terms. My family and
“blackbox” a lot of their complexity so
peers were baffled when I that users can focus on the outputs, not
tried to explain the field that I on the algorithms. However, as a result,
worked in. Since then, though, many users leverage only a small fraction
the term NLP has experienced a of these systems’ capabilities by focusing
rebirth as marketers have tried on outputs that are easy to consume (e.g.,
word clouds). In an attempt to recapture
to capitalize on the perceived
the complexities of NLP, some individuals
sexiness and sophistication have embraced a related academic term,
of related technologies. natural language understanding (NLU),
to refer to computational approaches to
While it still preserves a very specific understanding meaning behind language,
meaning in academia and research instead of just abstract patterns.
organizations, NLP has been watered down
in the customer experience (CX) space and Let’s be honest: Just grouping data into
often just refers to lists of words and topics. topics or categories isn’t enough anymore.
This change in semantics is unsurprising. For a while, it was exciting and novel
As NLP toolkits like Stanford NLP, NLTK, to be able to categorize your data into
spaCy, IBM’s Watson and Google’s SyntaxNet topics with a few keywords and phrases.
2
© Clarabridge. All rights reserved.
PART 1
UNDERSTANDING THE MEANING OF WORDS
employees and second, the livelihood of would add the top male and female names
every organization is tied to its customers. from the most recent census to a category
Without a doubt, people are the essential node. But, as you can imagine, names like
cogs in the business workflow; we designed Mark, Sue, Will, Bill, April, May, June, Ray,
our Person detection functionality Angel, Guy, Bob, Virginia, Rose, Ruby, Grace,
to cater directly to this keystone. Dawn, Amber, Joy, Terry, Penny, Kay, Violet,
Daisy and Barb caused a real headache!
Clarabridge offers three attributes that Similarly, there was no simple way to
populate automatically when you load data query for all of the different variations in
that assist in identifying individuals: Person, the structure of phone numbers or to get
Phone Number and Email Address. These a list of all email addresses mentioned.
attributes provide immediate value as you The power of a mature NLP engine is that
can determine which of your employees are it can use techniques that go far beyond
mentioned in your data as well as identify simple keyword matches to find these
which customers are seeking engagement kinds of entities. Our approach, which
by providing their contact information. combines both lists of known names and
language-specific linguistic rules, yields
Prior to the release of these features, higher precision and higher recall of
identifying these kinds of attributes was detected names and contact information.
extremely difficult in Clarabridge. Users
How are these semantic attributes
best leveraged? Here are popular
use cases for these features:
2. Customer Engagement
Quickly identify customers who are seeking
engagement with your brand by finding all
records that contain contact information
using our Phone Number and Email Address
attributes. Customers may or may not be
explicit when soliciting a response. Picking
up on subtle cues like the presence of a
phone number or email address in text and
then proactively contacting the customer
is sure to make a positive impression.
3. Influencer Identification
Not only will Clarabridge pick up on the
names of your employees, it will also
The power of a mature
find names of other individuals in the NLP engine is that it
data. For example, it’s not uncommon can use techniques
to see names of performers, politicians, that go far beyond
business executives and other celebrities
bubble up in reports for Person. It can
simple keyword
be incredibly insightful to discover these matches to find these
external influences on your customers’ kinds of entities. Our
perceptions. When analyzing Las Vegas
approach, which
casino reviews, we found lots of negative
mentions of Elton John. Turns out that combines both lists
he had been canceling lots of shows of known names and
and customers were not pleased! language-specific
The above-mentioned use cases
linguistic rules, yields
have led to valuable insights for our higher precision
customers, and we’re excited to see and higher recall of
how they continue to discover more
detected names and
and expand our use case library.
contact information.
Detecting Events
For many people, September 19 events take place on specific days, some
is a remarrrrrkable occasion for span weeks or months, and each comes
with its own traditions and customs.
which pirate memes and puns
abound. National Talk Like a Pirate For those in the customer experience
Day, invented in 1995 as an inside space, understanding consumer behaviors
joke between two friends after a in relation to specific events can be
contentious racquetball game, has extremely valuable information as it
become an internationally acclaimed can power development, marketing and
commercial decisions. In addition to
parodic sensation. (September 19
detecting the named entities described
also happens to be my birthday.) in previous sections, Clarabridge
automatically tags events of all kinds.
Events bookmark our perception of Our dictionary of events is geographically
time and give us a shared cultural diverse and includes entries for all of
experience. Specific events also help the event sub-types mentioned in the
keep certain businesses afloat. Many previous paragraph, including Diwali,
retailers make more money between Halloween, Chinese New Year, Easter,
Black Friday and Christmas than they Shabbat, The Olympics, The Oscars, Black
do the entire rest of the year. History Month, back-to-school, weddings,
How are these event attributes These examples have led to valuable
best leveraged? Here are popular insights for our customers, and we’re
use cases for this feature: excited to see how they continue to
discover more uses for this feature.
1. Seasonal Trends
Track mentions of events in conjunction
with discussions of sales and promotions
(Independence Day Sale, Black Friday,
Cyber Monday) to determine which ones
are generating buzz. Filter down these
For those in the
mentions to those discussing availability
to determine if demand is outpacing customer experience
supply at critical times of the year. space, understanding
consumer behaviors
2. Gift-Giving Opportunities
Determine for which occasions products
in relation to specific
are being purchased by individuals events can be
for themselves or as gifts for others. extremely valuable
Many products and brands are favored information as it can
for certain milestones. Mentions of
weddings, engagements, baby showers,
power development,
graduations and so on may help marketing and
highlight how best to market and price commercial decisions.
items to target specific buyers.
3. Product Uses
Discover how products are being used
by analyzing them in conjunction with
1 Emoticons, a portmanteau of “emotional icons,” consist of punctuation marks, letters or numbers that are used to express an emotional
state, such as :) or :(.
2 Emojis are pictographs of faces, characters, objects, actions and symbols, such as or .
© Clarabridge. All rights reserved.
14
DEBUNKING NATURAL LANGUAGE PROCESSING | PART 1
comments about the FIFA World Cup when of unambiguous emojis and emoticons to
suddenly we found ourselves confronted which Clarabridge assigns out-of-the-box
by a fierce herd of goats: . Our brief positive or negative sentiment values.
confusion subsided as we realized the
use of the emoji revealed which player Analysts can choose to analyze emojis
each country believed was deserving in much the same way that they would
of the Greatest Of All Time (G.O.A.T.) approach words. Here are the top
moniker! Similarly, you might look for use cases for emoticons/emojis:
to determine which products socially hip
customers find attractive or exemplary, 1. Emoji Cloud
as in “lit,” or where they are in full Create a word cloud of just emojis to
agreement with a new idea or campaign. explore how customers are reacting
Keep your eyes peeled for innovative uses to your brand. Emotional faces and
of emojis; you never know what you’ll find! other symbols can quickly give you a
sense of brand perception and may
In Clarabridge’s NLP engine, emojis and even clue you in to product uses.
emoticons are treated the same as
words. Variations on common emojis 2. Correlation with
or emoticons are mapped to standard Products and Services
tokens. For example, :) and :D both Drill down on specific products, services
map to :). These tokens can be tuned and experiences to see which emojis
to contain sentiment. The meaning of are used in conjunction with each one.
most emoticons and emojis are context- Or, drill down on a specific emoji to see
sensitive. However, there is a small subset which products, services and experiences
are being discussed. Understanding
reactions to these offerings can assist
in designing, marketing and promoting
The Clarabridge NLP your products and services.
1. Isolation of Suggestions
and/or Requests
Let your customers guide you to success
by isolating their suggestions/requests
in your dataset. Analyze the top topics of
suggestions to understand where your
business could improve; track KPIs for
this customer group to determine how
any product or service deficiencies are
affecting their loyalty. Look at topics of
requests (questions) to determine where
you may need to amp up your messaging
or clarify content on your website.
2. Noise Filtering
Hearing the customer’s voice can be
challenging when it’s swallowed up in
a sea of non-actionable messages. Use
Let your customers
our non-actionable sentence types to guide you to success
filter out the content that you don’t by isolating their
want to see. This step helps to part the
suggestions/requests
seas, so to speak, allowing you to hear
the feedback that you were missing. in your dataset.
Analyze the top topics
3. Route Cries for Help of suggestions to
or Threats of Churn
understand where
Don’t miss out on an opportunity to help
your customer! When customers ask for a your business could
return contact, leave their phone number or improve.
email address in their response, or threaten
to leave your business, they are pleading
for assistance. Use the Cries for Help or
Churn sentence types to automatically
trigger an alert and a case within our
Case Management system so that it gets
routed to the agent who can best help.
Detecting Spam
Waitress: Ah, Monty Python at its finest. And, recently,
Well, there's egg and bacon an apt description of my email inbox and
my voicemail. Spam, Hawaii’s favorite
Egg, sausage and bacon
processed meat and Western culture’s
Egg and spam favorite protein to mock, has an impressive
Egg, bacon and spam résumé. Hormel offers 15 flavors of Spam
Egg, bacon, sausage and spam and has sold over 8 billion cans across
Spam, bacon, sausage and spam 44 countries since its introduction in 1937.
Spam, egg, spam, spam, But, Spam’s influence isn’t just culinary; it
has also had an understated influence on
bacon and spam
our technological world. Inspired by the
Spam, sausage, spam, spam, spam, Monty Python sketch, certain abusive users
bacon, spam, tomato and spam of Bulletin Board Systems and Multi User
Spam, spam, spam, egg and spam Dungeons would repeat “spam” a massive
Spam, spam, spam, spam, spam, number of times to scroll other previous
spam, baked beans, spam, messages off the screen. Soon, the term
became a moniker for the unwanted junk
spam, spam and spam
we find on the Internet that obfuscates
the content that we actually want to see.
Not all spam, though, is created equal. documents upon ingesting or to retain them
By classifying the types of inorganic for analysis. Analysts can also customize
messages that appear frequently in this feature by training the algorithm on
customer feedback data, we can gain a their own data and with custom sub-
better picture of how the market views types. With the Clarabridge Content Type
a brand or product. The types of spam Detection, users can go beyond basic
present in each dataset may vary. On topic and sentiment analysis. Considering
social media, auto-generated messages in the integrity of a message provides
content such as news headlines, reviews a different dimension and a unique
and articles abound. In email sources, job analytical lens for many stakeholders that
requests or solicitations for corporate would be missed or misinterpreted if all
sponsorship may get in the way. Analyzing documents were viewed the same way or
the language used in the inorganic were viewed purely by their independent
messages bares its own utility. When these words. Just the same as how “egg and
spam documents mention a brand or spam” is not the same as “spam, spam,
product, they may reveal how customers spam, egg and spam.” Bon appetit.
and potential customers use, advertise
and perceive that brand or product.
Translation
Bad translations are ubiquitous
in our globalized world. There
are seemingly infinite listicles
showcasing very comical
mistranslations. My favorite
service, though, might be
TranslationParty, which takes your
sentence and translates it back
and forth through Google Translate
between English and Japanese
until it reaches equilibrium. The
result is inevitably a bastardized
but hysterical interpretation
of your original sentence.
However, neither human nor machine The subtlety of meaning, emotion and
translation is perfect. Both are vulnerable intent are inevitably lost, jeopardizing
to misinterpretation. Machine translation your ability to relate to your customer.
excels at translating certain words, While NLP engines (including Clarabridge’s)
especially nouns. Verbs and adjectives, can be used to analyze both native and
on the other hand, are more difficult to translated customer feedback, the output
translate automatically; they often have from text in its original language will always
subtleties that are lost in translation. be superior in quality and accuracy.
While it’s not so hard to translate a
pronoun, a preposition or a conjunction There certainly are cases where translating
in isolation, machine translation can text is an essential component of business.
end up completely reorganizing them Marketers must translate web content
throughout the sentence. Add idioms, to target new markets. News agencies
hyperbole, sarcasm and named entities must translate headlines in international
to a sentence and we have significantly environments. This pattern does not apply
complicated the task for both a human and to customer experience management or
a machine. While analyzing a translated customer experience analytics. In the CX
text, a human can make inferences about ecosystem, native-language text reigns
the meaning even when the grammar or supreme. You should make every effort
vocabulary are flawed. However, when we to preserve your customers’ intended
ask an NLP engine to interpret an imperfect messages in their truest form. By doing
translation, we are asking for trouble. so, you improve your ability to understand
the messages fully and to offer customers
Analyzing text through computational the most empathetic service possible.
means is also a very difficult task. It’s
taken the industry many decades to A comment I often hear is “But Ellen, I
produce suitable, scalable systems that don’t have anyone on my team that speaks
work on well-formatted, natively written [insert your exotic language here]!”
text. As we’ve established, translating text
often warps keywords and grammatical That is a fair point. It’s better to understand
elements, which makes natural language your customers partially than to ignore
processing even more challenging. them completely for not speaking your
Attempts at feeding poorly translated lingua franca. In this situation, I encourage
text through an NLP engine may result in you to translate the keywords in your
failures to identify negation, modification, topic categories rather than translating
sentiment, associated words and topics. the text itself. This procedure is most
Sarcasm
When I tell people that I work on to study. Linguists have classified sarcasm
into specific sub-types including irony,
text analytics products, their first
satire, passive aggression and flattery.
question is often “But, how do They’ve determined that your brain works
you handle sarcasm?” My typical differently when processing sarcastic
response—“About as well as a comments in comparison to sincere ones.
human, which is to say not very Others have identified the facial tics that
well at all”— is equally sincere and betray an otherwise earnest face. Most
of this research, though, is conducted
sarcastic, a poetic homage to the
through individual face-to-face analysis.
difficulty of the linguistic problem. Conducting broad analyses of sarcasm
through text analytics is nearly impossible.
Sarcasm is mired in a complex web of deep
Text-only communication lacks tonal and
cultural knowledge, emotional sensitivity
visual cues, making it highly susceptible to
and individual awareness. Some cultures
misunderstanding and misinterpretation.
(looking at you, my British friends) find
Other clues for correct interpretation of
comfort in the dryness of the humor,
a comment may be baked directly into
whereas others find it distasteful or even
the medium and shared among some of
disrespectful. Sarcasm can be subtle or
its participants, but these clues may be
obtuse; flattering or insulting; funny or
imperceptible or opaque to outsiders.
offensive. In sarcastic expressions, the
words used are only a small piece of
the puzzle. Tone matters, body language An NLP engine is not a native speaker of any
matters, context matters. Given the human language. It understands the rules
high degree of tangible and intangible or the patterns that its human overlords
awareness necessary for sarcasm to have programmed into it, but it lacks any
succeed, it is perhaps unsurprising linguistic intuition. It can understand words
that non-native speakers struggle to used, relationships between those words,
master both delivery and understanding and maybe even emotion and intent, but it
of sarcasm in learned languages. fails when meaning transcends content. It
may understand its context or its purpose
All of these components make sarcasm but will undoubtedly fail when other niche
an extremely difficult linguistic problem cultural or societal knowledge is injected
into a witty retort. As humans, we could vice versa) such as in the sentences
all listen to or read the same passage and “I love sitting in traffic” or “going to
walk away with different understandings the dentist is the best.” Clarabridge
of its intent. An NLP engine fares about the understands word associations,
same. In some situations, it will interpret parts of speech and sentiment and
an ironic comment correctly; in other allows users to leverage this word-
cases, it will completely miss the mark and level metadata in the construction of
produce the exact opposite sentiment as sentiment rules. Users could construct
a native speaker would otherwise expect. a rule that negated every positive
verb associated with “dentist” or
The Clarabridge NLP engine errs on the side “traffic” or “[insert emotionally charged
of sincerity, but, given the flexibility in our word from your industry here].”
sentiment engine, users have the power
2. Social media posts are now often
to customize rules to support common
suffixed with #sarcasm, #sarcastic or
sarcastic phrases that appear in their
#not to aid a reader in interpretation
dataset. I’ve found success in customizing
of an otherwise ambiguous post.
rules for two specific types of sarcasm.
Users can tune sentiment based off
of specific hashtags and positions
of these hashtags within posts.
Conclusion
With increasing pressure on If you intend to make your customers the
customer experience teams to centerpiece of your customer experience
strategy, you must also make them the key
deliver value and customer-centric
part of your customer experience analytics.
solutions, there’s no time to Understanding your customers, just like
meander around looking for insights. deeply understanding your family, your
friends, your boss, your co-workers, or your
Topic reports lack the oomph needed to partner, is really about going beneath the
truly change a business for the better. surface by discovering what makes them
By analyzing the intents, emotions tick. Each person or group of people is
and effort expressed in feedback data, much more than a list of topics or a set
analysts can better understand their of sentiment scores. They are nuanced
customers’ needs and wishes and cut individuals who are constantly evolving
through all of the noise with a few swift with new ideas, opinions and connections.
clicks. It’s no longer sufficient just to It is only after we truly digest the factors
know what your customers are talking that influence their experiences and their
about; you need to also understand perspectives that we can empathetically
how and why they’re talking about it. and effectively design solutions that
cater to their needs and preferences. A
customer experience strategy that matches
solutions to true needs will be much more
intuitive and successful than one that
It’s no longer sufficient takes surface-level assumptions and offers
just to know what your hasty answers. You owe it to yourself and
your organization to spend more time
customers are talking getting to know your customers. After
about; you need to all, you’re a customer too, aren’t you?
also understand how
and why they’re talking As you embark on this mission, I encourage
you to think carefully about the tools that
about it. are in your toolbox. If you endeavor to
truly understand customer feedback, you
need to leverage tools that go beyond
About Clarabridge
Clarabridge helps the world’s ADDITIONAL STRENGTHS INCLUDE:
leading brands understand the • A world-class analytics platform
with beautiful, interactive and easily
true voice of their customers.
customizable dashboards that can
Companies leverage Clarabridge’s
be tailored for every role in the
omni-channel capabilities to organization, advanced predictive
listen to all customer interactions algorithms and sophisticated
and conversations, analyze case management workflows
this data via a best-in-class AI- • A patented, best-in-class Natural
powered text analytics engine, Language Processing (NLP) platform
and get actionable insights to specifically designed for Customer
make mission-critical decisions. Experience Analytics that combines
the latest AI and Machine learning
technologies and offers accurate
and nuanced topic, emotions,
effort and intent detection