You are on page 1of 3

Customer Insights Analyst Task

Aim:
Label 300 reviews with tags that will help this e-commerce business measure customer
experience performance and pinpoint improvements

Background:
At SentiSum, we work in partnership with some of the most globally recognised brands
helping their Customer Experience teams understand and work on improvements as
flagged by customers in conversations and survey responses. Having machine learning
models that can understand the nuances of human conversation is a must for our
insights to make a real impact in these organisations.

To train these models, we have to label texts with the appropriate topics that can pick
out the information our customers will be interested in. This means our models will then
recognise this crucial information, predicting the topics you teach it for 100,000s of
comments.

Task:
1. Label 300 comments with the relevant topics, as shown in the example below.
2. Give topics and their definitions with examples in a separate sheet, as shown in
the example below.
3. Send us the links/documents to kirsty@sentisum.com

Tips:
1. This task must be completed manually. This is not a data science job posting, it
is for manual labelling and data annotation.
2. Most importantly, think about who a customer experience manager, or a head of
customer support. What would they want to learn from these customer comments
that will help them measure performance and pinpoint areas of improvement.
3. Be specific, knowing that booking is an issue is not enough because a customer
experience manager, for example, will not know what about booking to improve.
4. Cover as much of the data as possible but stay relevant with the topics,
remember no. 1! It’s important to cover as much of the information that comes up
with topics for accurate results, but the topics should also add value to a
business.
5. Easy to understand, unambiguous names. One of our main aims is to reduce the
time our customers need to understand an insight. That’s why really clear topic
names are important, so that customers do not need to read through the
comments themselves to understand.
6. Spend time on data exploration first before going straight into labelling.
7. Look beyond the words people are using to describe their experience, and look
at what the meaning is behind the words.
8. There should be clear distinctions between different topics. If different topics are
based on very similar language in comments, the model will confuse these topics
when predicting.

Example:
Here is an example of the task to be completed, with a different dataset (do not pay
attention or try to replicate these topics, unless they are also present in the task
dataset). This is how the 300 comments should be labelled.

Here is an example of how you should set out the topics, definitions and examples in
another sheet.
See data here, this file is view only, you will need to copy this dataset to another file for
labelling:
https://docs.google.com/spreadsheets/d/1cgrGVX-RXUHsjGoqhhQJUxZWKTmZsc1Vn
VtpXG15JvE/edit#gid=0

You might also like