You are on page 1of 5

Guideline for Question-Answering Annotation

In each task, you will see one product, one question about the product, plus 5 candidates. The 5 candidates come in one of
the following 2 formats:

● Semi-structured attributes, e.g., compatible_devices: { value:"im1, im3, rim300, im-1, im-3, im3r, js2, js3, mse" }
● Un-structured text, e.g., the supco kit is a universal replacement for most ge/sears/kenmore top freezer refrigerators.

The unstructured text is extracted from multiple sources in the Amazon product pages. It can be one sentence snippet
extracted from the product description, user review or community answers.

Task
For each of the 5 candidates, your task it to judge how much the candidate can help answer the question. You will choose
from the following 4 labels:

1. fully answering. Meaning that the candidate contains full information to help you draw the answer. It can take some
extra inference step to get the answer, but it must contain enough information to help come to the answer.
2. relevant but not fully answering. Meaning that the candidate contains useful information that help you know more,
and narrow down the range of the answer, yet not enough to get the exact answer from it.
3. requires additional context to interpret. Meaning that the candidate requires context to interpret and is not self-
contained itself to let you make the decision. Since the candidate is one snippet extracted from the product
description, user review or community answers, we might need its surrounding sentences to interpret it. This option
only applies to un-structured sentence but not to semi-structured attributes.
4. irrelevant. Meaning that the candidate does not provide useful relevant information at all, and you will not get any
useful information about the question after reading it.

Note: If a candidate is an uncertain expression, e.g., it probably does not need the battery, it is approximately 12“ long, it is
likely to have a doll with it, then you should mark it as label 2, even if it seems to fully answers the question.

Workflow
The following is an illustration of the workflow.
For one candidate, 

● If it is a semi-structured attribute, select from label 1,2 and 4


● If it is not a semi-structured attribute, decide whether it requires additional context to interpret. If yes, then select
label 3. Otherwise, select from label 1, 2 and 4.

Examples
Here we show examples of the 4 labels:

fully answering

Fully answering means the candidate contains enough information to let you draw the answer. The criteria of fully
answering should NOT be over strict. You can be lenient with the word choice, as long as the candidate conveys the proper
meaning. For example:

-question: is it an awesome gift for my girl friend?


-candidate: it is a nice valentine gift for your partner.
You shouldn’t bother with the difference between “awesome” and “nice”. The candidate is enough to tell you it’s a good gift
for your girl friend and thereby should be judged as “fully answering”.

One more example:

-question: is it comfortable to sleep on for a 6” tall man?


-candidate: It is comfortable to lie down for tall people.

You shouldn’t be over-strict about whether 6” can be considered as “tall” and whether “lie down” is equivalent to “sleep on”,
etc. Based on your common sense, if the immediate impression after reading the candidate provides the needed
information, you should NOT overthink about other ways of interpreting this candidate.

The following is more examples

A B C

1 question candidate comment


is it comfortable to sleep on for a
2
6” tall man? My 6“ son can fit it with no problem

requires additional knowledge to get the


3 Does it work with canon mp 495? works with all cameras smaller than 3"  answer, but the candidate contain
enough information
i could not find the fax capability described
4 is there a fax capability? anywhere.

5 what is the model number for the model_number: { value:"fax2840" }


toner/ cartridge this unit uses?

6 Is it good for my 4-year-old son? The toy is for kids 5 years old and up
well we have had them for almost 2 weeks and
7 does the battery last long? one has never been turned off and is still going The candidate contains full information to
strong. tell you "the battery lasts long".

relevant but not fully answering

Relevant but not fully answering means the candidate contains relevant information, but is not enough to answer the
question, or it can fully answers the question but the information is uncertain. “Relevant” means it provides useful
information to help you know more about the question or narrow down the scope of the answer.

For example:
-question: Is it good for my 3-year-old kid?
-candidate: my 5-year-old son likes it.

It cannot fully tell whether a 3-year-old will like it, but knowing that a 5-year-old likes it is helpful information. It helps you
narrow down the range of the answer — You know it is for kids but not adults, just not sure if it works exactly for 3-year-old.

The following are more examples:


A B C

1 question candidate comment

2 is it comfortable to sleep on for a My 6“ son seems to be happy with it The son seems to be happy,  but it's not
6” tall man? sure and it's an uncertain expression.

It only says "works with most cameras


smaller than 3"", which tells you useful
3 Does it work with canon mp 495? works with most cameras smaller than 3"  information but cannot fully answer the
question

4 how heavy is the cushion? the cushion is light It tells it's light, but not exactly the weight

The candidate has the uncertain word


5 how many pages does it have? it has approximately 100 pages. "approximately". Even if it contains the
exact information but it should be judged
as relevent but not fully answering.

It tells about the material uses the


6 what material is it made? its material is somewhat cotton like uncertain expression "somewhat cotton
like"
The candidate does not tell the average
7
what is the average age for this my youngest granddaughter is 2.5 yrs old and is age for it exactly, but it tells one age that
product? able to maneuver ezy roller. is suitable for. It is useful but cannot fully
answer the question.

requires additional context to decide

Requires additional context to decide means the sentence itself is incomplete or not self-contained (e.g. reference to
previous sentences, etc.) and you need additional context to interpret it.

Again you should NOT be over strict on it. You should only select this option when the candidate is fully unclear to you
unless more context is provided.

For example:

-question: is there a fax capability?


-candidate: i could not find the fax capability described anywhere.

This should be judged as “fully answering” as it implies “there is no fax capability”. You should not over think that maybe the
candidate is a snippet describing another thing but not about this product. Recall that all the candidates are snippets from
the product user review, description or community question-answering. You can assume they are all describing the provided
product. Based on your common sense, if the immediate impression after reading the candidate does not additional
context to interpret, do NOT select this option.

A B C

1 question candidate comment


The sentence is not self-contained, we
is it comfortable to sleep on for a don't know what it is about. It could be an
2
6” tall man? Yes, it does. answer to another completely different
question.
is this mattress comfortable Same problem. We don't know what the
3
enough to be slept in every day? its more of a giant pillow than a bed but yes. "yes" refers to.
We don't know whether the 13" is the
4 How tall is this product? 13“ height or width or length, there is no
additional context
is the one you are selling version
5
2? the one i bought from tisinc99 was. We don't know what "was"

We don't know whether the pure cotton


6 what material is its back made of? pure cotton refers to the back.
The sentence is incomplete and needs
7 Is it compatible with ios? It is compatible with:
p
7 Is it compatible with ios? It is compatible with: more context to help you decide.

irrelevant

“irrelevant” means the candidate provides zero useful information about the question, and is totally useless. Imagine you
are a customer that raises this question, you should only select this option when you cannot find any useful information from
the candidate.

The following are some examples:

A B C

1 question candidate comment


2 can you play with two players? the classic game of lovecraftian horror

3 how big of a doll can fit in the crib? dolls not included.

4 how do i request a sample? i don’t know.


5 can it work with 220 and 120 ? platform: { value:windows }
decided to get these black ones for our f150 and
6 can i get it in tan? am glad i did

You might also like