You are on page 1of 12

Snohomish Guidelines

1 l
Overview

56 tia
We are developing a visual search feature for commerce. This feature will need to be
able to recognize objects in a variety of poses and contexts. To achieve this, we need
a dataset with many images of the same item.

We need annotation to exclude images that do not feature the product (eg. fabric
swatches, diagrams for size information, overly zoomed in images, or images of
other color variations), as well as to draw bounding boxes around all instances of a

23 n
product in an image.

Product Image annotation


e
The Product Image annotation task presents an annotator with product metadata,
including:
• Product name (retailer provided)
• Product Category (retailer provided or model predicted)
59 id
• Description (retailer provided)
as well as several images the retailer / brand has provided for that product.

The annotator must identify which (if any) images are qualified (defined below)
matches for the product description.
nf

The product metadata is intended to disambiguate what the product is supposed to


be, in case it is unclear from the images. If the product name includes a color, and
the images are of another color, this would not be considered a match, for example.

Same-product images will be grouped together so that an annotator doesn't have to


Co

re-read product metadata for each image.

Data source

Product metadata and images will come from Catalog data provided by brands and
retailers. We will use images, as well as names, descriptions, categories (when
available), and color attributes (when available) provided by the retailers. Most of
the catalog data will be in English, however you may see some jobs where the
metadata is in another language.

We will group images for the same product together so that annotators do not have
to read metadata for each product more than once.
Annotation Task

We will use Halo to implement our annotation flow.

A set of images will be presented that a retailer has indicated represent the same

1 l
product. The annotator must make two judgements for each photo:

56 tia
• is this a “qualified” image of this product?
• where is this product in the image?

Both of these judgements are reflected in a single annotation: drawing a box around
only “qualified” matching images of the product (or rejecting the whole product).
What makes an image a “qualified” match is defined below. Images that do not
match, or which are not qualified matches, should not have boxes.

23 n
When images are duplicated, only one of them should be boxed (usually the first,
unless a later copy is of noticeably higher quality). This includes the same image
cropped differently, since the region within the bounding boxes will be the same, or
e
very similar.
59 id
nf
Co
1 l
56 tia
23 ne
59 id
Images are labeled in same-product groups, so that annotators don't have to re-read
nf

description and context for each image. The first product image (usually the most
accurate) will be presented first for labeling, to provide a visual hint as to what the
product is.

Sometimes the color of a product is not indicated in the name, description or color
attribute. (Even when it is, the color is often obscure; it is not necessary to search
Co

the internet to resolve color names.) When the color can't easily be determined, use
the first image to indicate the color. If the first image is a fabric sample or color
panel indicating a green color, and other images are of non-green products, assume
that the product is a green variation of the pictured product. If no pictures match the
indicated or assumed color, then do not draw boxes around any images.

Qualified Product Image Matches

A “qualified” product image satisfies the following conditions:


• The image matches the product name and description.
o This includes visual attributes such as color, which may be specified
either in the name or description, or in a separate color attribute.
o It is not necessary to hunt for minor discrepancies, but an image of a
“red 3-drawer chest” should be red and have 3 drawers.
o Some products have multiple possible colors labeled in the “color”
attribute (eg. Color: “brown, black” with a black chair pictured). If a
product image matches only one of a set of colors listed, then assume

1 l
the product record refers to only the matching color, and ignore other

56 tia
colors listed.
• The image shows the product itself, and not the product's original packaging
(or manual / accessories) that users would discard.
o For example: if furniture is in the box it comes in that users would
discard, we do not box it. Or if a phone is shown in the packaging it
comes in, that gets discarded, then we discard it
o An exception are products that permanently live in a package / box
that is not to be discarded but is essentially part of the product board;


23 ne such as:
▪ Board games (that live in a box permanently)
▪ CDs (that live in their case)
▪ DVDs (that live in their case)
▪ Shampoo (shampoo lives in a bottle that is packaging but a part
of the product)
The image is not a kit of several products (e.g., a nail kit, or a subscription
59 id
box). We can reject jobs that are of kits that are made up of many different
products.
• The image is not a color square, fabric swatch or material sample.
• The image is not a zoom showing only a very small part of the product
o We consider ‘not a zoom’ to be where ~40% or more of the image is
nf

visible
o [If only ~40% or less is visible, then do not draw a bounding box]
• The image does not have text overlaid near the product (eg. within where a
bounding box would go)..
o This includes watermarks, promotional text (eg. “sale”), or overlaid
measurements. These are all fine if they are far enough from the
Co

product that they would not be captured in a bounding box.


o Text that is part of the product (eg. an image label, or a piece of art
with text) is ok, as long as it's not a zoomed closeup of a label.
• The image is not a duplicate (including zooms and resizes) of another
qualified image (i.e. label just one image in a set of duplicates).
• The excerpt of the image is where the product is located is not identical too
(or a zoomed in version of) another image. In other words, if the subset of
where the product is shown in an image is identical to another image (or
zoomed in), we do not annotate one of the two images.
o In the example below the two images have different overlaps (the one
on the left has text while the one on the right has numbers) but the
part of the image where the product is present is identical – such that
we only annotate one image, and not both.
1 l
56 tia
• Enough of the product (80% of the product) is visible in the image in order to
recognize it.
o For example, if a dinner plate is hidden below other tableware to such

23 ne an extent that you're not sure if it's the same plate, do not box it.

When a product is shown more than once in an image, draw boxes around
each “qualified” instance within the photo.
• If the product is a set of two chairs, and the image has the same chair image
copied twice, then just box one of the chairs, since the second is a duplicate
(and therefore is not “qualified”).
59 id
• If the product is a single chair, but an image shows a collage of multiple
variations, and the correct variation cannot be inferred, then do not box
anything. If the variation (eg. color) can be inferred from the description /
attributes, then box only the matching variation, if present.

Special cases when the product name does not identify a singular product:
nf

• When the product name is not English or a language the rater speaks,
annotations can reject with “The product can't be inferred from the metadata
and images”.
• If the product name refers to a service (e.g., “Washing machine repair”,
“Kayak rental”) then we do not annotate the product related to that service—
Co

as the product is not what the images are about.


o For example, if a product description is about a “car repair” no cars
visible should be boxed and the job should be rejected.
• If the product name identifies a product and that product is present in many
colors, although the product description did not indicate a color, reject with
“The product can't be inferred from the metadata and images.”
o For example, if the product description says “mug” and in the images
there is respectively a green, blue, yellow and red mug that are
identical except for their color, we do not annotate, but reject.
Rejection Reasons

There are some situations where an entire product should be rejected, and no boxes
drawn:
• No image matches the product description / color / variation.

1 l
• The product is a set of multiple items that are not identical (eg. a table and

56 tia
chairs, but not two identical chairs).
• The product can't be inferred from the metadata and images.
o For example, images of a chair that's available in multiple variations,
and the first few images contain multiple variations in each image.
• No image is “qualified”, as per the conditions defined above
If you're confident that a product should be rejected, but none of the other rejection
reasons are appropriate, there is an “other” rejection code. We'll review these and

23 n
possibly revise our task to clarify how these should be treated.

Box Accuracy
e
Unlike many bounding box tasks, it is not necessary to use the zoom widget for
pixel-perfect bounding boxes. We expect boxes to be within a few pixels of where
the object starts, at the original image zoom size. It's more important to identify
which images should be boxed at all (to exclude non-qualified images including
59 id
zooms and variation mismatches) than to annotate perfect boxes.

Here are some model examples of what a fully correct bounding box, that is tightly
around each extreme side of the respective product, looks like:
nf

Product Name: Denim Blue Levi’s Jeans High Waisted


(Note: that here the correct bounding box goes all the way to the waist since we can
see the jeans through the shirt)
Co
Product Name: Modern-Style Bench

1 l
56 tia
23 n
Product Name: INTAX Mini Instant Film
e
59 id
nf

Product Name: Rosemary Essential Toner


Co
Co

Product Name: Headband


nf
59 id
23 ne
56 tia
1 l
Product Name: Polka dot blue headband baby girls

1 l
56 tia
23 ne
59 id
nf

Product Name: Pearl blue face mask


Co
1 l
56 tia
23 ne
59 id
nf

Product Name: Nars foundation


Co
1 l
56 tia
23 ne
59 id
nf

Product Name: White Plate


Co

Product Name: Chanel Purse Tan


Co
nf
59 id
23 ne
56 tia
1 l

You might also like