You are on page 1of 77

Stonecoal v3.

3 Guidelines
Last updated 2023-06-09

I. Rater instructions
Note: In these instructions, document will be used to refer to any text, image, or video that you may be
evaluating

Context: What are ‘restricted topics’?


Global regulations and internal Meta policy outline a range of topics that should be restricted for ads use. In
this context, ‘restricted’ refers to content related to sensitive topics and regulated goods that may require
specific protection such as restricted use in ads.

Objective
The labeling task is to determine if the core purpose of a document falls into one or more restricted categories
and categorize the document into the best-fit category(s) that apply.

Use all of the information provided within the document to determine the core purpose of the document and
each correct category that describes the core purpose.

Overview
• The restricted categories are organized in a hierarchy, with some sub-categories appearing as
“children” of “parent” categories.
o Category = Parent Category
o Sub-category = Child category
• The parent will not be selectable in the SRT UI, and the user will see the child categories for that parent
when expanding each parent category using the arrow. Note: some categories consist of only one
term and do not have child categories. These single-term categories will be selectable.

The user will be presented with one document at a time and asked to select the most specific, but still
inclusive, child category or multiple child categories for each document.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


• Okay to multi-select across parent categories
o Multiple parent categories may apply to one document, in this case select the most specific
child category for each parent category that relates to the document
• Only select multiple child categories of one parent in very specific cases. Some examples:
o Weight loss drinks and pills should be labeled with Wellness > Vitamins & supplements AND
Wellness > Weight loss
o Some medications for specific medical conditions should be labeled with Medical conditions >
Illness & injury AND Medical conditions > Prescription & OTC drugs
o Some services for both children and adolescents should be labeled with Age > Children AND
Age > Adolescents

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Rating Materials Overview
SRT UI:
The document will be presented on the left side of your screen, along with any additional information about it
(URL/link, advertiser name, images or videos, caption with text, scroll function, more button)

Error Buttons:
There are four types of issues you might come across when evaluating documents: Page Load Error, Missing
Content, Wrong Language, and Sensitive.

The following table describes what each of these errors might look like, as well as what you should do if you
come across them while categorizing.

Issue Description Action Notes


The document fails to load any
information - your screen is Select
Page
blank and you do not see the SRT "Page An example of this would be when the SRT widget
Load
interface as shown in the "How Won't fails to load
Error
to Categorize a Document" Load"
section
If you do not understand the content that is
The document does not provide provided, perform a breif side search, if necessary,
Select
Missing enough context for you to decide using a search engine to gain a better understanding;
"Missing
Context a category. For example, a login however, not understanding the provided content
Context"
page or a blacked out screen. is not a scenario of Missing Context and the job can
still be properly categorized.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Choose "Wrong Language" when you are shown
content in a language that differs from the
designated language of the queue you are
categorizing in, leaving you unable to determine
the core focus of the document.

Do not use if a mixed language document contains


enough of the designated language to categorize
Some or all of the document is in accurately.
a language that differs from the
designated language of the Select Context is key. For mixed language jobs in which
Wrong
queue, preventing you from "Wrong there might be a single word such as 'transfer' or
Language
understanding its core purpose Language" 'crime', there may be no way to determine the
and correctly assigning a overall theme of the job.
category.
For jobs that have all text or audio in a language
other than the targeted language but except for a
single recognizable entry such as "5%" or "90%", it
is likely that Wrong Language is appropriate as well
unless the rater is able to determine the
"aboutness" of the job with just the single
recognizable entry. This may be an infrequent
situation.
Escalate to your manager immediately and report
The document contains material the Job ID if the document contains:
Select
Sensitive which is illegal and/or presenting 1. Child Exploitation and/or Child Nudity
"Sensitive"
an immediate danger or threat. 2. Self-Injury/Suicidal
3. A Time-sensitive/Credible Threat

Rating Instructions
1. Evaluate all components of the document, including watching the full play time of videos.
2. Determine the core purpose of the document or video.
3. Do a side search if necessary: some documents may have an unclear purpose, product or service and
further context may be needed. Please do a side search in this case. Wikipedia is a good place to start,
but sometimes a broader search will be necessary.
4. Select if the user is able to make a decision on this job
a. If yes, move forward to select the most relevant restricted category(s) - child categories
b. If not, select the most appropriate error category (see Error category definitions above in
Materials)
5. Determine if the core purpose falls into any restricted category(s)
a. Yes, categorize the restricted category(s) to the best-fit child-category/child-categories (select
all relevant categories)
b. No, select the “None Apply” option. This option should only be selected if the user has
thoroughly evaluated all images, text, and/or hashtags and found that the document does not
relate to any restricted categories.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
c. IMPORTANT the user must select at least one child category OR “None Apply” per job. If
neither None Apply nor child categories are selected, then the job will not be counted.
6. Indicate if you used the text, image/video, or text and image/video in the document to categorize
into child category or categories. If you did not select a child category and selected “None Apply”,
select N/A.

7. Instructions for handling multiple media in a single job:

A. Check the radio button below the media (Image and Video) and make sure that it is always at the left
most part. (See image above)
B. Tag all SCD content in the 1st media, before proceeding to the next. If the first media already contains
the SCD Attributes Category in the 2nd media, proceed to other SCD taggings if available.
C. Repeat step B for the succeeding media and rate accordingly.

8. Select submit to move on to the next job

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Important Tips
The full document should be reviewed and considered when applying labels, including:
• all images in a series
• full video play time
• any text and/or captions
Determine what the core purpose of the document is and use this to apply restricted category labels, as
appropriate.

In some documents, a possible restricted category topic can be mentioned, but is not the core purpose. In
these cases, only apply labels related to the main topic of the document.

On the right side of the screen, there will be the interactive section where the task is completed and
submitted. If the content that is provided is not clear or there is confusion, perform a brief side search, if
necessary, using a search engine or Wikipedia to gain a better understanding.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Important Tips Examples
1. Please label according to the core purpose of the document or video, NOT based on if a restricted
category is simply present anywhere in the document. For example:
a. A video of a person talking about ‘best summer reads', where one book deals with LGBTQIA+
topics should not be considered related to restricted categories because the core purpose of
the video is not related to sexual orientation.
b. A Nike shoe advertisement showing an athlete with a prosthetic leg should not be
considered related to restricted categories because the prosthetic leg is not the core purpose
of the document, the Nike shoe for sale would be the core purpose.
c. A webpage for a podcast episode about life in New York listing dating as one topic for
discussion among many should not be considered related to restricted categories, because
dating is not the core purpose of the documents, ‘life in New York’ would be the core purpose.
2. The core purpose of a document can be associated with more than one category. Select all the
relevant categories. For example:
a. Abortion topics relate to both health and political beliefs
b. Jewish culture is both faith and ethnic group-affiliated
c. Ayurveda and Traditional Chinese Medicine relate to both faith and health.
3. Please side search if you are unsure about the content of a document.
4. “None apply” is a valid label - some documents will not relate to restricted data. However, please be
thoughtful about using it. It is important to capture all examples of restricted topics!

Avoid Over-Labeling
Please ensure that the labeling does not include more labels than necessary to categorize the document.

Examples:

CONFIDENTIAL TO APPEN – DO NOT FORWARD


# Document DO DO NOT Categorize Reasoning
Categorize To: (over-labeling)
To:
1 Faith & Separation & The ad is for an astrologer
spiritual divorce, Infidelity, who focuses on various
belief Dating relationship issues. No need
to enumerate every
relationship kind.

2 Fitness & Fitness & self-care In this case, the ad relates


self-care AND Preventative to non-medical + healthy
care & health lifestyle choice, which
programs should point only to Fitness
& self-care.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Standard side search process
1. Log in on Facebook & Instagram account
2. Use Google Chrome as default browser
3. Use only Google search as search engine
4. Set the locale/region settings to United States
5. Search only on the 1st page of search results

Enhanced Side Search Instructions

Google – Side Search Setup


1.Go to www.google.com
2. Click settings on the lower ride side of the page
3. Select Search Settings

CONFIDENTIAL TO APPEN – DO NOT FORWARD


4. Select Region Settings

5. Click United States and then Save

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Google – Side Searching Template (Defining unfamiliar words)

Sample JID: 462141719071817 (IG Posts)


Wedding of the year. . Don't forget to Lock Your date!! Only RM200!! Only RM200!! . For inquiry on date &
price Please contact � or DM this page

#wedding #weddingideas #malayweddingphotographer #eventphotographer #eventphotography


#pakejphotomurah #pakejphotography #muslimah #pakejphotographer #pakejkahwin #videowedding #video
#videoweddingmalaysia #videokahwin #videokahwinmurah #videokahwinmalaysia #eventvideographer
#promo #promotion #malayphotographer #photographerselangor #photographerkualalumpur #tunang #nikah
#postwedding #prewedding #preweddingphoto #preweddingshoot #preweddingphotography.

Sample JID: 5352382484872666 (Reels/Stories)


● The text overlay on the image calls out Mahadev, an important Hindu god.
● This job should be labeled “Faith & Spirituality”.

Important Note for Reels/Stories: For overlay texts in media, side search up to a maximum of 3 unfamiliar
words

Approach 1:
Keyword + “Meaning”
e.g., Muslimah Meaning

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Approach 2:
Keyword + “Definition”

Approach 3:
Keyword + “Slang Meaning”

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Reviewing captions with mixed English or non-English words

If a document does not provide enough context for a category to be selected (such as a login page or a
blacked-out screen), select ‘Missing Content’. However, if you do not understand the content provided,
perform a brief side search using search engine results to gain a better understanding. However, not
understanding the provided content is NOT a scenario of Missing Content and the job will still be properly
categorized.

How do we handle captions with mixed English and non-English words?

1. Review the English captions and try to get enough context of the job.
2. Review non-English words in the caption for IG posts or Texts overlay in media for Stories/Reels using the
Side-search template

■ If IG posts caption, review the first 5 non-English words


■ If Reels/Stories’ overlay text in media, review the first 3 non-English words

b. If you can understand the context using these words, proceed with labeling

c. If you cannot establish a context, there’s no need to check the definition of the rest of the non-English
captions but you have to make a decision based on the below scenarios:

i. Apply SCD labels or NA based on the available context and English words from the overlay text in the media.

ii. Lean on labeling wrong language if the entire text is non-English (if the designated market language is
English).

Note:

1. The 5 words benchmark for captions could be reduced to 3 words if the review time is affected.

2. For reels/stories with multiple images like slideshows/clips in videos with text overlay; only apply the Side-
search template in the first frame/image.

3. Usernames can still be used as additional context. However, if on its own and mixed with Foreign Language
we will still lean on labeling Wrong Language.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


II. Restricted topics definitions
Overview: v3.1 changes
V3 additions are categorically distinct from the v1 and v2 labels. These labels are almost exclusively related to
user identifiable information or personally identifiable information and NOT related to the core purpose of ads
documents. This is reflected in the definitions, which emphasize ‘user-specific data’ not ‘content, products and
services.’

V3.1 has removed several categories. Frequently used categories: Couples & relationships, Married and Single
are removed. Less common categories: Grandparents, Family name, Screen name, and Title are also removed.

Content types
For each of the restricted categories, examples are provided whether a document falls within the categories
based on the types of content it represents. The content could feature the following types of examples:
• Products and services related to the category
• Businesses, organizations, associations, & entertainment (including establishments, places)
• Education & schooling associated with the category
• Attributes related to the category
• Culture, topics & causes related to the category
• Job titles or professions within organizations affiliated with the category
• Public figures associated with the category

Definitions
1. Age: Identifies content explicitly restricted to a specific age-group.
• Children: content explicitly related to (serving or purchased for) babies or children under 13. For
example:
o Image, video, or posts (including any captions) in which a child or children are mostly the core
purpose, or primary focus of the content.
▪ Note: Where a post is focused on parental status, but also shows an image of a child,
please co-label with Children and Parents.
o Services explicitly for babies or children (e.g., youth sports leagues, elementary schools,
pediatric dentists)
o Products explicitly for babies or children (e.g., children’s literature, children and baby clothing,
baby equipment, children’s toys), NOT toys enjoyed or collected by a range of age groups
o NOTE: this label includes products and services for babies, infants, or toddlers
• Adolescent: content explicitly related to (serving or purchased for) adolescents aged 13-17 (inclusive).
For example:
o Image, video, or posts (including any captions) in which teens or what appear to be older
children are mostly the core purpose, or primary focus of the content. This must also be
supported by captions and hashtags (e.g., #teens, #U16s, etc.)
o Services explicitly for adolescents (e.g., middle and high schools, high school football, teen
counseling services)
o Products explicitly for adolescents (e.g., teen literature or magazines)

CONFIDENTIAL TO APPEN – DO NOT FORWARD


2. Appearance: Identifies content related to an individual’s appearance or genetically-
inherited physical traits.
• Body size: user-specific data indicating an individual’s specific weight or body size, including user-
provided adjectives (e.g., curvy, fat, skinny, etc.)
o Products related to an individual’s specific body size (e.g., plus-size clothing store, big & tall
men’s clothing store)
o Services related to a specific body size (e.g., plus-size dating services)

NOTE: ‘Body size’ does NOT apply to:


o Content related to weight loss (use ‘Weight loss’ label)
o Products specifying available clothing sizes (e.g., ‘Toddler sizes 1T-4T’ or ‘Women’s sizes 2-16’)
o Content featuring a model or individuals with body types that might be perceived as outside of
the norm for advertising, without specific mention of products or services for individuals of a
specified weight or body size

• Skin color: user-specific data, content, products, or services indicating an individual’s natural skin color,
including user-provided adjectives (e.g., skin lighteners, “fair skin,” “melanin-rich skin,” “olive skin” and
products or services directed at a specific skin color)
• Fingerprints: user-specific data indicating an individual's unique fingerprint pattern data
• Hair color: user-specific data indicating an individual's unique natural hair color
• Height: user-specific data indicating an individual's unique height, including user-provided adjectives
(e.g., tall, short, petite, etc.)
• Iris scan: user-specific data gathered through mathematical pattern-recognition techniques on video
images of one or both irises to reveal individually unique iris patterns

3. Faith & spiritual belief: identifies content closely related to faith, religion, and religious
belief, including the practiced absence of such belief and deeply held philosophical beliefs.
For example:
• Content related to membership or affiliation with a specific faith-based or philosophical group
o user-declared faith or religion
o businesses, organizations, or entertainment related to a religious affiliation (e.g., church,
mosque, temple, Jewish Community Center, Christian rock)
o products or services related to a religious affiliation (e.g., Scripture finder websites, prayer
journals, Christian booksellers)
o culture, topics, or causes related to a religious affiliation (e.g., Christian homeschooling, Islamic
fashion, film or literature with an explicitly religious purpose)
o education related to a religious affiliation (e.g., Yale Divinity School, Yeshiva University)
o job titles related to a religious affiliation (e.g., pastor, minister, guru, rabbi)
o public figures related to a religious affiliation (e.g., Jesus, Joel Osteen, Allah, Sai Baba)
o attributes related to a religious affiliation (e.g., atheism, Christianity, Sikhism)
o holidays based on religious belief, even if secularized (e.g., Diwali, Easter, Ramadan, Hannukah)
▪ DO NOT TAG content related to non-religious holidays (e.g., Halloween, Valentine's Day)
▪ Does not include holidays referenced for sale or promotion purposes (e.g., Easter sale,
Christmas discounts, etc.)
• Example: "Shop our Christmas deals!"

CONFIDENTIAL TO APPEN – DO NOT FORWARD


• Content, products, and services related to philosophical, spiritual, occult, or metaphysical beliefs
o user-declared spiritual beliefs
o businesses, organizations, or entertainment related to spiritual beliefs (e.g., meditation centers,
healing circles, tarot readings, love or money spells)
o products related to spiritual beliefs (e.g., healing crystals, tarot decks, occult products, etc.)
o health or healing practices closely related to philosophical or religious beliefs (e.g., Ayurveda,
Traditional Chinese Medicine, Reiki)
NOTE:
o Does not include fictional film or literary content about religion (e.g., Noah, The Prince of
Egypt, The DaVinci Code)

4. Family relationships: Identifies content related to an individual’s relationship status or


status as a parent/caregiver.
• Caregiver: content related to an individual’s status as a caregiver: someone engaged in the unpaid,
non-career work of providing day-to-day care for another adult, regardless of family relationship.
• Marital & relationship status: content related to an individual’s relationship status.
o Civil union: user-declared status as a partner in a civil union, or legally recognized arrangement
similar to marriage indicative of same-sex partnership
o Domestic partnership: user-declared status as a member of a domestic partnership, or legally
recognized partnership arrangement giving limited legal rights to partners
o Separation & divorce: content related to a change in marital status through divorce or
separation. For example:
▪ Products, services, or organizations related to divorce or separation (e.g., divorce/family
law firms)
▪ user-declared status as separated or divorced
o Infidelity: content related to a personal history of infidelity. For example:
▪ Infidelity therapy/counseling, The Ashley Madison Agency
▪ user-declared content about having committed infidelity
o Widow(er): content related to having experienced the death of a spouse or partner. For
example:
▪ user-declared status as a widow or widower
▪ organizations or services related to widow(er) status (e.g., National Widower’s
Organization, Widows Helping Widows)
▪ products related to widow(er) status (e.g., support/self-help literature related to the
loss of a spouse)
• Parental status: content explicitly related to parenthood, adoptive parenting and/or foster parenting.
o Adoption or foster care: user-declared status as adoptive or foster parents, in the process of
adoption or becoming foster parents, having given up a child to adoptive or foster care
▪ services related to adoption (e.g., Center for Adoption Support and Education)
▪ keywords related to user-specific adoption or foster care situations (e.g., ‘Foster
parent’)
o Parents: content explicitly related to an individual’s status as a parent or guardian of a child. For
example:
▪ user-declared status as parents or caregivers of a child/children, supported by keywords
related to parental status (As examples, raters might use words like "mom,' “mama“

CONFIDENTIAL TO APPEN – DO NOT FORWARD


”mums,“ ”step parent," "my baby," "my little girl," my son," my daughter," etc. as
supporting keywords. There will certainly be other examples.)
▪ entertainment or media related to parental status (e.g., Parents magazine, Your Modern
Family blog, The Modern Dads Podcast)
▪ organizations related to parental status (e.g., Homeschool Association of California)
▪ Excludes: content for which parenthood is only implied (e.g., baby products, children’s
clothing, diaper delivery services, etc.)

5. Financial identifiers: Identifies user-specific data indicating an individual's financial tools


and services
• Bank account data: user-specific data indicating an individual's unique bank account identifiers,
including account number, routing number, IBAN number, or credentials allowing access to a user
account.
• Credit card data: user-specific data indicating an individual's unique credit card identifiers, including
card or account number, CVC number, or credentials allowing access to a user account.

6. Gender: Identifies content closely related to gender identity, transgender or nonbinary


identity, or intersex status
• Intersex: content related to personal status as intersex: an individual with variable sex characteristics
that do not align with either typically male or female. For example:
o user-declared status as intersex
o support services or organizations (e.g., interACT, Intersex Society of North America)
• Nonbinary: content related to content related to personal status as nonbinary and/or gender identities
outside the male-female gender binary. For example:
o user-declared status as nonbinary
o support services or organizations (e.g., Autistic Women & Nonbinary Network, Non-binary
Union L.A. [NBULA])
• Transgender: content related to personal status as trans or transgender: gender identity or expression
that differs from the sex assigned at birth. For example:
o user-declared status as trans or transgender
o topics or causes related to transgender identity (e.g., trans visibility, transphobia)
o attributes related to transgender identity (e.g., “trans,” “transitioning,” “mtf”)
• Men: content explicitly related to men and the male sex. For example:
o services exclusively for men (e.g., men’s spiritual groups, men’s support groups, etc.)
o services related explicitly to male sex characteristics (e.g., beard care)
o products related explicitly to male sex characteristics (e.g., athletic cups)
o DOES NOT INCLUDE
▪ clothing or accessories of any kind, even that may be traditionally worn by those with a
male gender identity (e.g., tuxedos, cufflinks, watches, wallets, etc.)
▪ content that mentions ‘men’ or ‘guys’ (e.g., shower gel for ‘guys’ or ‘men’s deodorant’).
Label reserved for products & services related to specifically male body characteristics.
• Women: Identifies content explicitly related to women and the female sex. For example:
o services exclusively for women (e.g., women’s prayer groups, women’s career networking, etc.)
o services related explicitly to female sex characteristics (e.g., women’s health, gynecology,
breast augmentation)

CONFIDENTIAL TO APPEN – DO NOT FORWARD


o products related explicitly to female sex characteristics (e.g., feminine hygiene or menstruation
products)
o DOES NOT INCLUDE
▪ clothing or accessories of any kind, even that may be traditionally worn by those with a
female gender identity (e.g., handbags, watches, jewelry, dresses, bras, etc.)
▪ content that mentions ‘women’ or ‘ladies’ (e.g., shower gel for ‘ladies’ or ‘women’s
razor’). Label reserved for products & services related to specifically female body
characteristics.

7. Health: Identifies content relating to physical and mental health conditions, treatments,
organizations
• Mental health
o Self-harm & suicide prevention: content related to suicide or other types of self-harm,
including user experiences, treatment programs, specialists, or awareness campaigns.
• General health care
o Preventative care & health programs: content, products and services related to medical
health generally, but not tied to a specific medical condition or non-medical healthy lifestyle
choices. Includes:
▪ general medical topics and practitioners (e.g., pharmacies, family practitioners,
pediatricians, preventative or general dentistry, primary care, lung health)
▪ health insurance and health administration (Medicare, appointment services)
▪ cosmetic surgery (breast augmentation, face lifts, medical esthetic services)
▪ Excludes: Medical education (e.g., nursing programs, certification courses), job postings
for health care employment, medical supplies and educational materials (e.g., CPR
mannequins, dental supplies for professional use)
o Sexual & reproductive health care: content related to general sexual & reproductive health
care excluding medical conditions (e.g., gynecology, prostate examinations, birth control, etc.)
o Vaccines & vaccine status: content related to vaccines and vaccination, including vaccination
services and clinics, vaccination awareness campaigns, user-declared vaccination opinions or
status, etc.
▪ Note: User vaccination opinions should also be labeled with Politics > Political opinion
• Health data: Identifies user-specific data indicating unique personal health data
o Blood type: user-specific data indicating unique personal classification of blood type
o Genetic data: user-specific data indicating unique personal genetic data, including genetic
mapping, declared genetic disorders, DNA sequencing, chromosomal data, or genetic markers.
Note: may be co-labeled with Ancestry in the case of user-specific genetic/DNA data related to
ancestry
o Personal health information: user-specific personal health measurements, biosignals (e.g., data
from electrocardiogram or electroencephalogram), biological cycle data (e.g., sleep data,
ovulation data), reading & level data, including health & fitness data from programs, devices,
software, and wearables (e.g., heart rate/bpm; calories burned, oxygen levels, blood glucose
levels, etc.)
• Medical condition
o Accessibility settings: user-specific data indicating unique personal control settings intended to
increase access to technology, as related to physical or neurological ability

CONFIDENTIAL TO APPEN – DO NOT FORWARD


oAddiction: content related to current or past addiction to a substance (e.g., alcohol, nicotine,
narcotics) or activity (e.g., gambling). For example:
▪ user-declared status as an addict, recovering from addiction, or involved in supporting
someone suffering from addiction
▪ products and services related to addiction treatment and support for former addicts
(e.g., Al-Anon, smoking cessation programs)
▪ topics and causes related to addiction (e.g., gambling addiction, addiction as a disease)
▪ Excludes: playful or colloquial uses of 'addict' or 'addiction' (e.g., chocolate addict,
#crochetaddict, 'I'm addicted to my morning smoothies!")
o Body dysmorphia & eating disorder: content related to current or past vulnerability regarding
body image vulnerabilities or disordered eating. For example:
▪ user-declared status as suffering or having suffered from body dysmorphia or eating
disorder
▪ services relating to eating disorder treatment (e.g., National Eating Disorders
Association; Eating Disorders Helpline)
▪ content promoting eating disorder (e.g., pro-ana/pro-mia; ‘thinspo’)
o Disability: Identifies content related explicitly to disability or specific disabilities, either mental
or physical. For example:
▪ user-declared status as having a disability
▪ attributes related to specific disabilities (e.g., “disabled,” “people with disabilities,”
“learning disabilities”)
▪ products and services related to specific disabilities (e.g., Paralympics)
▪ topics and causes related to disabilities (i.e., Disability Pride Month)
Note: if the words “disabled” or “disability” are not used, there should be clear link
between the product/service and disability status (e.g., Paralympics)
o Illness & injury: content related to specific health conditions (both physical and mental health
conditions), injuries, treatments for health conditions other than pregnancy/childbirth,
addiction, disability, or eating disorder. For example:
▪ user-declared illness or injury
▪ names of specific health conditions (e.g., “cancer,” “asthma,” “HIV/AIDS,” “ADHD,”
“diabetes”)
▪ services and products tied to specific health conditions (e.g., chemotherapy, Beltone
hearing aids, Certified diabetes educator, glasses/lenses, wheelchairs)
• Note: words such as “treatment,” “therapy,” “care,” or “medicine/medical”
often require this label
▪ information on injury, disease, or disease risk, including medical history, medical
opinions, diagnosis and clinical treatment
▪ medical services related to a medical condition (e.g., cardiothoracic surgeon, DaVita
dialysis centers), excluding preventative health or wellness services
▪ advocacy groups and support groups for survivors (e.g., American Cancer Society, Find a
Cure for Cystic Fibrosis, Alzheimer’s Association of America)
o Prescription or over-the-counter (OTC) medication: medications intended for the treatment of
specific medical conditions and may or may not require a prescription. For example:
▪ user-declared use of prescription or OTC medication
▪ products and services related to medications (e.g., Nyquil, acetaminophen, Ambien,
insulin needles, pill cases, herbal remedies)
CONFIDENTIAL TO APPEN – DO NOT FORWARD
▪topics related to medications (e.g., PrEP, medical cannabis)
o Pregnancy & childbirth: Identifies content, products, and services related to pregnancy, birth,
and postpartum experiences, including abortion.
▪ Abortion: Identifies content, products, or services related to the termination of a
pregnancy. For example:
• user-declared experience with terminating a pregnancy or discussion of abortion
• abortion service providers or resources
• prescription or OTC medications related to terminating a pregnancy
▪ Pregnancy, birth & postpartum: Identifies content, products, and services related to
pregnancy and childbirth. Includes:
• user-declared status as pregnant or postpartum
• medical services related to pregnancy and childbirth (e.g., fertility therapy,
obstetrics, lactation specialists)
• products related to pregnancy and childbirth (e.g., pregnancy tests, breast milk
pumps, belly balm, nursing pillows, etc.)
• services related to pregnancy and childbirth (e.g., birthing classes, doulas,
midwifery, etc.)
• topics related to pregnancy and childbirth (e.g., natural childbirth, home birth,
breast feeding, etc.)
Note: When applying this label, do not co-label with gender, medical condition, or
age.
• Wellness
o Fitness & self-care: Identifies content related to general, non-medical health maintenance and
healthy lifestyle. For example:
▪ products related to healthy lifestyle (e.g., fitness trackers, hydration trackers, free
weights)
▪ services related to healthy lifestyle (e.g., gym memberships, yoga classes, hot stone
massage, reiki)
▪ Does NOT include bath and beauty treatments (e.g., shower gels, hand creams, facial
products or treatments, etc.)
o Personal fitness information: user-declared data indicating unique personal calculated or
achievement-related fitness data (e.g., 'ran a 5k,' 'benchpress personal best,' etc.)
▪ EXCLUDES raw data or measurements (e.g., heart rate/bpm, oxygen levels, blood
glucose levels)
o Vitamins & supplements: Identifies pills, capsules, powders, or liquids intended to supplement
diet and provide nutrients, not for treating a specific medical condition. For example:
▪ user-declared use of vitamins or supplements
▪ products and services related to vitamins/supplements such as Airborne, creatine
products, multivitamins, collagen protein powders, herbal supplements to ‘support’
behaviors or body parts (e.g., ‘Turmeric joint support’ or ‘Focus’ supplements)
o Weight loss: Identifies content intended to promote weight loss. For example:
▪ user-declared content about losing weight, discussion of prior weight loss, or weight loss
progress
▪ topics related to weight loss (e.g., keto diet, calorie counting, intermittent fasting)
▪ products and services related to weight loss (e.g., liposuction, macro counting/tracking
apps, Noom, meal-replacement shakes and bars)
CONFIDENTIAL TO APPEN – DO NOT FORWARD
▪ words and phrases such as ‘slimming’, ‘fat loss', 'burn fat', 'melt away fat', 'stay trim' are
clues that an ad relates to weight loss.
▪ Does NOT include fitness equipment (e.g., fitness trackers, free weights, yoga mats)
which are not necessarily weight loss-related - use ‘Fitness & self-care’ label.
8. Identity: Identifies user-specific data indicating related to individual identity.
• Email: user-specific data indicating unique personal email address, includes both private email
addresses or email content.
• Personal government identification: user-specific data indicating a unique personal government-
issued identification number, including passport, drivers license, birth certificate, social security,
visa/immigration, or other government-issued identifier.
• Phone number: user-specific data indicating unique personal number assigned to a telephone line for a
specific phone or set of phones.
9. Cultural background: Identifies content related to inherited identity traits, such as
ethnicity or race, caste, tribe, etc.
• Ancestry: content related to an individual’s ancestors or family tree (e.g., ancestry websites, 23 & Me
genetic mapping services, user-declared use of an ancestry website or ancestry mapping results, etc.)
Note: may be co-labeled with Genetic data when related to user-specific genetic/DNA information about
ancestry
• Ethnicity: content related to membership in an ethnic group(s) or ethnicity, caste, tribe, or other
inherited group membership *not* a national identity. For example:
o User-declared ethnicity or race, including tribe or caste affiliation, generally signaled by
attributes or keywords related to ethnic origin (e.g., "First nations," "Uighur," "Filipino/a,"
“Roma,” “Kurdish,” “Latino,” “African American,” “Alaska Natives,” “Shawnee Tribe of Indians
of Oklahoma,” “upper caste,” “low-caste”)
• Attributes related to national identity should not receive this label (e.g., American,
Canadian)
o educational institutions explicitly related to ethnicity (e.g., Historically Black Colleges and
Universities such as Howard University, Native American Tribal Colleges, and tribal land-grant
colleges such as Sitting Bull College)
o organizations and entertainment related to ethnicity (e.g., NAACP, BET, South Asian Medical
Students Association, Bureau of Indian Affairs)
o topics and causes related to ethnic origin (e.g., African-American culture, Punjabi cinema,
Tagalog, Chicano studies, Dalit studies, Black Lives Matter, Scheduled Tribes [India], Scheduled
Castes )
o products and services that cater directly to or self-identify for race or ethnicity (e.g., Black Girl
Sunscreen, black hair products)
Important notes on Ethnicity:
Ethnicity is generally an inherited status: ethnic groups often continue to speak related languages and
may share a similar gene pool. Caste is related to an inherited social identity based on a caste system
(e.g., “upper caste,” “low-caste,” Dalit, Scheduled Castes). Tribe is related to an inherited tribal
affiliation or community attachment (e.g., “Alaska Natives,” “Shawnee Tribe of Indians of Oklahoma,”
Scheduled Tribes [India]).

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Topics related to ethnic, ethnolinguistic, ethnoreligious groups should be labeled with this category.

Topics related to non-ethnic cultures (such as national cultures) should not. Topics related to racial
identity should be labeled with Ethnicity.

NOT examples of ethnicity:


• Documents (ads, webpages, images or videos) featuring images of individuals of a specific race or
ethnicity, and for which the core purpose is NOT related to race or ethnicity
• Organizations (e.g., churches, nightclubs) that may seem to serve to a specific ethnicity based only on
ad/page images
• Advocacy group for civil liberties issues, including issues of racial equality, but not focused exclusively
on issues affecting a specific racial or ethnic group (e.g., American Civil Liberties Union)
• Public figures like political candidates or religious figures who identify as a specific racial or ethnicity,
but who are not otherwise related to content related to membership in a racial or ethnic group.
10. Language: Identifies content related to an individual’s native or preferred language.
• Non-native speakers: content that relates to a native speaker of a minority or non-official language.
For example:
o user-declared status as a non-native speaker
o services linked to non-native speaker status (e.g., “English as a second language” classes in an
English-speaking country)
o attributes linked to non-native speaker status (e.g., native Mixtec speakers in Spanish-dominant
Mexico)
NOTE: To determine whether or not a document is about non-native speakers, it may be necessary to
do a side search. An English proficiency test in English-dominant New Zealand would be labeled,
whereas an English proficiency test in Thailand would not.
11. Location: Identifies user-specific data indicating an individual's location.
• Address: user-specific data indicating an individual's location in terms of a building, apartment, or
other structure or a plot of land; contains building or plot number, street name, postal code, etc.
• GPS location: user-specific data indicating an individual's location in the form of numeric latitude and
longitude coordinates.
12. Military: Identifies content related to the military, an individual’s affiliation with a
branch of the military, or veteran status.
• Military affiliation: Identifies content related to an affiliation with a government’s military or branch or
the military other than veteran status, such as current active military or military families. For example:
o user-declared status as affiliated with a branch of the military or part of a military family
o education, job titles, or job postings related to military affiliation (e.g., military
academies/colleges such as West Point Academy or Air University Islamabad, military
intelligence analyst)
o organizations related to military affiliation (e.g., National Military Family Association)
• Military topics: Identifies content about military topics or hobbies, but not tied to an individual’s
veteran status or active military affiliation. For example:
o topics or culture related to military topics (e.g., Anzac Day, military vehicles, historic military
weapons)
CONFIDENTIAL TO APPEN – DO NOT FORWARD
o non-fiction entertainment related to military topics (e.g., American Heroes Channel)
NOTE: Does not include fictional or video game content depicting war or the military (e.g., Call of
Duty, Full Metal Jacket)
• Military supplies & survivalism: Identifies content products or services closely related to the military,
but unaffiliated with a branch of a government’s military. For example
o services related to military supplies & survivalism (e.g., military-style training groups)
o products related to military supplies & survivalism (e.g., military surplus gear like boots,
rucksacks, poncho blankets, tactical subscription boxes)
Note: any military-style weapons should also be tagged with ‘Weapons’
• Veteran status: Identifies content related to an individual’s status as a veteran/former military
member. For example:
o user-declared status as a veteran
o organizations related to veteran status (e.g., veteran advocacy organizations like Hirepurpose,
government agencies serving veterans such as the US Department of Veterans Affairs)
13. Nationality: Identifies content, services, or products related to an individual’s national
identity or citizenship status, unrelated to race or ethnicity.
• Citizenship & immigration status: Identifies content related to an individual’s immigration or
citizenship status, regardless of location. For example:
o user-declared citizenship or immigration status
o topics or causes related to immigration status (e.g., DACA or DREAM Act)
o products of services related to immigration status (e.g., permanent residence or work visa
information/assistance)
o services related to citizenship status (e.g., preparation for national civil service examinations
that only permit citizen applicant)
• National origin: Identifies content related to original citizenship of a specific nation or membership of a
national origin community, unrelated to race or ethnicity. For example:
o user-declared national origin
o products or services related to national origin (e.g., s, services directed at specific expatriate
communities - “American expats in Dubai”)
o culture or topics related to national origin (e.g., regional festivals like the Guelaguetza in
Mexico).
o organizations related to national origin (e.g., Daughters of the American Revolution (DAR),
Association of Physicians of Indian Origin)
NOTE: The mention of a specific country (e.g., ‘Made in America’ or ‘German’ beer) does not
necessarily mean that the content should be categorized with a Nationality label.
• Refugee status: Identifies content related to displacement from a home region or country due to
natural disaster, war, or persecution. For example:
o user-declared status as a refugee
o attributes related to refugee status (e.g., “Syrian refugees,” “California fire refugee”)
o organizations related to refugee status (e.g., refugee advocacy or resettlement groups)

14. Politics: content related to political affiliation or belief, voting activity and/or related to
political issues or government services
• Government services: content related to non-political government agencies and services. For example:
o agencies such as the Department of Motor Vehicles or local Parks & Recreation

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Note: Does NOT include broad political locations such as municipalities, counties, cities, states, or
countries
• Political affiliation: content related to specific political parties, politicians, political philosophies or
other content affiliated with a political party or political movement. For example:
o User-declared political affiliation, including intent to vote for a particular candidate
o job titles related to political affiliation, even if unstated (e.g., speech writer, congressional
staffer/aide, lobbyist)
o topics related to political affiliation (e.g., Liberal, Conservative news media)
o public figures or job titles related to political affiliation (e.g., Donald Trump, Ilhan Omar, RNC
staffer)
o organizations related to political affiliation (e.g., Democratic Socialists of America, Green Party,
Labour Party, BJP)

Side research note: News sources with an explicit political leaning, state-owned news agencies, and
any news organizations affiliated with state propaganda should receive this category. When news
sources are closely tied to political affiliation, Wikipedia searches will often note the “political
alignment” of a news source in structured data on the right-hand side or in a content header. Any time
a political alignment is clearly stated, the topic should receive this category.

• Political candidate: content promoting a specific political candidate for public office
o Ads could be for local political positions (e.g., city council/councilman, mayor, alderman) or
state/national positions (e.g., representatives, assemblymen, senators, governors, presidents,
etc.)
o Where a document shows a politician as a part of a larger political movement, unrelated to a
specific election, Political affiliation is the correct categorization.
• Political issues: Social or political issues subject to political debate or lobbying, but without statement
of a specific stance or explicit political affiliation. Any topic related to a political or social issue that
might be subject of lobbying, a key issue in an election, or the subject of community organizing should
receive this category, if not tied to a specific politician or political party (which would indicate political
affiliation) or position (which would indicate political opinion). For example:
o topics such as “gun control,” “Israeli-Palestinian conflict”, “Environmentalism,” “criminal-justice
reform,” or “animal rights”
o documentary film or non-fiction books related to a political issue without statement of a
specific stance
• Political opinion: Political opinions or beliefs not related to specific political parties or membership.
References to “activism” and “political leaning” might indicate a topic is related to political opinion. In
general, “pro-xx” or “anti-xx” are strong indicators of political opinion. For example:
o user-declared opinions on a political topic (e.g., reproductive rights, gun control, immigration,
etc.)
o topics such as “pro-gun control,” “anti-regulation,” or “Support Israel,” “pro-Second
Amendment rights”
o documentary film or non-fiction books with a clear political opinion (e.g., No Safe
Spaces, Citizenfour)
o DOES NOT include firearm ownership, training, or hunting that is not explicitly about a political
opinion related to gun control or gun rights
• Voting activity: Content related to voting, voting choices, or beliefs about voting rights. For example:
CONFIDENTIAL TO APPEN – DO NOT FORWARD
o user-declared status as having voted or promoting voting (NOTE: may be co-labeled with
political affiliation or opinion)
o topics or attributes such as “get out the vote” or “I voted!”
NOTE: This category does not include fictional film or literary content about politics (e.g., All the President’s
Men, Vice)

15. Regulated products & services: Identifies content related to legally restricted or
culturally restricted products/services and/or products/services linked to damaging health
consequences.
• Alcoholic beverages: Identifies content related to beverages containing alcohol. For example:
o Products and services related to alcoholic beverages (e.g., Barefoot Pinot Grigio, Hendrick’s Gin,
wine clubs)
o Non-alcoholic beverages intended to mimic alcoholic beverages (non-alcoholic spirits, non-
alcoholic wine and beer, cocktail mixes, ‘virgin’ cocktails)
o Topics related to alcoholic beverages (e.g., oenology, distilling, designated driver)
o Businesses or entertainment related to alcoholic beverages (e.g., bars, wineries, brewpubs)
Excludes: incidental images of alcohol, such as in the background of an image, that are not part of
the core purpose
• High sugar, fat, and salt foods: Identifies advertised products (e.g., with commercial intent, not user
posts) related to processed foods linked to detrimental health outcomes, generally characterized by
high sugar, fat, and/or salt content.

These specifically include:


o Candy (e.g., Starburst, Haribo candies, Lemonheads)
o Chocolate candy (e.g, Kit-Kat, Cadbury’s Dairy Milk, Twix).
o Soft drinks of any kind, including soda (Mountain Dew, Fanta, etc.), sports drinks (e.g.,
PowerAde, Gatorade), flavored or enhanced waters (e.g., Vitamin Water, Hint), energy drinks
(e.g., Monster Energy), bottled ready-to-drink teas and coffees (e.g., bottled Starbucks
Frappuccino, etc.), and boba tea drinks.
Note: Excludes instant coffee, freshly prepared espresso or coffee drinks, and hot chocolate mixes.

o Ice cream, including milkshakes, frozen coffee drinks (e.g, Frappucino, frappé) gelato, frozen
yogurt, and popsicles (e.g., Klondike bars, Ben & Jerry’s, McFlurry)
o Savory snacks, such as chips/crisps, crackers, or rice snacks (e.g., Pringles, Ritz crackers, Hot
Cheetos).
o Sweet snacks, such as prepared cookies, prepared cakes, pastries, sweetened breakfast cereals
and toaster pastries (e.g., Oreo, croissants, Froot Loops, Pop Tarts, bakery treats, etc.). Excludes
mixes and ingredients.
o Instant & fast foods, such as instant/ powdered soups and noodles (e.g., Cup-o-Soup, instant
ramen), pre-prepared (e.g., not homemade, ingredients, or recipes) fast food pizza,
hamburger/hot dog, nugget, and/or fries.

Does not include:


o home cooking, recipes, and ingredients
o jobs in the food service industry

CONFIDENTIAL TO APPEN – DO NOT FORWARD


o clothing or accessories with fast food or soda logos

• Gambling & simulated gambling: any form of in-person or virtual gaming, betting for money (e.g.,
online slots, bookmaking and sports betting, casinos), or products linked to gambling (e.g., poker chips)
• Controlled substances: Identifies content related to prohibited or controlled substances, including
marijuana, opioids, etc.
• Nightlife: age-restricted night clubs, bars, cabarets, etc. (e.g., Karaoke clubs, “night club,” jazz or blues
clubs) Content should be explicitly about venues targeting adults.
o Excludes restaurants also offering live music.
• Tobacco & smoking: Identifies content related to tobacco products and smoking paraphernalia,
including cigars, cigarettes, chewing tobacco, pipes, and vaping products.
• Weapons: any type of real or toy weapon (e.g., swords, hunting knives, firearms and accessories -
vests, holsters, etc., fireworks, realistic toy firearms - including paintball guns and accessories, realistic
cosplay weapons), pointed or blade-based sporting goods (darts, archery arrows, fencing épées)
Excludes: unrealistic weapon-like toys (e.g., Nerf guns, unrealistic water guns like Super
Soakers, foam swords, etc.), kitchenware (e.g., paring knife, etc.), digital arts and video games

16. Sex life: Identifies content related to expressions of sexual practices, sexual activity and
dating.
• Adult products & services:
o user-declared or provided content related to sexual activity, nudity, etc.
o products and services related to sexual practices (e.g., sex work, sex toys, adult film or
magazines)
o culture or topics related to sexuality (e.g., Kama Sutra, human sexuality, sexuality studies,
gender studies, polyamory)
o public figures related to sexual expression or sex life (e.g., adult film actors)
• Dating: content related to dating. For example:
o dating services or apps
o dating or relationship advice
o Excludes reality or game shows about dating (e.g., Love Island, Married at First Sight, The
Bachelor, etc.)
o Excludes user declarations of partnership (e.g., "I went on a date last night!" or "Took my gf on
a romantic picnic")
• Sexual orientation: Identifies content related to patterns of sexual and/or romantic attraction. For
example:
o user-declared sexual orientation
o organizations or associations related to sexual orientation (e.g., The Trevor Project, GLAAD,
American Institute of Bisexuality)
o attributes related to sexual orientation (e.g., “gay,” “bisexual,” “asexual,” “heterosexual”)
o topics and causes related to sexual orientation (e.g., gay pride, marriage equality)
o public figures related to LGBTQIA+ activism (e.g., Harvey Milk), but *not* public figures who
identify as LGBTIA+ without taking an explicitly activist role.
o documentary or non-fiction materials related to sexual orientation, but NOT fictional film or
literature featuring characters or plot related to sexual orientation.
• Sexual partners: user-declared information related to sexual partners, including identity, gender, etc.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
17. Socioeconomic Status: Identifies content related to socioeconomic status, especially as it
relates to vulnerability based on factors related to income/assets, education level, and
employment.
• Education level
o Adult basic education & literacy: Identifies content related to the development of basic skills
and literacy in adult learners who may have experienced a disruption in primary or secondary
education.
▪ user-declared content about adult basic education or adult literacy
▪ high school dropout credit recovery services
▪ organizations and associations related to adult basic education or adult literacy (e.g.,
California Adult Education Program, literacy centers)
▪ products and services related to adult basic education or adult literacy (e.g., General
Educational Development [GED] preparation services)
▪ Does NOT include ESL courses or other enrichment/recreation courses for adults
▪ Does not refer to university, community college, or vocational education
o Post-secondary education: Content related to post-secondary education level (e.g., university
admissions, professional development courses, vocational colleges, college fraternities or
sororities, law schools)
▪ Includes college entrance examinations (e.g., Vestibular, SAT/ACT, etc.), even if
marketed to students who have not yet started college.
o Primary & secondary education: Content explicitly about to pre-school/kindergarten through
high school education, specifically promotion or recognition of specific schools, tutoring
services, or educational methods.
o Excludes: fictional school content, unclear content possibly related to education (e.g., a
child writing at a table who may or may not be doing homework)
Note: requires a co-label with Age (Children, Adolescent, or both) and possibly Parents, if
parenthood is explicitly mentioned.
• Employment
o Labor union: Identifies content closely related to labor union or trade union membership. For
example:
▪ user-declared status as a member of a labor union
▪ businesses and associations related to union affiliation (e.g., American Federation of
Teachers, Alliance Police nationale, British Medical Association, Free Workers’ Union of
Germany)
▪ public figures related to union affiliation (e.g., César Chávez, Randi Weingarten, names
of union leaders)
▪ topics related to unions (e.g., workers’ rights, collective bargaining, unionization, union
dues, Right to Work)
▪ products related to union membership (e.g., union-branded merchandise)
o Note: label should not be applied to any recruitment or employment-related ads
• Financial information
o Low-income: Identifies content related to low-income or low-asset status (including debt)
and/or need for support from government or charitable sources. For example:
▪ user-declared status as low-income
▪ topics related to low-income status: short-sales, home foreclosure, food stamps

CONFIDENTIAL TO APPEN – DO NOT FORWARD


▪ products and services related to low-income status (e.g., debt-consolidation services,
payday loan services, Medicaid)
▪ organizations and associations related to low-income status (e.g., Head Start programs,
student loans)
o Credit & loans
▪ Credit & loan products & services: money lending, loans, credit services, and other
credit-based financial instruments.
For example: Credit cards, loan offers, mortgage & refinancing products
▪ Credit history: content, products & services related to credit history. For example:
• user-specific data indicating an individual's record of responsible repayment of
debts
• credit reports or transactions from a number of sources, including banks, credit
card companies, collection agencies, and governments
• credit score or credit reporting services
▪ Credit score: user-specific data indicating that user’s current or previous credit score
▪ Loan records: user-specific data indicating records related a loan which was obtained by
the person or company
o Income bracket: user-specific data indicating an individual's personal income range
o Insurance: user-specific data indicating an individual's auto, home, life or other property-
related insurance data, such as policy number.
o Tax returns: user-specific data indicating an individual's or couple's tax information, including
images of completed filing forms, amount owed, amount reimbursed, or other tax-related
status.
o Retirement & pension: content, products or services related to retirement savings advice or
services, Social Security information. For example:
▪ user-declared status as retired (any age)
▪ retirement savings workshops, pension portals or tutorials, financial instruments related
to retirement savings
18. Tragedy & hardship: Identifies content related to personal vulnerability based on past
criminal activity, loss, or victimhood.
Note: No labels in this category should be applied to news items.
• Criminal record: Identifies content related to an individual’s criminal history, criminal activity, or
criminal record. For example:
o user-declared criminal history, activity, or record
o products or services related to criminal charges (e.g., bail bondsman, Hope for Prisoners)
o topics related to criminal activity or history (e.g., drug culture)
o attributes related to conviction history (e.g., “ex-offender,” “felon”)
Note: This label does NOT apply to criminal proceedings (e.g., crimes, arrests, trials, convictions, etc.)
present in news articles.
• Personal loss: Identifies content related to experiencing or having experienced personal hardship
related to loss (i.e., death or illness of a loved one, loss of home or shelter, displacement from natural
disaster). For example:
o user-declared information about personal loss

CONFIDENTIAL TO APPEN – DO NOT FORWARD


o services related to personal loss (e.g., grief counseling services, FEMA, homeless services, pet
cremation services)
• Personal safety & security: Identifies content related to an individual’s personal or home safety and
security. For example:
o products related to personal or home safety & security (e.g., home alarm systems, home
surveillance cameras, etc.)
o services related to personal or home safety & security (e.g., disaster insurance products,
neighborhood watch programs or apps, cybersecurity software, etc.)
• Victim status: Identifies content related to an individual’s personal history or experience as a victim of
a crime. For example:
o user-declared status as a victim of crime
o attributes related to victim status (e.g., “sexual assault survivor”)
o services related to victim status (e.g., victim support groups or organizations)
• Violence: content depicting or promoting violent crime or hate, including culturally recognizable acts
of violence or terrorism.
Note: content containing a credible threat of violence should be immediately escalated

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Examples
# Document Categorize To: Reasoning
1 1) Health > General health care > The core purpose of this ad is to
Sexual & reproductive health care promote a contraceptive pills
2) Heath > Medical condition > delivery service. This relates to
Prescription & OTC drugs the SCD labels 'Sexual &
reproductive health care' because
it is focused on sexual wellness. It
should receive a secondary label
of 'Prescription & OTC drugs'
because the ad is for a drug
delivery service.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


2 Age > Children The core purpose of this ad is to
promote play mats. This relates to
the SCD label 'Children' because it
is intended for/bought for babies
or kids.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


3 Regulated products & services > The core purpose of this ad is to
Gambling promote a casino. This relates to
the SCD label 'Gambling' because
the ad clearly promotes gambling
at a casino destination.

4 Family relationships > Parents This pixel clearly shows parenting


topics and the word ‘mums.’

CONFIDENTIAL TO APPEN – DO NOT FORWARD


5 1) Gender > Women The core purpose of this ad is to
2) Health > General health care > promote a breast cancer
Preventative care & health insurance product. This relates to
programs the SCD labels 'Preventative care
& health programs' because
insurance is considered a health
program - not necessarily
indicating an illness. This ad
should also be labeled with
'Women' because it specifically
addresses women.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


6 Health > Medical condition > Illness The core purpose of this ad is to
& injury promote a lung cancer support
service. This relates to the SCD
label 'Illness & injury' because the
other children of Medical
condition (Addiction, Body
dysmorphia, Pregnancy &
Childbirth, and Prescription/OTC)
do not apply. Therefore, 'Illness &
injury' should be used.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


7 Health > Wellness > Fitness & self- The core purpose of this ad is to
care promote an activity tracking
smartwatch. This relates to the
SCD label 'Fitness & self-care'
because it has to do with non-
medical wellness products.
It mentions fertility and menstrual
periods, but these are not the
core purpose of the ad.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


8 Health > General health care > The core purpose of this ad is to
Preventative care & health promote a nursing education
programs program. This relates to the SCD
label 'Preventative care & health
programs' because health
education is a general health
topic, not a medical condition or a
form of wellness.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


9 1) Cultural background > Ethnicity The core purpose of this ad is to
2) Faith & spiritual beliefs promote travel to Israel on an
Aliyah journey. This relates to the
SCD labels 'Ethnicity' and 'Faith &
spiritual belief' because the ad
specifically focuses on Jewish
diaspora community members
traveling to Israel.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


10 Military > Military supplies & The core purpose of this ad is to
survivalism promote a private military
training course. This relates the
SCD label 'Military supplies &
survivalism' because it the ad is
not affiliated with a specific
branch of the military or a
hobbyist military topic.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


11 Military > Veteran status The core purpose of this ad is to
promote a veterans’ benefits
program. This relates to the SCD
label ‘Veteran status.’

CONFIDENTIAL TO APPEN – DO NOT FORWARD


12 Politics > Political affiliation The core purpose of this ad is to
promote the Scottish National
Party (SNP) and its specific
position on Brexit. This relates to
the SCD label 'Political affiliation'
because it is about one political
party.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


13 Regulated products & services The core purpose of this ad is to
> High fat, sugar, and salt foods promote a candy store. This
relates to the SCD label 'High fat,
sugar, and salt foods' because
candy is one of the food types on
this list.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


14 Regulated products & services The core purpose of this ad is to
> Alcoholic beverages promote the partnership between
Heineken and the Champions
League. This relates to the SCD
label 'Alcoholic beverages'
because it promotes drinking
beer.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


15 Faith & spiritual belief The core purpose of this ad is to
promote a religious campaign
sponsored by a Catholic diocese.
This relates to the SCD label ‘Faith
& spiritual belief' because it has to
do with a specific religious
denomination.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


16 Faith & spiritual belief The core purpose of this ad is to
promote the services of a psychic
medium. This relates to the SCD
label 'Faith & spiritual belief'
because psychic ability is a
spiritual belief, even if not
necessarily attached to a specific
religious tradition.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


17 Health > Wellness >Weight loss The core purpose of this ad is to
promote a colon cleansing drink
Health > Wellness & prevention > that claims weight loss benefits.
Vitamins & supplements This relates to the SCD
labels 'Weight loss' because the
product claims weight loss
benefits and 'Vitamins and
supplements' because the full ad
shows a colon cleansing
supplement in powder and drink
form.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


18 Health > General health care > The core purpose of this ad is to
Preventative care & health promote a work-safety related
programs ergonomic posture cushion. This
relates to the SCD label
'Preventative care & health
programs' because, as the
guidelines mention, occupational
safety materials fall under this
category.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


19 Health > Medical condition > Illness The core purpose of this ad is
& injury to promote a treatment
for thinning hair. This relates to
the SCD label 'Illness & injury'
because even though we may
consider hair loss or hair thinning
a cosmetic issue, it is treated as a
health problem with a treatment.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


20 Health > Medical condition > Illness The core purpose of this ad is to
& injury promote the services of an eye
care provider. This relates the SCD
label 'Illness & injury' because
unlike other general medical
providers (family practitioners,
dentists, pediatricians),
optometrists and
ophthalmologists generally focus
more eye conditions
(prescriptions, reading
glasses) than on general
preventative medicine.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


21 Nationality > Citizenship & The core purpose of this ad is to
immigration status promote a podcast episode about
work visas in Japan. This relates to
the SCD label 'Citizenship &
immigration status' because it is
about information related to
changing a visa status. While the
text mentions dating, the core
purpose of this ad is not related to
dating.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


22 Politics > Political candidate The core purpose of this ad is to
promote a candidate for election
or re-election. This candidate is
identified as the 'Leader of
Canada's Conservatives' and the
ad is sponsored and paid for by
the Conservative Party of Canada.
By asking the public to 'place your
faith in me,' he is asking for their
vote.

Note: even though 'mental health'


is mentioned, this ad does not
need any Health labels because
health is not the core purpose of
the ad.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


23 Politics > Political candidate The core purpose of this ad is to
promote a candidate for election
or re-election. This candidate,
Barrett Reed, is running for a seat
on the Santa Barbara City Council,
a local government position. The
phrase 'will you join me' is a way
of asking people to vote for this
candidate.

Note: American political


candidates often use images of
their own family or other families
to relate to voters. Even though
there is an image of a family, no
Family status labels are needed
because family is not the core
purpose of this ad.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


24 Military > Military affiliation This image shows a man in a
Age > Children military uniform holding an infant.
The caption reads 'Us again!'

We clearly see that military


affiliation is an attribute of the
image. However, we can't exactly
tell what the relationship between
the man and the child is, so we
can't really infer 'Parents' from
this. The caption tells us that the
core purpose is about both of
them ('us'). Therefore, 'Children' is
the most appropriate secondary
label.

Decision based on Image/Video +


Text
25 Family relationships > Parental This image also shows a man and
status > Parents an infant. When we read the
caption, we see that the core
Age > Children purpose of this post is to
celebrate him on Father's Day.
There is also an image of a child.

Decision based on Image/Video +


Text

CONFIDENTIAL TO APPEN – DO NOT FORWARD


III. Document type guidelines
Advertisements
What is an Ad?
An ad (advertisement) is a commercial communication from a seller to a potential buyer about a product or
service.

Example of a Facebook Ad Example of an Instagram Ad Example of an Instagram Story Ad

Ad elements examples:
Note: Not all ads will follow the same consistent format.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Facebook Ads

1. The Facebook page running the advertisement


2. Text describing the ad and what is offers
3. Media and other attachments (images, videos, etc.) that promote the ad.
4. A call-to-action where readers are incentivized to follow-up on the ad

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Instagram Ads

1. The Instagram profile running the advertisement


2. Media and other attachments (images, videos, etc.) that promote the ad.
3. A call-to-action where readers are incentivized to follow-up on the ad
4. Text describing the ad and what is offers

[optional] “scroll image” arrow for when more than one image is present in the ad and “more” button to
expand long captions

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Instagram Story Ads

1. The Instagram profile running the advertisement


2. Media and other attachments (images, videos, etc.) that promote the ad.
3. A call-to-action where readers are incentivized to follow-up on the ad

Special considerations for ads


• Evaluate all components of the advertisement.
• If ad has a video, watch the FULL video (important information for labeling may appear at the end)
• Take hashtags into consideration to determine ad’s core purpose.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Webpages
What is a Webpage?
A webpage is a single instance document of information viewed on a website, viewable by anyone connected
to the internet who has a web browser.

Examples of Webpages

Special consideration for webpages


• Close pop-ups: be sure to x-out any pop ups that cover the webpage you are labeling. You are not
labeling the pop-up itself, but the content behind the pop-up message
• Use the website for context:
o As a general rule, only label the specific landing page provided in the job.
However, sometimes additional context can be drawn from the broader website. Use the
website name and purpose as clues to help determine the correct label, not as an absolute rule.
For example
▪ Website name: noticing that a landing page is from a website called ‘Christian
Books’ might help you understand whether or not the product on the landing page
relates to the ‘Faith & spirituality’ label
▪ Website purpose: if all the content on the website ‘Fit & Fun’ is weight-loss related,
this may indicate that the landing page you are labeling relates to the ‘Weight loss’
label.
o Do not click to other pages within the website to find additional content to label. For instance,
if the linked page shows a dress for sale, but another page within the website has medicines for
sale, only label the linked page.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


• Some webpages show search results. To the best of your ability, label the page for the searched
product.
o Search query and breadcrumbs: If ‘protein powder’ is the search query, but the search results
are a mix of products, label for ‘protein powder’ (Health > Wellness > Vitamins & supplements)

o URL: Check the URL for specifics about the product or service search. In this case, we can see
that protein powder is the focus of the search

• Some webpages show professional profiles. To the best of your ability, label the page for the main
purpose of the professional service offered by the professional profile or contact. For example:
o A listing of orthopedic surgeons in a geographic area should be labeled will ‘Illness & injury’
because the main purpose of these professional listings is promoting services treating a specific
medical issue.
o A profile of a gynecologist, showing their photo, education, professional experience, and
specialties should be labeled with ‘Sexual & reproductive health’ because the main purpose is
promoting services relating to this specific branch of health care.
• News and research content
News and research content should be classified similarly to other webpage content. However, because
webpage content has such a huge range of news articles and research papers, here are some specific
guidelines for these scenarios.
o Health-related news: Health news and research can range from clear, consumer-facing advice
(“7 Ways to Lower Your Blood Pressure”) to highly-specific professional research (“Factor Levels
with Platelet Count in Colorectal Cancer: Clinical Evidence?”). Regardless of the type or style,
label health news for its main purpose and topic. For example:
▪ If it is about a medical condition, label with Illness & injury, even if the article seems
highly specific or research-based.
▪ If an article is about fitness, label as Fitness & self-care
o Political news: Like medical news and research, political news can range from clear political
opinions to complex scholarly research. To the best of your ability, evaluate the headline and
topic of the news article to determine if it has to do with
▪ a political issue (neutral or unbiased discussion)
▪ a political opinion (taking a side, but not explicitly discussing political party membership
or specific left-wing/right-wing affiliation)
▪ political affiliation (specifically related to one political party or faction)
o Crime & tragedy news: News about crime is not considered sensitive. News stories discussing
crimes committed by specific individuals do not relate to the sensitive category of ‘Criminal
record.’

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Facebook Pages
What are Facebook Pages?
Pages are places on Facebook where artists, public figures, businesses, services, brands, organizations, and
nonprofits can connect with their fans, customers, and followers.

Examples of Facebook Pages

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Special considerations for Facebook pages
• Consider only the core purpose of the Facebook page.
o You can ask yourself the following questions:
▪ What type of business or organization is this?
▪ What product(s) are being promoted by the page owners?
o You can look at the following information:
▪ Structured data in side bars: Home, About, Posts
▪ The provided description
▪ Posts made by the page owner/sponsor
• Verify the About section. It can offer clues, but sometimes is misapplied (e.g., a fitness studio
marketing itself as ‘Physical therapy’ or a fire department labeling itself as ‘Nonprofit organization’
• Do not label the page based on comments or posts made by other users who are not the page
owners/sponsors.

Keywords
Some users will be asked to evaluate a keyword using the sensitive category labels.

• Read the keyword and decide whether it relates to any of the sensitive category labels. Be sure to
review the labels, as necessary.
• None apply label: use when the meaning of the keyword is clear, but it does not relate to any sensitive
category. For example:
o Keyword: Fun
o Keyword: Industry
• This keyword is not useful label: use when
a. there could be multiple meanings of a word that could affect its sensitivity. For example:
i. Keyword: Afghan (blanket? person from Afghanistan? hound?)
ii. Keyword: Diet (eating patterns? weight loss plan?)
iii. Keyword: Separation (relationship status? separation anxiety?)
iv. Keyword: Dates (fruit? dating?)
b. the context or meaning of the keyword is completely unclear and no assessment can be made.
For example:

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Example of user interface:

Solution: Here, the keyword, ‘trending,’ is very general. It doesn’t have a clear connection with any sensitive
category in any context.

In this case, the user should mark “None apply”

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Apps
What is an app?
Apps will appear as app store pages promoting a specific app product.

Examples of apps
Sometimes an app might not be available in your location. You’ll use other information in the UI to label the
app.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


App Workflow:
1. Click through any provided URLs.
2. Note the app category.
3. Read the text description of the app.
4. Decide whether this app relates to any sensitive categories.
5. Indicate if you were able to make a decision on the job. Try to make a decision with the information
available, even if the URL is broken or there is a blend of languages in the URL or text
description. ONLY use an error if one of the following issues arises AND there is not enough
information to label the job.
a. Document did not load: URL is broken or invalid (e.g., does not direct to iTunes store or Google
Play) AND there is not enough information from the app category and text description to label
the job.
b. Missing content: URL leads to an iTunes or Google Play page, but app is not available in your
area or no longer exists AND there is not enough information from the app category and text
description to label the job.
c. Wrong language: you cannot understand the English texts and the other language from the UI
and the URL AND this keeps you from labeling the job.
6. If you were able to make a decision, indicate whether you reached your decision based on
a. Text from UI (app category + text description)
b. URLs
c. Text from UI and URLs

Special considerations when labeling apps:


• Always click on the links AND notice the app category AND read the text description to gain a full
understanding of the app’s purpose.
• Some URLs might be broken or unavailable in your area.
o Label the job if there is enough information in the app category and text description. Do not use
the ‘Document did not load’ or ‘Missing content’ errors unless there is not enough information
to label the job. (see above)
• Many app descriptions and URLs contain blended languages, generally English alongside another
language.
o Label the job if there is sufficient information for you to understand, regardless of the language
combination. Do not use the ‘Wrong language’ error unless you are unable to understand
either language AND you don’t have enough information to label the job.

Sample app workflow


1. Click through the URLs: in this case, both URLs work and give images and text to help determine the
app’s main purpose.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


CONFIDENTIAL TO APPEN – DO NOT FORWARD
2. Note the app category. In this case, se see that the category is ‘Medical.’ This is a clue that we will use
‘Preventative care & health programs’ and/or ‘Medical conditions’
3. Read the text description: in this case, we see that this app relates to managing medications - it does
not point to a specific illness, injury, or other medical condition, but it does relate to medications.

5. Make a label decision: correct label: Health > Medical conditions > Prescription/OTC drugs
6. Indicate if you were able to reach a decision on this job: correct response: Yes
7. Indicate how you reached your decision: in this case, the correct selection would be Text from UI and
URLs because we were able to gather information about the app from both links, the app category,
and the text description.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Concepts
This type of job asks you to evaluate whether a word or phrase, and sometimes a description as well, relates
to any sensitive data category. The concept words or phrases are a piece of descriptive data attached to an
image or video. You will not see the image or video.

Concept workflow

1. Read the concept word or phrase


2. Read the description, if one is provided
3. Decide whether this concept relates to any sensitive categories
4. Indicate if you were able to make a decision on the job. If no description is available, make a decision
using only the concept.
a. Use the Missing Content Name error ONLY if the UI is missing both concept and description.
5. If you were able to make a decision, indicated whether you reached your decision based on
a. Concept name
b. Concept description
c. Concept name and Concept description

Special considerations when labeling concepts

1. Concepts can be simple words/phrases (e.g., canoe_trip or Italian soda)


2. Concepts can be hashtags, displayed in the following format: raw_hashtag_“concept”
(e.g., raw_hashtag_ascidian or raw_hashtag_crashbarrier). You will label based on the final “concept”
portion (e.g., ascidian or crashbarrier).
3. Some concepts do not have a description. If you do not know what the concept is or means, use a side
search. For example:
a. icrc (no description): side search shows that ICRC is the International Committee of the Red
Cross
b. frieza (no description): side search shows that ‘Frieza’ is a character in Dragonball Z.
4. Bias toward using a sensitive label where there is too little context or where a side search returns
conflicting definitions.
a. “Diet” - without context or description should be labeled as ‘Weight loss’
b. “Bramble” - side search returns a cocktail type and a plant type. Label as ‘Alcoholic beverages.’

Sample decisions

1. Concept: Italian soda

Description: An Italian soda is a soft drink made from carbonated water and simple syrup usually flavored
Decision: High fat, sugar, and salt foods

2. Concept: grasshopper

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Description: Any of numerous orthopteran insects chiefly of the suborder Caelifera characteristically having
long powerful hind legs ...
Decision: None apply

3. Concept: beer_mug
Description: none
Decision: Alcoholic beverages (Rationale: no side search necessary - this is clearly related to a sensitive data
category)

4. Concept: fios

Description: none
Decision: None apply (Rationale: a side search shows that '“Verizon Fios, also marketed as Fios by Verizon, is a
bundled Internet access, telephone, and television service that operates over a fiber-optic communications
network with over 6.5 million customers in nine U.S. states.”)

5. Concept: raw_hashtag_ascidian

Description: none
Decision: None apply (Rationale: a side search for the concept portion of the hashtag, ‘ascidian’ shows
that “Ascidiacea, commonly known as the ascidians, tunicates, and sea squirts, is a polyphyletic class in the
subphylum Tunicata of sac-like marine invertebrate filter feeders.”)

Example of user interface

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Facebook Page Posts

What are Facebook Page Posts?


Pages are places on Facebook where artists, public figures, businesses, services, brands, organizations, and
nonprofits can connect with their fans, customers, and followers. Fans, customers, followers, and the business
itself can post content to the Facebook Page.

Content can be text, image, or video. In this case, you will be categorizing Facebook Page Posts with text only.

Any personal information (e.g., names, phone numbers, etc.) will be removed from the post in the following
format: <redacted_data_type>

Example of Facebook Page Post

Special considerations for Facebook Page Posts

• Consider only the core purpose of the Facebook page post. What kind of business, product, or service
is this post intended to promote?
• In isolating the core purpose of the post, avoid labeling extra text that does not specifically relate to
the core purpose (e.g., labeling all duties in a job post for a nursing position)

Example Page Post jobs


Page Post Decision Rationale
Regulated products The core purpose of this post is to
& services > promote karaoke night at a pub. This
Alcoholic beverages combines both the sale of alcoholic
beverages and nightlife.
Regulated products
& services >
Nightlife

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Regulated products The core purpose of this post is to
& services > High promote a pizza restaurant. Because
fat, sugar & salt food this ad is focused only on pizza (and
not other foods), it should be
labeled with HFSS.
None apply The core purpose of this post is to
promote a jewelry sale. Even though
the post mentions 'ladies,' no
gender label should be used, per
guidelines for clothing and
accessories.

Health > General The core purpose of this post is to


health care > promote a job opening in the health
Preventative care & care industry. It should be labeled
health programs with Preventative care & health
programs.
None apply This post promotes a pizza
restaurant, but also other food and
catering. Because the focus isn't
exclusively on pizza (or another HFSS
food), this should not be labeled
with HFSS.
Faith & spiritural This post promotes a church service.
belief It should be labeled with Faith &
spiritual belief.

__________________________________________________________________________________________

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Image/Video Guidelines
Document type description
The images and videos you will be labeling come from user generated content on Meta platforms. They will generally
be accompanied by a user-provided caption.

Approach
In evaluating the image or video documents, you should ask yourself two central questions:
1. What is the core purpose of this image or video?
2. Does the user reveal any sensitive information about themself in the image or video? Are restricted
products or services shown?
There may be tangential mentions of restricted topics, but if they do not indicate a core purpose about a restricted
topic they should not be labeled.
Considerations:
• Evaluate each job fully by watching the full video, examining all images in a series, and reading the
full caption.
• Any text and image/video should be evaluated together. Restricted topics could appear in either.
• Any user-provided emojis may be considered for context.

Workflow
1. Evaluate the job by examining all images and reading the full caption text .

CONFIDENTIAL TO APPEN – DO NOT FORWARD


2. Indicate whether or not you can reach a decision on the job. See above for full description of
possible error responses.

3. If yes, assign all sensitive categories that apply to the job.

For this job, we have plenty of information to make a decision. The images show a range of family pictures from
several decades. The core purpose of the text is clearly to mourn and remember a loved one. The caption reveals that
the writer has suffered the death of a beloved family member.

The restricted topic Tragedy & hardship > Personal loss is the correct label.

While several types of family relationship are mentioned, we don’t exactly know the nature of the family relationship,
so no ‘Family relationships’ category should be selected.
1. Indicate how you reached your decision.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


In this job, the image provided a small amount of context, but the core purpose was revealed in the caption. ‘Text’
would be the most appropriate selection to report what part of the document was most helpful in categorization.

_____________________________________________________________________________________________________________

3PD Guidelines

Document type description


The jobs in the 3PD workflow are made up of three distinct parts:

• URL path: string identifying the web resource


Example: m.snapdeal.com/product/x/660338254268

• JSON: key/value pairs separated by a colon; valid values in the key/value pair will generally be strings,
numbers, boolean, or ‘null’
Example: {content_type : vehicle,
content_category : suv_midsize,
year : 2012}

• Query: a series of query parameters


Example: sd_shorturl = 1; utm_medium = wa_transactional_growth

Approach

Objective
Evaluate each input for clear signals that the data is related to restricted topic categories.

Considerations
To evaluate each of the three inputs, you should ask yourself:
1. Are there keywords embedded in the URL path, JSON data, or Query that indicate that this data is
related to a restricted topic category?

2. Is there enough context to make a clear judgment about a related restricted category?

CONFIDENTIAL TO APPEN – DO NOT FORWARD


There may be considerable noise and extra text around relevant keywords or values. Be sure to evaluate each input
type carefully.

Workflow

General Guidelines

There are two high-level activities for handling the 3PD data. Specifically:

1. The widget has the entire sample (url+query+json) selected by default, and raters should select all
categories related to the entire sample.
2. Raters should also select specific segments in URL, Query, or JSON that are related to the restricted
categories. For this part, the segments selected should be as specific as possible.

Important Additional Guidelines for 3PD


1. In many of our jobs, JSON or Query data (similar to the example below) do not exist. Empty inputs
are possible and valid. Raters only need to label whatever non-empty inputs are possible and
valid. For the example below, raters should select not sensitive rather than page not loaded.

2. The “Entire Sample” segment is intended to be a required segment. If there are also specific
keywords, raters should first assign a category to the entire sample and then select specific
keywords, as necessary. The entire sample might be multilabel, so a keyword might have different
assigned categories than those for the entire sample. For instance, the sample may be related to
health and politics, but a specific keyword could be health only. In other words, raters are expected
to assign potentially more than one category to the entire sample.

Important URL Considerations:


1. Do not create a segment or label assignment for the entire URL. Select the part or parts for use.

2. When selecting keywords from the URL, please do not include irrelevant punctuation such as
double quotes or semicolons.

Before labeling
Do an initial evaluation of the job by examining all three input types (URL, JSON, QUERY)

URL PATH

CONFIDENTIAL TO APPEN – DO NOT FORWARD


JSON

QUERY

CONFIDENTIAL TO APPEN – DO NOT FORWARD


Decide whether or not you can make a decision on the job. See above for full description of possible error
responses.

• For this job, we have plenty of information to make a decision. The URL, JSON, and QUERY are in
English, and there is enough text to give context to the data.

Labeling
If you feel that you are able to make a decision on the job, assign all restricted topic categories that apply to the job.

If necessary, do a side search to understand the url path. In this case, a search shows that snapdeal.com is an India-
based e-commerce site.

Assigning a label
1. Highlight any segments of the inputs that indicate that this data is related to a restricted topic
category.

2. Highlighted segments will appear in the central top panel.


a. Clicking Add Segment will create a pill showing your selected segment.

b. Clicking the x on the pill will delete the added segment.

3. Assign the appropriate restricted categories to the segment(s) you have selected.
a. In this case, “content_ids” is ONLY AN EXAMPLE to show the workflow.

b. Based on the URL and the JSON key/value pairs, we can clearly see that this data is related to a
used car sale, which is not a restricted category. No segments would need to be selected for
this job.

CONFIDENTIAL TO APPEN – DO NOT FORWARD


4. In the upper right panel, indicate whether you were able to make a decision on the job and what
input helped you make a decision on the job. Submit.

_______________________________________________________________________________________________

Facebook Comments Guidelines

Document type description

The jobs in the Facebook Comments section are made up on comments entered by Facebook members.

Approach

Objective

Evaluate each comment for clear signals that the data is related to restricted topic categories.

Considerations

To evaluate the job, ask yourself if there is enough context to make a clear judgment about a related restricted
category?

There may be considerable noise and extra text around relevant keywords or values. Be sure to evaluate each
comment carefully.

Workflow

General Guidelines:

For these Facebook comments documents, raters will not be calling out specific keywords but rather making a call on
the entire comment.

Decide whether or not you can make a decision on the job.

Considerations

If you feel that you are able to make a decision on the job, assign all restricted topic categories that apply to the job.

In the upper right panel, indicate whether you were able to make a decision on the job and what input helped you
make a decision on the job.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
Submit.

The following two screen shots provide a general idea of the Comments work area.

_______________________________________________________________________________________________

CONFIDENTIAL TO APPEN – DO NOT FORWARD

You might also like