Professional Documents
Culture Documents
3 Guidelines
Last updated 2023-06-09
I. Rater instructions
Note: In these instructions, document will be used to refer to any text, image, or video that you may be
evaluating
Objective
The labeling task is to determine if the core purpose of a document falls into one or more restricted categories
and categorize the document into the best-fit category(s) that apply.
Use all of the information provided within the document to determine the core purpose of the document and
each correct category that describes the core purpose.
Overview
• The restricted categories are organized in a hierarchy, with some sub-categories appearing as
“children” of “parent” categories.
o Category = Parent Category
o Sub-category = Child category
• The parent will not be selectable in the SRT UI, and the user will see the child categories for that parent
when expanding each parent category using the arrow. Note: some categories consist of only one
term and do not have child categories. These single-term categories will be selectable.
The user will be presented with one document at a time and asked to select the most specific, but still
inclusive, child category or multiple child categories for each document.
Error Buttons:
There are four types of issues you might come across when evaluating documents: Page Load Error, Missing
Content, Wrong Language, and Sensitive.
The following table describes what each of these errors might look like, as well as what you should do if you
come across them while categorizing.
Rating Instructions
1. Evaluate all components of the document, including watching the full play time of videos.
2. Determine the core purpose of the document or video.
3. Do a side search if necessary: some documents may have an unclear purpose, product or service and
further context may be needed. Please do a side search in this case. Wikipedia is a good place to start,
but sometimes a broader search will be necessary.
4. Select if the user is able to make a decision on this job
a. If yes, move forward to select the most relevant restricted category(s) - child categories
b. If not, select the most appropriate error category (see Error category definitions above in
Materials)
5. Determine if the core purpose falls into any restricted category(s)
a. Yes, categorize the restricted category(s) to the best-fit child-category/child-categories (select
all relevant categories)
b. No, select the “None Apply” option. This option should only be selected if the user has
thoroughly evaluated all images, text, and/or hashtags and found that the document does not
relate to any restricted categories.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
c. IMPORTANT the user must select at least one child category OR “None Apply” per job. If
neither None Apply nor child categories are selected, then the job will not be counted.
6. Indicate if you used the text, image/video, or text and image/video in the document to categorize
into child category or categories. If you did not select a child category and selected “None Apply”,
select N/A.
A. Check the radio button below the media (Image and Video) and make sure that it is always at the left
most part. (See image above)
B. Tag all SCD content in the 1st media, before proceeding to the next. If the first media already contains
the SCD Attributes Category in the 2nd media, proceed to other SCD taggings if available.
C. Repeat step B for the succeeding media and rate accordingly.
In some documents, a possible restricted category topic can be mentioned, but is not the core purpose. In
these cases, only apply labels related to the main topic of the document.
On the right side of the screen, there will be the interactive section where the task is completed and
submitted. If the content that is provided is not clear or there is confusion, perform a brief side search, if
necessary, using a search engine or Wikipedia to gain a better understanding.
Avoid Over-Labeling
Please ensure that the labeling does not include more labels than necessary to categorize the document.
Examples:
Important Note for Reels/Stories: For overlay texts in media, side search up to a maximum of 3 unfamiliar
words
Approach 1:
Keyword + “Meaning”
e.g., Muslimah Meaning
Approach 3:
Keyword + “Slang Meaning”
If a document does not provide enough context for a category to be selected (such as a login page or a
blacked-out screen), select ‘Missing Content’. However, if you do not understand the content provided,
perform a brief side search using search engine results to gain a better understanding. However, not
understanding the provided content is NOT a scenario of Missing Content and the job will still be properly
categorized.
1. Review the English captions and try to get enough context of the job.
2. Review non-English words in the caption for IG posts or Texts overlay in media for Stories/Reels using the
Side-search template
b. If you can understand the context using these words, proceed with labeling
c. If you cannot establish a context, there’s no need to check the definition of the rest of the non-English
captions but you have to make a decision based on the below scenarios:
i. Apply SCD labels or NA based on the available context and English words from the overlay text in the media.
ii. Lean on labeling wrong language if the entire text is non-English (if the designated market language is
English).
Note:
1. The 5 words benchmark for captions could be reduced to 3 words if the review time is affected.
2. For reels/stories with multiple images like slideshows/clips in videos with text overlay; only apply the Side-
search template in the first frame/image.
3. Usernames can still be used as additional context. However, if on its own and mixed with Foreign Language
we will still lean on labeling Wrong Language.
V3.1 has removed several categories. Frequently used categories: Couples & relationships, Married and Single
are removed. Less common categories: Grandparents, Family name, Screen name, and Title are also removed.
Content types
For each of the restricted categories, examples are provided whether a document falls within the categories
based on the types of content it represents. The content could feature the following types of examples:
• Products and services related to the category
• Businesses, organizations, associations, & entertainment (including establishments, places)
• Education & schooling associated with the category
• Attributes related to the category
• Culture, topics & causes related to the category
• Job titles or professions within organizations affiliated with the category
• Public figures associated with the category
Definitions
1. Age: Identifies content explicitly restricted to a specific age-group.
• Children: content explicitly related to (serving or purchased for) babies or children under 13. For
example:
o Image, video, or posts (including any captions) in which a child or children are mostly the core
purpose, or primary focus of the content.
▪ Note: Where a post is focused on parental status, but also shows an image of a child,
please co-label with Children and Parents.
o Services explicitly for babies or children (e.g., youth sports leagues, elementary schools,
pediatric dentists)
o Products explicitly for babies or children (e.g., children’s literature, children and baby clothing,
baby equipment, children’s toys), NOT toys enjoyed or collected by a range of age groups
o NOTE: this label includes products and services for babies, infants, or toddlers
• Adolescent: content explicitly related to (serving or purchased for) adolescents aged 13-17 (inclusive).
For example:
o Image, video, or posts (including any captions) in which teens or what appear to be older
children are mostly the core purpose, or primary focus of the content. This must also be
supported by captions and hashtags (e.g., #teens, #U16s, etc.)
o Services explicitly for adolescents (e.g., middle and high schools, high school football, teen
counseling services)
o Products explicitly for adolescents (e.g., teen literature or magazines)
• Skin color: user-specific data, content, products, or services indicating an individual’s natural skin color,
including user-provided adjectives (e.g., skin lighteners, “fair skin,” “melanin-rich skin,” “olive skin” and
products or services directed at a specific skin color)
• Fingerprints: user-specific data indicating an individual's unique fingerprint pattern data
• Hair color: user-specific data indicating an individual's unique natural hair color
• Height: user-specific data indicating an individual's unique height, including user-provided adjectives
(e.g., tall, short, petite, etc.)
• Iris scan: user-specific data gathered through mathematical pattern-recognition techniques on video
images of one or both irises to reveal individually unique iris patterns
3. Faith & spiritual belief: identifies content closely related to faith, religion, and religious
belief, including the practiced absence of such belief and deeply held philosophical beliefs.
For example:
• Content related to membership or affiliation with a specific faith-based or philosophical group
o user-declared faith or religion
o businesses, organizations, or entertainment related to a religious affiliation (e.g., church,
mosque, temple, Jewish Community Center, Christian rock)
o products or services related to a religious affiliation (e.g., Scripture finder websites, prayer
journals, Christian booksellers)
o culture, topics, or causes related to a religious affiliation (e.g., Christian homeschooling, Islamic
fashion, film or literature with an explicitly religious purpose)
o education related to a religious affiliation (e.g., Yale Divinity School, Yeshiva University)
o job titles related to a religious affiliation (e.g., pastor, minister, guru, rabbi)
o public figures related to a religious affiliation (e.g., Jesus, Joel Osteen, Allah, Sai Baba)
o attributes related to a religious affiliation (e.g., atheism, Christianity, Sikhism)
o holidays based on religious belief, even if secularized (e.g., Diwali, Easter, Ramadan, Hannukah)
▪ DO NOT TAG content related to non-religious holidays (e.g., Halloween, Valentine's Day)
▪ Does not include holidays referenced for sale or promotion purposes (e.g., Easter sale,
Christmas discounts, etc.)
• Example: "Shop our Christmas deals!"
7. Health: Identifies content relating to physical and mental health conditions, treatments,
organizations
• Mental health
o Self-harm & suicide prevention: content related to suicide or other types of self-harm,
including user experiences, treatment programs, specialists, or awareness campaigns.
• General health care
o Preventative care & health programs: content, products and services related to medical
health generally, but not tied to a specific medical condition or non-medical healthy lifestyle
choices. Includes:
▪ general medical topics and practitioners (e.g., pharmacies, family practitioners,
pediatricians, preventative or general dentistry, primary care, lung health)
▪ health insurance and health administration (Medicare, appointment services)
▪ cosmetic surgery (breast augmentation, face lifts, medical esthetic services)
▪ Excludes: Medical education (e.g., nursing programs, certification courses), job postings
for health care employment, medical supplies and educational materials (e.g., CPR
mannequins, dental supplies for professional use)
o Sexual & reproductive health care: content related to general sexual & reproductive health
care excluding medical conditions (e.g., gynecology, prostate examinations, birth control, etc.)
o Vaccines & vaccine status: content related to vaccines and vaccination, including vaccination
services and clinics, vaccination awareness campaigns, user-declared vaccination opinions or
status, etc.
▪ Note: User vaccination opinions should also be labeled with Politics > Political opinion
• Health data: Identifies user-specific data indicating unique personal health data
o Blood type: user-specific data indicating unique personal classification of blood type
o Genetic data: user-specific data indicating unique personal genetic data, including genetic
mapping, declared genetic disorders, DNA sequencing, chromosomal data, or genetic markers.
Note: may be co-labeled with Ancestry in the case of user-specific genetic/DNA data related to
ancestry
o Personal health information: user-specific personal health measurements, biosignals (e.g., data
from electrocardiogram or electroencephalogram), biological cycle data (e.g., sleep data,
ovulation data), reading & level data, including health & fitness data from programs, devices,
software, and wearables (e.g., heart rate/bpm; calories burned, oxygen levels, blood glucose
levels, etc.)
• Medical condition
o Accessibility settings: user-specific data indicating unique personal control settings intended to
increase access to technology, as related to physical or neurological ability
Topics related to non-ethnic cultures (such as national cultures) should not. Topics related to racial
identity should be labeled with Ethnicity.
14. Politics: content related to political affiliation or belief, voting activity and/or related to
political issues or government services
• Government services: content related to non-political government agencies and services. For example:
o agencies such as the Department of Motor Vehicles or local Parks & Recreation
Side research note: News sources with an explicit political leaning, state-owned news agencies, and
any news organizations affiliated with state propaganda should receive this category. When news
sources are closely tied to political affiliation, Wikipedia searches will often note the “political
alignment” of a news source in structured data on the right-hand side or in a content header. Any time
a political alignment is clearly stated, the topic should receive this category.
• Political candidate: content promoting a specific political candidate for public office
o Ads could be for local political positions (e.g., city council/councilman, mayor, alderman) or
state/national positions (e.g., representatives, assemblymen, senators, governors, presidents,
etc.)
o Where a document shows a politician as a part of a larger political movement, unrelated to a
specific election, Political affiliation is the correct categorization.
• Political issues: Social or political issues subject to political debate or lobbying, but without statement
of a specific stance or explicit political affiliation. Any topic related to a political or social issue that
might be subject of lobbying, a key issue in an election, or the subject of community organizing should
receive this category, if not tied to a specific politician or political party (which would indicate political
affiliation) or position (which would indicate political opinion). For example:
o topics such as “gun control,” “Israeli-Palestinian conflict”, “Environmentalism,” “criminal-justice
reform,” or “animal rights”
o documentary film or non-fiction books related to a political issue without statement of a
specific stance
• Political opinion: Political opinions or beliefs not related to specific political parties or membership.
References to “activism” and “political leaning” might indicate a topic is related to political opinion. In
general, “pro-xx” or “anti-xx” are strong indicators of political opinion. For example:
o user-declared opinions on a political topic (e.g., reproductive rights, gun control, immigration,
etc.)
o topics such as “pro-gun control,” “anti-regulation,” or “Support Israel,” “pro-Second
Amendment rights”
o documentary film or non-fiction books with a clear political opinion (e.g., No Safe
Spaces, Citizenfour)
o DOES NOT include firearm ownership, training, or hunting that is not explicitly about a political
opinion related to gun control or gun rights
• Voting activity: Content related to voting, voting choices, or beliefs about voting rights. For example:
CONFIDENTIAL TO APPEN – DO NOT FORWARD
o user-declared status as having voted or promoting voting (NOTE: may be co-labeled with
political affiliation or opinion)
o topics or attributes such as “get out the vote” or “I voted!”
NOTE: This category does not include fictional film or literary content about politics (e.g., All the President’s
Men, Vice)
15. Regulated products & services: Identifies content related to legally restricted or
culturally restricted products/services and/or products/services linked to damaging health
consequences.
• Alcoholic beverages: Identifies content related to beverages containing alcohol. For example:
o Products and services related to alcoholic beverages (e.g., Barefoot Pinot Grigio, Hendrick’s Gin,
wine clubs)
o Non-alcoholic beverages intended to mimic alcoholic beverages (non-alcoholic spirits, non-
alcoholic wine and beer, cocktail mixes, ‘virgin’ cocktails)
o Topics related to alcoholic beverages (e.g., oenology, distilling, designated driver)
o Businesses or entertainment related to alcoholic beverages (e.g., bars, wineries, brewpubs)
Excludes: incidental images of alcohol, such as in the background of an image, that are not part of
the core purpose
• High sugar, fat, and salt foods: Identifies advertised products (e.g., with commercial intent, not user
posts) related to processed foods linked to detrimental health outcomes, generally characterized by
high sugar, fat, and/or salt content.
o Ice cream, including milkshakes, frozen coffee drinks (e.g, Frappucino, frappé) gelato, frozen
yogurt, and popsicles (e.g., Klondike bars, Ben & Jerry’s, McFlurry)
o Savory snacks, such as chips/crisps, crackers, or rice snacks (e.g., Pringles, Ritz crackers, Hot
Cheetos).
o Sweet snacks, such as prepared cookies, prepared cakes, pastries, sweetened breakfast cereals
and toaster pastries (e.g., Oreo, croissants, Froot Loops, Pop Tarts, bakery treats, etc.). Excludes
mixes and ingredients.
o Instant & fast foods, such as instant/ powdered soups and noodles (e.g., Cup-o-Soup, instant
ramen), pre-prepared (e.g., not homemade, ingredients, or recipes) fast food pizza,
hamburger/hot dog, nugget, and/or fries.
• Gambling & simulated gambling: any form of in-person or virtual gaming, betting for money (e.g.,
online slots, bookmaking and sports betting, casinos), or products linked to gambling (e.g., poker chips)
• Controlled substances: Identifies content related to prohibited or controlled substances, including
marijuana, opioids, etc.
• Nightlife: age-restricted night clubs, bars, cabarets, etc. (e.g., Karaoke clubs, “night club,” jazz or blues
clubs) Content should be explicitly about venues targeting adults.
o Excludes restaurants also offering live music.
• Tobacco & smoking: Identifies content related to tobacco products and smoking paraphernalia,
including cigars, cigarettes, chewing tobacco, pipes, and vaping products.
• Weapons: any type of real or toy weapon (e.g., swords, hunting knives, firearms and accessories -
vests, holsters, etc., fireworks, realistic toy firearms - including paintball guns and accessories, realistic
cosplay weapons), pointed or blade-based sporting goods (darts, archery arrows, fencing épées)
Excludes: unrealistic weapon-like toys (e.g., Nerf guns, unrealistic water guns like Super
Soakers, foam swords, etc.), kitchenware (e.g., paring knife, etc.), digital arts and video games
16. Sex life: Identifies content related to expressions of sexual practices, sexual activity and
dating.
• Adult products & services:
o user-declared or provided content related to sexual activity, nudity, etc.
o products and services related to sexual practices (e.g., sex work, sex toys, adult film or
magazines)
o culture or topics related to sexuality (e.g., Kama Sutra, human sexuality, sexuality studies,
gender studies, polyamory)
o public figures related to sexual expression or sex life (e.g., adult film actors)
• Dating: content related to dating. For example:
o dating services or apps
o dating or relationship advice
o Excludes reality or game shows about dating (e.g., Love Island, Married at First Sight, The
Bachelor, etc.)
o Excludes user declarations of partnership (e.g., "I went on a date last night!" or "Took my gf on
a romantic picnic")
• Sexual orientation: Identifies content related to patterns of sexual and/or romantic attraction. For
example:
o user-declared sexual orientation
o organizations or associations related to sexual orientation (e.g., The Trevor Project, GLAAD,
American Institute of Bisexuality)
o attributes related to sexual orientation (e.g., “gay,” “bisexual,” “asexual,” “heterosexual”)
o topics and causes related to sexual orientation (e.g., gay pride, marriage equality)
o public figures related to LGBTQIA+ activism (e.g., Harvey Milk), but *not* public figures who
identify as LGBTIA+ without taking an explicitly activist role.
o documentary or non-fiction materials related to sexual orientation, but NOT fictional film or
literature featuring characters or plot related to sexual orientation.
• Sexual partners: user-declared information related to sexual partners, including identity, gender, etc.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
17. Socioeconomic Status: Identifies content related to socioeconomic status, especially as it
relates to vulnerability based on factors related to income/assets, education level, and
employment.
• Education level
o Adult basic education & literacy: Identifies content related to the development of basic skills
and literacy in adult learners who may have experienced a disruption in primary or secondary
education.
▪ user-declared content about adult basic education or adult literacy
▪ high school dropout credit recovery services
▪ organizations and associations related to adult basic education or adult literacy (e.g.,
California Adult Education Program, literacy centers)
▪ products and services related to adult basic education or adult literacy (e.g., General
Educational Development [GED] preparation services)
▪ Does NOT include ESL courses or other enrichment/recreation courses for adults
▪ Does not refer to university, community college, or vocational education
o Post-secondary education: Content related to post-secondary education level (e.g., university
admissions, professional development courses, vocational colleges, college fraternities or
sororities, law schools)
▪ Includes college entrance examinations (e.g., Vestibular, SAT/ACT, etc.), even if
marketed to students who have not yet started college.
o Primary & secondary education: Content explicitly about to pre-school/kindergarten through
high school education, specifically promotion or recognition of specific schools, tutoring
services, or educational methods.
o Excludes: fictional school content, unclear content possibly related to education (e.g., a
child writing at a table who may or may not be doing homework)
Note: requires a co-label with Age (Children, Adolescent, or both) and possibly Parents, if
parenthood is explicitly mentioned.
• Employment
o Labor union: Identifies content closely related to labor union or trade union membership. For
example:
▪ user-declared status as a member of a labor union
▪ businesses and associations related to union affiliation (e.g., American Federation of
Teachers, Alliance Police nationale, British Medical Association, Free Workers’ Union of
Germany)
▪ public figures related to union affiliation (e.g., César Chávez, Randi Weingarten, names
of union leaders)
▪ topics related to unions (e.g., workers’ rights, collective bargaining, unionization, union
dues, Right to Work)
▪ products related to union membership (e.g., union-branded merchandise)
o Note: label should not be applied to any recruitment or employment-related ads
• Financial information
o Low-income: Identifies content related to low-income or low-asset status (including debt)
and/or need for support from government or charitable sources. For example:
▪ user-declared status as low-income
▪ topics related to low-income status: short-sales, home foreclosure, food stamps
Ad elements examples:
Note: Not all ads will follow the same consistent format.
[optional] “scroll image” arrow for when more than one image is present in the ad and “more” button to
expand long captions
Examples of Webpages
o URL: Check the URL for specifics about the product or service search. In this case, we can see
that protein powder is the focus of the search
• Some webpages show professional profiles. To the best of your ability, label the page for the main
purpose of the professional service offered by the professional profile or contact. For example:
o A listing of orthopedic surgeons in a geographic area should be labeled will ‘Illness & injury’
because the main purpose of these professional listings is promoting services treating a specific
medical issue.
o A profile of a gynecologist, showing their photo, education, professional experience, and
specialties should be labeled with ‘Sexual & reproductive health’ because the main purpose is
promoting services relating to this specific branch of health care.
• News and research content
News and research content should be classified similarly to other webpage content. However, because
webpage content has such a huge range of news articles and research papers, here are some specific
guidelines for these scenarios.
o Health-related news: Health news and research can range from clear, consumer-facing advice
(“7 Ways to Lower Your Blood Pressure”) to highly-specific professional research (“Factor Levels
with Platelet Count in Colorectal Cancer: Clinical Evidence?”). Regardless of the type or style,
label health news for its main purpose and topic. For example:
▪ If it is about a medical condition, label with Illness & injury, even if the article seems
highly specific or research-based.
▪ If an article is about fitness, label as Fitness & self-care
o Political news: Like medical news and research, political news can range from clear political
opinions to complex scholarly research. To the best of your ability, evaluate the headline and
topic of the news article to determine if it has to do with
▪ a political issue (neutral or unbiased discussion)
▪ a political opinion (taking a side, but not explicitly discussing political party membership
or specific left-wing/right-wing affiliation)
▪ political affiliation (specifically related to one political party or faction)
o Crime & tragedy news: News about crime is not considered sensitive. News stories discussing
crimes committed by specific individuals do not relate to the sensitive category of ‘Criminal
record.’
Keywords
Some users will be asked to evaluate a keyword using the sensitive category labels.
• Read the keyword and decide whether it relates to any of the sensitive category labels. Be sure to
review the labels, as necessary.
• None apply label: use when the meaning of the keyword is clear, but it does not relate to any sensitive
category. For example:
o Keyword: Fun
o Keyword: Industry
• This keyword is not useful label: use when
a. there could be multiple meanings of a word that could affect its sensitivity. For example:
i. Keyword: Afghan (blanket? person from Afghanistan? hound?)
ii. Keyword: Diet (eating patterns? weight loss plan?)
iii. Keyword: Separation (relationship status? separation anxiety?)
iv. Keyword: Dates (fruit? dating?)
b. the context or meaning of the keyword is completely unclear and no assessment can be made.
For example:
Solution: Here, the keyword, ‘trending,’ is very general. It doesn’t have a clear connection with any sensitive
category in any context.
Examples of apps
Sometimes an app might not be available in your location. You’ll use other information in the UI to label the
app.
5. Make a label decision: correct label: Health > Medical conditions > Prescription/OTC drugs
6. Indicate if you were able to reach a decision on this job: correct response: Yes
7. Indicate how you reached your decision: in this case, the correct selection would be Text from UI and
URLs because we were able to gather information about the app from both links, the app category,
and the text description.
Concept workflow
Sample decisions
Description: An Italian soda is a soft drink made from carbonated water and simple syrup usually flavored
Decision: High fat, sugar, and salt foods
2. Concept: grasshopper
3. Concept: beer_mug
Description: none
Decision: Alcoholic beverages (Rationale: no side search necessary - this is clearly related to a sensitive data
category)
4. Concept: fios
Description: none
Decision: None apply (Rationale: a side search shows that '“Verizon Fios, also marketed as Fios by Verizon, is a
bundled Internet access, telephone, and television service that operates over a fiber-optic communications
network with over 6.5 million customers in nine U.S. states.”)
5. Concept: raw_hashtag_ascidian
Description: none
Decision: None apply (Rationale: a side search for the concept portion of the hashtag, ‘ascidian’ shows
that “Ascidiacea, commonly known as the ascidians, tunicates, and sea squirts, is a polyphyletic class in the
subphylum Tunicata of sac-like marine invertebrate filter feeders.”)
Content can be text, image, or video. In this case, you will be categorizing Facebook Page Posts with text only.
Any personal information (e.g., names, phone numbers, etc.) will be removed from the post in the following
format: <redacted_data_type>
• Consider only the core purpose of the Facebook page post. What kind of business, product, or service
is this post intended to promote?
• In isolating the core purpose of the post, avoid labeling extra text that does not specifically relate to
the core purpose (e.g., labeling all duties in a job post for a nursing position)
__________________________________________________________________________________________
Approach
In evaluating the image or video documents, you should ask yourself two central questions:
1. What is the core purpose of this image or video?
2. Does the user reveal any sensitive information about themself in the image or video? Are restricted
products or services shown?
There may be tangential mentions of restricted topics, but if they do not indicate a core purpose about a restricted
topic they should not be labeled.
Considerations:
• Evaluate each job fully by watching the full video, examining all images in a series, and reading the
full caption.
• Any text and image/video should be evaluated together. Restricted topics could appear in either.
• Any user-provided emojis may be considered for context.
Workflow
1. Evaluate the job by examining all images and reading the full caption text .
For this job, we have plenty of information to make a decision. The images show a range of family pictures from
several decades. The core purpose of the text is clearly to mourn and remember a loved one. The caption reveals that
the writer has suffered the death of a beloved family member.
The restricted topic Tragedy & hardship > Personal loss is the correct label.
While several types of family relationship are mentioned, we don’t exactly know the nature of the family relationship,
so no ‘Family relationships’ category should be selected.
1. Indicate how you reached your decision.
_____________________________________________________________________________________________________________
3PD Guidelines
• JSON: key/value pairs separated by a colon; valid values in the key/value pair will generally be strings,
numbers, boolean, or ‘null’
Example: {content_type : vehicle,
content_category : suv_midsize,
year : 2012}
Approach
Objective
Evaluate each input for clear signals that the data is related to restricted topic categories.
Considerations
To evaluate each of the three inputs, you should ask yourself:
1. Are there keywords embedded in the URL path, JSON data, or Query that indicate that this data is
related to a restricted topic category?
2. Is there enough context to make a clear judgment about a related restricted category?
Workflow
General Guidelines
There are two high-level activities for handling the 3PD data. Specifically:
1. The widget has the entire sample (url+query+json) selected by default, and raters should select all
categories related to the entire sample.
2. Raters should also select specific segments in URL, Query, or JSON that are related to the restricted
categories. For this part, the segments selected should be as specific as possible.
2. The “Entire Sample” segment is intended to be a required segment. If there are also specific
keywords, raters should first assign a category to the entire sample and then select specific
keywords, as necessary. The entire sample might be multilabel, so a keyword might have different
assigned categories than those for the entire sample. For instance, the sample may be related to
health and politics, but a specific keyword could be health only. In other words, raters are expected
to assign potentially more than one category to the entire sample.
2. When selecting keywords from the URL, please do not include irrelevant punctuation such as
double quotes or semicolons.
Before labeling
Do an initial evaluation of the job by examining all three input types (URL, JSON, QUERY)
URL PATH
QUERY
• For this job, we have plenty of information to make a decision. The URL, JSON, and QUERY are in
English, and there is enough text to give context to the data.
Labeling
If you feel that you are able to make a decision on the job, assign all restricted topic categories that apply to the job.
If necessary, do a side search to understand the url path. In this case, a search shows that snapdeal.com is an India-
based e-commerce site.
Assigning a label
1. Highlight any segments of the inputs that indicate that this data is related to a restricted topic
category.
3. Assign the appropriate restricted categories to the segment(s) you have selected.
a. In this case, “content_ids” is ONLY AN EXAMPLE to show the workflow.
b. Based on the URL and the JSON key/value pairs, we can clearly see that this data is related to a
used car sale, which is not a restricted category. No segments would need to be selected for
this job.
_______________________________________________________________________________________________
The jobs in the Facebook Comments section are made up on comments entered by Facebook members.
Approach
Objective
Evaluate each comment for clear signals that the data is related to restricted topic categories.
Considerations
To evaluate the job, ask yourself if there is enough context to make a clear judgment about a related restricted
category?
There may be considerable noise and extra text around relevant keywords or values. Be sure to evaluate each
comment carefully.
Workflow
General Guidelines:
For these Facebook comments documents, raters will not be calling out specific keywords but rather making a call on
the entire comment.
Considerations
If you feel that you are able to make a decision on the job, assign all restricted topic categories that apply to the job.
In the upper right panel, indicate whether you were able to make a decision on the job and what input helped you
make a decision on the job.
CONFIDENTIAL TO APPEN – DO NOT FORWARD
Submit.
The following two screen shots provide a general idea of the Comments work area.
_______________________________________________________________________________________________