You are on page 1of 18

Guidelines: Location Relevance

Updated 2023-08-30
If you have read previous versions of these guidelines, you should read Section 6 (“Version History”) to
learn what has changed. That way, you will not have to re-read the entire guideline. Most recent updates
are highlighted.

Training examples are in English for cross-language consistency and standardization. However, spam
score will also be based on HITs in your language / country.

Warning: a tiny number of queries in this HIT app may have adult content. When you see such queries,
we ask that you classify them as “Cannot Judge”. If you are not an adult, and/or are not willing to judge
such queries, you should stop now.

1. HIT app overview


In this HIT app, we will show you a user’s internet search query. Location words in the query will be
highlighted. On the upper right hand side, a map will show user location information and any selected
results. If map does not load automatically, doing a map “pan” or “zoom” will usually fix it.

N.B. HIT app is best viewed with browser set to low zoom level, e.g. 50%.

The query is shown on the top left. Please use the research page to learn more about it. By default, the
highlighted location words in the query will be automatically entered into the research pane for
whichever search engine you prefer (you can change your preference in the drop-down menu). If
research pane says it cannot show results due to “unusual traffic”, try a different search engine for a
while (this may be temporary).

Classified as Microsoft Confidential


You can also select the option to “[S]earch entire query (not just highlighted tokens)“. Finally, you can
also modify the query directly in the research pane or in the full size windows.

N.B. This is an early version of this HIT app. If you experience any bugs (such as inability to click a button
or see results), they can usually be resolved by refreshing the web page. We will make sure any such bugs
are fixed ASAP. Please feel free to report them.

The HIT app will also show a list of results further below. You should indicate any results that are an
Excellent match to the locations and the search intent inherent in the query. Clicking a result’s Excellent
button will add it to the map.

Results most relevant to the highlighted words are usually in the Primary list. However, sometimes the
best results will be in the Secondary list. If you find Excellent results in the Primary list, you will not need
to look in the Secondary list. In fact, you can only select results from one of the lists. But to do well in
this HIT app, you’ll need to consider both lists for some queries.

Choose the best result(s), or indicate that there were No Excellent Results. Then give a reason for your
judgment (options will depend on your judgment), and enter an optional comment. Or, if you cannot
judge the result for any reason (e.g. broken HIT app, in query or results, no location terms in the query,
extreme adult themes, or locations that are not actually places), select Cannot Judge and enter an
mandatory explanatory comment. Comments should be in English (as much as possible).

Do NOT select “Cannot Judge” if the result is in a foreign language. Please use the “Translate” research
option to translate the results to your preferred language. If you need to translate multiple location
results, it may be easier to set your default search engine to “Translate” under “Auto load research page
in right pane” and then utilize the search buttons next to each location result.

For example, the query “Weather in Seattle WA” indicates the user is looking for weather results for the
“Seattle” area. If a result of “Seattle, King County, Washington, USA” is shown below, you should click its
Excellent button. In the options below, choose “No Problems”, and click Submit.

If, however, the only results shown are “Bellevue, King County, Washington, USA”, “King County,
Washington, USA”, “Washington, USA”, and “Seattle, Zapopan, Jalisco, Mexico”, then click No Excellent

Classified as Microsoft Confidential


Results. In the options below, indicate the reason for your judgment (e.g. Irrelevant Results), provide an
optional comment, and click Submit.

 Step 1: Understand the location words in the query


 Step 2: Research the query and results.
 Step 3: Indicate which results in Primary/Secondary lists are most relevant and have correct geo-
chains (see definition below).
 Step 4: Indicate if there were any problems with the query or results, or if there were no
problems at all. Please leave a comment (optional unless reporting Cannot Judge).
 Step 5: Click Submit to enter your judgment.

Research in this HIT app is optimized for key usage. Future versions of the HIT app will be fully optimized
for keyboard usage for efficiency. There are also buttons for special cases: No Excellent Results and
Cannot Judge. Selecting any of these will reset selection of results. You must either select results, or
choose No Excellent Results, or choose Cannot Judge.

In Step 4, you will be asked to indicate any Problems that you experienced. The choices will depend on
whether you selected one or more results, or if you chose No Excellent Results or Cannot Judge.
- No problems: you saw no problems in query or results.
- No relevant results: None of the listed results are relevant to the query location words.
- Incomplete geo-chain: Geo-chain of one or more results is missing required information. Do not
select if the list of results / geo-chains is incomplete. For that, choose No relevant results.
- Incorrect Entity Type: the entity type of one or more results is incorrect.
- Spelling errors in geo-chain: Spelling errors the geo-chain of one or more results.
- Incorrect words selected in query: the words highlighted in query are incorrect or incomplete.
- Spelling error in the highlighted location in query: Please ignore non-highlighted parts of query
to consider only highlighted location words for spelling errors.

Finally, please provide a comment to explain your rating and click submit to complete the hit.

2. Definitions
Geo-entity
A specific location which can be uniquely described by a geo-chain, a geo-entity type, and either single
latitude / longitude point or a larger geographic area. A geo-entity could be small (e.g. a single address, a
building, a neighborhood), or very large (an ocean or a continent). For example, <Empire State Building,
20 W 34th St, New York, NY 10001, USA>, <King County, Seattle, USA, North America>, and <Pacific
Ocean> are all valid geo-entities.

Geo-chain
Geo-entities have hierarchical relationships with other geo-entities (e.g. city  state  country). These
relationships represent “belongs-to”, or parent-child, relationships. These form multiple connections
between the most specific and the least specific parts, and thus are known as geo-chains. Unlike what is
used in a postal address, geo-chains often include rich detail, including multiple administrative levels
(e.g. district, county / prefecture / region, state/province, etc.), and continent (e.g. South America, Asia).
N.B. Australia is both a country and a continent, so both may appear.

Classified as Microsoft Confidential


Geo-entity type
Describes the kind of Geo-entity.

Entity Type Description / Example Geo-chains


Street Address Example: 123 Main Street, Seattle, Washington, United States, North America, etc.
Neighborhood Refers to a neighborhood or borough. Formal boundaries may not exist, because
they usually grow organically and may not be organized by any government.
Examples: Hollywood, Los Angeles, Los Angeles County, California, United States,
North America.
Populated Place Refers to a city, town, village, or other populated place. Usually has formal
boundaries as an official entity organized by its local government. Examples: Los
Angeles, Los Angeles County, California, United States, North America.
Admin_3 Smallest level of administrative subdivision. Child of Admin_2. Example: Jinhae-gu,
Changwon, South Gyeongsang, South Korea, Asia.
Admin_2 Larger level of administrative subdivision than Admin_3, but smaller than
Admin_1. Usually refers to Districts, Counties, or Boroughs. Examples: King County,
Washington, United States, North America, or Greater Vancouver, British
Columbia, Canada, North America.
Admin_1 Largest level of administrative subdivision, but smaller than Country. Parent of
Admin_2. Usually refers to either a state or province. Examples: Washington,
United States, North America or England, United Kingdom, Europe.
Postcode Examples: NN1 3BQ, Northampton, Northamptonshire, England, United Kingdom,
Europe, or 98004, Bellevue, King County, Washington, United States, North
America, etc.
Country Germany, Japan, USA, Australia (but also a continent), etc.
Continent Asia, Europe, Antarctica, Australia (but also a country), etc.
POI Point of Interest. Well known location, landmark, popular monument, university,
airport, popular building, etc. which is commonly known to the public. Also: parks,
national forests, islands, lakes, rivers, and other natural features. Examples: Empire
State Building, Oxford University, Great Wall of China, Gatwick Airport, Central
Park (NY), Yellowstone National Park, Vancouver Island, Lake Superior, Nile River,
Sahara Desert, Amazon Rainforest.
Other Oceans, regions. Examples: Pacific Ocean, Middle East, Central Asia, etc.

Point of Interest
Entity with important cultural, historical, tourist, or other major public interest. Includes well-known
locations such as landmarks, monuments, airports, tourist sites, and other places commonly known to
the public. Examples: Eiffel Tower, Harvard University, Geneva Airport, Taj Mahal, Roman Coliseum,
Louvre Museum, Grand Central Station, Wembley Stadium, Six Flags Amusement Park, Rockefeller
Center.

In this HIT app, Point of Interest also includes Nature AREAS such as parks (national, state, city, regional
including man-made parks e.g. Central Park in New York City), nature preserves, wildlife refuges, and
forests (National Forest, State Forest, etc.), and other Natural geographic or geological features
including mountains, volcanoes, rivers, lakes, oceans, islands, canyons, gorges, and deserts.

Classified as Microsoft Confidential


Dominant search intent
The overwhelmingly primary meaning of a location word, which can often be determined with research.
Be sure to consider the full query as context. It usually reflects popularity and/or population, but could
also reflect cultural, historical, economic, or other importance. A location word put into a search engine
that returns only one location reveals overwhelming dominant search intent for that word. This is true
for world-famous cities like Paris, France, as well as for smaller cities like Denver, Colorado. It is also true
for other areas, such as neighborhoods or POIs. For example, a query for Qishan / 旗山 could be for a
district, city, national park, mountain, railway station, or others. Research shows that dominant search
intent is for Qishan District (旗山区) (aka Cishan District) in Taiwan. A query for “Australia” could be for
either the Country or the Continent, but dominant search intent is for the Country.

User Location
The location (latitude, longitude) of the user at the time the user made the query. It can be used to
determine the distance from the user to different possible results. Many user locations will also have a
circle around them indicating its accuracy. The smaller the circle is, the higher the accuracy, and the
more precisely it can be used to disambiguate locations.

Ambiguous Location
A location that is not uniquely defined, often because it is incompletely described. For example, a
location of “Springfield USA” could be <Springfield, OR, USA>, <Springfield, MA, USA>, <Springfield, IL,
USA>, or possibly even <Springfield Township, MI, USA>, among many, many others.

Unambiguous Location
A location that is uniquely defined, often because it is completely described. For example, a location of
“Springfield, MA” could only be for <Springfield, MA, USA>. The location of “Tlaquepaque” is not fully
described, but there is only Tlaquepaque in the world: <Tlaquepaque, Jalisco, Mexico>. Thus, it is also
unambiguous.

3. Judging results

3.1 Validate query and listed results


1. BEFORE selecting any results as Excellent, make sure you can judge it.
a) If query has illicit or adult themes, select Cannot Judge.
b) If there are multiple location words that ARE NOT RELATED (i.e. no parent-child
relationship), select Cannot Judge.
i. “San Francisco and Los Angeles”: multiple locations  Cannot Judge.
ii. “Is Canada bigger than the USA?”: multiple locations  Cannot Judge.
iii. “Restaurants in San Francisco, CA”: NOT multiple locations  Judge as normal.
c) If query has no valid location terms (in spite of highlighted words), select Cannot Judge.
d) Does research show the top ranked search results show the location word is part of a
business name, person name, or some other entity, e.g. “Kentucky Fried Chicken”? If so,
select Cannot Judge.
e) Do not penalize a query for spelling mistakes, or use of aliases, abbreviations, or alternate
names. (Spelling mistakes should however be reported in Step 4.)
f) Ignore street address (street name / number) portion of any query. Focus on the larger
locations in the query. For example, in query “123 Main Street, San Francisco”, you should

Classified as Microsoft Confidential


focus only on the city portion of “San Francisco”. For query “123 Main Street 94107”, you
should focus only on the ZIP/postal code portion of “94107”. For query “123 Main Street”,
there is no larger location, so select Cannot Judge.
2. BEFORE selecting any results as Excellent, try to understand the search intent as expressed in
the query text. Use common sense. Users make mistakes, can be confused, and are not always
clear or correct when typing search queries (and some search queries are spoken, leading to
more errors). Therefore, try to understand the user’s true intent.
3. BEFORE selecting any results as Excellent, research to see if the best possible results are listed in
the Primary/Secondary result lists. If intent is unclear, check “[S]earch entire query (not just
highlighted tokens)“ (or press “S” key). Consider user location, dominant search intent, and geo-
entity type. Does research reveal that none of the results in the Primary/Secondary results lists
are truly Excellent? If so, select No Excellent Results button.

3.2 Determine intended location


The user’s intended location for a query is the location result that is the most relevant match to the
user’s intent for the query.

Always start by looking at the research results. By default, the highlighted location words in the query
will be automatically entered into the research pane for whichever search engine you prefer (you can
change your preference in the drop-down menu). If intent is unclear, check “[S]earch entire query (not
just highlighted tokens)“ (or press “S” key). Finally, you can also modify the query directly in the
research pane or in the full size windows.

The intended location is the most specific location entity in the query, within reason. For query “SOHO
New York”, intended location is the Neighborhood of SOHO in the city of New York, and not the larger
city. But for query “Hudson Furniture in SOHO New York”, the intended location is still SOHO in New
York City, not the more specific furniture store (a store is not considered a “location” in this HIT app,
unless it is very famous and is thus a POI).

Also, some “spatial words” (e.g. in, at, near etc.) may be highlighted to help with context setting. These
words are fine in the query but they are NOT expected to be present in results. So while judging, please
don’t penalize results for not having spatial words or for the query having these words.

If the query is for a town and a postal code, the most specific part will be the one that is contained by
the other (in some cases, the postal code is larger than the town, yet sometimes the opposite is true).

If the location is unambiguous, the intended location is almost always very clear. If it is ambiguous, you
may need to consider dominant search intent, user location (indicating distance of result to user), geo-
entity type, and perhaps the context of the query.

For example, the location word “Paris” almost always refers to <Paris, France> and not to <Paris, Texas,
USA>. A search for the available location word, e.g. “Paris”, on a search engine will reveal overwhelming
dominant search intent for the former.

However, if user location was in or very near Paris, Texas at the time of the query, then “Paris” is more
likely to refer to Paris, Texas. However, Paris, France has overwhelming dominant search intent,
therefore a query for “Paris” when user is in or near Paris, Texas could still be for either <Paris, Texas,
USA> or <Paris, France>. Both would be valid.

Classified as Microsoft Confidential


This is also true for less famous locations than Paris. Research shows that dominant search intent for
location word “Denver” is for <Denver, Colorado, USA>. But if user location is in or near <Denver,
Pennsylvania, USA> (especially if the UL accuracy circle is small), then both would be valid. Same if user
location is in or near <Denver, Iowa, USA>.

Be sure to consider context, including other information in the query. For example, consider query “Phat
Phil's BBQ Paris”. The location word is Paris. Normally, Paris, France has globally dominant intent.
However, “Phat Phil's BBQ” is a restaurant in Paris, Texas that has no equivalent in Paris, France.
Therefore, regardless of user location, the dominant intent of “Paris” is for Paris, Texas.

When dominant search intent does not exist, user location can still be used to disambiguate between
ambiguous locations, especially if it is much closer to one than others (and especially if the UL accuracy
circle is small). As described above in the definition of Ambiguous Location, the location word
“Springfield” might refer to <Springfield, OR, USA>, <Springfield, MA, USA>, <Springfield, IL, USA>, or
many other places by that name. But if the user location is in Oregon, it mostly likely refers to
<Springfield, OR, USA>. If it is roughly equally close to more than one, then it does not disambiguate
between those entities, though it may rule out others. Please select equally Excellent results when there
is no dominant intent and user location doesn’t help disambiguate.

Similarly, for users in the USA or Canada, location word “London” probably refers to <London, UK>,
<London, Ontario, Canada>, or <London, KY, USA>. If user location is in or near London, Ontario, then
<London, Ontario, Canada> is the most relevant result (especially if the UL accuracy circle is small). If
user location is in New York City, then user location does not disambiguate as well. Although <London,
Ontario, Canada> and <London, KY> would still be closer to the user, global dominant search intent for
<London, UK> is more important since the alternatives are not very close.

Some results may seem nearly identical, though they have different geo-entity types. In general, favor
Populated Place over other geo-entity types. For example, if location word is “New Rochelle NY” and
results includes a POI (train station) and a Populated Place (city), pick only Populated Place.

Result #1 Result #2
Geo-chain New Rochelle, Westchester County, New Rochelle, New Rochelle,
New York Westchester County, New York
EntityType Populated Place POI

However, in many cases, you should NOT favor Populated Place over other geo-entity types. Instead,
you may need to consider a combination of dominant search intent, user location, geo-entity type,
context of the query, or other factors to determine the likely intended location. In other cases, there
may simply be multiple reasonable interpretations, even when weighing all these aspects.

For example, in query for “Ohio”, dominant search intent would be for Admin_1 (USA state of Ohio)
rather than the small town of Ohio – a Populated Place which is located within the Admin_1 state of
Ohio.

Populated Place or POI (physical location)?


Many Populated Place entities have the same name as their physical/geologic location, especially for
islands (POI).

Classified as Microsoft Confidential


For example, a query for “Key West” could be either for the island of Key West (POI) or the city of Key
West (Populated Place), but the overwhelming popular intent would be for the city. This can be
confirmed by looking at the search results, which prominently feature the city.

An interesting variation: “Manhattan” refers formally to the island of Manhattan, but popular use of
that name refers to the most famous borough of New York City. Therefore, intended location would be
the Populated Place (New York) rather than the POI (island of Manhattan).

By contrast, a query for “Long Island” overwhelmingly refers to the large island of Long Island (POI) in
New York state and not to Long Island City (Populated Place), which is a relatively small part of the
island.

Query “manzanillo limon” could be a search for:


- Gandoca Manzanillo National Wildlife Refuge (POI) in Limón Province, Costa Rica.
- Village of Manzanillo (Populated Place) next to the Refuge
- The nearby geological feature (POI) named Punta Manzanillo.
Considering research results, the village is the dominant search intent. Therefore, intended location
would be the village of Manzanillo (Populated Place).

Country or POI (physical location)?

You may also need to weigh various factors when deciding intended location for queries that could be
answered with both a Country or an island (POI). For example, Iceland, Greenland, and Madagascar.
Unless dominant search intent indicates otherwise, you should choose the Country.

Populated Place, Country, or POI (physical location)?


Consider the extreme case of Singapore, which is a set of 63 islands, a city, and a nation / country. Most
queries would be for the city (Populated Place), but it depends on context. If query is for a hotel, a
business, or a tourist attraction, then intended location is the city (Populated Place). If query is for the
currency, then intended location is for the Country (which controls it); if for the Prime Minister,
intended location is for the Country (national government). But for query “Singapore biggest island”,
intended location would be the physical island (POI) of Pulau Ujong within Singapore.

3.3 Consider geo-chain quality


Once you know the intended location, you may still need to judge the quality of the best result’s geo-
chain, or between multiple geo-chains.

If the entity type is not listed correctly, the geo-chain should NOT be marked as EXCELLENT. For
example:
- Washington, United States, North America: entity type should be Admin_1. If not, it should NOT
be selected as the EXCELLENT location
- Madhapur, Hyderabad, Andhra Pradesh, India, Asia: The entity type should be Populated Place.
If not, it should NOT be selected as the EXCELLENT location.

Some of these geo-chains may look very similar. In such cases, zoom in to learn more about the results
to determine which one is most relevant to the location words highlighted in the query.

Classified as Microsoft Confidential


Consider these two results and their geo-chains that appear to be the same:
Result #1 Result #2
Geo-chain Los Angeles County, California, United Los Angeles, Los Angeles County, California,
States, North America United States, North America
EntityType Admin_2 Populated Place
Lat-Long 34.35885, -118.21705 34.05349, -118.24532

On closer inspection, one is an “Admin_2” (District / County), while the other is a Populated Place (City)
contained within the given Admin_2.

Since an Admin_2 geo-entity is slightly less specific than a Populated Place, the judges should
understand the query-context and base their selection on the location-intent expressed in the query.
For example: if the query were “realtors Los Angeles”, then result should be the Populated Place (Result
#2). But for query “realtors Los Angeles County”, then selecting the Admin_2 geo-entity (Result #1)
would be correct.

Sometimes, a geo-chain may have duplicate / redundant information. These results should NOT be
selected as Excellent. For example:
 <Seattle, Seattle, Seattle, WA, USA>. Seattle is the name of a city (Seattle, WA), but there
are no counties or states called Seattle, so it is redundant information.
 On the other hand, take case of: <New York, New York, USA>. In this case, “New York” is the
name of both the city and the state. The first “New York” thus refers to the city and the
second “New York” refers to the state. Therefore, it is NOT redundant information.

Many times you may need to consider a combination of the cases above, using entity types and query
context to make the correct decision. For example, consider the query “bernat pop yarn dubai” with
results as a Populated Place (Dubai) and its parent administrative entity (Dubai, the Emirate). Query
context suggests that user is searching to buy a product within a city, and NOT within the Emirate /
Admin_1. Hence correct answer is the Populated Place “Dubai, Dubai, United Arab Emirates, Middle
East, Asia”.

Result #1 Result #2
Geo-chain Dubai, Dubai, United Arab Emirates, Dubai, United Arab Emirates, Middle
Middle East, Asia East, Asia
EntityType Populated Place Admin_1

If none of the listed results have correct geo-chains, then do not select any as Excellent. Instead, choose
No Excellent Results.

3.4 Summary of an Excellent result


To be chosen as Excellent:

1. Location result must be highly relevant to the highlighted location terms in the query.
2. There is no better location result than this result (either in the list or in research).
3. There can be multiple location results rated as “Excellent” for a query.

Classified as Microsoft Confidential


4. If web search on Bing, Google, and Wikipedia return references only to one geo-entity for all
their top ranked results, and other geo-entities are never or very rarely referenced, then that
should be the ONLY “Excellent” result.
5. If location terms are ambiguous, select the result closest to the user location (especially if the
UL accuracy circle is small). However, if a location exists with dominant search intent (especially
for a globally significant location), then that result should be selected instead (or as well).
6. Verify the entity type for the Excellent geo-chain(s). Any mismatch of entity type should NOT be
selected.
7. The geo-chain must be correct (complete, correctly spelled, and no false details).

3.5 Make judgment

Select as EXCELLENT the result(s) that:


- The most relevant, considering:
o Text match to query
o User location (if applicable)
o Dominant search intent of query
o Geo-entity type match to intent
- Has correct geo-entity type
- Has correct geo-chain

 If multiple results are equally EXCELLENT and relevant, and their geo-chains are correct, then
you should select ALL of those.

 If multiple results are equally EXCELLENT and relevant, but only some of their geo-chains are
correct, then you should select only those results that are both relevant and with correct geo-
chains.

 But if no results are shown, or there are not any EXCELLENT results with correct geo-chains, do
NOT select any result. Instead choose No Excellent Results.

4. Examples
Example 1. To be or not to be
Query to be or not to be
User Location N/A
Judgment Cannot Judge
This query clearly does not contain a location intent. The highlighted location term “or” could be an
abbreviation of “Oregon”, but not in this context. There are no location terms in this query.Therefore,
select Cannot Judge.

Example 2. www.hyderabad-house.com
Query www hyderabad house com
User Location N/A
Judgment Cannot Judge
This query is looking for a website of Hyderabad House – which is a chain of food outlets in India. The
user has not specified “where” s/he wants to search. The query does contain the word “Hyderabad”, but

Classified as Microsoft Confidential


it is a part of the URL and not a location of user’s intent. This query clearly does not contain a location
intent. Therefore, select Cannot Judge.

Example 3. Seattle’s Best Coffee


Query Seattle’s Best Coffee
User Location Any
Judgment Cannot Judge
This query is looking for the coffee chain “Seattle’s Best Coffee”. The user has not specified “where”
s/he wants to search. The query does contain the word “Seattle”, but it is a part of the name of the well-
known coffee shop and not a location of user’s intent. Therefore, select Cannot Judge.

Example 4. Boots UK
Query Boots UK
User Location Oberursel, Hessen, Germany
Location Results Entity Chain Entity Type User Distance
Vereinigtes Königreich Country 1300 km
Uttarakhand, Indien, Asien Admin_1 8157 km
Boot Cove, Lubec, Maine, USA POI 3413 km
Step 4 Problems No problems
The user is looking for “Boots” chain stores in the United Kingdom (UK). The user is in Oberursel,
Germany, and should expect a single result: the country of United Kingdom. For this example, please
pretend you are judging this as part of an English-speaking judge pool (even if you are actually in a
different language judge pool). As an English-speaking judge, you would be able to read the query
because it is in English. However, you would need to translate these results because they are in a foreign
language (German).

Example 5. Shopping in Tokyo


Query Shopping in Tkyo
User Location Any
Location Results Entity Chain Entity Type User Distance
Tokyo, Tokyo, Japan, Asia Populated Place XX
Tokyo, Japan, Asia Admin_1 XX
Tokyo Internation Airport, Japan, Asia POI XX
Step 4 Problems Spelling errors in query, Spelling errors in geo-chain
The user is looking for shopping areas in Tokyo (misspelled). Based on the highlighted location term, the
geo-chain corresponding to the city of Tokyo will be displayed, which is within the Japanese prefecture
also named Tokyo. The judge should study the geo-entity chains for correctness and completeness and
then choose the first one as Excellent. Step 4 Problems: Because there was a spelling error in the query,
select Spelling errors in query. Because there is a spelling error in one result (“Internation” should be
“International”), select Spelling errors in geo-chain.

Example 6. Map of Jaipur


Query Map of Jaipur
User Location Any
Location Results Entity Chain Entity Type User Distance
Jaipur, Jaipur, Rajasthan, India, Asia Populated_Place XX

Classified as Microsoft Confidential


Jaipur, Rajasthan, India, Asia Admin_2 XX
Jaipur, Puruliya, West Bengal, India, Asia Populated Place XX
Step 4 Problems No Problems
In this query, the user is looking the map of a specific location: Jaipur. Based on the highlighted location
term, then you will see a list of probable geo-chains for “Jaipur”. First research to learn that Jaipur,
Rajasthan is a very famous city in India. Select the first result ONLY, which has globally dominant search
intent of Populated Place. The second result is for its parent administrative area and hence not relevant,
and the third result is nowhere near as well known. Step 4 Problems:

Example 7. Pizza in Redmond


Query Pizza in Redmond
User Location New York, NY, USA
Location Results Entity Chain Entity Type User Distance
Redmond, King, Washington, USA, North Populated Place 2846.38 miles
America
Redmond, Deschutes, Oregon, USA, North Populated Place 2795.01 miles
America
Redmond, Mason, West Virginia, USA, Populated Place 480.51 miles
North America
Step 4 Problems No Problems
In this query, the user is looking for a pizza place in Redmond. Based on the highlighted location term,
multiple geo-entity chains appear. Redmond, WA, USA has dominant search intent. The other options
are not very close to the User location to UL cannot help disambiguate. Thus, “Redmond, King,
Washington, United States, North America” should be selected based on dominant search intent. Step 4
Problems: No Problems.

Example 8. Pizza in Redmond OR


Query Pizza in Redmond OR
User Location Any
Location Results Entity Chain Entity Type User Distance
Redmond, Deschutes, Oregon, USA, North Populated Place XX
America
Step 4 Problems No Problems
Based on the highlighted location term, the corresponding geo-entity chain will appear. Unlike Example
#7, the intent is not ambiguous about which Redmond the user wants. The judge should evaluate the
result / geochain for correctness and completeness. In this case, there is only one location, which is the
correct result. Step 4 Problems: No Problems.

Example 9. Restaurants in Paris


Query Restaurants in Paris
User Location Hugo, TX
Location Results Entity Chain Entity Type User Distance
Paris, Paris, Ile-de-France, France, Europe Populated Place 4578 miles

Classified as Microsoft Confidential


Paris Pike, Lexington, Kentucky, USA, Street Address 754 miles
North America
Paris, Tennessee, USA, North America Populated Place 515 miles
Paris, Lamar, Texas, USA, North America Populated Place 25 miles
Step 4 Problems No Problems
In this query, the user is looking for Restaurants in a location named “Paris”. Based on the highlighted
location term, multiple geo-entity chains appear. Paris, Texas, USA is very close to the user. The other
Paris (in France) though popular, is very far from the user. It is very probable that the user is in fact
looking for restaurant recommendations in the location that is close to her/him. However, Paris, France
has extremely dominant search intent. Hence, select both “Paris, Lamar, Texas, USA, North America”
and “Paris, Paris, Ile-de-France, France, Europe”. Do not select the other results, which are neither close
to the user location nor do they have dominant search intent. Step 4 Problems: No Problems.

Example 10. Lake Washington, Seattle


Query Lake Washington, Seattle
User Location Any
Location Results Entity Chain Entity Type User Distance
L Washington, USA, North America POI XX
Judgment No Excellent Results
Step 4 Problems Incomplete geo-chain, Spelling errors in geochain
User is looking for Lake Washington near Seattle. Research reveals that Lake Washington is a lake in King
County, Washington near Seattle. Since the entire lake is not entirely within Seattle, “Seattle” is a
reasonable location word, but it does not precisely describe the intended location. The result shown is
excellent in terms of relevance. But it is NOT Excellent because of problems with the geochain. Research
shows that the ideal result should have been “Lake Washington, King County, Washington, USA, North
America”. Therefore, select No Excellent Results. Step 4 Problems: select the “Incomplete geo-chain”
option in the final popup dialog. Also, note that the official name for the location is “Lake Washington”,
and not “L Washington”. Recognized abbreviations and aliases are allowed, but “L” is neither. Therefore,
also select “Spelling Errors in Geochain” option. Then click Submit.

Example 11. Surrey, Vancouver, CA


Query Surrey, Vancouver, CA
User Location Any
Judgment No Excellent Results
Step 4 Problems No relevant results
User is looking for the city of “Surrey” in the “Greater Vancouver Regional District” in “Canada”. Based
on the highlighted location term, no results are returned. Select No Excellent Results, and click the “No
relevant results” option. Step 4 Problems: No relevant results.

Example 12. Directions from seattle to portland


Query Directions from seattle to Portland
User Location Any
Judgment Cannot Judge
The user is looking for directions between TWO different locations. Considering multiple locations is out
of scope for this HIT app. Select Cannot Judge.

Example 13. 14200 NE 62nd St Redmond WA

Classified as Microsoft Confidential


Query 14200 NE 62nd St Redmond WA
User Location Any
Location Results Entity Chain Entity Type User Distance
Redmond, King, Washington, USA, North Populated Place XX
America
Step 4 Problems No problems
The user is looking for a specific address in Redmond. Marking specific addresses (street number, street
name, street type) is out of scope for this HIT app, but you should mark the results matching the
highlighted portion of the query, which is the portion of the geochain above street level. The result is
correct, and so is the geochain. Step 4 Problems: No problems.

Example 14. WA Italian Restaurants Redmond


Query WA Italian Restaurants Redmond
User Location Any
Location Results Entity Chain Entity Type User Distance
Redmond, King, Washington, USA, North Populated Place XX
America
Redmond, Deschutes, Oregon, USA, North Populated Place XX
America
Redmond, Mason, West Virginia, USA, Populated Place XX
North America
Step 4 Problems No problems
User is looking for Italian Restaurants. There are two non-continuous parts to the location of the user’s
intent – “WA” and “Redmond”. Still, the intent is clear: Redmond, WA. Evaluate the results and their
geo-entity chains for correctness and completeness. Only the result and geochain for Redmond WA is
correct, so select only that one as an Excellent result. Step 4 Problems: No problems.

Example 15. Bellevue High School, WA


Query Bellevue High School, WA
User Location Any
Location Results Entity Chain Entity Type User Distance
Washington, USA, North America Admin_1 XX
Step 4 Problems No problems
In this example, there are two words which can describe a location – Bellevue and WA (Washington
State). However, in context of this query Bellevue is a part of the name of the entity that the user is
trying to search. The only word that describes the user’s location intent is WA. That the entity of user’s
interest “Bellevue High School” is in Bellevue which is a location in WA is a coincidence. The word
Bellevue does not represent that location. Based on the highlighted location term, the correct result (for
Washington State) is shown. Check the geo-entity chain for completeness and correctness. Step 4
Problems: No problems.

Example 16. burnsville courage center


Query burnsville courage center
User Location Lexington, KY
Location Results Entity Chain Entity Type User Distance

Classified as Microsoft Confidential


Burnsville, VA Admin_1 264 miles
Burnsville, NC Admin_1 190 miles
Burnsville, MN Admin_1 650 miles
Step 4 Problems No Problems
The user is looking for “courage center” in Burnsville. A few possible Burnsvilles are returned, and the
User Location shows the first two results (in VA and NC) are much closer than the 3 rd one (in MN).
However, there is a “Courage Center” in Burnsville, MN that is not in (or near) either of the closer towns
in VA or NC. This shows the dominant search intent / intended result is in Burnsville, MN, regardless of
the further distance. Check the geo-entity chain for completeness and correctness. Step 4 Problems: No
Problems.

Example 16. Empire State Building New York


Query Empire State Building New York
User Location Any
Location Results Entity Chain Entity Type User Distance
New York, NY, USA, North America Populated Place XX
New York, USA, North America Admin_1 XX
Empire State Building, New York, NY, USA, POI XX
North America
Step 4 Problems No Problems
The user is looking for a famous POI (Point of Interest) in New York City. The result list has some
irrelevant results showing the city of New York and the State of New York, but neither is specific enough.
Only the POI result is relevant. Check the geo-entity chain for completeness and correctness. Step 4
Problems: No Problems.

Example 17. Blount county farmers co-op


Query Blount county farmers co-op
User Location Commerce, Colorado, United States
Location Results Entity Chain Entity Type User Distance
Blount County, Tennessee, United States, Admin_2 110 miles
North America
Blount County, Alabama, United States, Admin_2 196 miles
North America
Step 4 Problems No Problems
The user is looking for a type of business in "Blount county". Both results are for counties of the same
name but in different states. However, both are relatively close to user location, and research shows
that both have Farm Co-Op businesses, hence there isn’t a dominant intent for either of them, and
query context doesn’t help disambiguate. Therefore, the correct answer is to select both Excellent
results. Step 4 Problems: No Problems.

5. Final notes
1. Judges in ZH HIT apps may encounter queries from different dialects. If they are in a dialect that
you cannot understand, consider skipping the HIT. If you are certain that people in the ZH judge
group would not be able to judge the HIT, then please try to translate the results to your
language.

Classified as Microsoft Confidential


2. Chinese / Japanese often use the same characters. A query in "Chinese" may give Japanese
results due to dominant intent. If you are unable to read or understand the results, please try to
translate the results to your language. For example, the following queries may present problems
for judges in China but not for judges in Japan: 木 須 意思, line 壁紙, 13 夜, 市立 南越 中学校.

THANK YOU!

Thank you for reading this far. You should now be ready to complete training and take the qualification
test. It is good practice to keep these guidelines always open during judgment in case you need to
double check the rules.

6. Version history
If you have read previous versions of these guidelines, you should pay close attention to the changes
listed below. That way, you will not have to re-read the entire guidelines. However, be sure to refer to
the relevant sections, especially for new examples.

2023-08-30:
- Section 1 (“HIT app overview”)
o Do not mark results in foreign language as CANNOT JUDGE. Instead, try to translate
these results and judge normally.

2021-02-09:
- Training examples are in English for cross-language consistency and standardization. However,
spam score will also be based on HITs in your language / country.
- Section 1 (“HIT app overview”)
o What to do if map does not load automatically
o What to do if research pane displays “unusual traffic” error
o Option to “[S]earch entire query (not just highlighted tokens)“.
- Section 2 (“Definitions”):
o Defined words are now listed in table of contents (easier to find)
o Definition of geo-chain now describes its inclusion of rich detail (i.e. unlike what is used
in a postal address).
o Updated definitions of Neighborhood and Populated Place (contrasting formal
boundaries and organization by local government).
o Updated Geo-entity type table in Section 2 (Definitions): [TODO add link to table]
 POI: added natural features
 Other: added “regions”. More examples.
 Added “Australia” (as both Country and Continent)
o Added example of Australia to definition of Dominant Search Intent: A query for
“Australia” could be for either the country or the continent, but dominant search intent
is for the country.
- Section 3.1 (“Validate query and listed results”):
o Subsection 1(e): changed “Location in Entity” to example of “Kentucky Fried Chicken”:

Classified as Microsoft Confidential


Does research show the top ranked search results show the location word is
part of a business name, person name, or some other entity, e.g. “Kentucky
Fried Chicken”? If so, select Cannot Judge.
o Subsection 1(g): Ignore street address (street name / number) portion of any query.
Instead, focus on the larger location areas in query; if there are none, select Cannot
Judge.
- Section 3.2 (“Determine intended location”):
o Increased emphasis on starting with research, including searching the full query or
modifying the query text yourself.
o Strengthened wording of when NOT to favor Populated Place (i.e. city/town) over other
geo-entity types.
o Added section “Populated Place or POI (physical location)?”
o Added section “Country or POI (physical location)?”
o Added section “Populated Place, Country, or POI (physical location)?” for Singapore
- Section 3.5 (“Make judgment”): Formatted to clarify correct judgment options. As appropriate,
select one result, multiple results, or choose “No Excellent Results”.

2020-12-22:
- Clarified in Section 1 that during step 4 of hit-app, ‘spelling errors’ checkbox is to be marked only
when highlighted location words in query have errors i.e. please ignore non-highlighted words.
- Updated Section 3.2 to emphasize that multiple excellent results should be selected for
ambiguous locations that neither have dominant intent nor user location help disambiguate.
Added example 17 in Section 4 to further explain.
- Updated Section 3.2 to emphasize that the Populated Place should be preferred over POI (Point
of Interest) when both exist with same name. (e.g. New Rochelle NY)
- Updated Section 3.2 to clarify that spatial words (e.g. in, at, near) could be highlighted in query
but they are NOT expected to be present in result. Hence please don’t penalize for the same.
- Updated Section 3.3 to emphasize that for child and parent results having same name, query
context should be used to identify excellent result between them (e.g. Dubai).

2020-11-26:
- Don’t penalize for query misspellings, abbreviations, aliases or alternative names. See Section
3.1(f).
- Try to understand user’s true search intent as expressed in the query text. Users make mistakes,
and can be confused. Try to understand their true meaning. See Section 3.2.
- Intended location is the most specific location in the query (within reason). See Section 3.2
(Determine intended location).
- Consider full query text for context. See example of “Phat Phil’s BBQ Paris” in Section 3.2
(Determine intended location).
- Dominant search intent: added example of query for Qishan / 旗山. See definition of Dominant
Search Intent in Section 2.
- Updated Geo-entity type list and descriptions. See definition of Geo-entity type in Section 2.
- Foreign Language: strengthened language to choose Cannot Judge if result geo-chains are in a
different language group AND not in English. Applies to both query and results. See Section
3.1(a).
- Section 4 (Examples): improved appearance of tables, added “Step 4 Problems”, and added /
modified examples, especially: #4, #10, #15, #16.

Classified as Microsoft Confidential


- Added advice to view at reduced zoom level (e.g. 50%). See 1 st paragraph of Section 1 (HIT app
overview).
- Section 5 (Final notes) has special instructions for Chinese-speaking judges (ZH).

Classified as Microsoft Confidential

You might also like