You are on page 1of 21

Search Result Preference

Search Engine Relevance

The goal of this task is to compare two results from a search engine.

Your task is to compare two results returned by a search engine following 5 steps:
1. Understand the Search Query and the User’s Intent – shortcut key is <ENTER>
2. Understand each Search Result – shortcut keys are [ ]
3. Choose the comparison Preference – shortcut keys are ← → and <ENTER>
4. Select the Reason(s) why you made the judgment – shortcut keys are ↓ ↑ and <SPACE>
5. Submit your judgment – shortcut key is <ENTER>

Use the ← → ↑ ↓ keys to choose different options.


Use <ENTER> and <SPACE> to make your selections.
Installing the UHRS Extension
IMPORTANT: This HitApp requires the UHRS Extension. This extension helps with loading pages
inside the app which improves your experience, speeds up judging, and leads to better labels.

Please install the extension from the Chrome Web Store before starting this task. If you do not
install the extension, you’ll see an error (see below) which also contains a link to the download.

This installation will take no more than a minute.

This is the error you will see if you do not have the extension installed.

To install, click this link and then press the “Add to Chrome” button as shown below.
Step #1: Understand the intent of the Query
You will see a query issued to a search engine and the location of the user.
Try to understand the user’s intent and what they were hoping to find

Search Query
Carefully read the search query. Pick out the key terms, phrases or concepts which are important to the user
issuing the query.

QUERY: Ariana grande hairstyles


KEY TERMS:
• Ariana grande
• Hairstyles
Intent Description
For many queries we asked another judge to think about the most likely intent of the query and what
useful results would look like. If this information is available, it will be displayed with the query. Please
consider this information carefully as it may provide useful hints and to make judging the task easier.

Location
It’s common for a query to require results which depend on the user’s location. Think about whether
location would matter. Most people issuing queries like “museum” are probably looking for museums in
their general area, not ones across the country or far away.

Spelling
It’s also common for a query to be misspelled. There are thousands of spelling variations for queries and
it is important to think about what the user meant to type.

If the misspelling is obvious, please, ignore it and judge the page as if the query was spelled correctly.
But please, be careful, some queries might be researching misspellings, or they might be about
product/brand names which might be misspelled on purpose. When not sure, use the Google/Bing
search links to see how the engines would interpret the query.

QUERY: facebook
MISSPELLINGS:
• facbook
• faebook
• facboko

Research
To assist in researching the query, open Bing or Google to see what their search results pages look like.
There is also a link showing the user’s location on a map. (A map can also be accessed by clicking on the
location name.)

SHORTCUTS
[g] = Google
[b] = Bing
[m] = view location on map
[ENTER] = progress to next step
Step #2: Understand each Search Result
Two Search Results will be shown for the QUERY and LOCATION.
Understand each well enough to be able to make a comparison.
Opening/Viewing the Web Page
Many of the search results have web pages which will appear in the “frame”. However, it’s very
common that even great sites like Google Mail might prevent their web pages from loading in our HitApp.

The web pages should load in the “frame”. If they don’t – press the “Open Page” button.

Many sites (even great sites) block the embedding of the web page within our HitApp. When
this happens, you must open the search results by (1) clicking on the search result or (2)
pressing the “Open page” button or (3) using the shortcuts. Remember, this is what a real
user would see if they clicked the search result.

Sometimes the embedded web page will try to take the focus from our HitApp. This is normal
and expected. You might occasionally need to click inside the HitApp window to bring focus to
it, before the hotkeys start working. If you hit any accessibility issues, please, let us know.

Aspects of Relevance
There are many aspects of relevance which are important. The common ones are shown below.
Page Loading
When users click a search result, it should load the web page with the expected content in a
reasonable amount of time. However, it is quite common that a result no longer exists or that the website
has taken down or moved the content (404 Error). Other times, the browser might indicate that a site is NOT
SECURE or suspicious. There are many other sites which require passwords or user accounts that users
might not have to access general information. However, those password-protected sites (like banks or
personal information) are commonly what users are trying to access.

In general - if a page cannot be loaded, it cannot satisfy users.

Meet Needs
The most important aspect of relevance is MEETING THE NEEDS of the user issuing the query. If a page
doesn’t have enough information about the query, then no user will be satisfied by the result. Some pages
have very superficial content which won’t satisfy users either because the content is incorrect or because
there isn’t enough meaningful content.

Some common scenarios:

• NAVIGATIONAL – queries like “facebook” or “facebook.com” are looking for the Facebook home
page. Be aware that sometimes a query that looks like a result, like “searchrelevance.com/judge”,
might not actually exist
• INFORMATIONAL – queries like “mahjong rules” might be looking for information about the game
and its rules. A good result will clearly show the rules of the game
• MISSING CONCEPT – in the query “how to install matra mfs 100 on windows 10” if a search result
fails to include “matra mfs 100” then it is missing a key concept in the query.
• USABILITY – if a query makes it difficult to access the content or information, then that means it is
less likely to meet the needs of users. Sites that have too many popup windows, ads, or require
stepping through lots of pages are less usable than sites that show information directly like
Wikipedia.
• WRONG LANGUAGE – If a page is in a language other than that of the query, then the page cannot
meet the needs of the user. However, if the query contains Spanish terms or phrases, this might be
ok since the user has demonstrated that they understand some Spanish.

When in doubt – prefer pages with enough content that’s easily accessible.

Authority
The second most important aspect of relevance is AUTHORITY or TRUST. Even if a page shows the
information clearly, if a site cannot be trusted then a search engine won’t want to present those results.
Some pages will try to deceive users into clicking an ad or downloading some malware onto their
machine. Be on the lookout for pages whose primary purpose seems like it is trying to fool users.

This is difficult to evaluate, but be aware that sites which have lower authority often exhibit the following:

• TOO MANY ADS – If a lot of ads appear at the top of the page, the side of the page, in between
paragraphs of the text, or that popup it’s an indication that this site is trying aggressively to profit
from people who happen to land there. This indicates a lower level of trust.
• SPAMMING THE DOMAIN – A domain like “homegoodsonline.com” might be trying to trick users
who are looking for “homegoods.com”. This is a very common technique for spammers trying to fool
users.
• ORIGINALITY – Another common approach from low-authority sites is to steal/copy the information
from a legitimate site. Be on the lookout for content which doesn’t look like it came from this site.
• MISINFORMATION – A relatively recent phenomenon is the proliferation of sites which actively try
to mislead users either for political or financial reasons. Be aware that it’s very common for sites to
give false or misleading information about potentially controversial topics.
• POOR GRAMMAR – Because spammers will try to automatically generate lots of fake pages, it’s
quite common for a document to contain misspellings, grammatical mistakes, or does not make
sense when carefully read.

When in doubt – prefer trustworthy domains from major companies or governmental organizations.

Freshness
Some pages might have STALE information either because the page hasn’t been updated in a while or
because the page was about an old topic. For example, if somebody issues the query “Manchester United”
they are probably not looking for a page about an old match from a couple of years ago. Instead, they would
either be looking for the home page (which is always fresh/current called EVERGREEN) or looking for the
most recent match results or news.

When in doubt – prefer information that is more recent/current or evergreen.

Location
Some pages, like news sites, might be more appropriate to show closer to the user’s location for a query like
“news”. Other pages, like Facebook’s home page, might always be appropriate to show for a query like
“facebook” and are UNIVERSAL. Be aware that sometimes the results should be different for different users
in different locations.

When in doubt – prefer pages that are applicable to people near that location or universal.
Step #3: Search Result Satisfaction Judgment

Select how well each search result MEETS THE NEEDS of the query intent.

After you have understood the QUERY INTENT and explored both SEARCH RESULTS along with the
various aspects of relevance (MEET NEEDS, AUTHORITY, FRESHNESS, LOCATION), you will make an
assessment of how well each search result MEETS THE NEEDS of the query intent.

MEETS
The page fully or almost fully meets the needs of users issuing the query from the specified location. For
{facebook} – the official site, the Wikipedia page about Facebook, and Twitter page would all be “Meets”.

SOMEWHAT
The result page would meet the expectations of only a few users. Either not many users would look for
this information, or the page has several issues or flaws and would not fully satisfy the user’s
expectations.

DOES NOT
The result page does not meet the expectations of even a few users. The result page may be unrelated to the
query. The result page might not load, be unusable, or have little content satisfying users.

QUERY RESULT LABEL EXPLANATION


IFlagger https://iflagler.org Meets The result page meets the user’s
virtual expectations as it is the homepage
school of the Iflagler School
New https://www.usnews.com/nyt Somewhat meets The result page somewhat meets as
York it provides headlines but most users
Times expect the official website, not an
aggregator site
Craigslist https://orangecounty.craigslist.org/ Does Not Meet The results page is the wrong page.
orange Users are looking for Orange County
county fl in FLORIDA, not Orange County in
CALIFORNIA
Step #4: Pairwise Judgment
Choose how much one result will satisfy the user more than the other.
Sometimes there isn’t a preference. Other times, neither satisfies.

After you have understood the QUERY INTENT and explored both SEARCH RESULTS along with the
various aspects of relevance (MEET NEEDS, AUTHORITY, FRESHNESS, LOCATION), you will make a
preference judgment comparing the two results. This is the most important step in the task.

Use the ← and → keys to switch preferences.


Use <ENTER> to select it.
Helping you choose the preference
It can sometimes be difficult to decide how to use the preference scale. Here’s a methodology we’ve
developed which goes from the easiest decision (neither satisfies) to an easy decision (one is much
better) to a more difficult decision (similar) and finally deciding between the preference levels. You
might come up with your own decision-making process that works better for you.
NEITHER (FAIL TO SATISFY)
A result will FAIL TO SATISFY users for any of the following reasons:

• WON’T LOAD – the page doesn’t exist, is not secure, or is behind a login that most users won’t have
• DOESN’T MEET NEEDS – the page doesn’t have enough information about the query to be useful.
• LOW AUTHORITY – the page is from a source that is trying to fool users or has too many ads
• STALE – the information is too out-of-date and therefore no longer useful
• WRONG LANGUAGE – the page is in a language which most users won’t understand
• WRONG LOCATION – the page is intended for people in a different location
• POOR USABILITY – the information is hard to find or difficult to access
• ADULT – the content or theme is pornographic
• OFFENSIVE – there is hate or offensive speech in the document
• OTHER – there are potentially lots of other reasons why a page shouldn’t be shown

When both pages fail to satisfy users – select NEITHER


This should be an easy decision

MUCH BETTER
A result can be MUCH BETTER for a variety of reasons:
• It is the NAVIGATIONAL result people are directly looking for
• It MEETS THE NEEDS of users much better than the other result
• It has much higher AUTHORITY than the other result
• It is showing CURRENT information while the other has STALE information
• It is more appropriate for the users LOCATION
• It is much more USEABLE due to a better presentation, layout, or design
• The other result FAILS TO SATISFY (as described above)

Use MUCH BETTER when one result is clearly superior to the other across most aspects.
This should also be a fairly easy decision

SIMILAR
Some results being compared will have similar amounts of information or both will be lacking, but in
different aspects. One result might MEET THE NEEDS better while having slightly lower AUTHORITY, but
not enough to affect users. Another time a site might have more INFORMATION but be a little bit STALE
or out-of-date.
If you are having a hard time deciding which result is better – use SIMILAR.
This is a little more difficult.

SLIGHTLY BETTER vs BETTER


When one result is better than the other (unable to select SIMILAR) but it is not quite so obvious (unable
to select MUCH BETTER) it is necessary to make tradeoffs between the various aspects of relevance.
Consider the QUERY INTENT and what the user is looking for. In general, preference is for sites with
higher AUTHORITY as long as they provide enough information to MEET THE NEEDS of the users. This
can be difficult to determine what tradeoff to make.

Choose SLIGHTLY BETTER if you are leaning towards SIMILAR.


Choose BETTER if you are leaning towards MUCH BETTER.
This is a harder decision.

Making tradeoffs among Aspects


It is common to be comparing two results in which one result will have better INFORMATION while the other
might be FRESHER. Other times, a result will comes from a better AUTHORITY (like a governmental agency)
while the other has more specific information about the query yet comes from a questionable source.

In general – we prefer results which sufficiently MEET THE NEEDS while


coming from higher AUTHORITY sites that users can TRUST. Implied in
MEET THE NEEDS is that the page is sufficiently CURRENT and
appropriate for the user’s LOCATION and LANGUAGE

Simple Shortcuts
[→] Move right
[←] Move left
[ENTER] = Make selection

Direct Shortcuts
[1] = Left is much better
[2] = Left is better
[3] = Left is slightly better
[4] = Neutral / Similar
[5] = Right is slightly better
[6] = Right is better
[7] = Right is much better
[0] = Neither, both results are bad

Use the ← and → keys to switch preferences.


Use <ENTER> to select it.
Step #5: Select primary reason(s) for your selections
Tell us the primary reason why you preferred one result.
You can choose multiple reasons in some cases
As part of the decision-making process, you will notice cases where one result has a lot more information
than the other. Other times you will notice that one of the results has stale or out-of-date information. For
medical and financial queries, things like the site authority matter. After all, you wouldn’t trust your health
or money with any random site.

In this step, we want you to make the reason(s) why you choose the preference or marked Neither.

Use the ↓ and ↑ keys to move


Use <SPACE> to select/deselect a reason
Step #6: Submitting your judgment
After verifying your preference and reason, submit your judgments

After you have selected at least one reason (and often there will be multiple) you will be able to directly
submit your judgment to the UHRS backend. Simple hit <ENTER> and you will move to the next hit.

Use <ENTER> to Submit your judgment


QUICK GUIDES
Please print for your quick reference
OVERALL FLOW
KEYBOARD SHORTCUTS

Use the ← → ↑ ↓ keys to choose different options.


Use <ENTER> and <SPACE> to make your selections.
UNDERSTANDING SEARCH RESULTS
4 STEPS TO MAKING A COMPARITIVE JUDGMENT

You might also like