(English) Trillions of Questions, No Easy Answers - A (Home) Movie About How Google Search Works (DownSub - Com)

[TYPING SOUND]
[MUSIC PLAYING]
NARRATOR: This is
Louis XIV, also known
as Louis the Great,

Louis the Grand Monarch
and Louis the Sun King.
Famous for supposedly

declaring "l'etat, c'est moi."
But in 1685, even the

self-declared direct
representative of God on Earth

had questions he could not
answer on his own--
questions about the ruling

Qing dynasty in China.
How big is it?
How many people

live in the capital?
What can they teach

us about music?
Culture?
Astronomy?
So in the spring
of that year, Louis
sent members of France's

order of royal mathematicians
on a voyage that would span

three continents and three
oceans.
Their task-- gather

information that would
satisfy the King's curiosity.
It was a journey with

numerous hardships
and countless setbacks.
But five years four

months and two days
later, Louis's answers

finally arrived.
In the grandest of
human traditions,
he had become curious,

asked a question,
and learned a new

piece of information,
just like billions of people

who had come before him
and billions who

have come since.
People who had access to cave

walls, clay tablets, oracles,
scrolls, books, the

printing press, libraries,
semaphore towers, telegraphs,

the radio, the television,
Betamax tape, and the

short-lived French national
internet system called Minitel.
[ELECTRONIC VOICE SPEAKING

FRENCH]
Which brings us to today.
Fishermen looking up when

tomorrow's tide comes in.
Careful cooks wondering

when anchovies expire.
Travelers trying
to figure out how
to say Chapstick in Turkish.

Friends settling a
bet about which team
won the '92 NBA Eastern

Conference Finals.
Job seekers looking

to make a move.
And a fourth grader looking

at facts about the Qing
dynasty for a history

paper that's due tomorrow.
Billions of King Louis

asking trillions of questions
in hundreds of languages,
expecting someone
to give them an answer

in under one second.
Now, who would sign up

for a challenge like that?
BEN GOMES: Interesting setup.
CREW: Yeah.
[MUSIC PLAYING]
NARRATOR: This is Ben Gomes.
BEN GOMES: Well, the correct

pronunciation is "Gah-mez."
BEN GOMES: But I say "Gohms."
It's a Portuguese name.
He knows a few
things about search.
Uh, that search.
Anyway, he's kind of a

big deal, even though he'd
try to convince you otherwise.

Ben worked on Search
for more than 20 years.
But that's now where

his story started.
BEN GOMES: So I was born in

Dar Es Salaam in Tanzania.
But at a very early

age, my parents
moved back to
India to Bengaluru.
And there was a few books at

home from my elder siblings.
And that's the information I had

access to, including I remember
one torn encyclopedia that I

think my grandfather had given
my mom.
So it was really out of date.
In 5th grade, I
got two presents--
a bike, which my parents thought

I'd be very excited about,
and a much better encyclopedia.
And I was actually

much more excited
about the encyclopedia--
this is where geeks come from--
than the bicycle.
And my parents didn't

know what to do with this.
[MUSIC PLAYING]
When I look back at how

we found information,
it was so dramatically
different from today.
When my mother was growing

up, where there was not even
access to a good library,
you would have just accepted
the fact that you didn't

have the information,
and that's the way

it was going to be.
When I was growing up, for

some kinds of information,
there was a decent library.
But you still had

to take this bus.
It took about an hour.
You had to look things

up in a card catalog.
That took time.
Now today, we measure in

fractions of a second the time
it takes for you

to get information.
I think that
reduction in friction
is absolutely dramatic,
because it can enable people
around the world to have

equal access to information.
It's not just that

people in some
places who have access

to the best libraries.
Everybody should have access

to the highest quality
information.
So that combination of a
deep technical problem and I
think a fundamental human

need to understand the world
around us, to know more

about the world around us,
is the heart of Google
Search, and what
keeps me coming to work still

so excited 20 years later.
So in the early days, I

wondered whether the company
had the infrastructure

to be a real company.
Because when I had come

for my interview actually,
there was not even

a sign indicating
that this was Google.
So I was not sure I'd

come to the right place.
But halfway up the

staircase, there
was a small neon sign

that said Google.
So that's when I knew.
[MUSIC PLAYING]
And it generally felt

completely chaotic.
And Jeff was there.
Jeff is also brilliant.
JEFF DEAN: Yeah, we were

a very small company.
We were maybe about 25 people.
We were all kind of wedged

into this second floor
area in downtown Palo Alto.
I was in an office
with Urs Holzle.
BEN GOMES: Urs was in charge

of all of engineering.
And at the time, I don't think I
knew how to pronounce his name.
But he put the three of us

named Ben in one office,
just so people would walk

by and say, hey, Ben.
URS HOLZLE: Yes,

we had the Ben Pen.
I think it was pure

coincidence, actually.
My first reaction to Google was,

I have no idea what Search is,
so it's probably not for me.
But then I was intrigued

by the problem.
It was clear that there

was some real value there.
Because without
really good ranking,
all that growth of

the web would be
wasted if nobody
could actually find
the things that were there.
BEN GOMES: So one of the

core aspects of Search
is, how do we rank

results and how do we find
the most relevant information.
So a lot of people work on that.
You'll get really good stuff

on this from Pandu, actually.
CREW: Pandu?
OK.
[MUSIC PLAYING]
NARRATOR: This is Pandu Nayak--
PANDU NAYAK: Hi, I'm Pandu.

NARRATOR: --head
of Search Ranking.
His personal motto--
PANDU NAYAK: No
query left behind.
NARRATOR: Before
working at Google,
Pandu worked at an artificial

intelligence lab at NASA.
PANDU NAYAK: Yeah, we built an

autonomous system that provided
high-level control to a
spacecraft called Deep Space 1,
really the most exciting

thing that has ever happened
in my life--
in my professional
life, I guess.
NARRATOR: After doing that,

he wanted a new challenge.
PANDU NAYAK: I oversee

the Ranking team.
So ranking is important
because if we simply
return the million pages

that match your search query,
that's not particularly helpful.
And so we need to rank the pages

that you might find useful.
Hopefully, these are at

the top of the results.
We're really trying

to bring information
to the world at large and

make it useful so people can
improve their day-to-day lives.
And I feel really lucky

to have the opportunity
to work on this mission.
[MUSIC PLAYING]
NARRATOR: Let's go back a bit.
Summer, 1999, room 300 and

something in the Gates building
at Stanford.
And these two guys,

Larry and Sergey,
who were about to announce

something so big it merited
matching polo shirts.
LARRY PAGE: OK, maybe

we should get started.
So what is our mission?
So how is Google different?
Basically, we want to organize

the world's information
and make it universally

accessible and useful.
NARRATOR: 20 years later,

bigger stage, same deal.
[CHEERING]
SPEAKER 2: And
today, our mission
feels as relevant as ever.
[MUSIC PLAYING]
NARRATOR: So what does

this actually mean?
Here are a few takes.
CATHY EDWARDS: I
think if we weigh up
the various parts

of the mission,
to me the most important

piece is organizing.
There are hundreds of
billions of web pages
that are out there.
Our job is to
filter through that
and to really give you

what you are looking
for at that moment in time.
NICK FOX: And then the next

part is the world's information.
So information means
really anything.
It started out for

Google with web pages,
but it's so much more than that.
DAVID BESBRIS: Whether it's

physical books that we need
to scan or maps that we build

of every place on Earth,
that's information, too.
And it's not web pages.
It's the kind of stuff

that we organize today.
TULSEE DOSHI: And then I

think that word universal
is important, because
universal means for everyone.
NICK FOX: Whether it's

someone that can't see,
whether it's someone

that can't hear,
people that speak

different languages, really
make it accessible to as broad

a set of people as possible.
DAVID BESBRIS: We
might be goofy people
who come to work in T-shirts
and desperately need
haircuts and things like that.
We may not look super

serious, but we know
how much people rely on this.
We take that mission and

really, really seriously.
SPEAKER 3: 1.0.
NARRATOR: So it sounds
like the mission is pretty
important to these folks.
But here's another

important question.
CREW: So how would you

explain how Search works?
BEN GOMES: Right.
Yeah, so how does Search work?
TULSEE DOSHI: How Search works?
[MUSIC PLAYING]
PANDU NAYAK: How Search

works, in a nutshell.
NARRATOR: This is
server rack 3349b.
It lives here in
Ballybane, Ireland,
along with cows, a golf course,

and Kavanagh's Auto Accident
Repair Center.
This is one of the places

where Search happens.
Search is a big
piece of software
that takes the words you

type in here and looks
for them here, on
the worldwide web.
It can do that because

first it downloads
a copy of the entire

web, scans it, and makes
a list of all the words

and lists of all the pages
each word appears on.
It's like the index of a

book, except 10 trillion times
longer.
Lasagna appears on 59
million of those pages.
When you search for

lasagna, the software
puts these pages in

order with what it hopes
are the most useful at the top

and less useful at the bottom.
Most people
searching for lasagna
want a recipe for lasagna.
COOK: Look at how

delicious that looks.
NARRATOR: Some people want

nutrition facts for lasagna.
And a few people want to learn

about the life and research
of Louis C. Lasagna, MD.
They call him the father

of modern pharmacology.
The software living

on server rack 3349b
helps rank those

pages, depending
on where you live, whether

the page was updated recently,
[OVERLAPPING] how
many other pages link
to that page, how many times

the word lasagna appears
on the page, is lasagna in

the title, is lasagna bolded,
are there pictures

of the lasagna?
It does all this in less than

one second, billions of times
a day every day,

mostly for things
that are tougher to

figure out than lasagna.
BEN GOMES: So behind the

scenes of Google Search,
there are many

kinds of engineers
and many different

teams that come together
to bring you to such

experience you see,
teams around the world,

in many other countries--
Zurich, London,
India, Japan, so on.
You have teams that are working

on the interface by which we
present this information,

teams working on the evaluation
processes, processes that

sure that the changes that
are happening are good changes.
And then there are teams of

engineers who work on ranking.
They might examine

the kinds of queries
where we are not
doing well today,
and think about, what are

the kinds of techniques
we could use to enable us

to do better in the future?
NARRATOR: Like the team that's

about to enter this meeting.
ELIZABETH TUCKER:
Anything we need to know?
CREW: Don't look at

the lens of the camera.
ELIZABETH TUCKER: OK.
All right, let's do it.
NARRATOR: Despite their lack

of on-camera experience,
they're working on what could be

the biggest change to Search in
over a decade.
SUNDEEP TIRUMALAREDDY:
Things are getting exposed.
SPEAKER 4: Part of
that is building--
NARRATOR: But we'll

get back to them later.
SPEAKER 4: But we will

actually see some--
[DOOR SLAMS]
BEN GOMES: So Search is

a pretty complex product.
It's a big effort to actually

make these things work,
to take all of these different

pieces of the system,
using a lot of mathematics,

and then trying
to bring them together

into something more real,
into something that can actually
be turned into an algorithm.
NARRATOR: All right, so behind

the scenes, people at Google
are working on algorithms.
[MUSIC PLAYING]
Let's dig into

that for a minute.
At its most basic,

an algorithm is just
a set of mathematical
instructions
that a computer follows,

kind of like a recipe.
Just like there are different

recipes for different dishes,
there are different

algorithms for different jobs.
Some make elevators

go up and down.
Some predict subway delays.
Some help cars

parked themselves.
The Google Search

algorithms exist
to return high-quality
information based on a user's
query, stuff like all of

the text, pictures, videos,
and ideas that people

have taken the time
to put on the open

web, stuff they
want other people to find and

read and watch and look at
and learn from.
PRESENTER: Hey, guys!

LUCAS: Lucas here.
ROOFER: In today's video,

I want to show you--
TEACHER: --how to simplify

a rational expression.
NARRATOR: We're talking about

the angle of the Leaning
Tower of Pisa, how to hit a 7-10

split, whatever this thing is.
This is the
information that Google
tries to organize
and make universally
accessible and useful, because

this is the kind of information
that people are out

there looking for.
But you know what

they're not looking for?
VOICE: Act now!
VOICE: We will be
with you shortly.
VOICE: A whiter, brighter smile!
VOICE: Hey!
NARRATOR: Spam.
Not the delicious

kind, the bad kind.
CATHY EDWARDS:
Yeah, so let me just
talk about spam for a

minute, because spam
is one of the biggest

problems that we face.
NARRATOR: This is Cathy Edwards,

head of User Trust for Search,
which basically means she

deals with a lot of crap
so the rest of us never have to.
CATHY EDWARDS:
Broadly, spam is what
we consider a low-quality
page that is artificially
boosted in our results.
NARRATOR: She's
talking about pages
that use AI-generated nonsense

text, hidden keywords,
and hijacked URLs to trick

their way into people's
Search results, pages

like fastcashonline.org,
topicalarticles.info,
the kind of websites
that, when you end

up on them, you
hit the Back button as

quickly as possible,
because they're [BLEEP].
Because they're spam.
DAVID BESBRIS: There's a

wide variety of motivations
why people do this.
Sometimes it's
commercial interests.
CATHY EDWARDS:
Spam, where they're
trying to sell things that are

a little bit dubious, right?
Or sometimes it can
just be to capture
more of the user's clicks.
And that's not right.
That site is not getting

those links organically.
It dilutes the value
of that signal.
It makes it even
harder for us and it
makes it harder for users

to find great information.
DAVID BESBRIS: It's a

very, very hard problem,
because people on the other side

are very motivated to succeed.
And they're smart, too.
And they have resources, and

they're working on it also.
We solve one part of

it, and they adapt
and they do something else.
CATHY EDWARDS: And that's

the reason that we keep
Google's Search algorithm a

very closely guarded secret,
recipe-for-Coke-level
guarded secret.
DAVID BESBRIS: Because if we

talk about our [? Search ?]
[? signals ?] too much, then

people will manipulate them.
And that breaks Search entirely.
Fighting spam is a
cat and mouse game.
It's not something that I

think will ever be solvable.
CATHY EDWARDS: As
an example, 40%
of pages that we crawled

in the last year in Europe
were spam pages.
This is a war that we're

fighting, basically.
NARRATOR: So yeah, people
at Google hate spam,
which is one of the reasons

they're always making changes
to Search, to keep spam

out of your results
and to keep high-quality

information in.
[MUSIC PLAYING]
BEN GOMES: OK, so you've got the

Search engine and it's working.
And by all accounts,

it's working
better than any other search

engine has worked before.
And every day, you see

millions of queries.
And clearly users are happy.
But as an engineer,
you ask yourself,
how can I make this better?
You see many ways in which

we are still failing.
And you see a ton of opportunity

for us to make it even better.
And over a period of

time, the developments
we've made in the Search Engine

have had a dramatic impact
on how well it actually

works for users.
PANDU NAYAK: No,

no, I don't think
we had that particular problem.
Even though we've launched

a whole series of changes
over the years
that have, I think,
meaningfully and materially

improved the Search result
sets, I'm here to tell

you that Search is far
from a solved problem.
In fact--
BEN GOMES: There's

actually no end in sight,
in terms of when this

will actually be solved.
Because the world

keeps evolving.
We're coming up
with new devices.
We're coming up with new ways

of interacting with information.
We're coming up with

new information sources,
like videos and so on, that are

adding in new opportunities, as
well as new challenges.
CATHY EDWARDS: The content

on the web has changed.
Users have changed what

they're searching for
and how they search.
For example, 15%

of the queries--
PANDU NAYAK: 15%

of the queries--
BEN GOMES: 15% of queries

we see every day--
CATHY EDWARDS: --we

have never seen before.
That's just going

to keep happening,
and we're going to need to
constantly evolve to keep up.
It's a little bit like the Red

Queen says to Alice in "Alice
in Wonderland," you need

to run as fast as you
can to stay where you are.
SPEAKER: We're going

to add some friction.
We don't actually think

we have good results.
The idea is to add friction for

the worst of the worst results
to start with.
CATHY EDWARDS: We change the

Search algorithm, on average,
six times per day.
It's actually really frequent.
However, to get to those

six launches per day,
roughly a couple of
thousand launches in a year,
we're doing 200,000 to

300,000 experiments.
So the vast majority of changes

that we think about making,
that we might try,

actually fail.
PANDU NAYAK: Imagine you have

a smart engineer on the team,
and they come to

you and say, I've
got this great idea on

how to improve Search.
And you talk to the

engineer, and they come back
a little while later and
say, OK, I've got the change,
can I launch it?
And you're like, no,

you can't launch it.
You've got to prove that

this is actually good.
NARRATOR: Proof comes from data.
Data comes from experiments,

side-by-side tests where
results from the current

version of Google Search
are compared to the

proposed version.
If the proposed version gives

better quality results--
AKA, links to better

quality websites--
then it gets closer to being

put into production, which
is a fancy way of saying,

actually in use by people
around the world.
Which brings up a question.
Who decides what makes a

better quality website?
RAMI BANNA: Those

people that we asked
the question of, which

is better, A or B,
are known as Search

Quality Raters.
NICK FOX: The people at

Google aren't deciding what's
a good result from a bad

result. The people at Google
aren't determining what results

to show for any given query.
But rather, the
Raters are basically
teaching our computers

what's good and what's bad.
Is this a high-quality result?
Is this low-quality result?
CATHY EDWARDS: And

they are trained
on what are called

our Rater Guidelines.
NARRATOR: The Search

Quality Evaluator Guidelines
are a 168-page document

establishing what makes
a good Search result good.
We're talking about

websites exhibiting
expertise, authoritativeness,
and trustworthiness.
These words are given

clear, detailed definitions
so the thousands of
independent evaluators keeping
an eye on Search know

what they're looking for.
Want your website to

show up higher in Search?
Read the guidelines-- seriously.
They're publicly available,

and the more people
that read them, the better the

web could be, for everybody.
All right, let's

get back to Ben.
BEN GOMES: Making

changes to Search
is a bit of a balancing act.
There are many

different things you're
trying to balance together--
quality, freshness,
relevance, but we also
have to balance the performance.
Some ideas may be

really good, but they
may result in Search

that takes a lot longer.
So we have to be
careful that we are not
making Search slower in

the process of giving you
slightly better results.
In some ways,
the key innovation worked.
BEN GOMES: And

what about latency?
Does this introduce

new latency or--?
SPEAKER: The distilled model's

pretty quick [INAUDIBLE]
SUNDEEP TIRUMALAREDDY: Yeah,

I think 10 milliseconds or so.
BEN GOMES: It seems like

a reasonable trade-off
for this level of win.
JEFF DEAN: From-- when

we first started really,
we were focused on how can we

make Search run very fast so we
respond more quickly with

better results to more
people every day, every week.
INTERVIEWER: I did a
search a couple days ago,
a complicated thing,
three-hundredths of a second.
I mean, it seems
inconceivable you
can do all that that quickly.
RAMI BANNA: We are about

finding the world's information
and bringing it to your

fingertips the second
you ask it--
in fact, less than 0.5 seconds.
BEN GOMES: It seems

incredibly difficult,
and yet that's an area that

works reliably 24 hours a day,
365 days a year,

around the world.
But how are you going to look up

an index that goes to the moon
and back several times in

a fraction of a second?
NARRATOR: I don't know Ben.
Maybe we should ask the expert.
This guy you saw earlier,

this is Urs Holzle.
He manages the technical

infrastructure at Google.
This is Urs Holzle

in 1999, when he also
managed the technical

infrastructure at Google.
URS HOLZLE: My first

business card actually
said Search Engine

Mechanic because my job was
fixing things that were broken.
And the problem was hard,

because really everything
was broken, and it
was just about fixing
the thing that's most broken.
To people sometimes,
the internet
seems kind of like it's nowhere.
I'm using my phone, and

then here's wireless,
and I don't really see anything.
But when it comes to a Search

engine, when it comes to a data
center, these are really

physical, big machines,
so to speak.
A data center actually is

conceptually very simple.
It's a building with

lots and lots of servers.
And that's really it.
So in Dublin, we have one

of the data center campuses.
It's actually one

of the smaller ones.
PETRA: I think it's the

smallest data center we have.
JAMES: We're considered the

baby data center of the fleet.
We're--
DANIEL: --quite small.
KEVIN: Quite the snowflake.
PHILLIP: Actually,
this is quite big.
For any other company,

this is bewildering.
This is just [? not a ?] thing.
NARRATOR: This is Phillip,

Kevin, James, Daniel,
Petra, and the crew

we hired to film them.
And this is where they

and all their coworkers
work, the Google Data

Center in Dublin, Ireland.
PHILLIP: The scale of what we

do here can be kind of crazy.
PETRA: [INAUDIBLE]
searches a day
goes through those machines.
That's why they're very loud

and they produce lots of heat.
That means they're constantly

working, constantly
answering your queries.
URS HOLZLE: And so how do

we really store the web,
so to speak?
The way to think about it

is, we take the internet,
download it, index it, and

chop it into small pieces.
And then each server

has a small piece.
All of the servers

for that data center
work together to each

search their little part
of the internet.
RAMI BANNA: And

it literally takes
millions of servers
and hard drives
to be able to support
the world's websites.
URS HOLZLE: So each
of these data centers
has a complete copy of the web.
RAMI BANNA: So if you're

in France or if you're
in South Africa,
you're not sending
a query that goes through

the wires, underwater cables,
and comes to Mountain

View, asks that question,
and we send it back.
That's just not possible.
That's never going to work

as a solution that's fast.
URS HOLZLE: How it actually

works is, if you go into Google
and you type in

a search, then we
direct your query to the

data center that is closest.
And so that's
actually the reason
why we have data

centers everywhere,
because we want to be close to

the users that we're serving.
RAMI BANNA: Because

that's the only way
to get you the most accurate

response as fast as possible.
CREW: So there's a lot of

expensive equipment here, huh?
KEVIN: Yeah.
CREW: How does that

all get paid for?
KEVIN: I have
absolutely no idea.
I guess it's from advertising.
JAMES: Ads keeps the lights

on and probably puts gas
in my car at the end of the day.
CREW: All right, yeah,

I think we might have
to talk about ads a bit here.
Any last thoughts before we cut?
JAMES: Keep it sweet.
NARRATOR: All right, ads.
Why are there ads?
Two reasons-- one, ads keep

Search universally accessible,
no paywalls, no
subscriptions, no "you've
used your last credit,

want to buy a 50-pack."
just search that's

free for everyone.
And two, ads help people

who want to buy a thing
find people who sell that thing.
Like Bart here--
BART: Hi.
NARRATOR: --and his employees.
ALL: Hi!
NARRATOR: At Carr Hardware

in Pittsfield, Massachusetts.
BART: Yep, we sell 38,000 items.
NARRATOR: Like weed whackers,

tack hammers, wrenches,
and M10 metric castle nuts.
MARIE: I think the only thing

we don't sell is milk and bread.
NARRATOR: Bart
buys ads on Google
that only get shown when

someone near their town--
BART: Pittsfield!
NARRATOR: --searches,
for instance,
"lawn mower dealers near me."
And Google only gets paid if

the person doing the search,
maybe your neighbor or

your brother-in-law,
clicks on Bart's ad, which

is always labeled "Ad."
It helps people
find mowers to buy,
and it helps Bart and

the store get business.
BART: Have a nice day.
NARRATOR: And it helps

pay for all the stuff that
keeps Search and Maps and

Docs working and free.
That's why there are ads.
PANDU NAYAK: Since

I've been at Google
and worked on Search

for the last 14 years,
I have to say that no one,

absolutely no one, comes to me
and says, you know,

I did this search
and the results were great.
Nobody says this.
They only call to complain

that they did something
and it didn't work.
NARRATOR: And the

name of the man who's
been collecting
Google's dumbest Search
mistakes for the last 14 years?
[CHEERING]
Senior Software
Engineer Eric Lehman.
CREW: Eric L, take 1, mark it.
ERIC LEHMAN: Over

the years, I've
been gathering some of

my favorite bloopers.
I'll walk you through

some of those.
So how far from the coast

is Cambridge, Massachusetts?
It's actually a little

over 3,000 miles
from the West Coast.
How many calories in

330 tons of butter?
So this caused an
overflow error,
and we said about

minus 2 billion.
Mm-hmm.
What color is green?
That's a tough one.
Blue?
Sure.
For the search "meat

nutrition facts,"
we brought up all kinds
of detailed information.
I think it's quite good.
The query's a little

ambiguous because it
didn't say what kind of meat.
And so the system

chose roasted muskrat.
[LAUGHS] Yeah.
Avogadro's number is a
sort of important constant
in chemistry.
It's also, apparently,

the name of a restaurant.
And so we've given a lot

of chemistry students
their phone number.
Is that what you

were shooting for?
CREW: Yes, yes, that's perfect.
NARRATOR: Since you

started watching,
people have done over

100 million searches,
enough results to
fill 27 libraries,
but none as cool as this one.
This is the Weston Library

on Oxford's campus.
Two buildings down,

you'll find the office
of Dr. John-Paul Ghobrial,

a professor of Early
Modern History.
He specializes in the history

of information and archives.
Suffice to say, he's an
expert on this stuff.
JOHN-PAUL GHOBRIAL:
It used to be,
before, say, the

16th or 17th century,
that if you were reading a

manuscript copied by someone,
perhaps someone you knew,

perhaps someone who you didn't
know but they were recommended

to you by someone else,
you could have a certain trust

that the text you are reading
was stable, was

authoritative, was right.
Printing changes all of this.
Sure, printed word

can flow everywhere.
But that worried lots of people.
Because for example, if we don't

know who printed it, well then,
what should we think

about this information?
If there's an error
in the printed word,
then everyone will get it wrong.
So we look now actually

at the print revolution,
which we used to think about

almost in a celebratory way,
and we think now that actually

the anxieties that people had
about print in many ways

paralleled the anxieties
that people have

today about fake news,
about origins of information.

NICK FOX: Google Search is
an index on what exists.
And so if that
content is out there,
sometimes we can surface it.
That can present

results that are
accurate when it comes to the

content of the web out there,
but not accurate in terms of

what the truth actually is.
But that can result

in some, what
I would consider to be
reprehensible or really
offensive results.
[MUSIC PLAYING]
BEN GOMES: A few years ago,

people were pointing out that,
for some queries, like,

"did the Holocaust happen,"
we were giving people

results that had the words
and were on the topic, but

were from low-quality sites.
And we viewed this as a

pretty profound failure.
PANDU NAYAK: This is clearly

bad because this is clearly
a case of misinformation,
because the Holocaust did
actually occur.
And so then we wanted

to understand why
it is that this was happening.

BEN GOMES: So we take a
very algorithmic approach.
We did not go in and

say, oh, for this query,
we've got to change the results.
PANDU NAYAK: The

fundamental reason
for that is, every problem that

is reported to us like this
is usually the tip

of the iceberg.
And it's usually

just a representation
of a whole class of problems,

in this case problems
of misinformation.
And just solving the specific

problem that was reported to us
does not solve the large

iceberg of problems
that were not reported to us.
FEDE LEBRON: Part of the reason

why we were all in Search
is because we want to give

good results to users.
We want to make
their lives better
by giving them good information.
This was contrary to

everything that we
wanted as employees in Search,

in a very egregious sense.
It wasn't just a misspelling

or something that.
MEG AYCINENA LIPPOW:

Every query is
going to have some

notion of relevance
and each one's going to
have some notion of quality.
And we're constantly trying to

trade off which set of results
balances those to the best.
SPEAKER: That's a good question.
MEG AYCINENA LIPPOW: But

if you type in the query,
"did the Holocaust happen,"

higher quality web pages
may not really bother

to explicitly say
that the Holocaust did happen.
They're talking
about the Holocaust
and taking for granted the fact

that we, as informed citizens,
are aware that the

Holocaust happened,
because we learned about

it in school and so on.
And so the only kinds of

websites that are actually
going to have the combination

of terms that seem to closely
match a query like that might

be ones which in fact say,
no, the Holocaust

didn't actually happen,
it's all a big hoax.
Those results are not

the high-quality results.
They tend to be
lower quality even
though they're more relevant.
And so what was happening on

the "did the Holocaust happen"
type of queries is that

the relevant signals
were overpowering
the quality signals
to a degree that was resulting

in low-quality results
for users.
PANDU NAYAK: We
have long recognized
that there's a certain

class of queries,
like medical queries,

like finance queries,
in all of these cases,

authoritative sources are
incredibly important.
And so we emphasize expertise

over relevance in those cases.
So we try to get you results

from authoritative sources
in a more significant way.
MEG AYCINENA LIPPOW:

And by authoritative, we
mean that it comes from

trustworthy sources,
that the sources

themselves are reputable,
that they are upfront

about who they are,
where the information

has come from,
that they themselves

are citing sources.
PANDU NAYAK: And

so the change we
have made in the case

of misinformation
is to change the ranking

function to emphasize authority
a lot more, and this has
made all the difference.
SPEAKER: Actually, not these.
NARRATOR: Misinformation is
one of the challenges that
comes with helping people

find what they're looking for.
But it's not the only one.
Launched in 2010, the

Autocomplete feature
has saved millions of

hours in people's time
by guessing what
they're searching
for before they finish typing.
But when those guesses

have been wrong,
it's led to some pretty

disturbing predictions.
REESE PECOT: A few years back,

we started hearing from people
that sometimes folks were

typing things into Autocomplete
and they would be shocked

by some of the predictions
that they were getting.
Autocomplete was
designed to help people
complete their searches faster.
Instead, we were actually

returning them information
that they weren't searching for.
When we provide you with

something that's shocking,
that's not relevant, we've
really at that point not
stood up to our core principles.
PANDU NAYAK: I think I and

all the members of the team
felt a deep personal

responsibility
to try and develop the

systems to minimize
these kinds of occurrences

as much as possible.
First, we developed
a set of policies
that say what kind

of predictions
that we would not want

to offer to users.
REESE PECOT: Things

like violent content,
sexually explicit
content, hate speech.
But we also publish

those policies.
That way people can

see where we stand,
and then that gives us

some accountability.
PANDU NAYAK: With these

Autocomplete algorithms,
we try not to
surface predictions
that violate the policies.
Now, these algorithms are

very good at what they do,
but they're not perfect.
And every so often, we'll

get some predictions
that in fact violate them.
REESE PECOT: So you

can report if you've
seen a prediction that

violates those policies.
And every day we get

flags from our users
out there to tell

us where we might be
seeing problems in the product.
PANDU NAYAK: We
use those reports
to improve our algorithms to try

and see whether we can address
the whole class of problems

that the report might
be just pointing towards.
But one thing that I

would like to emphasize
is that this in no
way prevents users
from searching for whatever

it is that they want.
They're absolutely
free to do that.
NARRATOR: Think
about it this way.
Search is like a door

that leads to the web.
With Autocomplete,
it's the kind of door
that senses you walking

towards it and opens for you.
But if you're typing a query

that violates its policies,
the automatic part stops.
The content of the web

is still behind the door,
but you won't see
any results until you
complete the query yourself.
NICK FOX: Search isn't perfect.
We do make mistakes.
We make more mistakes than

we would like to make.
But we need to learn from them.
We need to get better.
And we need to continue

to improve to avoid
those cases in the future.
Each time that

something happens where
we become aware of a bad

result, we use that as learning.
We use all that

feedback to continue
to improve it and make sure that

Google one day from now, five
days from now, 10 days from

now, 10 years from now,
is continuing to get better.
BEN GOMES: Many

people tend to think
that Search is really easy.
You type in a few words,

you get a few documents,
and the process feels very easy.
And in many ways, that's

what we want to achieve.
We want Search to be
very easy for people.
But behind that is an extremely
hard technical problem
of actually understanding what

people mean when they type
in a query, not
just matching words,
but actually understanding

language much better
over time so that we

can match the thing
you asked to the concept

that you were really
looking for in the documents,

and we can bring these two
things together.
It's an absolutely
fascinating problem
to work on, because it lies at

the frontiers of what computers
and computer science can

do and our understanding
of basic aspects of how we

wish to interact with computers
as human beings.
NARRATOR: As long as
there have been machines,
humans have tried to get

those machines to do more.
Of course, for most of

history, the machines
couldn't speak human.
So humans had to
come up with new ways
to tell machines what to do.
Joseph Jacquard used cards

with holes punched in them
to tell his loom, put the

thread here and here and here.
It made weaving complex

patterns easier.
Punch cards were a big idea.
They're how early computers

took instruction, did math,
solved equations.
NARRATOR 2: Holes
punched in the card
represent data to be
placed in the computer.
NARRATOR: Then computers

got screens and keyboards.
But you still couldn't talk to

it like you'd talk to a human.
You had to write it in code.
C colon, slash carat

smartdrv dot exe.
Once Search came along,

things got a little easier.
You just put in the words

you were looking for
and Google came

back with websites.
But you were still

writing in code--
"ice cream shop

27705," when really you
meant, "where can I get

some ice cream around here?"
BEN GOMES: As we
understand language better,
you should be able to

ask a question in a much
more natural way.
[CHEERING]
NARRATOR: What time

is tonight's match on?
Who do I call for a
tow truck around here?
Does anyone make a nail

polish that's safe for dogs?
BEN GOMES: So rather than you

having to craft keyword-ese
that the search

engine can understand,
we want to be able
to understand what
you had in mind in

the most natural way
you can express

it so that we can
satisfy that information

need with information that we
have available.
NARRATOR: We call this problem

natural language processing.
BEN GOMES: So where

are we in the space
of solving this problem?
[MUSIC PLAYING]
I think we've come a long ways,

but the journey's so long,
it's very hard to see

where it ends, right?
I mean, we began to
work on this problem
19 years ago with a system

that I worked on call Spelling
Correction.
We got to beyond that to

understanding synonyms
and how words are

related to each other.
But to go deeper, we needed

a different approach.
Google has been doing research
in something called machine
learning for almost a decade.
And Geoff Hinton was at

the forefront of that.
[APPLAUSE]
HOST: Please welcome Geoffrey

Hinton, the engineering fellow
at Google.
NEWSCASTER: When Geoffrey

Hinton began work in the 1970s,
people said artificial

intelligence was
the stuff of science fiction.
Today, he is
revolutionizing how we live.
BEN GOMES: Geoff Hinton

combined forces with Jeff Dean
at some point, and we began to

see these huge breakthroughs
in machine learning.
JEFF DEAN: If you look at

the last, say, 8 or 10 years,
machine learning has gone from

a small part of overall computer
science research to something

that is now affecting
many, many fields of endeavor.
BEN GOMES: And we realized

this could pay off
in a big way in helping

us do search better.
INTERVIEWER: What
kind of impact do
you hope deep learning

has on our future?
GEOFFREY HINTON: I
hope that it allows
Google to read documents and

understand what they say,
and so return much better

search results to you.
[CHEERING]
NARRATOR: A few years

later, a new development
in natural language
processing was announced.
They called it--
JEFF DEAN: Bi-directional

Encoder Representations from
Transformers--
it's a bit of a mouthful,

so we just call it BERT.
Research like this gets us

closer to technology that
can truly understand language.
NARRATOR: So BERT's a
big deal for Search.
At least it could be,

which brings us back
to this team from earlier.
It's going to be up to them--
Elizabeth, Jingcao, Sundeep,

Eric, and a few other folks,
to figure out how to get

BERT working in Search.
They named their

project DeepRank
after the deep

learning methods used
by BERT and the ranking

aspect of Search.
And also because it sounds cool.

SPEAKER: It's cool.
[MUSIC PLAYING]
ELIZABETH TUCKER: So
I think we're finally
getting going here.
One of the things

that we can do today
is talk through some

of the new evals.
When I first joined

the project, I
got really, really

excited thinking,
this system is doing

something pretty special
that most of our other systems

in Search probably can't do.
JINGCAO HU: We are still

at the very early stage
of building such system which

truly understands human beings.
But this project is

very unique in the sense
that this is the

first time for Search
we have a signal which

understands the relationship
between different terms.
That's why
we are very excited

about DeepRank
because we are hoping that

this could help us make Google
Search more intuitive

to use and make
it feel like Google Search

actually understands our users.
ERIC LEHMAN: --is the
most ambiguous wording.
So people use
language every day.
We don't even really think about

how we put sentences together.
It's just a tremendously

subtle thing.
Some slight changes of

wording can change the meaning
of what we're saying.
And it's very hard to write

a computer program that
captures all of that subtlety.
So it's actually
sort of interesting.
Early on in information
retrieval, which
is the science
behind Search, people
would tend to just give

up on these things.
So like a lot of
little connector words,
they'd simply ignore them.
They call them stop words.
They'd just throw them out.
I think we've learned over

time that those words often
have an important role in

communicating what we're trying
to say, communicating an idea.
And so through machine

learning systems like DeepRank,
we hope to pick up on these

subtleties of language
that humans get so naturally

but are so difficult to program.
So hopefully people will

be able to phrase Search
queries in a more
natural way for humans
and not suffer from this

problem that machines
don't get the subtleties.
NARRATOR: Eric makes it all

sound pretty straightforward.
But actually getting BERT

to play nicely with Search,
it's not going to be easy.
[MUSIC PLAYING]
SPEAKER: These all

look like the queries
where we would expect to

see wins from DeepRank,
like the longer

natural language.
ELIZABETH TUCKER: I
would have guessed that--
NARRATOR: The team starts

by testing their theories.
Months go by.
Progress is slow.
PANDU NAYAK: And it's

not trying to make
a distinction in
that rank, so I'm
just not that thrilled

with this part of it.
With change that is so

positive and so powerful,
there is a tendency to feel

like, oh, we should just get it
out there as soon as possible.

And so you have to temper
that with some pragmatism.
If this is where your

IS win is coming from,
that's not so thrilling.
Let's put it that way.
NARRATOR: For each

result that gets better,
others are getting worse.
SPEAKER: Single term queries

are also way more negative.
When we don't know what we're

doing, we're doing great.
NARRATOR: Each failure

requires a new test.
Each test requires rewriting

big chunks of code.
They don't have all

the time in the world.
Even just experimenting

with a system based on BERT
takes thousands of servers,

crunching quadrillions
of numbers.
ERIC LEHMAN: So DeepRank

needs an enormous amount
of computing power.
Google has tremendous resources.
But even by Google's

standards, this is a lot.
We have enough TPUs to

launch DeepRank, but barely.
NARRATOR: If they don't

show progress soon,
the resources will

go to some other team
with a more promising idea.

PANDU NAYAK: It'll all hinge on
getting a strong quality rank.
Let's put it that way.
ELIZABETH TUCKER: We can get--
PANDU NAYAK: If
we don't get that,
then we're not

getting the resources.
NARRATOR: Time is running out.
ELIZABETH TUCKER: I
would say, in general,
on many of the examples I see

when we have optionalization
on both sides, this is

actually someplace where
DeepRank typically does better.
But if once we mix

in the localness--
So we have these
high-level measurements
that we do to say whether

something is good or not.
Because if something's not good

for people searching on Google,
we are not going to launch

it, period, no matter
how great the technology is.
So this was the week where

we saw some really nice
experimental results.
And that was so reassuring.
I would like us to
go through some wins.
So one of my favorites is, what

temperature should you preheat
your oven to when cooking fish?

I was kind of fascinated
with this one.
ERIC LEHMAN: It
is a tough query.
Holy cow.
That's really, really nice.
NARRATOR: Here's what

they're so excited about.
Without DeepRank, the

Google search algorithms
were surfacing some good

information about cooking fish,
but they were also getting

confused, showcasing
a recipe for baking cookies.
When DeepRank was

tested on this query,
it understood that the

result was about cookies,
reducing the prominence

of the incorrect recipe,
and instead elevating

useful, relevant information
about cooking fish.
These are the kinds

of wins the team will
need to see more of if

they want their project
to launch and start improving

search results for billions
of people around the world.
ELIZABETH TUCKER: However,

before we can launch,
we need to get launch approval.
It's a formal process

where any change to Search
gets a lot of scrutiny.
Hi, guys.
So I'm feeling a little

pressure to like--
I don't know.
TULSEE DOSHI: Yeah,

Launch Committee.
[LAUGHS]
So Launch Committee is
essentially the final review
before you actually choose

to launch a project.
ERIC LEHMAN: I mean, I feel like

that we've seen that pattern.
TULSEE DOSHI: So when you

go to Launch Committee,
you're essentially
saying, hey, we
have a project that

we've built. We
have all this data that we think

shows that it's a good thing.
And now we're getting

approval to actually put it
into production.
[MUSIC PLAYING]
ERIC LEHMAN: There's always

a little bit of anxiety,
because the outcomes

of these meetings
are really important to people.
People have put a lot

of work into them.
And to have a change rejected

is pretty dispiriting.
JINGCAO HU: Before

the meeting, I
always feel like there are
things that I forgot to catch.
So I was going over the

launch report again trying
to see if there was

anything I'm missing.
There are lots of

stress, but also hope.
Like OK, no matter what, we will

have some reasonable feedback
from the launch discussion.
It may be over, or
it may be approved
and then we can launch it.
Regardless, it's
a big milestone.
NARRATOR: Jingcao has

every right to be nervous.
Around here, Launch Committee is

known for killing experiments.
Because despite
their best intention,
despite the months of

work that went into them,
most experiments never make

it out of the building.
CATHY EDWARDS: If you talk

to the average engineer,
they will have their share

of war stories of moments
that have been incredibly

frustrating for them.
But the flip side of

that is, there's not
many products that are more

impactful than Google Search.
So when you can ship

something that's really great,
it's really an amazing feeling.
ELIZABETH TUCKER: All

right, are we ready?
So we are here to get launch

approval for DeepRank.
DAVID BESBRIS: Launch committee

is the meeting where we all
get together, look

at the metrics
and argue with each other.
[INTERPOSING VOICES]
PANDU NAYAK: That's not

what this is saying.
This is saying, when site

diversity increases, the--
DAVID BESBRIS:
Generally speaking,
the engineers don't

present their own work.
SPEAKER: So let's take a look

at the logic parametrics.
DAVID BESBRIS: They're

there often for context
and to answer questions.
But your work is

presented by an analyst,
because we want the analyst to

be an impartial third party.
Because it can be
a little tough.
ELIZABETH TUCKER: There is

a slight issue in the way
the metrics are calculated.
PANDU NAYAK: It's

important to realize
that most of the changes

we make in Search
are not ones that are 100% good.
There are always

wins and losses.
BEN GOMES: There

is only one thing
that is [INAUDIBLE] positive

but is not out of the noise.
PANDU NAYAK: Actually,

the one that I
think is particularly
worth looking at
is the long tail asset, right?
BEN GOMES: Yeah,

let's look at that.
PANDU NAYAK: So
one of the things
that the Launch

Committee is doing
is to weigh these
wins and losses.
BEN GOMES: Wow.
ELIZABETH TUCKER: It's pretty

clear from the wins and losses
there are some interesting

relationship understandings
going on in here.
However--
PANDU NAYAK:
DeepRank illustrates
some really nice wins we

get from understanding
language and the

nuance of language.
SPEAK: This is my favorite win.
Can you get medicine

for someone pharmacy.
It's a very beautiful

natural language one.
You see--
SPEAKER: Yeah, it's an

important question, right?
Can you pick up medicine

for somebody else?
SPEAKER: This is wonderful.
SPEAKER: And DeepRank brings

up this very relevant,
very specific result.
SPEAKER: You can imagine

why this happened.
Because before, all those words

like "for" and maybe "you"
and "get," they're all stop

words, largely ignored.
And now, because of

BERT, it actually
understands that those are

very important to [INAUDIBLE]..
SPEAKER: Yeah, yeah,

but for someone
is a really hard
concept to get in IR.
PANDU NAYAK: We saw some

wins that was really, really
beautiful in various ways.
BEN GOMES: So point

two is, I think,
one of the biggest changes

we have seen in a long time.
Because you're getting to more

semantics and all over here
when you're ranking.
PANDU NAYAK: When

that's all you have.
You don't have other
signals, right?
And so this is
where it can excel.
BEN GOMES: All right, this

seems like a great launch.
Really excited about this.
DAVID BESBRIS:
When it's all done,
the coordinator of
the Launch Meeting
just changes a field

in a spreadsheet,
changes it from blank to Yes.
It's a very momentous occasion.
SPEAKER: Approved-- we'll mark

this as Search [? Leads ?]
Flagged, I'm guessing?
[LAUGHTER]
ERIC LEHMAN: This was a very

positive launch meeting.
The decision is to
launch DeepRank.
ELIZABETH TUCKER: I thought

I wasn't feeling nervous.
But when the moment

came, it felt
so good to get that approval.
JINGCAO HU: [LAUGHS]
[SIGHS]
ELIZABETH TUCKER: Thanks, guys.
SPEAKER: Awesome.
SPEAKER: Pretty darn cool.
ERIC LEHMAN: Yeah,

so after a launch,
you might imagine there's
some great big celebration.
More typically, people stand

around the meeting room
a little awkwardly
for a few minutes,
and say, hey, good job.
And then they nervously

shuffle back to their desks
and try to catch up on life.
And probably
that'll happen here.
Maybe we'll do something a

little bit more in this case.
It was a pretty
remarkable project.
ELIZABETH TUCKER:
Congratulations.
NARRATOR: In the moment,

this approval feels big.
It feels significant.
But in the grand

scheme of things,
it's just another step

forward, an improvement,
just like all the

others that came
before it, that helps make

Search a little bit more useful
than it was yesterday.
ELIZABETH TUCKER: We
will work on that.
[LAUGHTER]
I think there was a

promise there of something.
[MUSIC PLAYING]
PANDU NAYAK: Solving
the Search problem
is not easy, that's for sure.
We've been at it for

20 years, and I think
there's still a lot to be done.
CATHY EDWARDS: Humans

have more access
to information than at
any other time in history.
And I really feel

like it's our job
to make sure that they're

connecting with the highest
quality, the most authoritative,

the most relevant information
for them, and that

they're really
able to access the

information that makes
a difference in their lives.
[NON-ENGLISH SPEECH]
PANDU NAYAK: This is

sort of a core value,
and we feel deeply

responsible to our users
to make this happen.
URS HOLZLE: What is

Google in 20 years?
It's very hard to

predict the future.
I would never have

predicted 20 years ago
how Google looks today.
The mission will still be

there, making information
accessible to people.
And I think the thirst
will still be there,
that people really want to

find the things that they're
looking for.
BEN GOMES: Information

really releases things
that are in people's potential.
It enables them
to make decisions
that they couldn't make before.
It enables them to know about

things that they couldn't know
about before, to know

about things in the world,
to know about the

people around them.
And I hope it also improves

their understanding
of the world around

them as they do that.
[LAUGHTER]
And I believe that

our role in Search
is to actually help serve that

curiosity in people, to help
them find that information

that they are looking for,
that takes them on the next step

of their journey of curiosity.
NARRATOR: All kinds of people on

all kinds of journeys, curious
about the thing holding

them back, curious
about the thing

pushing them forward,
people searching for
themselves and their families,
just like people always

have and always will.
BEN GOMES: And while that

curiosity lives on in us,
I think our job here in

Search is never done.

(English) Trillions of Questions, No Easy Answers - A (Home) Movie About How Google Search Works (DownSub - Com)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(English) Trillions of Questions, No Easy Answers - A (Home) Movie About How Google Search Works (DownSub - Com)

Uploaded by

Copyright:

Available Formats

[TYPING SOUND]

as Louis the Great,

and Louis the Sun King.

Famous for supposedly

But in 1685, even the

representative of God on Earth

answer on his own--

questions about the ruling

How big is it?

How many people

What can they teach

sent members of France's

on a voyage that would span

Their task-- gather

satisfy the King's curiosity.

It was a journey with

But five years four

later, Louis's answers

he had become curious,

and learned a new

just like billions of people

and billions who

People who had access to cave

scrolls, books, the

semaphore towers, telegraphs,

Betamax tape, and the

internet system called Minitel.

[ELECTRONIC VOICE SPEAKING

Which brings us to today.

Fishermen looking up when

Careful cooks wondering

to say Chapstick in Turkish.

won the '92 NBA Eastern

Job seekers looking

And a fourth grader looking

dynasty for a history

Billions of King Louis

to give them an answer

Now, who would sign up

BEN GOMES: Interesting setup.

NARRATOR: This is Ben Gomes.

BEN GOMES: Well, the correct

NARRATOR: This is Ben Gomes.

BEN GOMES: But I say "Gohms."

It's a Portuguese name.

NARRATOR: This is Ben Gomes.

Uh, that search.

Anyway, he's kind of a

try to convince you otherwise.

But that's now where

BEN GOMES: So I was born in

But at a very early

And there was a few books at

And that's the information I had

one torn encyclopedia that I

So it was really out of date.

a bike, which my parents thought

and a much better encyclopedia.

And I was actually

about the encyclopedia--

this is where geeks come from--

than the bicycle.