Search Off The Record - 26th Episode: John Mueller

Search Off the Record - 26th episode
[00:00:01] ♪ [music] ♪
[00:00:10] John Mueller: [00:00:10] Welcome, everyone, to the next episode of the Search Off the
Record podcast. Our plan is to talk a bit about what's happening at Google Search, how things work
behind the scenes, and who knows, maybe have some fun along the way.
[00:00:24] My name is John Mueller. I'm a Search Advocate on the Search Relations Team here at Google in
Switzerland. I'm joined today by Martin and Gary, both also on the Search Relations Team, and today, we'll
be talking about the future of SEO.
[00:00:41] ♪ [music] ♪
[00:00:46] Gary Illyes: [00:00:46] Talking about the future, what could possibly go wrong, John?
[00:00:49] John Mueller: [00:00:49] The future! I don't know. I mean, looking back and looking forward a little
bit. One of the big changes I've been seeing over, I don't know, maybe the last ten or so years is how more
and more sites are moving to hosted platforms where, basically, you don't have to run your own server
anymore or where it's almost better that you don't run your own server anymore because you don't have to
deal with all of the technical infrastructure that's associated with everything around... like a server.
[00:01:22] And to me that seems like, I don't know, it seems like a reasonable move because you're kind of
off-loading a lot of the technical details, and it gives you more room to focus on what actually matters, like the
content that you're creating. So I guess, in a sense, if we think about SEOs, it means SEOs won't need to
learn HTML anymore, right?
[00:01:45] Martin Splitt: [00:01:45] Oh no, no, no. Abort mission, abandon ship. No, John, no.
[00:01:51] John Mueller: [00:01:51] Well, I mean it's like if you just have a a rich editor and you just type
things in and then you format your text properly and you add some links. What do you need to do with
HTML?
[00:02:03] Gary Illyes: [00:02:03] I mean, SEO is more than just doing the text part, right? It's not just about
writing the content.
[00:02:10] Basically, if you hire a copywriter, that could do the content for you. But SEO is also about link
tags and meta tags and title elements and all those weird things in the head section of the HTML that you
can put there.
[00:02:25] So you kind of want to know about them to control how your snippets look like or how your titles
show up in search results and the rel canonical tag to control what will be the-- or what should be the
canonical version of a URL. You kind of want to know that.
[00:02:44] John Mueller: [00:02:44] But couldn't you just have that in your CMS? It's like you just have like a
big field for text and then you have some extra fields for metadata.
[00:02:54] Gary Illyes: [00:02:54] But if you just started a website, then do you want to learn, for example,
what is rel canonical? Could you explain to me, as an absolute noob, what is a canonical URL and what is its
relation to rel conical?
[00:03:12] It's very hard to simplify it. Or href link, for example. Try to explain href link to someone who just
came to the Internet and they've never even thought about localization before.
[00:03:23] Martin Splitt: [00:03:23] I mean, explaining href link to someone who has been on the Internet for
a decade is hard too.
[00:03:30] Gary Illyes: [00:03:30] I mean, there are people who understand it and can do a good job.
[00:03:33] Martin Splitt: [00:03:33] That's true. Also, having worked with CMSs in the past, I've seen that the
ones that are pretty much the more successful ones allow you to always insert custom HTML, because an
editor that needs to be as simple as possible for everyone to use it and to create content should also give
you the opportunity to say: "And now, I want to jump out of this simple mode and actually go and do
something more advanced."
[00:03:58] And that usually contains HTML, or means that you need to write HTML. So if you don't know any
HTML, you'll be very, very quickly out of your depth there. So I think that's a risky direction to take.
[00:04:11] Gary Illyes: [00:04:11] Also, when we are launching new meta tags for example, or directives,
robots directives, for example, then we kind of rely on SEOs to make use of them instead of site owners use
them, and we kind of tailor the documentation towards SEOs versus site owners.
[00:04:31] Like, for site owners. If you look on .dev site, in our search documentation, then there, the
site-owner tailored content is very simplistic and for good reason. They need to get the basic information that
can get their site do OK in search, but then if you want to take the next step, then you either learn more
about search and how you can do stuff on your website that makes your site do better in search or you hire
an SEO.
[00:05:03] And then we have the SEO documentation which is much more in-depth, I guess, and explains
things differently with the assumption that the person who's reading it already knows stuff about search and
how things work and how things connect to it. You probably don't want to throw the site owner into that water,
just when they started with their site.
[00:05:27] John Mueller: [00:05:27] OK, so I guess taking a step back, you're also saying that SEOs should
know HTML from the beginning, even now.
[00:05:35] Martin Splitt: [00:05:35] Yeah, I would say so. I think it's one of the fundamental technologies that
make the web what it is. And if you want to work with it, you should know at least a little bit.
[00:05:44] Gary Illyes: [00:05:44] Yeah, it's not like JavaScript where you don't have to know JavaScript to
create a good website. Oh, I love Martin's face, but with HTM-- you can't avoid HTML, the it's the very basic
building block of the web.
[00:06:01] John Mueller: [00:06:01] So HTML is not going to go away?
[00:06:03] Gary Illyes: [00:06:03] No. It's also such a progressive technology in the sense that with a little bit
of it, you can do a bunch of stuff already, and then you can expand step by step. Whereas for instance,
JavaScript, you just need to know a lot of stuff to actually get something going. And if you make a mistake,
potentially, your entire code doesn't work. Whereas in HTML, the browser will do its best to actually still make
it work somehow. So the more you know the better it is, but I don't think everyone needs to be an HTML
expert, you just need to understand what HTML is and how to use it.
[00:06:36] John Mueller: [00:06:36] OK.
[00:06:37] Martin Splitt: [00:06:37] So John, do you think SEOs don't need to know HTML?
[00:06:41] John Mueller: [00:06:41] Oh, I don't know. I mean, it's a question that comes up every now and
then, and it's... I think a part of the reason behind that question is also that there's just so many different
things that SEOs do, and some of them focus purely on content or focus purely on building relationships with
other websites. And for some of those activities, you don't need to know HTML, but there are also lots of
technical on-page things that you do have to use.
[00:07:14] I mean it's... I don't know how that will evolve, if those directions will kind of merge more or
separate out more. But yeah, I mean it's always interesting to see HTML is not going to go away, so you
might as well get used to it. And if, as an SEO, you don't know anything about HTML, then maybe it's time to
actually try things out.
[00:07:41] ♪ [music] ♪
[00:07:44] John Mueller: [00:07:44] Another thing that I think is is happening more and more and that we'll
see more of in the future is kind of the migration from native apps more and more into web apps which we've
seen with some of the PWA things that are happening, but also with more and more...
[00:08:05] I think the user is kind of expecting to be able to use any app that they have in any platform, any
device that they use. And it feels like that kind of work is going to continue as well. And probably, that means
things like understanding JavaScript will become more and more important for SEOs as well, which will
probably make Martin happy.
[00:08:29] But it probably also means that a lot of these apps suddenly have to think about SEO in general.
Like what do they actually want to have findable on the web, because in the past, they were just apps.
[00:08:42] Gary Illyes: [00:08:42] Yeah, that's actually a big topic and a very interesting one, and one that I
think is... has been in the past underrepresented or under investigated or under researched.
[00:08:57] In the future, I think a lot of our applications will just happen to run in the browser and you can
already see that like you have so many APIs, and opportunities. You can have a video chat in the browser.
No one has to necessarily install a client to-- or like a desktop app or mobile app to do a video chat. Video
chats have been quite popular in the last couple of years and I think that has shown that applications can
shift to the web even if they are a little more intricate. But then the question becomes, how do you represent
that to someone who searches the web? And what kind of content do you want to highlight?
[00:09:34] And we had this challenge at a company that I worked for in the past where they would create
interactive 3D models of real estate spaces like apartments or offices and you could furnish them and walk
around them. You could basically do like a virtual viewing in the browser of a different space. You could even
do that in AR and try out different furniture in your own home and you could-- and all of that in the browser.
[00:10:01] And then the question became: "OK, so we have this amazing application that happens to run in
the browser. But if a search engine looks at it, and because it is all visual, it is a black box for a bot for a
computer. As far as Googlebot or any other computer looking at the website would go, they would see a title,
a meta description and then
a canvas, which is effectively a big rectangle of pixels that they have no idea what they represent or what
they mean." And then the question became: "How do we... how do we get that into a search engine?"
[00:10:34] So for instance we made a virtual model of Don Draper's apartment in Mad Men. We also made
the Simpsons family home, a 3D model that you could visit in the browser, which is really, really cool. But if
you search for the Simpsons house or Don Draper's apartment, you wouldn't really find our 3D model,
because as far as as Google Search and other search engines are concerned, our page is really, really low
on content and not very relevant to the query that you enter.
[00:11:02] So how do you do that? And then there's obviously a bunch of strategies, but I think that will be a
topic in the future where SEOs need to identify together with the people making the app and using the app
on what content do we want to expose? How do we expose it so that it is useful and understandable to
search engines? And that will also be something that search engines will work on and we can already see
that 3D models for products or like white sharks or tiger or whatever. There are 3D models for this so that
you can get a spatial feeling for how these things look and interact with their environment.
[00:11:42] Maybe we'll see more of that in the future. I don't know. But it's going to be a huge field that is
really, really interesting, and with that, also a lot of JavaScript comes to the web that we, as a search engine,
need to understand and run, and that might mean that technical SEO might get a lot more technical and a lot
more tricky in terms of how they work with JavaScript. I don't know, but it sounds like something that not
many people I think have thought about at this point.
[00:12:10] John Mueller: [00:12:10] So it sounds a lot like you have to combine the SEO strategies that you
have for existing sites, suddenly, with completely different site models where--
[00:12:22] Gary Illyes: [00:12:22] Yeah.
[00:12:23] John Mueller: [00:12:23] maybe their developers have purely focused on the technical aspects,
like: "How does this 3D model actually work?" And at some point, you combine it with the marketing side of
SEO and kind of like: "How do I package information in here so that text-based search engines can actually
make use of that?"
[00:12:42] Gary Illyes: [00:12:42] Yeah. Yeah. And also how can users navigate this application? Because if
you give me, let's say, an empty paint application and I don't necessarily know if that is the application I need
to do what I need to do. So then, how do you package this content and how do you interlink functionality with
content? And yeah, that's going to be interesting to see
[00:13:03] John Mueller: [00:13:03] Cool. OK. So that's like one area where I guess SEOs will be required,
almost. That sounds pretty good. What kind of things do you think will stay the same? You mentioned HTML,
both of you, the future will be based on HTML on the web.
[00:13:24] Gary Illyes: [00:13:24] I think so.
[00:13:24] John Mueller: [00:13:24] JavaScript, you mentioned, is JavaScript going to be the thing or is
there going to be a new Flash kind of technology that comes out and replaces JavaScript?
[00:13:34] Gary Illyes: [00:13:34] I think as JavaScript has evolved so much over the last decades, I pretty
much think that it will continue to evolve and efforts at replacing it, like Flash, Or Java or Moonlight and
Silverlight and all the other lovely things that that happened have historically been not very successful. So I
would be very surprised to see that go away and be changed for something else.
[00:13:56] John Mueller: [00:13:56] OK, so HTML and JavaScript, SEOs will need to continue working on
that. What about more basic things like URLs? Are URL's going to go away in favor of entities or IP
addresses, or I don't know... How do you see that evolving?
[00:14:15] Gary Illyes: [00:14:15] Fortunately, URLs cannot go away.
[00:14:17] John Mueller: [00:14:17] What do you mean?
[00:14:18] Gary Illyes: [00:14:18] At least not in the foreseeable future, because the URLs they are the
standard way to communicate addresses on the Internet. And without that the Internet is just not the Internet.
The same way domain names cannot go away because of how the Internet is built or IP addresses cannot
go away because of how the Internet is built. The same way URLs cannot go away.
[00:14:42] John Mueller: [00:14:42] OK.
[00:14:43] Gary Illyes: [00:14:43] If you think about it, how hard it was to introduce IPV6 to the Internet? It
took many, many years to introduce it. Not to replace IPV4, but to introduce a new format for IP addressing.
Changing URLs, that would be even crazier than changing IPs or IP formats.
[00:15:04] John Mueller: [00:15:04] OK.
[00:15:05] Martin Splitt: [00:15:05] I mean, we did change IP formats when we changed from IP version 4 to
IP version 6, but IPs will stay around. And so, I think, well, URLs, they might look different, but they'll stay
around, I think.
[00:15:16] John Mueller: [00:15:16] So by look different, do you mean, instead of path and filenames,
everything will be parameters and it'll be like a machine learning hash instead of words?
[00:15:30] Martin Splitt: [00:15:30] Maybe, or maybe we will decentralize the web somehow and everything
will be identified by the hash of the content. And then you might get it from different sources, who knows. But
I think URL's, in terms of addressing contents on the network, will stick around.
[00:15:50] John Mueller: [00:15:50] OK, so I guess like the the URL-based mechanisms will also stick
around. Like at least, looking forward, maybe five or ten years, it feels like a long time for the web, but at the
same time, it's like looking back ten years, like what has changed on the web? Not much. It's like different
ads, more cat videos, which is kind of sad, but the other URL-based mechanisms, I guess would also be
similar.
[00:16:18] You mentioned the rel canonical. That seems like something that will stick around or do you think
something will come along and be able to replace the rel canonical?
[00:16:30] Gary Illyes: [00:16:30] I mean, there's no need to replace it. Usually these changes are prompted
by a need, and unless there is a need for changing rel canonical, something that's extremely widely used,
why would we want to try to change it. When we know that something is broken and we need to come up
with something else, then we might change it or try to find new solutions for the same problem. But if it's not
broken, we generally don't want to touch it.
[00:16:59] John Mueller: [00:16:59] But if we can... I mean, the rel canonical is kind of a mechanism to let
search engines know that two pieces of content are the same and you should pick this one. It feels like at
some point in the future, we will be able to look at pages and say things like: "Oh, it's like pretty much the
same. We'll just pick one of these."
[00:17:19] Gary Illyes: [00:17:19] I mean, technically, we could do that already. We appreciate the help that
we get from rel canonical, and we use it quite aggressively, in fact, in canonical selection but it would work
without it. If we remove that criterion from a rel canonical selection, it would work. But then people would
have less control over their desired canonical URL, and that's not something that we want.
[00:17:46] We do want to give people control over what we show in search results, what the canonical
version of the URLs are and rel canonical is just a good match for that and that's why we also standardized
it. There's an internet draft for that, or I think it's an internet draft.
[00:18:02] John Mueller: [00:18:02] OK, so I guess I think that's also a really interesting aspect, because on
the one hand there's all of the machine learning work that's being done to kind of automatically understand
things better. But the control aspect is something that machine learning can't really replace there because
that's like... that's my personal preference kind of thing and less understanding of a piece of content.
[00:18:31] Gary Illyes: [00:18:31] Yeah, what about meta tags in general? I mean, we have talked about href
link and we have talked about canonicals, but do you think we will need more meta tags in the future, or will
the need for meta tags go away?
[00:18:45] Martin Splitt: [00:18:45] I hope that we are not introducing more meta tags. And usually, when
you see internal threads about, like, this search team wants to introduce a new meta tag. Then usually both
John and I jump on that thread and we are pushing back quite aggressively because there's very rarely a
good reason to introduce a new meta tag. Usually, there is already something that might be used for that, like
for example, someone wants to let people control whether the content can be translated. And then they want
to introduce a new meta tag.
[00:19:18] It's like, well, is your translate service a robot? Well, technically yes. Then just use the robots meta
tag and then just introduce a new directive there. Don't introduce a completely new meta tag, because then
people just have to pile or to learn actually a new meta tag, which is not necessarily a good thing. I think we
don't want more meta tags, and I hope that we are not going more meta tags. But then teams have weird
ideas and unfortunately, John and I are not always there to fight back.
[00:19:50] John Mueller: [00:19:50] I mean, it's also a matter of control that you mentioned. Where
sometimes, site owners have very strong preferences one way or the other and it's useful to listen to them
because we kind of want to work together. But it's it's always a bit tricky. So I guess robots.txt falls into
the similar category of on the one hand, it's a URL, so it'll probably
be the same. And the other hand it's about site owners' preferences and controls. So probably, that will
remain as well.
[00:20:22] Gary Illyes: [00:20:22] One correction there. It's for URIs.
[00:20:25] John Mueller: [00:20:25] URIs. Oh, what's the difference?
[00:20:28] Gary Illyes: [00:20:28] I mean, URL is a form of URI.
[00:20:31] John Mueller: [00:20:31] OK.
[00:20:31] Martin Splitt: [00:20:31] Could you give us an example of a URI versus a URL?
[00:20:36] Gary Illyes: [00:20:36] App indexing, deep links, for example, that's a URI and not a...
[00:20:43] Martin Splitt: [00:20:43] not a URL.
[00:20:44] Gary Illyes: [00:20:44] ... URL.
[00:20:45] Martin Splitt: [00:20:45] Yeah, OK.
[00:20:46] John Mueller: [00:20:46] And you do that in robots.txt too?
[00:20:48] Gary Illyes: [00:20:48] So one of the things that we tried to do with robots.txt when we
started the process of standardizing it was to expand the language to accept URIs versus URLs because the
original de facto standard that was describing the protocol to be used with URLs, and we had to change
some wording, I don't recall exactly how, but we had to change some wording, plus the ABNF language to
make it work on URIs because If we already have a protocol that can do a very good job controlling crawling
on the Internet, then why would we want to introduce yet another one in case a new form of URI shows up
on the Internet and invent a new control mechanism for crawling those. And that's why we expanded the
language to accept URIs to be used with robots.txt versus just URLs.
[00:21:43] John Mueller: [00:21:43] Oh, cool. OK. I totally didn't know about that. That's that's pretty cool. So
basically, if you, as an SEO, work to understand the foundation of robots.txt and how the the matching
goes there, then if some new form of URI pops up and becomes popular in the future, then they can keep
building on that. OK, that's cool. OK. What about things like structured data? You mark up a product, you put
a price...
[00:22:16] Martin Splitt: [00:22:16] Oh-h-h!
[00:22:17] John Mueller: [00:22:17] Do search engines really need to have structured data? Can't you just
look at a product page and recognize it's a product page? Come on!
[00:22:27] Gary Illyes: [00:22:27] I just want to say that I have very strong opinions about structured data.
[00:22:30] Martin Splitt: [00:22:30] I expect structured data as in terms of the data that you present is if you
do things right, it's superfluous, but I think of structured data as a way to opt in to certain features of Search
and other products that we offer so that you basically say I add structured data specifically so that I don't
accidentally end up in certain products or services, but I specifically say this is a website that contains
product information. So, you know, if there is someone out there who specifically looks for that, they might
pick it up and then use it in some sort of user experience or some sort of app or service or whatever. And I
like it for that. I like it as a kind of like implicit agreement to provide this information in a more structured form.
But I don't think many people think about it like that.
[00:23:30] John Mueller: [00:23:30] OK, so it's almost like a control mechanism,
[00:23:34] Martin Splitt: [00:23:34] Kind of, yeah.
[00:23:35] John Mueller: [00:23:35] where you kind of say: "Well, I'm OK with Google understanding this is a
product page."
[00:23:39] Martin Splitt: [00:23:39] Yeah.
[00:23:40] John Mueller: [00:23:40] Google probably understands what a product page is anyway. At least
looking into the future where machine learning is everywhere.
[00:23:49] Martin Splitt: [00:23:49] Yeah. I'm pretty sure we can understand: "Oh, this is a product, and the
product's name is this and the product's price is that and this is a product image." But it is kind of nice to
have this explicit machine-readable information where you can say: "Oh, so they specifically want us to think
of it as a product." It's basically a glorified meta tag that says is product page and then the value of that meta
tag would probably be true or something like that.
[00:24:19] John Mueller: [00:24:19] OK, but that almost sounds like the rest of structured data is is going to
be optional at some point in the future. Not like next week in the future, but in 10 years or so. Who knows
what machine learning progress will have happened and we should be able to look at a page and say: "Oh,
these are 12 attributes of this page, and if someone searches for an attribute we should be able to match
that."
[00:24:51] Martin Splitt: [00:24:51] Generally, yes, but there's many ways of doing things on the web, and
even with machine learning, there might be creative ways where a machine does not necessarily pick up the
information correctly. So having it spelled out quite literally is probably helpful nonetheless, even in the near
future.
[00:25:07] John Mueller: [00:25:07] Yeah.
[00:25:08] Martin Splitt: [00:25:08] I don't know, maybe not.
[00:25:09] John Mueller: [00:25:09] I don't know. Gary, what are your really strong opinions or...
[00:25:13] Gary Illyes: [00:25:13] These are more like internally strong opinions because we have a very
strong team and leadership that focuses a lot on structured data, and there's massive use of structured data
in indexing and also in understanding entities. And my opinion is that yes, structured data is amazing for
these kind of things and to power features, but we certainly can get to a point where we don't need it
anymore and we do have the granular controls that enable people to opt out of different presentations.
[00:25:49] For example, you could add a span tag, well, span element and markit with datanode snippet the
part of the text that you don't want to see in a rich result like for example, if you don't want your price to show
up in a rich result, then you could just opt that out, and you don't even have to wait for the price to end up in
a rich result. You could do that proactively as well as many people do it already with, for example, their news
articles where they actually opt out complete paragraphs or even complete news articles from the page or
from search results.
[00:26:25] So I don't see a reason why people couldn't do that with rich snippets, especially because we are
getting better understanding these, for example, product pages. We are getting there where we are really
good at figuring out that this is a product page and this is the image of the product and this is the price. This
is the stock, whether they have inventory of that particular item. I think it's a matter of time we start using
that. It's not an if anymore, it's more like a when. And if we jump ahead 10 years or 15 years where our
computers are actually way better than they are now, then that will just enable us to do more of this and also
better.
[00:27:13] And when I say we, I mean search engines in general, not just Google because we definitely see
other search engines do amazing things with language understanding and machine learning in general. So it
is coming. The question is when it's going to land.
[00:27:29] John Mueller: [00:27:29] OK.
[00:27:31] Martin Splitt: [00:27:31] And I guess for, at least for the foreseeable future, it'll still be there as
something where if you have strong opinions about what your page should be and what the attributes should
be, then you can specify it there. And at some point, it'll be almost like, I don't know, like a rel canonical
where it's like if you don't care, then you don't have to do it, and we'll try to figure it out. But if you do care,
then you can tell us what your real product name is and we won't have to try to identify it on a page.
[00:28:03] Gary Illyes: [00:28:03] I mean, it could also work as an override. I imagine that there would be a
transition period where we move from structured data machine learning data, and then in the transition
period, it could be also used as an override. Like for example, you provide something on the page, we
misunderstood it, you notice that we misunderstood it, then you could provide structured data to correct what
we show in search results if you want that piece of data to be shown.
[00:28:28] John Mueller: [00:28:28] OK.
[00:28:29] Gary Illyes: [00:28:29] Maybe that would work.
[00:28:31] John Mueller: [00:28:31] Cool. Yeah. So I guess SEOs still have to think about structured data,
and at the very least they'll have to think about what they actually put on their pages and to be clear with
regards to the actual page's content. So it's almost like good content will continue to be important for SEO!
[00:28:51] Gary Illyes: [00:28:51] Well, that's a shocker.
[00:28:52] John Mueller: [00:28:52] No shocker. OK.
[00:28:54] ♪ [music] ♪
[00:28:57] John Mueller: [00:28:57] Getting things into search engines, do you think that will change. Like
the crawling part of the web, that feels so antiquated. It's like: "Oh, you find one URL and then you look to
see if there are links to other pages and then you request those pages. Couldn't we, I don't know, just get a
dump of all HTML pages on a website and we'll just process that at once.
[00:29:23] Martin Splitt: [00:29:23] I am so excited to see how that's going because currently, there is this
push towards a more push based approach. So right now, it's a bit of a pull thing where search engines kind
of make a decision which URL to crawl when and how often. So they go and pull information from people's
websites. I know that other search engines, namely Bing, is experimenting with a push approach where I, as
the website owner, proactively tell the search engine: "Hey, there is information on this URL. Please come
get it."
[00:30:00] We are experimenting with it as well for certain use cases, I think, like livestreams and something,
or life blogs or something like that and job adverts. I think we are using an index push-based approach. But
looking in the future, now this is kind of nice and useful because there's only very few people using it and
very few URLs being pushed, if you compare to the size of the Internet as a whole or of the web as a whole.
[00:30:29] And and thus, obviously, we can give those priority that are pushing us proactively or pinging us
proactively. But if everyone does it in the future, let's paint a picture, three years our indexing API opens for
all websites and all pages, and Bing does it as well and everyone wants to be in Bing. I think Bing is now
open for everyone already, but I'm not sure how many people are actually using it, I don't think they publish
this information anywhere. But let's assume like everyone pushes all their pages all the time.
[00:30:59] First things first, there is additional work on the side of the website creators, of the website
owners, because they have to feed this information to the API somehow. So there will be specific tooling that
does this for you or you have like a script that runs every hour and pushes any potentially updated pages to
this API. So it's actually more effort on your side.
[00:31:22] And then also, if everyone does it all the time, I don't see how anyone, Google, Bing, whoever else
would be able to process this with the same high priority as they do it today, and then we would be back to
square one, I feel, where they have to schedule things and then just pushing it to the API doesn't mean that it
gets indexed right away or indexed at all because there will be delays, there will be scheduling, there will be
dismissal of spammy or bad URLs and I wonder how that's going to look like in a couple of years and if there
are some certain solutions to this problem that I'm not seeing, but I can't really imagine one. But I would love
to be wrong on this.
[00:32:07] Gary Illyes: [00:32:07] I think one more problem with push is the amount of spam that you will
ingest, and we've seen this with the the submit URL feature that we had on google.com where, I don't
remember the exact number, but the vast majority of the submissions was spam and not low quality content
or something. It was spam. It was very obvious that it was Spam.
[00:32:38] And then I'm just very skeptical about exposing more push interfaces because of that reason,
because of spam, because it just opens yet another door into search engines for spammers to push spam.
And do we really want that or we want to just find our way to good content?
[00:33:05] Martin Splitt: [00:33:05] But is that a filtering problem or an inherent problem that can't be
solved?
[00:33:10] Gary Illyes: [00:33:10] That's a good question. It can be tailored or filtered to some extent, but
then, you end up with false positives where you filter stuff that you shouldn't have, and then people get
grumpy about it. People who who are trying to spam also get grumpy about it and they create lots of noise,
externally, and then it's your and John's and my job to keep them at bay. So, is that nice?
[00:33:37] Martin Splitt: [00:33:37] No.
[00:33:38] Gary Illyes: [00:33:38] No! What I would actually really want to see, and we are kind of working
on some solutions is to be more intelligent about crawling. Because if we are more intelligent about crawling
and we are not hitting sites repeatedly for the same URL, for example, or we are more intelligent about
discovery, then we are not wasting resources. We are either on the site's side or on our side and we are just
doing better job at getting content into the index. And I think that's much nicer, and that also leaves the thing
that SEOs or technical SEOs do nowadays there for them to work on.
[00:34:20] John Mueller: [00:34:20] So you're saying links will not go away?
[00:34:23] Gary Illyes: [00:34:23] Why would you bring up links?
[00:34:25] John Mueller: [00:34:25] Well, it's like crawling.
[00:34:26] Gary Illyes: [00:34:26] Why?
[00:34:26] John Mueller: [00:34:26] It's like crawling through a website. You need links.
[00:34:29] Gary Illyes: [00:34:29] But why, why, why?

[00:34:32] John Mueller: [00:34:32] Oh, so you're saying links are going away?
[00:34:35] Martin Splitt: [00:34:35] I'm saying out of this.
[00:34:36] Gary Illyes: [00:34:36] No, why are you twisting my words? Why are you bringing this up, even?
Links shouldn't go away.
[00:34:45] John Mueller: [00:34:45] OK.
[00:34:45] Gary Illyes: [00:34:45] I think we shou-- Well, we got better at using them, and perhaps, we don't
need as many links as people believe to do ranking well. But I don't think they are going away. They are the
same as HTML, they are basic building blocks... Well, because they are HTML, and they just cannot go
away.
[00:35:14] John Mueller: [00:35:14] OK. OK. Cool. More things not going away. Sounds like SEOs will have
a future at work anyway. What about things like keyword research?
[00:35:25] Gary Illyes: [00:35:25] What's that?
[00:35:26] John Mueller: [00:35:26] It's like when you research specific topics that people are interested in
and then SEOs and try to encourage writers to write about these topics because they would drive attention
and search.
[00:35:38] Gary Illyes: [00:35:38] I guess it will stay.
[00:35:39] John Mueller: [00:35:39] OK. OK. What about content in general? It's like with all of these text
generation algorithms, basically, you just tell the machine what the topic should be and it'll create a full page
for you, right? So writing will go away?
[00:35:58] Gary Illyes: [00:35:58] I think that could be a topic on its own for a future podcast episode
because we can see the pros and the cons of machine-generated content, and we are quite strict about what
we allow in our index. But on the flip side, you can also see very good and smart machine-generated-- I don't
know if smart is a good word, but very intelligent machine-generated content.
[00:36:26] I recently saw a short article about yeast, for example, and it was generated by GPT-3, search for
it on your favorite search engine if you don't know what it is, and it was very well written. I couldn't tell that it
was written by a machine. And then there's the thing that if you can't tell that it was written by a machine,
then does it matter if it's in search or not?
[00:36:54] John Mueller: [00:36:54] OK, yeah. So it's a good content. It's OK. But what if the machine
makes stuff up. It's a topic about yeast and it tells you, you put gasoline into bread and then it generates
yeast and it's well written English. But anyone who knows the topic is like: "This is wrong."
[00:37:14] Gary Illyes: [00:37:14] Well, but it also depends how the the content was generated or what were
the sources for it, right? Like how it was taught. Yeah, I think this deserves its own podcast episode. We
could debate about this a lot. Right now, our stance on machine-generated content is that if it's without
human supervision, then we don't want it in search. If someone reviews it before putting it up for the public
then it's fine.
[00:37:45] John Mueller: [00:37:45] Cool, it sounds like one of those areas where SEOs could evolve and
try to learn more about fancy machine learning technologies and kind of build out a niche for themselves.
What about images, video, audio? That that seems like another one of those areas where machine learning
could pick up and say: "Oh, this video is about cats. We will just rank it about for cats."
[00:38:11] Gary Illyes: [00:38:11] Wait, audio. Why would you ever produce audio?
[00:38:15] John Mueller: [00:38:15] Audio?
[00:38:15] Martin Splitt: [00:38:15] Yeah, podcasts are overrated.
[00:38:17] John Mueller: [00:38:17] It's, it's...

[00:38:18] Gary Illyes: [00:38:18] Oh podca-- Well, oops. [all laugh]
[00:38:24] John Mueller: [00:38:24] I mean...
[00:38:25] Martin Splitt: [00:38:25] Or ASMR.
[00:38:25] John Mueller: [00:38:25] I mean like like with images, that's one of those things where the
machine learning teams, the research teams, always kind of try to show off how well they recognize the
objects on an image and what they're doing. Do you see that kind of going into SEO where suddenly, people
won't have to do alt attributes for images anymore and images will just rank perfectly?
[00:38:52] Gary Illyes: [00:38:52] I gave a presentation a couple of years ago at a conference called [Sipic],
and we were showing entities that we could detect from simple images, simple images being a picture of an
apple on a white background. I think the Eiffel Tower and then a person in a picture, again, white background
and we could tell the general topic of the image.
[00:39:18] So we could say that a red apple or Eiffel Tower or person, but for example, for the person, we
couldn't even tell the gender. And from the picture, if you were looking at the picture as a human, it was very
obvious, the perceived gender. But the machine just couldn't actually say anything about the perceived
gender of the person in the picture, and I think that's still true.
[00:39:43] In general we can tell the basic topics or topic of the picture or what's in the picture, but we can
get quite confused. If you put up a grapefruit and an orange and the perspective is off or confusing, then we
might not be able to tell the two apart. So yeah, I think for now, we are going to rely on odd attributes and the
surrounding text, quite a bit. I can see that eventually we get there where we can detect more concepts and
more accurately in images and then we can use that for ranking purposes. But I don't think that we are there
yet. But I definitely think that we are going to get there eventually.
[00:40:26] John Mueller: [00:40:26] Yeah, I think the the aspect that I always think about is, well, is that
usually, when it comes to images, it's not that people
are looking for images, but they're looking for kind of like what's represented by the image. Where if you're
looking for luggage, then you might use image search to kind of try to find the luggage, but it's not that you're
looking for a photo of a nice suitcase you want, actually, to buy a physical suitcase and you just want to see
what it looks like.
[00:41:00] Gary Illyes: [00:41:00] Yeah, I think we called that visual exploration. It was actually the basis of
that presentation that I was referencing, and that's the vast majority of our users are actually doing that,
visually exploring the web versus going and finding an image for a meme or whatever.
[00:41:21] John Mueller: [00:41:21] OK. So working with images and videos will also continue to be
something for SEOs. What about voice search? Will SEOs have to optimize for voice search?
[00:41:32] Martin Splitt: [00:41:32] Oh God, the future that never will be. I think no, because if we learn
anything-- I remember a bunch of years ago, people were like: "Oh, we'll stop using keyboards and just do
voice." And I think that has been a recurring theme from the 90s. But I think in the future, it won't change and
will naturally or magically become the number one thing that we need to worry about, simply because it
changes the input modality, and it changes probably how queries are phrased, but it doesn't change the
fundamental use of natural language to retrieve information from the Internet.
[00:42:13] So I think you don't have to worry too much about it, to be honest, but that's maybe just me.
Maybe the future will be completely different and we'll... I don't know. I don't think so.
[00:42:25] Gary Illyes: [00:42:25] I think we are going to experiment with just projecting our thoughts into
search engines and then that's how we are going to find things.
[00:42:32] Martin Splitt: [00:42:32] But I wonder if I our... I don't know. So apparently there's two different
kinds of people. There's one kind which has an inner monologue, the other one doesn't. I'm of the kind of
inner monologue, so my thoughts are fully formed sentences, so I would still use normal natural language in
my thoughts. But maybe the other kind doesn't, I don't know.
[00:42:54] John Mueller: [00:42:54] Then you have three voices in your head, two of your own, and then
one Google. And then our, you have Bing and the other search engines too. It's like you have-- you wake up
in the morning and you're like: "What should I have for breakfast?" It's like: "This, this, this." [Martin laughs]
[00:43:10] Martin Splitt: [00:43:10] And maybe it's like this video calls or general like conference call
situation like: "Can you hear me now. Loud enough?" [all laugh]
[00:43:18] Gary Illyes: [00:43:18] Martin, you are muted. [Martin laughs]
[00:43:24] John Mueller: [00:43:24] Oh, my gosh. OK, cool, Yeah. OK. so I guess we started-- we thought,
well, what is the future of SEO? Externally, people always always bring that up: "Is SEO dead, and will it die
soon?" It sounds like, at least based on your expert opinions here, like SEO is not going to go away. Things
like URLs are going to remain in place. HTML is going to continue to be the basis and SEOs will continue to
need to know HTML and JavaScript, perhaps, probably.
[00:44:06] Martin Splitt: [00:44:06] Dead.
[00:44:07] John Mueller: [00:44:07] Understanding more about web apps.
[00:44:09] Martin Splitt: [00:44:09] Dead.
[00:44:10] John Mueller: [00:44:10] No, no. Web apps is like that new opportunity that sounds good.
Structured data seems like, well... There's like, if you plan to retire in the next 10-20 years, probably you'll
continue to do it. But maybe at some point less.
[00:44:30] Crawling and all of that probably stays the same or similar, and machine-generated content seems
like one of those research opportunities for people to kind of plan ahead on what might happen, I don't know,
maybe 5-10 years in the future. Is that about right?
[00:44:49] Gary Illyes: [00:44:49] Sounds about right to me, yeah, but maybe we are all wrong. Who knows?
That's the beauty about the future. We can't really predict things. But yeah, I think that encapsulates what we
think will happen. Let's see how right or wrong we are in the future.
[00:45:05] John Mueller: [00:45:05] OK. Well, it sounds like we'll continue to have search engines because
people will continue to ask is SEO dead, and for that, you need a search engine. So at least for that topic,
we'll continue to need SEOs.
[00:45:21] Gary Illyes: [00:45:21] True. Very meta.
[00:45:24] John Mueller: [00:45:24] Cool. Alright. And with that, I think we've kind of made it to the end of
our episode, which is pretty cool. Thank you two for for joining in here. Thank you, all of the listeners who are
watching or hearing, I guess. Thanks for joining us here. We've been having fun with these podcast episodes
and I hope you all find them insightful and interesting too. And regardless, let us know if there's anything that
you think we should be talking about more in the future.
[00:45:57] Feel free to drop me a note on Twitter or chat with us at any of the virtual events that we
sometimes go to, which is not a lot at the moment. And of course, don't forget to like and subscribe and
update all of your links to point to this podcast episodes because Gary says links will not be going away. So
thank you and goodbye.
[00:46:20] Martin Splitt: [00:46:20] [speaks in a foreign language]
[00:46:22] Gary Illyes: [00:46:22] Goodbye.
[00:46:24] ♪ [music] ♪

Search Off The Record - 26th Episode: John Mueller

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Search Off The Record - 26th Episode: John Mueller

Uploaded by

Copyright:

Available Formats

Search Off the Record - 26th episode

[00:06:01] John Mueller: [00:06:01] So HTML is not going to go away?

[00:06:36] John Mueller: [00:06:36] OK.

[00:12:22] Gary Illyes: [00:12:22] Yeah.

[00:13:24] Gary Illyes: [00:13:24] I think so.

[00:14:15] Gary Illyes: [00:14:15] Fortunately, URLs cannot go away.

[00:14:17] John Mueller: [00:14:17] What do you mean?

[00:14:42] John Mueller: [00:14:42] OK.

[00:15:04] John Mueller: [00:15:04] OK.

[00:20:25] John Mueller: [00:20:25] URIs. Oh, what's the difference?

[00:20:28] Gary Illyes: [00:20:28] I mean, URL is a form of URI.

[00:20:31] John Mueller: [00:20:31] OK.

[00:20:43] Martin Splitt: [00:20:43] not a URL.

[00:20:44] Gary Illyes: [00:20:44] ... URL.

[00:20:45] Martin Splitt: [00:20:45] Yeah, OK.

[00:20:46] John Mueller: [00:20:46] And you do that in robots.txt too?

[00:22:16] Martin Splitt: [00:22:16] Oh-h-h!

[00:23:34] Martin Splitt: [00:23:34] Kind of, yeah.

[00:23:39] Martin Splitt: [00:23:39] Yeah.

[00:25:07] John Mueller: [00:25:07] Yeah.

[00:25:08] Martin Splitt: [00:25:08] I don't know, maybe not.

[00:27:29] John Mueller: [00:27:29] OK.

[00:28:28] John Mueller: [00:28:28] OK.

[00:28:29] Gary Illyes: [00:28:29] Maybe that would work.

[00:28:51] Gary Illyes: [00:28:51] Well, that's a shocker.

[00:28:52] John Mueller: [00:28:52] No shocker. OK.

[00:33:37] Martin Splitt: [00:33:37] No.

[00:34:23] Gary Illyes: [00:34:23] Why would you bring up links?

[00:34:25] John Mueller: [00:34:25] Well, it's like crawling.

[00:34:26] Gary Illyes: [00:34:26] Why?

[00:34:29] Gary Illyes: [00:34:29] But why, why, why?

[00:34:35] Martin Splitt: [00:34:35] I'm saying out of this.

[00:34:45] John Mueller: [00:34:45] OK.

[00:35:25] Gary Illyes: [00:35:25] What's that?

[00:35:38] Gary Illyes: [00:35:38] I guess it will stay.

[00:38:15] John Mueller: [00:38:15] Audio?

[00:38:15] Martin Splitt: [00:38:15] Yeah, podcasts are overrated.

[00:38:17] John Mueller: [00:38:17] It's, it's...

[00:38:24] John Mueller: [00:38:24] I mean...

[00:38:25] Martin Splitt: [00:38:25] Or ASMR.

[00:44:06] Martin Splitt: [00:44:06] Dead.

[00:44:07] John Mueller: [00:44:07] Understanding more about web apps.

[00:44:09] Martin Splitt: [00:44:09] Dead.

[00:45:21] Gary Illyes: [00:45:21] True. Very meta.

[00:46:20] Martin Splitt: [00:46:20] [speaks in a foreign language]

[00:46:22] Gary Illyes: [00:46:22] Goodbye.

You might also like