This action might not be possible to undo. Are you sure you want to continue?
The Value of Indexing for Real Estate Industry Websites
Indexing Listings Whitepaper December 2010
WAV Group www.waves.wavgroup.com
Indexing Listings Whitepaper • 2
Table of Contents
What is Indexing? ........................................................................................................... 3! Indexing Dilemmas ......................................................................................................... 4! WAV Group Opinions on Indexing ................................................................................ 4! Impacts of Listing Indexing by Third Party Websites ................................................. 5! Indexing Creates Search Engine Ranking on Keywords......................................... 5! Impact of Indexing on Agent and Broker Websites ................................................. 6! Industry Wide Indexing Impact Theory ..................................................................... 7! MLS Guidance on Indexing............................................................................................ 8!
Indexing Listings Whitepaper • 3
What is Indexing?
In early November, the National Association of REALTORS MLS Issues and Policies Committee considered policy changes related to RSS listing syndication, social media, mobile devices, and listing indexing on IDX compliant websites. The board did not approve any changes, and the issue remains in committee. This paper addresses the impact of allowing indexing of IDX listings and the unclear language in the NAR policy. In parlance, the word indexing is nearly same as the term cataloging. Today, IDX rules and regulations do not require participants to block indexing. Section 18.2.2 - This does
not require participants to prevent indexing of IDX listings by recognized search engines. However, Section 18.2.6 states “Except as provided in these rules, an IDX site or a participant or user operating an IDX site may not distribute, provide, or make any portion of the MLS database available to any person or entity” The dilemma here is that the rules to not clearly indicate which activity considered allowable indexing, and which activity is electronic distribution of the MLS database. The pivotal differentiator seems to be related to who is doing the activity. Presumably, any third party who is recognized as some sort of search engine is free to copy and index the MLS database.
The old IDX rules and regulations that governed the display of MLS data on agent and broker websites contained a provision that obligated anyone displaying MLS data to make ‘reasonable’ efforts to prevent listings from being scraped (copied) from their website and duplicated on another website. Indexing and scraping activities are very difficult to differentiate. The process for performing either activity is fundamentally the same. A computer software solution called a “bot”(short for robot) will visit a subject website and index or catalog all of the information that is on that website. The difference between indexing and scraping is a factor of who is doing it and what they do with the data that is being indexed. Scraping is loosely defined as copying information from one website and publishing it to another website. There are certain websites that like Google, MSN, Yahoo and others that are commonly understood to be Recognized Search Engines by consumers. To be clear – NAR has provided an unofficial definition of Recognized Search engines as “intended to mean those facilities average consumer consider to be ‘search engines.’ These websites are the most prevalent users of bots that index content across the World Wide Web. It is important to understand that the behavior of this form of indexing has the goal, or intended goal today of referring consumers to the best websites that contain the best information to answer the consumer’s search parameters as entered into their search field. In order to achieve this goal, the recognized search engine copies the indexed information, or a portion of it, over to their servers where it may be manipulated and displayed in the form of a search result to visitors to their website.
Indexing Listings Whitepaper • 4
One dilemma that we see in the policy consideration is that they do not define Recognized Search Engines very well. At a technical level, web sites may prevent certain ‘bots’ from indexing data, and allow (or white list) certain other ‘bots’. If all indexing is allowed, any website that uses a ‘bot’ to crawl a real estate website could catalog the listing data. The second dilemma in the NAR Model Provisions relates to the lack of any restriction to the components of listing data may be indexed or cataloged. In other words, our understanding is that all IDX listing content may be indexed or cataloged. The third dilemma in the NAR Model Provisions relates to the use of indexed or cataloged IDX data. There is no guidance on how the data may be used or displayed once indexed by the recognized search engine. The fourth dilemma is the issue of duplicate content. Search Engines have long held a policy of favoring original content over content that is copied on the web. What will happen when hundreds or thousands of MLS participants and subscribers make duplicate IDX content available for indexing? Here is how these three dilemmas combine to create significant concerns for the indexing of IDX data. Any company or individual who claims to be a recognized search engine may crawl the full compilation of MLS IDX data on any authorized IDX website and use that data for any purpose or repurpose so long as they provide a link to the site where the data originated. The site where the data originated may or may not be the original source of the data. And finally, there is no license that governs what the site does with the IDX listing compilation that is indexed. Oddly, NAR policy does not require all recognized search engines to provide a link to the site where the data originated.
WAV Group Opinions on Indexing
NAR’s policy consideration to allow listing content to be indexed by search engines on agent, broker, and franchise websites will serve to create a shift in how search engines rank real estate industry websites. WAV Group is among the first industry leaders to advise the industry on our position that indexing should only be allowed by the Listing Broker, indexing their active and sold listings. We have formulated this opinion based upon a fundamental guiding principle.
Indexing Listings Whitepaper • 5
The listing broker is the curator of property data, and liable party for the information appearing on the Internet. WAV Group also maintains that indexing is a form of scraping – a condition whereby copyrighted listing content is being copied, maintained, and stored by third parties (recognized search engines) without any license agreement to protect the broker or MLS copyrighted content. WAV Group maintains that indexed content should not be used for any other purposes other than linking back to the broker’s website. Furthermore, WAV Group maintains that only a shortlist of fields should be allowed for indexing – for example, primary photo, beds, bathrooms, price, and location.
Impacts of Listing Indexing by Third Party Websites
We have known for a long time that indexing listings by property address is a key factor in driving traffic to third party listing websites. This practice is widely used by sophisticated web developers, and we commend third party listing websites like Zillow, Trulia, Realtor.com and others on their efforts and leadership. After all, because these third party websites are not held to maintaining IDX display rules, they have maintained an advantage in search engine optimization as a result of being able to allow listing content to be indexed by search engines for a very long time. Although WAV Group has not done a formal study on the effectiveness in optimizing third party sites for address search results, our informal experience shows that these sites appear in the #1 position most often when a search if performed on a popular search engine for an address, using a computer or mobile browser. The impact of this commanding position is significant. In web terminology, this practice improves “long tail search” results. This is defined by the condition whereby a user will type a number of words into a Google search field to find what they are looking for. Rather than typing in “Real Estate”, the user will type in an address like “123 Broadway NY, NY.” In this case, the long tail search term is 4 words.
Indexing Creates Search Engine Ranking on Keywords
WAV Group recently partnered with Experian to produce online effectiveness reports based upon Hitwise data. Hitwise studies the traffic to the top 500,000 websites in the United States. Using Hitwise, we are able to look at the search terms that drive traffic to any of the top sites. To produce an example, we ran a Hitwise keyword report on Trulia.com. We find that most of their search traffic comes from Google. Their top search terms are Trulia (4%), Trulia.com (1%) etc. However, a careful investigation reveals a deeper understanding of the impact of indexing listings and long tail search.
Indexing Listings Whitepaper • 6
About 25% of third party website traffic comes from short keywords like Trulia. 72% of third party search engine traffic comes from 4 words or more – again, long tail search. The most popular search terms include “homes for sale in city name,” but as you dig deeper, you see that they are getting an enormous amount of traffic on property address searches. We found similar tendencies on other third party websites comparable to Trulia.
Impact of Indexing on Agent and Broker Websites
WAV Group talked about indexing listings with Wolfnet, a leading IDX provider in the United States who offers their clients the ability to index listings. Wolfnet was gracious enough to perform a test and benchmark the effects of Indexing on Large Broker websites, Medium Broker Websites, and Small Broker Websites. You can see from the chart below that each website had improvements in search engine performance as a result of Indexing. However, you will also notice that the bigger the site, the faster and higher the rise in traffic.
Indexing Listings Whitepaper • 7
WAV Group believes that the reason why larger sites get more traffic from indexing listings has to do with the volume of traffic, and specifically search engine traffic that they received before Indexing. This opinion is based upon theory rather than science since Search Engine ranking formulas are guarded secrets. It would appear that websites with higher traffic are considered as more likely to be the originator of indexed content by the results. Larger sites see larger increases in traffic as a result of indexing.
Industry Wide Indexing Impact Theory
As you extrapolate this out, you can imagine the impact potential on modifying IDX rules and regulations should they allow Indexing. The following is our theory of who will benefit most by this ruling. 1. 2. 3. 4. 5. 6. 7. 8. Third Party Websites (already indexing and ranking high) Franchise Websites MLS Consumer Facing Websites Large Broker Websites Medium Broker Websites Virtual Tour and other service sites Small Broker Websites Agent Websites
In our estimation, the big will get bigger and the small will get smaller.
Indexing Listings Whitepaper • 8
MLS Guidance on Indexing
WAV Group would advise more careful consideration of indexing. First Consideration: only allow the listing broker to index his or her own listings. Our reasoning is simple. 1. The listing broker is legally responsible for the listing content 2. The listing broker and its agent collected the listing content and created a unique and aggregate piece of work in the MLS system. 3. The listing broker and its agent curated the listing content from inception to sale. Second Consideration: only allow the buyer’s broker representative to index sold listings. In so far as the seller’s representative is the responsible party for the active listing data, the buyer’s representative is the logical responsible party for the sold listing data, and its accuracy. Third Consideration: only allow indexing on a limited number of fields that satisfy the need of a recognized search engine to direct consumers to the source of the indexed data. Presumably only a short list of fields is necessary for this purpose. Let the indexing begin! On your marks, get set, go! See you at the finish line in a few years as this shakes out. Thanks to Brian Larson, of the Law Offices of Larson/Sobotka. In addition to Larson’s review of this paper in draft form, firms’ blog, http://www.mlstesseract.com/ has been a great resource for this paper. Thanks also to Michael Wurzer, CEO of FBS Data systems. In addition to Wurzer’s review of this paper in draft form, the firms’ blog, http://www.flexmls.com/blog/ has been a great resource for this paper. Thanks to the MLS policy committee of the National Association of REALTORS. Their many hours of effort and resources have generated policies that guide the industry into the future.