/  15
 
This report has been prepared to help SEOs understand the concepts and practicalapplications contained in Google's US Patent Application #20050071741 -Information Retrieval Based on Historical Data. My own advice and interpretationis offered throughout this paper - please conduct your own research before actingon the recommendations.Sections in this Report:I. Overview of the 5 Most Critical Concepts from this PaperGoogle's Concept of "Document Inception"How Changing Content can Affect RankingsSpam Detection & PunishmentWhat Google is Attempting to MeasureThe Impact of this PatentII. Analysis and Interpretation of 63 Patent ComponentsHistory Data (1)Inception Date (4)Frequency of Document Changes over Time (6)Amount of Changes over Time (3)Click-Through Rate Data (2)Document Association to Search Terms (1)Queries that Remain the Same but have New Meanings over Time (1)Staleness of Documents (3)Link Behavior (4)Freshness of Links (4)Anchor Text Changes over Time (1)Content Changes in a Document compared to Linking Anchor Text (1)Freshness of Anchor Text (2)Traffic Characteristics of Site/Page (2)User Behavior (2)Domain Related Information (3)Prior Rankings Data (4)User Maintained Data (3)Growth Profiles of Anchor Text (1)Linkage of Independent Peers (1)Document Topics (1)Identifying Relevant Documents (1)Plurality of History Data (1)History Component (1)Ranking of Linked Documents (10)III. Documentation on Description ElementsDocument Inception DateContent Updates/ChangesQuery AnalysisLink-Based CriteriaAnchor TextTrafficUser BehaviorDomain Related InformationRanking HistoryUser Maintained/Generated DataUnique Words, Bigrams, Phrases in Anchor TextLinkage of Independent PeersDocument TopicsIV. List of Additional Coverage & ResourcesOverview of the 5 Most Critical Concepts from this PaperThese 5 concepts are what I believe to be the most ground-breaking and importantfor search engine optimization professionals to understand in order to best
 
conduct their work.1. Google's Concept of "Document Inception"The date of "document inception", which can refer to either a website as a wholeor a single page is used in many different areas by Google. This data can comefrom the registration info, the date Google first found a link to the site/page orthe site/page itself. Google will be using this data to rank documents andestablish credibility and relevance.2. How Changing Content can Affect RankingsChanging content over time has a huge impact in Google's measures according tothis patent. They use changes to determine "freshness" or "staleness" of websitesand pages and how that data impacts the value of the links on the page as well itsrankings. They'll also measure large, "real", content changes vs. superfluouschanges and rank based on that data.Google also says that for some types of queries, particular results are morevaluable - stale results may be desirable for information that doesn't needupdating, fresh content is good for results that require it, seasonal results maypop up or down in the rankings based on the time of month/year, etc.3. Spam Detection & PunishmentGoogle is employing many new systems of spam detection and prevention according tothe patent. These include:Watching for sites that rise in the rankings too quicklyWatching for registration information, IP addresses, name servers, hosts, etc thatare on their "bad list"Growth of off-topic linksSpeed of link gainPercentage of similar anchor textTopic/Subject shifts or additions4. What Google is Attempting to MeasureGoogle wants to measure or is attempting to actively measure each of thefollowing:Domain informationRegistration dateLength of renewal (10 years, 5 years, 1 year, etc)Addresses and Names of admin & technical contactsDNS RecordsAddress of Name ServersHosting Location & CompanyStability of this dataInformation on User Behavior OnlineCTR (Click-Through Rate) of individual results in the SERPsLength of time spent on a given site/pageData contained on your computerFavorites/Bookmarks ListCache & Temp FilesFrequency of visits to particular sites/pages (history)5. The Impact of this PatentI believe that this patent will help to verify most of the theories surroundingGoogle's rankings. There has been speculation over the past 18-24 months on nearlyevery subject covered in this patent at the major SEO forums, but this will serveas verification.Although it is long, I urge every SEO/Webmaster to read this page completely. I
 
have attempted to make the information legible and readable, and only pulled outparts that are important to the active practice of SEO (which was almost 2/3 ofthe document, surprisingly). If you have any questions or corrections on thissummary, please send me an email.--------------------------------------------------------------------------------Analysis & Interpretation of the 63 Patent ComponentsHistory Data1. Documents may be scored in Google's rankings based on "one or more types ofhistory data".Inception Date2. The "inception date" read - registration date - may be considered as a scoringfactor (I assume that older will be considered better, but this is not spelledout).3. Google may determine how old each of the pages on a given website is and thendetermine the average age of pages on the website as a whole. The differencebetween a specific page's age and the average age of all documents on the sitewill be used in the ranking score.4. The score for a website may include the amount of time since "documentinception" - i.e. how old the website is.5. One methodology of discovering site age might include when Google first"discovered" - read spiders the site, when Google first finds a link to the site,and when the site contains a "predetermined number of pages". I interpret this tomean that Google has some kind of threshold for site size (number of pages) thatwhen reached, triggers a scoring effect (probably positive).Frequency of Document Changes over Time6. Google's scoring will (according to the patent) be based on "determining afrequency at which the content changes over time".7. The "frequency at which the content changes" will be determined by the averagetime between changes, the number of changes over a particular time period, and therate of change of one time period vs. the rate of change for another time period.So, if you are updating your website every day, then switch to updating once aweek, your scoring in the historical measurements at Google will shift.8. Scoring will also include how much of the site has changed over a given timeperiod (new pages, changes, etc.).9. The scoring based on changes (described in #8) will be determined by the numberof new pages within a time period, the ratio of new pages vs. old pages and thetotal "percentage of the content of the document that has changed during a timedperiod."10. The scoring of changes (from #8) will be based on the "perceived importance ofthe portions" that have been changed. The score will also take into account thechanges as compared to the weighting(s) of each of the different pages of the site- i.e. if important pages change, it will have a different impact than ifunimportant pages changed. My guess is that importance is mostly determined bylinks (both internal and external) that point to a given page. So if your contactpage changes, it's not a big deal, but if your home page changes, that's a biggerdeal.

Share & Embed

Add a Comment

Characters: ...

eurailpassleft a comment

star and flag! awesome read