You are on page 1of 4

Restructuring Websites to Improve Indexability

Flow Chart Methodology

Problem: Google is struggling to crawl your site because a large number of URLs
seem to be pointing to identical or similar content, as a result it is consuming
more bandwidth than necessary and is unable to completely index your entire site
content.

Effect: The current search system produces a large number of unique URLs that
use up all possible filter combinations. Google will spend time crawling these
pages despite and attempting to index them. The number of potential pages
created by this is huge. When you scale this across the entire site it would be
easily possible to create over a million crawlable pages.

Solution: This is crippling Google’s ability to crawl and understand your site. By
studying the job aggregator industry landscape, we can see that competitors are
currently in one of two positions:

(1) They are heavily limiting the pages that are indexable and as a result
miss out on the opportunity to rank for keywords with search volume;

(2) They aren’t limiting their pages and wind up with hundreds of low
value pages showing up on the SERPs which results in reducing link equity
to their pages.

We are suggesting a solution that is radically different that uses actual search
volume data to only show the pages that are valuable to search engines.

Note: This flow chart is based on a job aggregator site, the methodology can be
replicated and applied across different industries and sites.

1
User requests page

Rules
exist? Yes

Load Page
No

Combination Script

Job
Available?
Job
Available?

Yes

KW in
Database?

All
Check Search Some None
Volume Fetch Search
Volume
Highest SV
SV<50

Create Page – Load Page – No


Index & Crawl Noindex & Nocrawl

2
Flow Chart Breakdown

The flow chart starts off with a user requesting a page, which is typically done by
using the search option in the homepage.

It then follows this path; if rules for this query don’t already exist:

1) It passes the search query through a keyword combination script; this


script ejects different combination for the search conducted by changing
the order of the keywords to see all possible combinations.

2) It then searches the database to see if this job is available; if it is not then
load a page stating so that is no-index and no-crawl.

3) If the job is available, then it searches the keyword database for all
different keyword combinations that have been outputted from the script.

4) There are then two options – either all keywords exist or some/none of the
keywords exist; in which case:

a. All keywords exist

i. Search Volume > 50: Search volume for these keywords are
then checked and the keyword with the best search volume
is picked. A page is then created using that keyword
combination and is both indexed and crawled.

ii. Search Volume < 50: Load page for users but no-index and
no-crawl it.

b. Some or none of the keywords exist: Fetch search volume data for
the keywords using keywordtool.io then follow the process in point
A.

If the rule for this query already exists, then it simply loads the relevant page.

It’s important to note that we recommend the keyword database search volume
data to be based on monthly average search volume. This needs only to be
refreshed yearly – but we advise not deleting any of the old data as it would be
useful to keep it stored to develop an understanding of the changing market
demand over time. Your search volume cut-off can be updated at any time and is
based on what makes sense for your industry.

3
Keywordtool.io

Keyword Tool API is a tool that will help you automate and give you full access to
keyword data from Google. You would be able to generate data for hundreds of
thousands of keywords using very simple scripts.

Simply make an API request, specify location and/or language targeting and
receive data in JSON or XML format in seconds. An API request can return Search
Volume data for up to 800 keywords, it means that you can process your long list
of keywords very fast.

Keyword Tool offers very precise geographic and language targeting of the
Search Volume data. Using the API, you will be able to get Google Search Volume
data for keywords in 192 countries, 44,051 individual locations and 43 languages.

There are options for both monthly and annual subscriptions and you can sign up
here. You will also find a full API documentation and code samples here.

Error Case Scenarios

There are different error case scenarios that might occur; such as:

1) Tie Breaker: If at the stage of checking several volumes of several


keyword combinations; we find that they have the same search volume (in
case the search volume > 50) – create the page based on the original
search query.

2) API Down: If the API is down and search volume data is unable to be
retrieved, then load a page that is no-index and no-crawl for usability and
avoid storing the query in the database as there is currently no search
volume data for it.

You might also like