You are on page 1of 3

Index Cards – An URL Naming Standard

The URL Naming Standard Index Cards is an open source standard that describes a
method to include a concise group of tags in a webpage-url to describe web pages to
search-engines, including a search engine side user fed filter mechanism to prevent seo
gaming. Author: Jan Neirynck

We think of the internet as one big library with books being web pages: large chunks of
text indexed by search engines; and we think the search terms we put in search engines
to be the index cards of our global library. We are wrong. For now, at least.

The Web Contains Text Without Meaning

The internet links web pages and documents together in a web of text which only reveals
its meaning to the visiting and reading human web surfer. These text sources are not
tagged or labeled nor objectively described by meta data which would make them into
units of information that are searchable by their identifying tags. Of course the web also
contains images, audio and video: content which is not tagged either. This makes it
impossible to ask search engines database-like kind of questions. Because there is no
meta data about the indexed web pages to query. Search engines return links to web
pages containing the exact words in your search terms. Accuracy of the search results is
determined by user selection (i.e. most clicked links travel up in the search results
sorting) and by counting the incoming links to the web pages that contain your search
terms. So the web is one global interlinked source of untagged text.

What we need in our global library are index cards. But which information or tags should
we include in these cards ?

How to organize what people look for online from the real world

The scope of questions a search engine gets asked encompasses the dreams and ideas of
entire mankind. It is impossible to explicitly develop an ontology – a logical group of
tags – describing all things in this world. Instead we should define a simplified model of
reality that can be used to let people themselves organize the web content they
themselves put on the internet.

People don’t look for the most linked to or most clicked on search result. People search a
digital world based on a mind map linked to the physical real world they know and
experience every day. They search for people, organizations who can tell them about
things and ideas they want to know or act on. Search is not text based, it is based on
the concepts that come from reality and just happen to be expressed in a non descriptive
text code on the web. Right now people have no specific way to explicitly indicate to the
search engine what they are looking for. There is no list of tags to choose from to drill
down to a list of specific websites that contain exactly what you are searching for. We
should stop reengineering websites to make them contain the keywords people type in
search engines trying to guess the keywords website builders agreed on trying to guess
their desired visitors keywords… Sounds like catch 22.

Just as you have a very simple list of keywords on index cards pointing to books in a
library you should have a very simple group of tags to choose from when describing a
webpage you are putting up. The ultimate index card of a web page is its url. It points
directly to the webpage technologically and so should it do logically in its meaning.

Text and Idea by Jan Neirynck – Belgium – 2009/07/11. 1/3

URL Naming Standard

These 6 concepts group all tags needed to define a web page in a concise manner.

IDENTITY= (Main | Sub) Name:website-entity, Activity:value, Free Tags
CONTACT= Email:value, Telephone:value, Fax:value, Postal Address:value
LOCATION= Country:value, State:value, Zip-Code:value, City:value,
County:value, Street:value, Building:value, Floor:value, Room:value.
TIME= Open:value, Closed:value
INFORMATION= (Text | Pictures | Audio | Video) Name:value
(Free | Price:valuecurrency) (Online | Physical) Used-In:activities
Free Tags
SERVICE= (Free | Price:valuecurrency) (Online | Physical) Name:value
Transforms:x-To-y Used-In:activities
Free Tags
PRODUCT= (Online | Physical) Name:value, Product-Number:value
Price:valuecurrency Brand:value
Length:value, Width:value, Height:value, Weigth:value …
Transforms:x-To-y Used-In:activities
Free Tags

The identity, contact, location and time tags should identify the real world entity
responsible for the website and the website as well. These groups can be combined in
one URL identifying the main website address. In case of multiple sub entities of the
main entity different URLs can be added to the main hostname thus describing multiple
locations. The main entity will then be identified by the Main tag, the sub entities by the
Sub tag. The same information should be reproduced on the web page the URL points

Information, service and product groups should each identify single web pages. Each of
the web pages described should contain the information pointed to by the URL.

All tags including free tags consist of a name:value pair with the name first, a “:” and
then the value. The value is a word or a number. If it is a number it should be
accompanied by an identifier. I.e. Length:250mm or a length of 25 cm. Or

The time tags have a fixed syntax. They can describe opening and closing times at a
date, year, month, week, weekday, day_of_year, hour and minutes level. This is how:

17h00)FR(8h30-16h00)SA(8h30-12h00)) … DEC(MO(8h30-17h00)TU(8h30-
12h00))&CLOSED:2009(JAN(SU()) … DEC(SU())(25/12/2009))

&CLOSED:2009(1,4, …, 358)


All these name:value pairs should be included in the url and linked by & characters.

Text and Idea by Jan Neirynck – Belgium – 2009/07/11. 2/3


This can become:

A dedicated web page can become:

These tags will be queried by search engines to provide search results as an answer to
very precise questions asked by web surfers. I.e. which free online service converts MPG
video files to WMV.

Prevent SEO Gaming By User Feedback

Of course search results need continuous improvement by user feedback. Make this
simple and easy and make users understand they get better search results from
participating to user feedback. Let users score search results answering to 3 questions:
a] this site is what I was looking for: YES or NO; b] this site is not what it says it is: YES
or NO; c] this site is spam: YES or NO. These questions enable search engines to reroute
wrongly understood search queries, inform website owners of the web pages that they
truthfully but wrongly tagged and root out malicious spammers.

In short, we should be able to query the index cards of our global library. We should be
able to ask a search engine questions like : Which shops in the region of London sell
product A from brand B. Or, give me the websites for all America Universities that have
a course in writing and are in the state of New York. Who are the impressionist painters
and which are their most important paintings ? Which American newspapers published a
review on that new Jacky Chan movie ? Which bistro’s are open on Sunday in Nice,
France and what is their address ? Search should be fast, easy and fun.

Purpose Of This Text

I have the feeling internet search becomes more and more cumbersome every day.
Simply said, a lot of the times I just can’t find what I am looking for. Polluted search
results, websites trying to sell me stuff, and most of the time, things that just are not
what I am looking for because there is no specific way to indicate what I am searching
for. I am certain I’m not the only one angered by the morass I sink in every time I
simply search something on the internet. I might be wrong but isn’t this something an
organization like the World Wide Web Consortium (W3C) should address full force. The
proposal in this text isn’t a be all, end all solution. I just hope somebody with enough
pull starts the gears on what needs to be done: agree on a simple tag list that makes
sense for non-techies and get the big internet players to actively use and enforce it.
Improving the ease of use and the value of the information already on the web should be
on everybody’s agenda.

The Author

Jan Neirynck (°18/11/1974) has a bachelors degree in IT. He combines a fond interest
in mankind with a good basic insight in technology. In his off time he writes fiction as
well as non fiction texts. Before and after finishing his bachelors degree he earned
experience in different jobs inside and outside of IT. He lives in Belgium (Flanders) and
is a native Dutch speaker.

Text and Idea by Jan Neirynck – Belgium – 2009/07/11. 3/3